Author: Scott Altham

.NET : Processes, Threads and Multi-threading


So I’ve been digging around in my evernote to publish something this evening that might be of actual use.  My notes tend to be lots of small tid bits that I’ll tag into evernote whilst I’m working on projects and whilst they’re great on their own as a golden nuggets of information that I’ll always have access to, it fails to be useful for use in a blog article.

I did come across some notes from a couple of years ago around threading, specifically threading in the .NET framework…. So as I’ve little time this week to dedicate to writing some new content, I thought I’d cheat a bit and upload these notes, which should go some way in introducing you to threading and processes on the Windows / .NET platform. Apologies for spelling errors in advance as well as the poor formatting of this article. I’ve literally not got the time to make it look any prettier.

So, what is a windows process?

An application is made up of data and instructions. A process is an instance of a running application. It has it’s own ‘process address space’. So a process is a boundary for threads and data.
A .NET managed process can have a subdivision, called an AppDomain.

Right, so what is a thread?

A thread is an independent path of execution of instructions within a process.
For unmanaged threads (that is, non .NET), threads can access to any part of the process address space.
For managed threads (.NET CLR managed), threads only have access to the AppDomain within the process. Not the entire process. This is more secure.

A program with multiple simultaneous paths of execution (concurrent threads) is said to be ‘multi-threaded’. Imagine some string (a thread of string) that goes in and out of methods (needle heads) from a single starting point (main). That string can break off (split) into another pieces of string thread that goes in and out of other methods (needle heads) at the very same time as it’s sibling thread.

When a process has no more active threads (i.e. they’re all in a dead state because all the instructions within the thread have been processed by the CPU already), the process exits (windows ends it).

So if you think about when you manually end a process via Task Manager. You are forcing the currently executing threads and the scheduled to execute threads into a ‘dead’ state. Thus the process is killed as it has no more code instructions to execute.
 threadendOnce a thread is started, the IsAlive property is set to true. Checking this will confirm whether a thread is active or not.
Each thread created by the CLR is assigned it’s own memory ‘stack’ so that local variables are kept separate

Thread Scheduling

A CPU, although it appears to do process a billion things at the same time can only process a single instruction at a time. The order of which instruction is processed by the CPU is determined by the thread priority. If a thread has a high priority, the CPU will execute the instructions  in sequential order inside that thread before any other thread of a lower priority. This requires that the thread execution is scheduled, according to priority.   If threads have the same priority however then an equal amount of time is dedicated to each (through time slicing of usually 20ms for each thread).  This might leave low priority threads out in the cold however if the CPU is being highly utilized, so to avoid not executing low priority threads, Windows specifically dedicates a slice of time to processing those instructions, but that time is a lot less than given to the higher priority threads. The .NET CLR actually lets the windows operating system thread scheduler take care of managing all the time slicing for threads.

Windows uses pre-emptive scheduling. All that means is that when a thread is scheduled to execute on the CPU then Windows can (if it wants to) unschedule a thread if need be.
Other operating systems may use non pre-emptive scheduling, meaning the OS cannot unschedule a thread if it wants to if the thread has yet not finished.

Thread States

Multi-threading in terms of programming, is the co-ordination of multiple threads within the same application. It is the management of those threads to different thread states.

A thread can be in…
  • Ready state – The thread tells the OS that it is ready to be scheduled. Even if a thread is resumed, it must go to ‘Ready’ state to tell the OS that it is ready to be put in the queue for the CPU.
  • Running state – Is currently using the CPU to execute its instructions.
  • Dead state – The CPU has completed the execution of instructions within the thread.
  • Sleep state – The thread goes to sleep for a period of time. On waking, it is put in Ready state so it can be scheduled for continued execution.
  • Suspended state – Thread has stopped. It can suspend it’s self or can be suspended by another thread. It cannot be resumed by itself. A thread can be suspended indefinatley.
  • Blocked state – The thread is held up by the execution of another thread within the same memory space. Once the blocking thread goes into dead state (completes), the blocked thread will resume.
  • Waiting state – A thread will release its resources and wait to be moved into a ready state.

Why use multiple threads?

Using multiple threads allows your applications to remain responsive to the end user, whilst doing background work. For example, you may have a windows application that requires that the user continue working in the UI whilst an I/O operation is performed in the background (loading data from a network connection into the process address space for example). Using multi-threading also gives you control over what parts of your applications (which threads) get priotity CPU processing. Keeping the user happy whilst performing non critical operations on background threads can make or break an application. These less critical, low priority threads are usually called ‘Background’ or ‘Worker’ threads.

Example

If you created a simple windows form and drag another window over it, your windows form will repaint itself to the windows UI.

If you created a button on that form, then when clicked put the thread to sleep for 5 seconds (5000ms), the windows you are dragging over your form would stay visible on the form, even when you dragged it off of your form. The reason being is that the thread was being held up by the 5 second sleep, so it waiting to repaint itself to the screen until the thread resumed.

Implementing multi-threading, i.e. putting a background thread to sleep and allowing the first thread to repaint the window would keep users happy.

Multi-threading on single core / single processor 

Making your app multi-threaded can affect performance on machines that have a single CPU.  The reason for this is that the more threads you use, the more time slicing the CPU has to perform to ensure all threads get equal time being processed.  The overhead involved in the scheduler switching between multiple threads to allow for processing time slices extends the processing time. There is additional ‘scheduler admin’ involved.

If you have a multi-core system however, let’s say 4 cpu cores this becomes less of a problem, because each thread is processed physically at the same time across the cpu cores. No switching between threads is involved.

multithreading

Using multiple threads makes code a little harder to read and testing / debugging becomes more difficult because threads could be running at the same time, so they’re hard to monitor.

CLR Threads

A thread in the CLR is represented as a System.Threading.Thread object.  When you start an application, the entry point of the application is the start of a single thread that all new applications will have. To start running code on new threads, you must create a new instance of the System.Threading.Thread object, passing in the address of the method that should be the first point of code execution. This in turn tells the CLR to create a new thread within the process space.

To access the properties of the currently executing thread, you can use the ‘CurrentThread’ static property of the thread class: Thread.CurrentThread.property

You can point a starting thread at both a method that has parameters or a parameterless method. Below are examples of how to start these 2 method types:

  • ThreadStart – new Thread(Method);
  • ParameterizedThreadStart – new Thread(Method(object stateArg));
ThreadStart example

Thread backgroundThread = new Thread(ThisMethodWillExecuteOnSecondThread);
            backgroundThread.Name = “A name is useful for identifying thread during debugging!”;
backgroundThread.Priority = ThreadPriority.Lowest;
backgroundThread.Start(); //Thread put in ready state and waits for the scheduler to put it into a running state.

public void ThisMethodWillExecuteOnSecondThread()
{
//Do Something
}

ParameterizedThreadStart example

Thread background = new Thread(ThisMethodWillExecuteOnSecondThread);
           background.Start(“string”);

public void ThisMethodWillExecuteOnSecondThread(object stateArg)
{
     string value = stateArg as string;
//Do Something
}

Thread lifetime

When the thread is started, the life of the thread is dependent on a few things:
  • When the method called at the start point of the thread returns (completes).
  • When the thread object has it’s interrupt or abort methods invoked (which essentially injects an exception into the thread) from another thread that has handled an outside exception (asynchronous exception).
  • When an unhandled exception occurs within the thread (synchronous exception).
Synchonous exception = From within.
ASynchronous exception = From outside.

Thread Shutdown

Whilst threads will end in the scenario’s listed above, you may wish to control when the thread ends and have the parent thread regain control before it leaves the current method.
The example below shows the Main() thread starting off a secondary thread to take care of looping. It uses a volatile field member (value will always be checked for updates) to tell the thread to finish its looping.
Then the main thread tells the secondary thread to rejoin the main threads execution (which blocks the secondary thread).

Background vs Foreground Threads

Foreground Thread – A foreground thread if still running will keep the application alive until the thread ends. A foreground thread has it’s IsBackground property set to false (is default value).
Background Thread – A background thread can be terminated if there are no more foreground threads to execute. They are seen as non important, throw away threads (IsBackground = true);

Thread Pools

Threads in the CLR are a pooled resource. That is, they can be borrowed from a pool of available threads, be used and then return the thread back into the pool.
Threads in the pool automatically have their IsBackground property set to true, meaning that they are not seen as important by the CLR and if a parent thread (if is NOT a background thread) ends, the child will end whether complete or not. Threads in the pool work on a FIFO queue basis. The first available thread added to the queue is the first thread pulled out of the pool for use. When that thread ends execution, it is returned to the pool queue. Thread pool threads are useful for non important background checking / monitoring that do not need to hold up the application.

//Creating a new thread from the pool

Threadpool.QueueUserWorkItem(MethodName, methodArgument); //This will be destroyed if the foreground thread ends.

Advertisements

Improving .NET Application Performance and Scalability


I’m working on an expenditure system at the moment and as we’re adding more and more code, my concern for application performance rises.  When it becomes noticeable that operations in your application are slowing down, it’s time to look at fine tuning a few things.  Over the weekend I went ‘googlin’g for some resources, tips and best practices for optimizing application performance and came across this old, but in a broad sense still relevant document published by Microsoft in 2004 called ‘Improving .NET Application Performance and Scalability’.

I’m still to read the whole PDF, but from flicking through it, it does appears to cover topics that are still applicable in applications now, although it does refer specifically to .NET 2.0 and back to .NET 1.1. Hopefully this will benefit you also in some way if you are also on the hunt to streamline your applications for better operation and scalability.

The document can be downloaded from Microsoft’s Download Centre.

Microsoft WebsiteSpark : Free Full Featured Development Tools


I’ve recently started my own virtual infrastructure by renting some dedicated servers in the UK. One of those servers will run my soon to be re-developed ProcessWorks website, which will expand on the articles I write via this blog, but also include training material, downloads, one or two useful public web services and the details of a secret product still in development.  But enough of the advertisement.

The other machine is my development server, which now hosts a full on Microsoft development environment. As Microsoft is clearly a commercial entity and a proportion of non Microsofter’s tend to moan about the fact you have to actually pay for enterprise software, this might surprise you. I managed to kit out my new virtual development server for the high price tag of… FREE… and with the following specification:

  • .NET 4 Development Framework
  • SQL Server 2008 R2 Web (includes database engine, reporting services, notification services etc) – Much more than Express edition.
  • Visual Studio Professional 2010
  • Sharepoint Foundation 2010
  • Microsoft Expression Studio 4 Premium (includes Expression Web, Encoder, Design and Blend)
  • BizTalk Server 2010 Developer Edition

In my case I already had Windows 2008 R2 Standard installed with the Web Server and Application Server roles turned on (but I’ll get to how you can download Windows Server 2008 R2 for free also).

This was all made possible via Microsoft’s continuing support of new start businesses and developers.  Biztalk 2010 is available as a free full featured download for developers through this link.  This is an amazing integration and process server and if you wish to learn Biztalk, this is what you need.  The installer takes care of downloading the pre-requisites or allows you to load a pre-req .cab file and away you go. The available training material for Biztalk from Microsoft is plenty available.  The rest of the software listed above was obtained via one of Microsoft’s ‘Spark’ programs.

  • BizSpark is aimed at new small businesses and offers Microsoft software for free via their MSDN download portal.  This is clearly a move to seed more expensive Microsoft infrastructures and companies expand, but it’s free and you can always decide to get smart and replace with free technology.
  • DreamSpark is aimed at students, giving them access to software they can use to aid study.  Provided you can prove you are a student, you get access to software such as Visual Studio 2010 Ultimate, Expression Studio 4 Premium, XBox SDK, SQL Server 2008 R2 and operating systems such as Windows Server 2012 / 2008 R2. A pretty neat deal.
  • WebSiteSpark is aimed at small web development companies, like ProcessWorks.  I have created a couple of ASP.NET web sites for clients and the software that has been made available has been so very useful.  You simply sign in with your MSN/Hotmail credentials, provide the name of the your company and the address and then you are registered and have access to the Microsoft partner portal / MSDN downloads.  You are granted several licensce keys for much the same product set as is given to students via DreamSpark.  You are also given access to a free set of ASP.NET UI controls from a third party company and get a 1400 dollar voucher for using Microsoft’s Azure cloud service to deploy your applications (this is not a pre-requisite to signing up however).

So, in my case, I have a small company so I went with WebsiteSpark (for the choice of software I wanted).  So unfortunately these support programs are not open to everyone, however if you are a student, small start-up or a one man ltd company, you can get access to what would normally be very expensive software, for free.

IIS : A Recent History


I’ve been setting up a new Windows 2008 R2 Server today and configuring the Web Server role. I had an unanswered question or two so went about ‘googling’ for some clarification. Forgetting that I’d set my local Evernote content to be shown as part of google searches, I found some of the answers from some reasonably old notes I’d made when Win Server 2008 R2 was first released.  With these notes on screen and the fact that it’s Friday night and I’m thinking about beer more than posting a new article, I thought I’d take some of the usable content  (minus my drivel and spelling errors) from the notes and post them up here. A little random, but hopefully some use in illustrating the evolution of the most recent editions of Internet Information Services.

IIS Feature History (Recent – since 6.0)

IIS 6.0 – Included with Win Server 2003 / Windows XP Pro

  • Introduced application pools. These are boundaries that exist to seperate sets of applications / sites from each other. They have their own security context.
  • Introduced worker processes (w3wp.exe, of which there can be many associated with an application pool.) The w3wp.exe is created when traffic is received and not is resident all of the time.)
  • Introduced the HTTP.sys as the protocol listener for HTTP/S, which listens for HTTP requests on the network and hands them to the application pool queues
  • Removed winsocks (windows sockets API) component which was previously used to listen and transfer HTTP requests
  • Security Account – IUSR_NameOfServer / Group – IIS_WPG
  • WWWService managed the application pools and worker processes
  • Used the ‘metabase’ for server / site level configuration

IIS 7.0 – Included with Win Server 2008 / Windows Vista

  • Complete re-write
  • New modular design to reduce attack surface (feature modules must be turned on before use)
  • Hierarchical configuration
  • Greater .NET support
  • Security Account – IUSRS / Group – IIS_IUSRS (no server name used now so easier to migrate)
  • IIS_IUSRS group has access to wwwroot by default, meaning that access is open to anonymous users accessing wwwroot. In order to restrict access to a certain folder of the web service, you must remove NTFS permissions from the IIS_IUSRS group.
  • You can create your own protocol listeners in WCF (which listen out for certain protocols)
  • WAS (Windows process activation service… named as it activates/creates windows processes) now takes care of managing application pools and worker processes (WWW Service is now used to manage performance tokens). A protocol listener will pickup a request and ask WAS to determine whether an application pool and worker process is available. If there is no worker process available in the application pool, WAS will start (activate) a new one.
  • Introduced the applicationHost.config XML configuration file in place of the ‘metabase’ (similar idea to having a machine.config for .NET applications). It contains the configuration/definitions of all sites, applications, application pools and global settings. It also contains the location of custom ‘modules’ written in .NET that you can implement in IIS and the native modules that ship with IIS. Config file is found in %winDir%system32inetsrvconfig

IIS 7.5 – Included with Win Server 2008 R2 / Windows 7

  • Powershell support added
  • Improved WebDev and FTP modules
  • New management tools
  • Configuration file logging. Enables auditing of access to config files.
  • Hostable web core. This means the core web components can be hosted by other applications, meaning applications can accept and process HTTP requests.

IIS HTTP request handling

Request processing follows a similar model in IIS6/7/7.5. The below shows the processing model for HTTP requests. If another protocol was being used, the listener would be different but the processing would be the same.

  • When a client browser initiates an HTTP request for a resource on the Web server, HTTP.sys intercepts the request.
  • HTTP.sys contacts WAS to obtain information from the configuration store.
  • WAS requests configuration information from the configuration store, applicationHost.config.
  • The WWW Service receives configuration information, such as application pool and site configuration.
  • The WWW Service uses the configuration information to configure HTTP.sys.
  • WAS starts a worker process (w3wp.exe) for the application pool to which the request was made if one is not already available.
  • The worker process processes the request by running through an ordered list of events that call different native and custom ‘managed’ modules (custom .net assemblies design to process web traffic)
  • The worker process executes the server side logic in the context of the user identity configured in the application pool and then returns a response to HTTP.sys.
  • The client receives a response.

IIS Server Modules

Unlike IIS 6.0, IIS 7.0 introduced a core web server engine (below in blue) that can have modules (functionality) added or removed from it. These modules are used by the core web engine to process requests. You can add or remove the native modules or create your own custom modules. This module based approach is more secure than IIS 6.0 because it reduces the attack surface and memory consumption footprint by letting you choose which modules to activate. It also makes the web server extensible in the form of custom managed modules (.dlls). The module types are:

  • Native modules – These ship with IIS and can be found in the %winDir%system32inetsrv folder of the server. e.g. Cachhttp.dll is the http cache module.
  • Managed modules – These are .NET based modules that come with the .NET framework and plug into the engine. e.g. System.Web.Security.UrlAuthorizationModule. You can create your own custom managed modules using the .NET Framework SDK.

The below image shows the ordering of events that a worker process carries out to process a request. It shows the modules that are invoked by the worker process. First native modules are called, then CLR hosted ‘managed’ modules in the form of .net assemblies installed to the server and registered in the applicationHost.config file.

Entity Framework : Unable to load the specified metadata resource.


So I’ve been working on a code base for a financial system today and have performed a good bit of code re-factoring involving the re-organization of classes into and out of my namespace structure. After deploying my code to the test server, I ran into a problem with a few of my domain model objects that load their state from the database using an entity framework model (edmx). The problem, as per the title of this post threw me a little as I hadn’t changed the properties of my edmx, nor had I changed the connection string.

I should add at this point, that I dynamically pass my entity connection string to my entity model constructor at runtime so that I have more control.  In this instance, the code I’m using to build that connection string had not changed either, to my knowledge. The model names had remained the same.  After a few minutes of head scratching, it became clear that my connection string had become invalid because I had renamed the namespace that housed my entity model’s and had to update my code that took care of building the entity connection string.

Here’s my code. The namespace my models used to be in was called ‘DataModels’. Then as I added some more DAL classes to that namespace, I decided to call it just ‘Data’. That needed updating:

Image

Hopefully my moment of ‘numptiness’ can benefit you if this exception is thrown in your code after a re-factoring exercise. It’s very likely a change in the connection string / connection strings metadata. The 3 resource addresses in the connections metadata relate to the storage model (ssdl), conceptual model (csdl) and mappings (msl) although they are not terribly clear in my image above, so here’s my models metadata in full:

res://*/Data.MetastormModel.csdl|res://*/Data.MetastormModel.ssdl|res://*/Data.MetastormModel.msl

Entity Framework : A short introduction


Introduction

The entity framework is an object relational mapping tool that is well integrated into the .NET development environment. It allows for the conceptual modelling of a physical data store that can be used with your applications. Data sources are represented as objects so its much easier to incorporate data entities into your logic.

The problem with older data access technologies is that they did nothing to bridge the gap between the relational set based format of the data and the strongly typed object oriented logic that needs to access that data. Supporting large data reads for example required considerable code written into the logic of the application to load the data in and persist it back to the data store, this affected performance, caused issues with the data type differences and generally contributed to poor design. The Entity Framework is a bridge between the data tier and the application logic.

Entities are represented as partial classes, this is to allow you to extend your entity by creating another partial class with the same class name
Differences between object orientated logic and relational data stores

Data Type Differences

– Nvarchar(20) has a limited size, but string allows upto 2gb of data
– Sql_varient only loosely maps to .net object
– Binary data in the database doesn’t quite map to a byte array in .net
– Date and time formats can become an issue when mapping between DB and .NET

Relationship Differences

– The database uses primary / foreign key relationships. These relationships are stored in a master database table.
– .NET uses object references (linking to a parent object from the child objects (in a child object collection))

Inheritance Differences

– .NET supports single object inheritance (have a base object with shared features and then then derive from the that base object for creating similar objects)
– Makes code simpler and easier to debug
– Relational databases do not support databases, tables cannot be inherited by other other tables
– Modelling a database that attempts to emulate some kind of table hierarchy (inheritance) can cause issues and introduce very complex entity models in .NET

Identity / Equality Differences

– In the database, the primary key constraint (unique column) is the identifier of a row object
– When comparing objects in .NET that have had the same data loaded from the same table, these objects are NOT equal, even though they hold the same data (only when the variable REFERENCES the same object… equality by reference… is the object EQUAL to another). So loading the same data from the database (the same row) will still not seem equal in the logic.

Handling these differences : Object Relational Mapping

You can write a lot of complex code to try and iron out these object/relational differences when working with data from your logic… or you can use a framework, such as the entity framework to act as an intermediate layer that focuses on resolving the issues highlighted. This intermediatry is called, Object Relational Mapping. That means that the relational database object is MAPPED to an OOP object. That mapping solves the differences highlighted. That means it is a back and forth mapping, meaning your object based changes can be written back to the database.

Microsoft’s answer to handle these differences

Microsoft’s primary ORM tool, now supported within the framework is the Entity Framework. This is Microsofts primary data access strategy, released in .NET Framework 3.5. Microsoft are investing a lot of cash into the development of Entity Framework so it is the recommended data access method.
LINQ to SQL is a similar technology developed by Microsoft and also benefits from the LINQ syntax, but Microsoft decided to take Entity Framework forward and leave LINQ to SQL as is, with no further development planned.

Benefits of using ORM

– More productive. Less complex code writing required to manage the differences between objects and relational data
– Better application design. Adding complex code can complicate the design. Introducing a mapping layer, maintains the n-tier application stack
– Code can be re-used (the model can be saved off as its own project that is used by many applications or parts of the same application).
– Much more maintainable that custom code.
– Lots of different ORM tools (NHibernate was an older ORM tool that could be used with .NET, based on a port from the java Hibernate ORM)

Note that, a small performance hit is required to execute mappings, but the other benefits of using ORM far outway the cons.

Generic Data Objects vs. Entity Objects

ADO.NET has long used generic data objects such as SqlConnection, SqlCommand, SqlDataAdapter, SqlDataReader, DataSet etc. Together these objects allow a means for getting data from a data store and the writing back of data changes to that data store. The DataAdapter object monitors the DataTable for changes and writes those changes back. Whilst these objects do a great job, there are a few issues with using generic data objects like these:

Generic Data Objects

– Tight coupling between application and database because the app code needs to know the structure of the data in the database. So changes to the database can cause problems and code updates may be needed.
– Older DataTables contain columns with loose typing and you would have to convert the column value to a strong type using the Convert() method (although typed data sets resolve this to an extent)
– Extracting related data (data tables in a data set) whilst possible introduced some complexities.
– Unecessary looping of in memory data is needed to find values (unless LINQ to DataSets is used)
– Generally more code needs to be written to extract bits of data at difference times from DataSets.

Entity Objects

– The ORM takes care of the strong typing from relational DB type to object type. This means you don’t have to be concerned with running expensive conversions on the returned data.
– The compiler checks the types at runtime
– More productive to work with as you have to write less code in order to work with the objects. Intellisense provides a much quicker way of accessing the columns in your data objects when writing LINQ for example.
– You are programming against a model, rather than custom code specific to the application
– You’re no longer working directly with the data store in your application logic, just normal objects, therefore decoupling the two.
– Is a layer of data abstraction.

Entity Framework Benefits

So what’s the benefits of using the Entity Framework? – Plenty of reasons.

1) It’s API’s are now fully integrated into Visual Studio so available to add to your projects
2) Uses LINQ as its query language, which again is integrated into the language
3) It’s independent of the data store
4) It abstracts data layer from the logic layer

Note, you might not need to use the Entity Framework for more basic data requirements. Simple reads of data from a few simple tables will not require you to setup entity models. Entity Framework is useful when you are working with more than a few tables within an application.

Entity Data Model

The entity data model, is the structure of the business data, but reshaped into objects usable by the programming language. It describes the structure of the entity data objects that make up the model and also the relationships between this entity objects.

It is made up of 3 main chunks of XML in a single data model file, with the extension .edmx:

– Conceptual model (the object orientated structure)
– Storage model (the physical database storage structure)
– Mapping data (the mappings between the two models above)

Creating a Model

1) Create a VS project targeted at .NET 3.5
2) Add a new folder to that project called ‘DataModels’ (good to keep your models separate from your code)
3) Add a new item to that folder, specifically an ADO.NET Data Entity Model, this will start the model wizard.

First you can select whether you want a blank model or to read from an existing databases schema, that’s what we’ll do. Next, you specify the connection string to use to connect to the data store. This then uses that connection string to build an entity framework connection string (contains the normal ADO.NET connection string, but also some other metadata). The ‘metastorm’ connection string I’m using here is pulled from the App.Config file in my project.

At the bottom of this screen, the MetastormEntities is not just the name of the connection string that will be saved to App.Config, it’s the name of the entity model, so should reflect the name of the database you are pulling your entities from.

Next, the database specified by the connection string is read and its schema is presented to you so you are able to select the database objects you wish to model. Tables, Views or Stored Procedures (incl. Functions):

I select 3 of my local tables, MRPOPS, MRPIPS and MRQUERY and click finish. This creates my .edmx model called MetastormEntities (as specified earlier):

By right clicking each entity you can view the mapping from the physical table to the conceptual entity, including the data type conversions. You can edit the names of the local properties that map to the database column names :

You may notice that I have renamed the entities as follows:

MRPOPS is now called Order (a singular entity) with an EntitySet name of Orders
MRPIPS is now called Invoice with an EntitySet name of Invoices
MRQUERY is now called Query with an EntitySet name of Queries

This will make working with the tables easier in code and leaves the actual table names intact.

Looking into the EDMX Model

As previously mentioned, the .edmx model is actually an XML file that is rendered graphically by visual studio to show the conceptual model. In order to view the 3 parts of the .edmx file (physical storage, conceptual model and mappings) then right click the model and select ‘..open with’ and select the XML Editor. This is a collapsed view of my model file that shows the 3 parts (with the edmx namespace):

The runtime here contains the metadata about the data, including the mappings.
The designer section is used by visual studio in order to display the entities properly graphically.

Looking into the Model Designer cs

The designer.cs file contains the partial classes that represent each entity as well as the model itself:

Using Our MetastormModel.edmx

In our project, add a new class and make sure we add reference to System.Data.Entity in that class. Ensure that the System.Data.Entity.dll has been added to your project references. It should been added automatically but best to check. Because we have our class in the same project as our model, we do not need to copy our App.Config out into another project that might need our model.

We can now instantiate our model and start querying our tables (note the looping in the below example is unnecessary but illustrates how to access each order in the query results:

//Create an instance of the model
MetastormEntities metastorm = new MetastormEntities();

//Query the orders entity for doc generation errors
var SelectedOrders = metastorm.Orders.Where(p => p.txtDocGenResult.StartsWith(“E” ));

foreach (Order order in SelectedOrders)
{
return “Order Folder with ID ” + order.EFOLDERID + ” has had a document failure!” ;
}

Entity Framework Architecture

What makes a good solution architect?


I’ve worked with quite a few over the years, from many different companies in many different sectors / verticals. You find however that the qualities of a solution architect shine through pretty obviously.  A solution architect does not have a wall full of qualifications proving then know how to ‘code in a proper way’ or ‘understand architectural patterns’ as this proves nothing to the business.  When it comes to someone directing how a suite of business requirements is translated into a solid business solution there is no substitute for experience and experience is full of failure.

In my view , a solution architect must understand the business and the technical side of things. In a broad way. I know of one or two ‘Solution Architects’ that have just spent time looking after computer networks for 10 years and I really feel that this is in no way a path to becoming an architect.  The architect must understand the business first and foremost but have experience in more than one specific area of technology, but have expertise in at least one.  For me, that would translate into experience into:

– Network Infrastructure : It’s important that an architect knows how the network hangs together. How machines communicate each other and be able to use the most common OS command line / shell tools to work on remote servers and detect problems on the network.  Network security is also important, as are topics such as certificates and encryption. Finally, an architect should be familiar with the OSI model and the communication protocols that operate at each layer of the stack.

– Server Technologies : An architect should have a relatively good understanding of core server roles and the services and features they provide to clients. DNS and DHCP for example are basic server features that almost all servers of differing operating systems will have and need configuring. Some experience in servers such as Windows Server, if you are working in a Microsoft dominated network environment are also a must. Understanding active directory, network secure-able objects (users/computers etc), domain group policy and NTFS permissions are basic features that one should be aware of.

– Database Administration or Design : Almost all applications that a solution architect will work with will have a persistent data store and that is a database for the majority of the time. Therefore understanding SQL is a no-brainer. You simply cannot call yourself a solution architect if you don’t know how to SELECT from an INNER JOIN. Database servers are generally under utilized in a big way (at least that’s my experience working with SQL Server for 10 years).  SQL Server for example comes with several major ‘components’ for reporting, analysis, notifications,data extraction and loading and of course the core database. Many developers only know how to use the database feature and get stuck when it comes to the SQL Server security model.  I have not met a developer yet who understands all of the database roles and what features they provide.  Finally, a good solution architect know’s when logic should be placed in stored procedures and when it should be placed in the business logic tier.  A major bug bare of mine is large stored procedures that do more than basic data manipulation and organization.  In summary, understanding what data layer support you have when architecting an application is key so you can properly place component responsibility within the data layer.

– Software Design & Development : Yes, that means that as an architect, you should have experience not just writing code, but designing applications and seeing those designs through the entire first cycle.  This means that from a set of business requirements, you understand the technical landscape (see above) enough to present to the business a preliminary, high level design by taking the ‘what’ of the functional and non functional requirements and turning those into the ‘how’ in terms of high level implementation and software sub-systems.  That means identification of the technology stack to use (hence the server/network/database knowledge requirements) and what are the main high level areas of concern (including cross cutting, such as security). The architect must also determine what operational requirements should be taken into account, what architectural patterns will be used in the solution and finally how the solution will honour a set of quality attributes (maintainable, secure, extensible, scalable etc). Once the high level specification is signed off, the architect should then be able to take the high level design and move down to the implementation detail by decomposing an existing system or set of requirements into the individual software components and create a domain model.  These components should then (ideally) conform to the 3 P’s of software design, Principles, Practices and Patterns in their design and interaction.  In summary, whilst it’s important to know your code syntax and the common objects available in the libraries as part of development frameworks, you must be familiar with the principles and methods of designing good components that interact well.

So to re-iterate. These are the qualities and skills that I personally believe all good solution architects I have met have. I am certainly no barometer for assessing the role, but have worked with enough competent architects to know what skills are important and almost all of them do not hold a lot of qualifications in the technical subjects.

Experience and a strategic/practical/pragmatic skill set are far more important in reality.

To conceptualize, abstract reality, direct a solution’s development, look at every problem with the architecture in mind, provide the why, what and how in any task arising from the development project and support the developers are skills that an architect should hold.

My personal favourite skill observed however is the modesty of some architects.  When you sit in a meeting with them, you realize their experience and that they are truly the guru of the business domain, they have a constant want for knowledge and that shows in their modest approach. They clearly know more than they show and that, to be is one of the greatest qualities an architect can have.

As always, I welcome and appreciate reader opinions on the subject.

Transactional Isolation


Transactional isolation is usually implemented by locking whatever resource is being accessed (db table for example) during a transaction thus isolating the resource from other transactions.  There are two different types transactional locking: Pessimistic locking and optimistic locking:
Pessimistic Locking : With pessimistic locking, a resource being accessed is essentially locked from the time it is first accessed in a transaction until the transaction completes, making the resource unusable by any other transactions during that time. If competing transactions simply need to read the resourceonly (a SELECT of a data row for example) then an exclusive lock may be overkill. The lock exists until the transaction has either been committed or rolled back at which point, the resource is made available again for other transactions.
Optimistic Locking : Optimistic locking works a little differently.A resource being accessed isn’t locked when first used, but the state of the resource is noted.  This allows other transactions to concurrently access to the resource and the possibility of conflicting changes is possible. At commit time, when the resource is about to be updated to persistent storage, the state of the resource is read from storage again and compared to the state previously noted when the resource was first accessed. If the two states differ, a conflicting update was made, and the transaction will roll back.

Good Design Documentation


Ok, that’s a relatively general title. What I mean specifically and what I want to talk about here is the importance of a good enough technical design specification and how the quality of said document really does impact the resulting solution. I want to share some recent experiences where documentation really did let the side down.

I’ve not long finished a 6 month placement at a company working on Cordys C3 based bug fixes and change requests. It was challenging in 3 respects.

Firstly, the work was specific to Business Processes and XForms based user interfaces and both of which where not exactly organized within Cordys in a very intuitive and structured manner that provided any contextual reference to how the business solutions worked together.  Second, the new change requests being received really required a good knowledge of the existing 300+ processes and forms that made up the applications that ran on Cordys.  This was because design documentation was less than adequate and in six months, you really can’t get a detailed enough grasp of such a large number of processes, especially non-standard business processes in the world of media management and digital content distribution.  Finally, the loaded web service interfaces (and so the services themselves) included operation parameters such as arg0, arg1 and arg2, which as you’ll have no doubt evaluated, is unbelievably unhelpful in determining what the service operation actually does and what data should be provided in the messages.

OK, these issues aside for a moment, the real issue I want to discuss was how the design documentation should go some way in explaining which components, services and forms should be used or designed for the given solution, what dependencies are outstanding at the time of writing and what risks may be involved.  I worked on a couple of CR’s and the documentation was poor.  This was through no fault of the architect however, who knew the systems inside out and was stretched so thinly, that their desk was only visited once every 2 days when meetings were cancelled. Poor and constantly changing requirements where no help either.

In order for developers to attach time estimates to tasks detailed in a design specification document, there must be enough detail so that the estimate confidence level of such tasks can be as high as possible and thus the business is kept more realistically informed of how long the solution might take.  The problems I found with the documentation in this instance where as follows:

– Very brief document objective
– Zero mention of any outstanding dependencies at the time of writing (i.e. are services required by this functionality written yet? Some where not)
– ‘To Be’ process illustrations made up the bulk of the document, with very little supporting written description.
– Message interface definitions where non-existent
– Not all of the requirements were met in the design document
– No exception handling or compensatory actions where detailed in the event of errors
– No architectural overview was presented and minimal system interactions (i.e. sequence diagrams) where present

In short, the design specification put the responsibility on the developer to fill in the gaps in detail. Whilst this free’s up the architects time, this really is no good for a few reasons:

1) Not all of the design is documented and therefore cannot be referenced in the future if the client questions functionality or attempts to request amendments outside of a new CR
2) New developers to the team (myself in this instance) are left with gaps in their knowledge that requires time consuming investigation with potentially mutliple teams (and as such consumes estimated time but no development is done)
3) Gaps in design specifics can lead to incorrect assumptions about how the solution should operate
4) Inadequate design detail leaves room for (mis)interpretation of the design and can mean solutions move away from company design standards and architectural rules. This leaves a messy set of solutions that operate differently, don’t really utilize re-use and only further confuse developers.

In this case, clearly the company may not have the resource to focus more on detailed documentation or maybe they believe it’s just not as important as I do.  The bottom line however is that if you are going to develop a solution that’s more complex than ‘Hello World’ you should really think about documenting the following (and I apologize in advance if you are great at your design specifications) :

– Start with a document summary. This should includes, author, distribution list, document approvers and release history.
– Basic I know, but include a ‘contents’ section that logically breaks up the design into layers (data access, service, process, ui).
– Provide a detailed overview of the solution. Detailed being the key word here. Copying chunks of text from the functional specification is not cheating. The overview should include how the solution will improve any existing solutions (i.e. improve stability, boost usability, provide a greater level of system managability)
– If necessary, provide details of the runtime environment and any antipicated configuration changes
– Make reference to any other materials such as the functional specification and use case documents
– Include design considerations (company or otherwise)
– Detail any known risks, issues or constraints
– Detail any dendancies at the time of writing. This should include any work being performed by other teams that the solution being detailed requires in order to operate successfully.
– Provide a top level architectural diagram, even if the solution is basic. Diving into the detail without giving the developers a 1000 foot view of where the solution fit’s in to the wider solutions architecture to me is just wrong. Support the diagram with a sentance or two.
– List the components that will change as part of the design
– List any new components
– Diagram component interactions (sequence diagrams)
– ‘To be’ process designs should include annotation, even if it’s assumed current knowledgable developers would know this information. You will not always have the same people doing the work.
– For UI’s, detail the style and appearence ensuring it’s in line with company accessibility policies. That may require detailing fonts and hexidecimal colour references [#FFFFFF].
– Detail what programming may be required. Server side, client side. What functionality should this code provide. What coding best practices should be honoured.
– Keep a glossary of terms to the back of the document

Finally and most importantly, even if you are the architect and the master of all knowledge when it comes to your solution domain… Distribute the document and enable change tracking. Send to the subject experts for clarification and the business stakeholders, even if they don’t understand the content.

Most of us do and that’s great.  Design specification templates can be found online so there’s really no excuse.

SOA. Just the basic facts. In 5 minutes.


What a SOA is not

SOA does not mean using web services.
SOA is not just a marketing term.
SOA also does not just mean ‘using distributed services’.

What a SOA is

SOA is an architecture of business services (usually distributed and sometimes ‘connected’ by means of a service bus) that operate independently of each other, advertise what services they offer through well-defined interfaces and can be heavily re-used not only to aid development productivity of the IT department but to enable use of existing IT assets/systems. Ultimately, this means quicker turn around of IT projects, improving how IT serves the business and thus improves business agility.

‘Service’ orientation means that logic is provided via a suite of services which should be seen as ‘black boxes’. They do stuff, but you or the services consuming them don’t need to know what’s going on under the hood, only what messages they require as input (usually SOAP messages) and what the service will do for / return to you. A black box service doesn’t have to be a  web service, though they are the most commonly implemented type of service for maximum distribution and cross-platform compatibility.

So whilst that goes some way in explaining what SOA is on a general level using these developer written ‘services’… What SOA really is, is A FUNDAMENTAL CHANGE IN THE WAY YOU DO BUSINESS via a top down transformation requiring real commitment from the business, not just IT. That requires a change in mind-set of the top people.

Characteristics of a black box ‘service’ in a SOA

– Loosely coupled (minimizes service dependency)
– Contractual (adherence to well-defined service interface contracts… ‘if you wanna do business you need to abide by my interface contract’)
– Abstract (service is a black box, internal logic is hidden from service consumers)
– Reusable (divide and conquer! – We divide up business logic to basic reusable services)
– Composable (can be used as a building block to build further composite services)
– Stateless (retains little to no information about who it interacts with)
– Discoverable (describes itself, a bit like a CV so that the service can be found and assessed ‘hello I’m a service, here is my CV’)

What can these services do?

Whatever you need them to do in order to satisfy business change needs / requirements. Common functions of services include:

– Perform business logic
– Transform data
– Route messages
– Query data sources
– Apply business policy
– Handle business exceptions
– Prepare information for use by a user interface
– Generate reports
– Orchestrate conversations between other services

The business benefits of implementing a SOA strategy

– Open standards based
– Vendor neutral
– Promotes discovery
– Fosters re-usability
– Emphasizes extensibility
– Promotes organizational agility
– Supports incremental implementation (bit by bit development)

What a SOA might look like

The below shows a business application based on a SOA.

The lowest level of operation consists of application logic, including existing API’s, DAL code and legacy systems. This may include ‘application connectors’ that are middle men that interface between a simple exposed API and large systems like ERP, MRP etc).
This low-level application logic is then exposed as basic level services (application orientated services as they are a wrapper for parts of the application logic).
These basic level services form the building blocks of composite level services. More application aligned services are combined together to form services that are more aligned with the business, thus are more business orientated services. This can include exposing a business process as an independent business service.
Basic (application orientated) and composite (more business orientated) services can then be orchestrated by business processes.
These business processes may include human interaction points where user interfaces are required. Processes can also be initiated via user interfaces (requests / orders / applications etc).

Image

Steps in Implementing a SOA with web services

1) Creating and exposing services (development team creating component services)
2) Registration of services (SOA isn’t truly in place when you just have random web services sitting on different web servers exposing WSDL. Where services are just consumed based on word of mouth and passing WSDL documents around.  A SOA requires a directory of its available services where all available services can be registered. UDDI 3.0 being the standard when using web services. This directory is the yellow pages for the services).
3) Address security. Exposing business logic as services over large networks opens up a serious set of security challenges. Security standards must be implemented for the services so that consumers of services can meet the security requirements.
4) Ensure reliability. Services must be monitored to make sure they are always up (high availability) and performance must be monitored to ensure reliability.
5) Concentrate on governance. How are all of these steps governed? What are the policies that should be enforced at runtime. Compliance is also important.

SERVICES THAT ARE EXPOSED, REGISTERED, SECURE AND PERFORM WELL FORM A SOLID SOA FOUNDATION.

That’s all for now. Hopefully that paints a very top-level picture of what SOA is, what it is not and how you should go about implmenting it (with the all important business buy in).