Technology

BPM Technology Platforms

.NET : Processes, Threads and Multi-threading


So I’ve been digging around in my evernote to publish something this evening that might be of actual use.  My notes tend to be lots of small tid bits that I’ll tag into evernote whilst I’m working on projects and whilst they’re great on their own as a golden nuggets of information that I’ll always have access to, it fails to be useful for use in a blog article.

I did come across some notes from a couple of years ago around threading, specifically threading in the .NET framework…. So as I’ve little time this week to dedicate to writing some new content, I thought I’d cheat a bit and upload these notes, which should go some way in introducing you to threading and processes on the Windows / .NET platform. Apologies for spelling errors in advance as well as the poor formatting of this article. I’ve literally not got the time to make it look any prettier.

So, what is a windows process?

An application is made up of data and instructions. A process is an instance of a running application. It has it’s own ‘process address space’. So a process is a boundary for threads and data.
A .NET managed process can have a subdivision, called an AppDomain.

Right, so what is a thread?

A thread is an independent path of execution of instructions within a process.
For unmanaged threads (that is, non .NET), threads can access to any part of the process address space.
For managed threads (.NET CLR managed), threads only have access to the AppDomain within the process. Not the entire process. This is more secure.

A program with multiple simultaneous paths of execution (concurrent threads) is said to be ‘multi-threaded’. Imagine some string (a thread of string) that goes in and out of methods (needle heads) from a single starting point (main). That string can break off (split) into another pieces of string thread that goes in and out of other methods (needle heads) at the very same time as it’s sibling thread.

When a process has no more active threads (i.e. they’re all in a dead state because all the instructions within the thread have been processed by the CPU already), the process exits (windows ends it).

So if you think about when you manually end a process via Task Manager. You are forcing the currently executing threads and the scheduled to execute threads into a ‘dead’ state. Thus the process is killed as it has no more code instructions to execute.
 threadendOnce a thread is started, the IsAlive property is set to true. Checking this will confirm whether a thread is active or not.
Each thread created by the CLR is assigned it’s own memory ‘stack’ so that local variables are kept separate

Thread Scheduling

A CPU, although it appears to do process a billion things at the same time can only process a single instruction at a time. The order of which instruction is processed by the CPU is determined by the thread priority. If a thread has a high priority, the CPU will execute the instructions  in sequential order inside that thread before any other thread of a lower priority. This requires that the thread execution is scheduled, according to priority.   If threads have the same priority however then an equal amount of time is dedicated to each (through time slicing of usually 20ms for each thread).  This might leave low priority threads out in the cold however if the CPU is being highly utilized, so to avoid not executing low priority threads, Windows specifically dedicates a slice of time to processing those instructions, but that time is a lot less than given to the higher priority threads. The .NET CLR actually lets the windows operating system thread scheduler take care of managing all the time slicing for threads.

Windows uses pre-emptive scheduling. All that means is that when a thread is scheduled to execute on the CPU then Windows can (if it wants to) unschedule a thread if need be.
Other operating systems may use non pre-emptive scheduling, meaning the OS cannot unschedule a thread if it wants to if the thread has yet not finished.

Thread States

Multi-threading in terms of programming, is the co-ordination of multiple threads within the same application. It is the management of those threads to different thread states.

A thread can be in…
  • Ready state – The thread tells the OS that it is ready to be scheduled. Even if a thread is resumed, it must go to ‘Ready’ state to tell the OS that it is ready to be put in the queue for the CPU.
  • Running state – Is currently using the CPU to execute its instructions.
  • Dead state – The CPU has completed the execution of instructions within the thread.
  • Sleep state – The thread goes to sleep for a period of time. On waking, it is put in Ready state so it can be scheduled for continued execution.
  • Suspended state – Thread has stopped. It can suspend it’s self or can be suspended by another thread. It cannot be resumed by itself. A thread can be suspended indefinatley.
  • Blocked state – The thread is held up by the execution of another thread within the same memory space. Once the blocking thread goes into dead state (completes), the blocked thread will resume.
  • Waiting state – A thread will release its resources and wait to be moved into a ready state.

Why use multiple threads?

Using multiple threads allows your applications to remain responsive to the end user, whilst doing background work. For example, you may have a windows application that requires that the user continue working in the UI whilst an I/O operation is performed in the background (loading data from a network connection into the process address space for example). Using multi-threading also gives you control over what parts of your applications (which threads) get priotity CPU processing. Keeping the user happy whilst performing non critical operations on background threads can make or break an application. These less critical, low priority threads are usually called ‘Background’ or ‘Worker’ threads.

Example

If you created a simple windows form and drag another window over it, your windows form will repaint itself to the windows UI.

If you created a button on that form, then when clicked put the thread to sleep for 5 seconds (5000ms), the windows you are dragging over your form would stay visible on the form, even when you dragged it off of your form. The reason being is that the thread was being held up by the 5 second sleep, so it waiting to repaint itself to the screen until the thread resumed.

Implementing multi-threading, i.e. putting a background thread to sleep and allowing the first thread to repaint the window would keep users happy.

Multi-threading on single core / single processor 

Making your app multi-threaded can affect performance on machines that have a single CPU.  The reason for this is that the more threads you use, the more time slicing the CPU has to perform to ensure all threads get equal time being processed.  The overhead involved in the scheduler switching between multiple threads to allow for processing time slices extends the processing time. There is additional ‘scheduler admin’ involved.

If you have a multi-core system however, let’s say 4 cpu cores this becomes less of a problem, because each thread is processed physically at the same time across the cpu cores. No switching between threads is involved.

multithreading

Using multiple threads makes code a little harder to read and testing / debugging becomes more difficult because threads could be running at the same time, so they’re hard to monitor.

CLR Threads

A thread in the CLR is represented as a System.Threading.Thread object.  When you start an application, the entry point of the application is the start of a single thread that all new applications will have. To start running code on new threads, you must create a new instance of the System.Threading.Thread object, passing in the address of the method that should be the first point of code execution. This in turn tells the CLR to create a new thread within the process space.

To access the properties of the currently executing thread, you can use the ‘CurrentThread’ static property of the thread class: Thread.CurrentThread.property

You can point a starting thread at both a method that has parameters or a parameterless method. Below are examples of how to start these 2 method types:

  • ThreadStart – new Thread(Method);
  • ParameterizedThreadStart – new Thread(Method(object stateArg));
ThreadStart example

Thread backgroundThread = new Thread(ThisMethodWillExecuteOnSecondThread);
            backgroundThread.Name = “A name is useful for identifying thread during debugging!”;
backgroundThread.Priority = ThreadPriority.Lowest;
backgroundThread.Start(); //Thread put in ready state and waits for the scheduler to put it into a running state.

public void ThisMethodWillExecuteOnSecondThread()
{
//Do Something
}

ParameterizedThreadStart example

Thread background = new Thread(ThisMethodWillExecuteOnSecondThread);
           background.Start(“string”);

public void ThisMethodWillExecuteOnSecondThread(object stateArg)
{
     string value = stateArg as string;
//Do Something
}

Thread lifetime

When the thread is started, the life of the thread is dependent on a few things:
  • When the method called at the start point of the thread returns (completes).
  • When the thread object has it’s interrupt or abort methods invoked (which essentially injects an exception into the thread) from another thread that has handled an outside exception (asynchronous exception).
  • When an unhandled exception occurs within the thread (synchronous exception).
Synchonous exception = From within.
ASynchronous exception = From outside.

Thread Shutdown

Whilst threads will end in the scenario’s listed above, you may wish to control when the thread ends and have the parent thread regain control before it leaves the current method.
The example below shows the Main() thread starting off a secondary thread to take care of looping. It uses a volatile field member (value will always be checked for updates) to tell the thread to finish its looping.
Then the main thread tells the secondary thread to rejoin the main threads execution (which blocks the secondary thread).

Background vs Foreground Threads

Foreground Thread – A foreground thread if still running will keep the application alive until the thread ends. A foreground thread has it’s IsBackground property set to false (is default value).
Background Thread – A background thread can be terminated if there are no more foreground threads to execute. They are seen as non important, throw away threads (IsBackground = true);

Thread Pools

Threads in the CLR are a pooled resource. That is, they can be borrowed from a pool of available threads, be used and then return the thread back into the pool.
Threads in the pool automatically have their IsBackground property set to true, meaning that they are not seen as important by the CLR and if a parent thread (if is NOT a background thread) ends, the child will end whether complete or not. Threads in the pool work on a FIFO queue basis. The first available thread added to the queue is the first thread pulled out of the pool for use. When that thread ends execution, it is returned to the pool queue. Thread pool threads are useful for non important background checking / monitoring that do not need to hold up the application.

//Creating a new thread from the pool

Threadpool.QueueUserWorkItem(MethodName, methodArgument); //This will be destroyed if the foreground thread ends.

Advertisements

Microsoft WebsiteSpark : Free Full Featured Development Tools


I’ve recently started my own virtual infrastructure by renting some dedicated servers in the UK. One of those servers will run my soon to be re-developed ProcessWorks website, which will expand on the articles I write via this blog, but also include training material, downloads, one or two useful public web services and the details of a secret product still in development.  But enough of the advertisement.

The other machine is my development server, which now hosts a full on Microsoft development environment. As Microsoft is clearly a commercial entity and a proportion of non Microsofter’s tend to moan about the fact you have to actually pay for enterprise software, this might surprise you. I managed to kit out my new virtual development server for the high price tag of… FREE… and with the following specification:

  • .NET 4 Development Framework
  • SQL Server 2008 R2 Web (includes database engine, reporting services, notification services etc) – Much more than Express edition.
  • Visual Studio Professional 2010
  • Sharepoint Foundation 2010
  • Microsoft Expression Studio 4 Premium (includes Expression Web, Encoder, Design and Blend)
  • BizTalk Server 2010 Developer Edition

In my case I already had Windows 2008 R2 Standard installed with the Web Server and Application Server roles turned on (but I’ll get to how you can download Windows Server 2008 R2 for free also).

This was all made possible via Microsoft’s continuing support of new start businesses and developers.  Biztalk 2010 is available as a free full featured download for developers through this link.  This is an amazing integration and process server and if you wish to learn Biztalk, this is what you need.  The installer takes care of downloading the pre-requisites or allows you to load a pre-req .cab file and away you go. The available training material for Biztalk from Microsoft is plenty available.  The rest of the software listed above was obtained via one of Microsoft’s ‘Spark’ programs.

  • BizSpark is aimed at new small businesses and offers Microsoft software for free via their MSDN download portal.  This is clearly a move to seed more expensive Microsoft infrastructures and companies expand, but it’s free and you can always decide to get smart and replace with free technology.
  • DreamSpark is aimed at students, giving them access to software they can use to aid study.  Provided you can prove you are a student, you get access to software such as Visual Studio 2010 Ultimate, Expression Studio 4 Premium, XBox SDK, SQL Server 2008 R2 and operating systems such as Windows Server 2012 / 2008 R2. A pretty neat deal.
  • WebSiteSpark is aimed at small web development companies, like ProcessWorks.  I have created a couple of ASP.NET web sites for clients and the software that has been made available has been so very useful.  You simply sign in with your MSN/Hotmail credentials, provide the name of the your company and the address and then you are registered and have access to the Microsoft partner portal / MSDN downloads.  You are granted several licensce keys for much the same product set as is given to students via DreamSpark.  You are also given access to a free set of ASP.NET UI controls from a third party company and get a 1400 dollar voucher for using Microsoft’s Azure cloud service to deploy your applications (this is not a pre-requisite to signing up however).

So, in my case, I have a small company so I went with WebsiteSpark (for the choice of software I wanted).  So unfortunately these support programs are not open to everyone, however if you are a student, small start-up or a one man ltd company, you can get access to what would normally be very expensive software, for free.

IIS : A Recent History


I’ve been setting up a new Windows 2008 R2 Server today and configuring the Web Server role. I had an unanswered question or two so went about ‘googling’ for some clarification. Forgetting that I’d set my local Evernote content to be shown as part of google searches, I found some of the answers from some reasonably old notes I’d made when Win Server 2008 R2 was first released.  With these notes on screen and the fact that it’s Friday night and I’m thinking about beer more than posting a new article, I thought I’d take some of the usable content  (minus my drivel and spelling errors) from the notes and post them up here. A little random, but hopefully some use in illustrating the evolution of the most recent editions of Internet Information Services.

IIS Feature History (Recent – since 6.0)

IIS 6.0 – Included with Win Server 2003 / Windows XP Pro

  • Introduced application pools. These are boundaries that exist to seperate sets of applications / sites from each other. They have their own security context.
  • Introduced worker processes (w3wp.exe, of which there can be many associated with an application pool.) The w3wp.exe is created when traffic is received and not is resident all of the time.)
  • Introduced the HTTP.sys as the protocol listener for HTTP/S, which listens for HTTP requests on the network and hands them to the application pool queues
  • Removed winsocks (windows sockets API) component which was previously used to listen and transfer HTTP requests
  • Security Account – IUSR_NameOfServer / Group – IIS_WPG
  • WWWService managed the application pools and worker processes
  • Used the ‘metabase’ for server / site level configuration

IIS 7.0 – Included with Win Server 2008 / Windows Vista

  • Complete re-write
  • New modular design to reduce attack surface (feature modules must be turned on before use)
  • Hierarchical configuration
  • Greater .NET support
  • Security Account – IUSRS / Group – IIS_IUSRS (no server name used now so easier to migrate)
  • IIS_IUSRS group has access to wwwroot by default, meaning that access is open to anonymous users accessing wwwroot. In order to restrict access to a certain folder of the web service, you must remove NTFS permissions from the IIS_IUSRS group.
  • You can create your own protocol listeners in WCF (which listen out for certain protocols)
  • WAS (Windows process activation service… named as it activates/creates windows processes) now takes care of managing application pools and worker processes (WWW Service is now used to manage performance tokens). A protocol listener will pickup a request and ask WAS to determine whether an application pool and worker process is available. If there is no worker process available in the application pool, WAS will start (activate) a new one.
  • Introduced the applicationHost.config XML configuration file in place of the ‘metabase’ (similar idea to having a machine.config for .NET applications). It contains the configuration/definitions of all sites, applications, application pools and global settings. It also contains the location of custom ‘modules’ written in .NET that you can implement in IIS and the native modules that ship with IIS. Config file is found in %winDir%system32inetsrvconfig

IIS 7.5 – Included with Win Server 2008 R2 / Windows 7

  • Powershell support added
  • Improved WebDev and FTP modules
  • New management tools
  • Configuration file logging. Enables auditing of access to config files.
  • Hostable web core. This means the core web components can be hosted by other applications, meaning applications can accept and process HTTP requests.

IIS HTTP request handling

Request processing follows a similar model in IIS6/7/7.5. The below shows the processing model for HTTP requests. If another protocol was being used, the listener would be different but the processing would be the same.

  • When a client browser initiates an HTTP request for a resource on the Web server, HTTP.sys intercepts the request.
  • HTTP.sys contacts WAS to obtain information from the configuration store.
  • WAS requests configuration information from the configuration store, applicationHost.config.
  • The WWW Service receives configuration information, such as application pool and site configuration.
  • The WWW Service uses the configuration information to configure HTTP.sys.
  • WAS starts a worker process (w3wp.exe) for the application pool to which the request was made if one is not already available.
  • The worker process processes the request by running through an ordered list of events that call different native and custom ‘managed’ modules (custom .net assemblies design to process web traffic)
  • The worker process executes the server side logic in the context of the user identity configured in the application pool and then returns a response to HTTP.sys.
  • The client receives a response.

IIS Server Modules

Unlike IIS 6.0, IIS 7.0 introduced a core web server engine (below in blue) that can have modules (functionality) added or removed from it. These modules are used by the core web engine to process requests. You can add or remove the native modules or create your own custom modules. This module based approach is more secure than IIS 6.0 because it reduces the attack surface and memory consumption footprint by letting you choose which modules to activate. It also makes the web server extensible in the form of custom managed modules (.dlls). The module types are:

  • Native modules – These ship with IIS and can be found in the %winDir%system32inetsrv folder of the server. e.g. Cachhttp.dll is the http cache module.
  • Managed modules – These are .NET based modules that come with the .NET framework and plug into the engine. e.g. System.Web.Security.UrlAuthorizationModule. You can create your own custom managed modules using the .NET Framework SDK.

The below image shows the ordering of events that a worker process carries out to process a request. It shows the modules that are invoked by the worker process. First native modules are called, then CLR hosted ‘managed’ modules in the form of .net assemblies installed to the server and registered in the applicationHost.config file.

Entity Framework : Unable to load the specified metadata resource.


So I’ve been working on a code base for a financial system today and have performed a good bit of code re-factoring involving the re-organization of classes into and out of my namespace structure. After deploying my code to the test server, I ran into a problem with a few of my domain model objects that load their state from the database using an entity framework model (edmx). The problem, as per the title of this post threw me a little as I hadn’t changed the properties of my edmx, nor had I changed the connection string.

I should add at this point, that I dynamically pass my entity connection string to my entity model constructor at runtime so that I have more control.  In this instance, the code I’m using to build that connection string had not changed either, to my knowledge. The model names had remained the same.  After a few minutes of head scratching, it became clear that my connection string had become invalid because I had renamed the namespace that housed my entity model’s and had to update my code that took care of building the entity connection string.

Here’s my code. The namespace my models used to be in was called ‘DataModels’. Then as I added some more DAL classes to that namespace, I decided to call it just ‘Data’. That needed updating:

Image

Hopefully my moment of ‘numptiness’ can benefit you if this exception is thrown in your code after a re-factoring exercise. It’s very likely a change in the connection string / connection strings metadata. The 3 resource addresses in the connections metadata relate to the storage model (ssdl), conceptual model (csdl) and mappings (msl) although they are not terribly clear in my image above, so here’s my models metadata in full:

res://*/Data.MetastormModel.csdl|res://*/Data.MetastormModel.ssdl|res://*/Data.MetastormModel.msl

Entity Framework : A short introduction


Introduction

The entity framework is an object relational mapping tool that is well integrated into the .NET development environment. It allows for the conceptual modelling of a physical data store that can be used with your applications. Data sources are represented as objects so its much easier to incorporate data entities into your logic.

The problem with older data access technologies is that they did nothing to bridge the gap between the relational set based format of the data and the strongly typed object oriented logic that needs to access that data. Supporting large data reads for example required considerable code written into the logic of the application to load the data in and persist it back to the data store, this affected performance, caused issues with the data type differences and generally contributed to poor design. The Entity Framework is a bridge between the data tier and the application logic.

Entities are represented as partial classes, this is to allow you to extend your entity by creating another partial class with the same class name
Differences between object orientated logic and relational data stores

Data Type Differences

– Nvarchar(20) has a limited size, but string allows upto 2gb of data
– Sql_varient only loosely maps to .net object
– Binary data in the database doesn’t quite map to a byte array in .net
– Date and time formats can become an issue when mapping between DB and .NET

Relationship Differences

– The database uses primary / foreign key relationships. These relationships are stored in a master database table.
– .NET uses object references (linking to a parent object from the child objects (in a child object collection))

Inheritance Differences

– .NET supports single object inheritance (have a base object with shared features and then then derive from the that base object for creating similar objects)
– Makes code simpler and easier to debug
– Relational databases do not support databases, tables cannot be inherited by other other tables
– Modelling a database that attempts to emulate some kind of table hierarchy (inheritance) can cause issues and introduce very complex entity models in .NET

Identity / Equality Differences

– In the database, the primary key constraint (unique column) is the identifier of a row object
– When comparing objects in .NET that have had the same data loaded from the same table, these objects are NOT equal, even though they hold the same data (only when the variable REFERENCES the same object… equality by reference… is the object EQUAL to another). So loading the same data from the database (the same row) will still not seem equal in the logic.

Handling these differences : Object Relational Mapping

You can write a lot of complex code to try and iron out these object/relational differences when working with data from your logic… or you can use a framework, such as the entity framework to act as an intermediate layer that focuses on resolving the issues highlighted. This intermediatry is called, Object Relational Mapping. That means that the relational database object is MAPPED to an OOP object. That mapping solves the differences highlighted. That means it is a back and forth mapping, meaning your object based changes can be written back to the database.

Microsoft’s answer to handle these differences

Microsoft’s primary ORM tool, now supported within the framework is the Entity Framework. This is Microsofts primary data access strategy, released in .NET Framework 3.5. Microsoft are investing a lot of cash into the development of Entity Framework so it is the recommended data access method.
LINQ to SQL is a similar technology developed by Microsoft and also benefits from the LINQ syntax, but Microsoft decided to take Entity Framework forward and leave LINQ to SQL as is, with no further development planned.

Benefits of using ORM

– More productive. Less complex code writing required to manage the differences between objects and relational data
– Better application design. Adding complex code can complicate the design. Introducing a mapping layer, maintains the n-tier application stack
– Code can be re-used (the model can be saved off as its own project that is used by many applications or parts of the same application).
– Much more maintainable that custom code.
– Lots of different ORM tools (NHibernate was an older ORM tool that could be used with .NET, based on a port from the java Hibernate ORM)

Note that, a small performance hit is required to execute mappings, but the other benefits of using ORM far outway the cons.

Generic Data Objects vs. Entity Objects

ADO.NET has long used generic data objects such as SqlConnection, SqlCommand, SqlDataAdapter, SqlDataReader, DataSet etc. Together these objects allow a means for getting data from a data store and the writing back of data changes to that data store. The DataAdapter object monitors the DataTable for changes and writes those changes back. Whilst these objects do a great job, there are a few issues with using generic data objects like these:

Generic Data Objects

– Tight coupling between application and database because the app code needs to know the structure of the data in the database. So changes to the database can cause problems and code updates may be needed.
– Older DataTables contain columns with loose typing and you would have to convert the column value to a strong type using the Convert() method (although typed data sets resolve this to an extent)
– Extracting related data (data tables in a data set) whilst possible introduced some complexities.
– Unecessary looping of in memory data is needed to find values (unless LINQ to DataSets is used)
– Generally more code needs to be written to extract bits of data at difference times from DataSets.

Entity Objects

– The ORM takes care of the strong typing from relational DB type to object type. This means you don’t have to be concerned with running expensive conversions on the returned data.
– The compiler checks the types at runtime
– More productive to work with as you have to write less code in order to work with the objects. Intellisense provides a much quicker way of accessing the columns in your data objects when writing LINQ for example.
– You are programming against a model, rather than custom code specific to the application
– You’re no longer working directly with the data store in your application logic, just normal objects, therefore decoupling the two.
– Is a layer of data abstraction.

Entity Framework Benefits

So what’s the benefits of using the Entity Framework? – Plenty of reasons.

1) It’s API’s are now fully integrated into Visual Studio so available to add to your projects
2) Uses LINQ as its query language, which again is integrated into the language
3) It’s independent of the data store
4) It abstracts data layer from the logic layer

Note, you might not need to use the Entity Framework for more basic data requirements. Simple reads of data from a few simple tables will not require you to setup entity models. Entity Framework is useful when you are working with more than a few tables within an application.

Entity Data Model

The entity data model, is the structure of the business data, but reshaped into objects usable by the programming language. It describes the structure of the entity data objects that make up the model and also the relationships between this entity objects.

It is made up of 3 main chunks of XML in a single data model file, with the extension .edmx:

– Conceptual model (the object orientated structure)
– Storage model (the physical database storage structure)
– Mapping data (the mappings between the two models above)

Creating a Model

1) Create a VS project targeted at .NET 3.5
2) Add a new folder to that project called ‘DataModels’ (good to keep your models separate from your code)
3) Add a new item to that folder, specifically an ADO.NET Data Entity Model, this will start the model wizard.

First you can select whether you want a blank model or to read from an existing databases schema, that’s what we’ll do. Next, you specify the connection string to use to connect to the data store. This then uses that connection string to build an entity framework connection string (contains the normal ADO.NET connection string, but also some other metadata). The ‘metastorm’ connection string I’m using here is pulled from the App.Config file in my project.

At the bottom of this screen, the MetastormEntities is not just the name of the connection string that will be saved to App.Config, it’s the name of the entity model, so should reflect the name of the database you are pulling your entities from.

Next, the database specified by the connection string is read and its schema is presented to you so you are able to select the database objects you wish to model. Tables, Views or Stored Procedures (incl. Functions):

I select 3 of my local tables, MRPOPS, MRPIPS and MRQUERY and click finish. This creates my .edmx model called MetastormEntities (as specified earlier):

By right clicking each entity you can view the mapping from the physical table to the conceptual entity, including the data type conversions. You can edit the names of the local properties that map to the database column names :

You may notice that I have renamed the entities as follows:

MRPOPS is now called Order (a singular entity) with an EntitySet name of Orders
MRPIPS is now called Invoice with an EntitySet name of Invoices
MRQUERY is now called Query with an EntitySet name of Queries

This will make working with the tables easier in code and leaves the actual table names intact.

Looking into the EDMX Model

As previously mentioned, the .edmx model is actually an XML file that is rendered graphically by visual studio to show the conceptual model. In order to view the 3 parts of the .edmx file (physical storage, conceptual model and mappings) then right click the model and select ‘..open with’ and select the XML Editor. This is a collapsed view of my model file that shows the 3 parts (with the edmx namespace):

The runtime here contains the metadata about the data, including the mappings.
The designer section is used by visual studio in order to display the entities properly graphically.

Looking into the Model Designer cs

The designer.cs file contains the partial classes that represent each entity as well as the model itself:

Using Our MetastormModel.edmx

In our project, add a new class and make sure we add reference to System.Data.Entity in that class. Ensure that the System.Data.Entity.dll has been added to your project references. It should been added automatically but best to check. Because we have our class in the same project as our model, we do not need to copy our App.Config out into another project that might need our model.

We can now instantiate our model and start querying our tables (note the looping in the below example is unnecessary but illustrates how to access each order in the query results:

//Create an instance of the model
MetastormEntities metastorm = new MetastormEntities();

//Query the orders entity for doc generation errors
var SelectedOrders = metastorm.Orders.Where(p => p.txtDocGenResult.StartsWith(“E” ));

foreach (Order order in SelectedOrders)
{
return “Order Folder with ID ” + order.EFOLDERID + ” has had a document failure!” ;
}

Entity Framework Architecture

Cordys BPM Constructs 101a


I’m now really quite comfortable with Cordys process modelling and the models component constructs.  There are simple differences between certain constructs that I’ve used in other products such as MS Biztalk so I wanted to absorb the Cordys documentation and present a sort of quick and dirty intro to each construct and how Cordys uses them to execute your processes.  As there are several constructs and I wanted to go into some detail, I’ll present some of them here and the rest in a subsequent article.

If you don’t already have a copy of the Cordys community edition VM that runs C3 or BOP4 on a CentOS virtual hard disk, I’d suggest getting hold of it so you can ‘play’.

/*Note that most of the constructs can only be used if a user is granted the Business Analyst role, from the Cordys Classic Studio ISV package.*/

Start Event – Only one per process. Can have several trigger types:

– Message (a defined process specific message)
– Timer (by setting a time unit (e.g. days) and no of occurrence
times)
– No Message or Timer

End Event – Process can have multiple end events. Error, Message and Rollback. You define an output message by dragging a ‘Document Type’ from the BPM Components repository to the end event. This then invokes a web service as the process ends. You can specify custom error XML in the ‘error details’ tab of the end event.

Activity – The activity is the construct that represents a generic step in the process and should be configured with an activity ‘type’. It can be of type ‘Application’ (web service calls) or ‘XForm’. When of type XForm, an additional XForm tab becomes available in the activity properties.  An activity can also be set to be a ‘dummy activity’ meaning it does nothing.  Each
activity can be conditionally executed based on conditional evaluation returning a boolean result.

Decision – The decision is a basic construct that allows multiple mutually exclusive outputs. It’s actual conditions are defined in the connectors that exit the decision construct and Xpath evaluations can use message map data for data driven decisions.

Intermediate Message Event – The intermediate message should have a process specific message defined and is invoked when a message is received to the process.  Normally this is used to ‘wake’ a waiting process instance. The gateway inmessage based on the process instance id (ensuring message correlation).  For asynchronous web service calls, an intermediate message can be used to receive the response message from a service operation call. Dragging the output message from message map will set this.  Therefore it is useful to know whether
the operations you are calling via your activities are sync or async.

Compensate Event – Used to rollback any effects of an activity or context or activities when an exception is caught.  A compensate event can be defined for an activity, sub-process,
context (embedded sub-process), for each, while and until loops. An activity or sub-process can only have one compensate event associated. Compensate is represented as two left pointing arrows, side by side in a circle.

Delay Event – An intermediate event that stalls the process for a configurable amount of waiting time. Delays can be of type:
– Fixed delay.  That is a set amount days, hours, minutes or seconds.
– Message read delay.  That is a delay value read from a process specific message using an Xpath reference. As with a standard activity, a delay may be conditionally executed based on a boolean return evaluation.

Exception Event – An exception is an event which is fired when a Cordys error occurs (e.g. a soap fault is detected via external service responses or via SOA grid messages).
As you would in programming code, you can specify a specific exception name to catch or just catch any exception.  Once an exception is thrown, process execution is diverted to the connected exception event where subsequent exception handling activities should follow.  There are several error types which can be caught and which are configured on the exception event construct:
– All. All exceptions are caught, regardless of the error code.
– Custom Error. You can set an error code you may expect from a service response indicating an error has occurred (e.g. a SOAP fault
code).
– System Error. Simply put, a Cordys internal error where some part of Cordys may not be operational (a soap processor may be down for example) or syntax rules are broken.
– Communication’s Error. These errors are specific to SOAP faults returned from external service calls. Error code 11 is returned for comm’s failures.

There are error codes that are only raised when a sub-process is being invoked from a parent. These are:
– Process Loading Error
– Process Instantiation Error
– Process Model Not Found Error

… more to follow soon!

Metastorm BPM : It’s not an application development tool


After 2 years of designing a large operational system using Metastorm v7.6, I wanted to reflect on why it’s a bad idea to use Metastorm BPM to build big workflow based systems.

The problem with the finished system is not that it doesn’t satisfy the requirements or doesn’t perform well enough (considering), it’s that it is a maintenance nightmare.  I wrote an article this time last year whilst travelling back home to Holland from being snowed in and which concerned why maintainability is the most important design factor (over and above scalability and extensibility).  Coupled with a ‘big bang’ design approach (over an agile dev approach) and consistent requirement changes, it’s a surprise the system runs in its operational state.

I don’t wish to run the product down, because for small to medium workflow driven applications, it does the job. But, it’s clear lack of object orientation is the biggest single product flaw and when building a big system with Metastorm this cripples maintainability.   A solid design architecture is obviously of major importance.  Basic application architecture fundamentals such as breaking a system design down into cohesive functional ‘components’  that represent ‘concern’ area’s for the application can be difficult to implement.  This is down to the fact that process data is stored in the database per process and passing data between processes using flags can become messy, especially when certain characters are passed using those flags (that Metastorm classes as delimiters).  Sub-processes are then an option, but these also have inherent flaws.

Forms, which again are application components are process specific, so re-use is again suffering and so replication of forms has to be done, further disabling the idea of good maintainability.

Having data repeated in processes and having no code dependency features is bad enough, but because you have to remember where you have used process variables and keep in mind when and where values may change, the tool puts all the responsibility on the developer.  Once the system get’s very large, the event code panels (the ‘on start’, ‘on complete’ etc) get very complicated and tracking variables and when they may change etc becomes a struggle in itself.  Changing a variable value in a large process has the risk of making other parts of the process not work quite right because ‘you’ve forgotten that the variable is used in a conditional expression later on’.

This then begs the question, should you even use the Metastorm event/do this panels for ANY business logic.  I’d say no.  Only UI or process specific logic should be used and you should push ALL of your business logic to server-side scripts and a suite of external .NET assemblies.  You can then at least implement a fully swappable business logic layer.

So along comes v9.  This product is a great move towards stronger application architectures.  OOP design and ability to debug alone save a whole lot of system maintenance time.  So although this version takes us closer to being able to create solid, maintainable operational applications, it was released too early.  It is slow (halving development productivity against version 7), it had many broken features and grids, one of the most used visual components, especially for data driven applications (which is most business apps) were just terrible.  They continue to perform almost independently from the rest of the system and patch 9.1.1 is still addressing grid shortfalls.  Obvious shortfalls which should have been picked up by a thorough QC team @ (OpenText) Metastorm.

The new OOP approach means that designers and developers no longer have to use the ‘Line by line interpreted’ syntax of v7 and can design re-usable components.  So there is a greater case for using Metastorm BPM as an application development tool for fair-sized applications but whilst development productivity is still poor and the designer is still very buggy, it’s not quite there yet.

Cordys BOP4 : Messaging on the SOA Grid


I’ve spent a few weeks getting used to Cordys BOP 4 and as I usually try and do with a new product, I wanted to know more about what’s going on under the bonnet with it.  The central coordinating component of Cordys is its SOA grid, which takes care of messaging between all of the core Cordys services and other web services.  Based on the information provided in the Cordys offline documentation and because I’m a visual learner, I’ve drawn up the following image that should hopefully shed some light on how Cordys organises its internal services and how they communicate via the SOA grid. Click on the image to zoom to actual size.


What I’m trying to show here is how Cordys deals with an inbound service request.  The dark line represents the path of the message along the service bus.

To illustrate an example of how the above image can be used to understand what Cordys does is the request of an XForm from the Cordys client.  The client sends a request to display an XForm so sends a HTTP request to the web server for a resource of .caf file extension.  The web server, based on the .caf file extension hands the request over to the Cordys web gateway.  The web gateway contacts the LDAP service container and checks for the location of the XForms service container (the LDAP service must always be up and running for proper SOA grid functioning).  The LDAP service container has an LDAP application connector which talks to CARS.  Next the SOAP request is sent to the XForms service container and the XForms engine takes care of rendering the HTML response.  Not only that, but the XForms engine also validates controls against schemas and automatically contacts other web services required whilst rendering.  Once the HTML is generated, it is returned via the SOA grid to the Cordys web gateway, then back to the calling client.

I should mention at this point that web services on the SOA grid are called based on the service operation name and namespace in the SOAP request.

This is very high level and it’s always a good idea to read further into the Cordys documentation, but I hope this graphic helps to illustrate the architecture of services, service containers and service groups on the Cordys SOA Grid.

Understanding Multi-tenancy


I’m doing a lot of research and practical ‘playing’ with the Cordys BOP 4 environment at the moment.  It’s a relatively young product from a young company (founded in 2001) but from what I have understood about the product and its architecture it is a strong, versatile collaborative tool that really supports rapid application development and maintenance for business change thus reducing general cost of ownership.  I don’t want to talk about the product itself, I will be doing that soon enough, but I wanted to cover one of the software’s best features in my opinion and that is its ability to operate within the enterprise and in the cloud utilizing a fully multi-tenant architecture.

Multi-tenancy is an architectural principle which says that a single install of a software program / server application can service multiple client ‘organizations’, providing a customized isolated application environment with its own data.  This is in complete contrast to the idea of multiple instances for each organization (i.e. installing an instance of some server software and underlying database to serve one organization and store only that organizations data).  Each ‘organization’ (in the general sense of the word) is classed as a tenant, using the application, therefore if one single installed application can serve multiple tenants their customized view of the application, then it is said to have multi-tenancy (supports multiple tenants). Google apps is a perfect example of a multi-tenant architecture.  Multi-tenancy is the foundation architecture for most if not all Software as a Service applications, thus cloud applications support multi-tenancy.

How multi-tenancy is implemented in an application can vary from application to application.  The general principle however is to have the application virtually partition portions of its data and services to different consuming organizations.  A very simple example at the data level would be to have the application generate a set of related data tables under new database schemas as new organizations are set-up to use the application (so a schema per organization).  This separates off the data into logical groups that have their own security context on the database.  There are other ways to partition the data, but this is just to illustrate one potential method.

So multi-tenancy is a software architecture and one that is prevalent in cloud applications.  Cordys BOP 4 does this very well and I’m looking forward to investigating this product and its virtualization capabilities further.

Metastorm 9 : Ignores ‘Client Paging’ property


Grids were one of the biggest reasons I upgraded to 9.1 from an application written in 9.0.0.1.  Mainly because the ability to wrap text in grids was removed in v9.0 and so columns with large amounts of text to display became unreadable.  Now although this has been fixed in 9.1, grids are still throwing up issues.   The first is noted in my previous post where you cannot assign a business object field to more than one column of a grid (worked in v9).  Second, the grid page size on many (if not all grids) appeared to reset to 50, regardless of what was previously set.  Lastly, it appears on deployment that 9.1 is completely ignoring the fact that I’ve unchecked the client paging check-box as I don’t want to deal with pages of data in the grid.  The below example shows what is set in the designer and the resulting form after deployment….

It is clear that the client paging property is not checked for this grid.

You can see from the above however that paging is still active on the grid (the pages are shown to the left of the screen grab, but the white space represents the paging bar). Clearing the browser cache doesn’t help in any way.  It really makes me wonder how much in-house quality control testing is done at Metastorm.