March 13, 2013 at 4:10 PM

Often we find ourselves at a customer or project explaining why it is better to have a dedicated SQL server or cluster for your BizTalk platform.

One can easily come up with the most obvious reason as to why this is important:

BizTalk is designed and optimized for parallel processing and throughput and the BizTalk databases are resource intensive: it is the heart of your BizTalk environment.
If you are sharing a SQL server between BizTalk and any of your other applications, you are pulling away those resources from BizTalk. This is in regard to memory, threads, CPU cycles, etc…

However I find that customers are hard to persuade with that argument and I cannot blame them:

  • If they look at the resource usage of their SQL server that is currently hosting their BizTalk platform, it is hardly doing anything. Memory usage is high off course (it is quite normal behavior for SQL to take up all memory for caching purposes), but the load on CPU and disk is mostly quite low.
  • They have other applications (either old or new) that need a SQL server environment. Most of the times the SQL database they need hosting is nothing more then a few dozen MB, so the tendency is there to say: what harm can it bring to host it on the BizTalk SQL?
  • They paid good money for the SQL server license(s) and BizTalk license(s) they are running their BizTalk platform on. They do not want to ask their manager for extra money again (cost of hardware/storage/licenses) because of budget limits or any other reason for that matter.

So in the end, what often happens at some projects is that the SQL server or SQL cluster hosting your BizTalk platform becomes a shared server or cluster hosting the BizTalk environment as well as several other databases.

 

Why is a shared SQL a bad thing?

Like I mentioned above, any resources you’re taking away from your SQL server, you are effectively taking away from your BizTalk platform.

Additionally, there is something on SQL called the “Max DOP” parameter or the parameter for the “maximum Degree Of Parallelism”. Learn more about max DOP on MSDN.

In short, it is the number of processors used by SQL server on a SQL Server instance to execute queries in parallel to each other in order to give the results as fast as possible.
If your Max DOP parameter is set to something higher than 1, SQL will try to process queries in parts by spreading them over a number of processors. All done by the SQL engine without you having to care about it much.

The reason why I talk about this here is that BizTalk, during configuration, will actually set this max DOP parameter to one (1) on the SQL server instance where your Message Box database is located. This is due to the fact that the BizTalk Message Box database is highly optimized and works quite differently from other "normal" databases. Other databases are just data stores where SQL needs to retrieve data from and store data to. BizTalk is built quite different in many ways while it stays in essence a data store off course. The BizTalk databases are that optimized that setting Max DOP to something higher than one (1), it will actually hurt BizTalk throughput and performance.

Having said this, your “other” non-BizTalk databases will actually perform worse in 99,9% of the cases having Max DOP set to 1.

 

So there it is, another reason to have a dedicated SQL server/cluster for your BizTalk platform.

Do you know any other reasons why sharing your BizTalk SQL environment is a bad idea? Feel free to put them in the comments.


November 21, 2012 at 3:36 PM

You want to make a custom BizTalk pipeline component that handles content from an inbound XML-message, e.g. getting a value with XPath. Then you could use a MemoryStream or load the message in an XmlDocument. This is not the most performant way for doing this, though. Loading the message into memory with an XMLDocument, can take up to 10 times of the amount of space than the actual message size.

 

If you want to make a pipeline component that has a good performance, you might want to use an Xml(Text)Reader, a ReadOnlySeekableStreem or a VirtualStream. This doesn’t load the message entirely into memory.

 

In order to do this, you will need the assembly Microsoft.BizTalk.Streaming.dll.

You can find this assemby in the GAC and get it by executing this in cmd.exe:

cd C:\Windows\assembly\GAC_MSIL\Microsoft.BizTalk.Streaming\3.0.1.0__31bf3856ad364e35
xcopy Microsoft.BizTalk.Streaming.dll "C:\Program Files (x86)\Microsoft BizTalk Server 2010"

 

For an example you can check out this post:

http://blog.codit.eu/post/2011/10/15/XslTransform-pipeline-component.aspx

For more info about the VirtualStream and it’s advantages, you can take a look at this topic:

http://msdn.microsoft.com/en-us/library/ee377071(v=bts.10).aspx

Posted in: BizTalk | Performance | XML

Tags:


October 15, 2011 at 8:58 PM

The BizTalk Sever SDK comes with an interesting Pipeline component sample that allows executing Xsl mapping files in a pipeline.

This functionality can be of interest when the receive port or send port mappings are not executed at the desired time! Receive port mappings are executed after the pipeline execution, send port mappings are executed before the pipeline execution. Thanks to this pipeline component you get control over the exact time when a transformation gets executed.

Although this functionality can be very useful the code of this sample contains some problems that need fixing before using this in production environments.

Let’s take a look at the business end of the sample, the TransformMessage method (you can find the original sample code in this location: <BizTalk installation directory>\SDK\Samples\Pipelines\XslTransformComponent):

 

private Stream TransformMessage(Stream stm)
{
	 MemoryStream ms = null;
	 string validXsltPath = null;
 
	 try 
	 {
  		// Get the full path to the Xslt file
  		validXsltPath = GetValidXsltPath(xsltPath);
  
  		// Load transform
  		XslTransform transform = new XslTransform();
  		transform.Load(validXsltPath);
    
  		//Load Xml stream in XmlDocument.
  		XmlDocument doc = new XmlDocument();
  		doc.Load(stm);
    
  		//Create memory stream to hold transformed data.
  		ms = new MemoryStream();
   
  		//Preform transform
  		transform.Transform(doc, null, ms, null);
  		ms.Seek(0, SeekOrigin.Begin);
 	}
	catch(Exception e) 
 	{
  		System.Diagnostics.Trace.WriteLine(e.Message);
  		System.Diagnostics.Trace.WriteLine(e.StackTrace);
  		throw e;
	}

 	return ms;
}


The signature of the method accepts a Stream as input parameter and returns a Stream as result. Perfect to keep everything streaming…

Then a XslTransform object is created. At first, this might look like a good idea, but XslTransform will load the message into memory internally! For smaller messages this will not cause any issues but bigger messages will cause a System.OutOfMemoryException…

To get rid of this problem I replaced the XslTransform class with the BTSXslTransform class. This class uses the BizTalk transformation engine. The BizTalk transformation engine will only load small messages into memory and will use disk space if the message size reaches a certain threshold.

 

The following line instantiates the BizTalk mapper engine (add a reference to Microsoft.XLANGs.BaseType)

BTSXslTransform trans = new BTSXslTransform();


Use the following line to execute the transform:

trans.ScalableTransform(inputReader, null, vs, new XmlUrlResolver(), false)

 
The following problem we encounter in the original sample is this section:

XmlDocument doc = new XmlDocument();
doc.Load(stm);

This section reads the stream into memory again via an XmlDocument.
I replace this section with this line:

XmlTextReader inputReader = new XmlTextReader(stm);

Instead of feeding the message to the transform method as one memory chunk, I present it a stream that can be pulled by the BTSXslTransform object.

There is one last thing to change in this sample to make it suitable for large messages:

ms = new MemoryStream();

A MemoryStream is used to store the result of the transform operation. Unfortunately a MemoryStream uses memory to store the stream. This makes it unusable for large messages.

Luckily BizTalk Server comes with a stream class that has the same functionality as MemoryStream but one that uses disk space to store large streams. This class is the VirtualStream class (add a reference to Microsoft.BizTalk.Streaming).

Instantiate a VirtualStream object to hold the result of the transformation:

vs = new VirtualStream(VirtualStream.MemoryFlag.AutoOverFlowToDisk);

The AutoOverFlowToDisk instructs this stream to use disk space for large messages.

 

The changed TransformMessage function now looks like this:

 

private Stream TransformMessage(Stream stm)
{
 	VirtualStream vs = null;
	string validXsltPath = null;
  
 	try 
 	{
  		// Get the full path to the Xslt file
  		validXsltPath = GetValidXsltPath(xsltPath);
  		XmlTextReader stylesheet = new XmlTextReader(validXsltPath);

		// Load transform
		BTSXslTransform trans = new BTSXslTransform();
		trans.Load(stylesheet);

		XmlTextReader inputReader = new XmlTextReader(stm);
                    
  		//Create memory stream to hold transformed data.
  		vs = new VirtualStream(VirtualStream.MemoryFlag.AutoOverFlowToDisk);
                
  		//Preform transform
		trans.ScalableTransform(inputReader, null, vs, new    XmlUrlResolver(), false);
    
  		vs.Seek(0, SeekOrigin.Begin);
 	} 
 	catch(Exception e) 
 	{
  		System.Diagnostics.Trace.WriteLine(e.Message);
  		System.Diagnostics.Trace.WriteLine(e.StackTrace);
  		throw e;
 	}
  	return vs;
}

 

When this code is compiled into a pipeline component, it allows executing large transformation exactly where you want it in the BizTalk pipeline infrastructure. This component will keep memory consumption flat, even when processing very large Xml files.

 

The source code of this article can be downloaded from this post as attachement.

 

XslTransform.zip (30,62 kb)

 

Peter Borremans

Posted in: BizTalk | Performance

Tags:


By Sam
January 22, 2010 at 7:12 AM

 

A lot of custom pipeline development involves reading message values, mostly using Xpath queries.  As we all know, it is a best practice to implement pipeline components in a streaming way, because of memory usage.  Luckily BizTalk comes with a very nice implementation of streaming Xpath.  This logic is in the XPathMutatorStream (part of the BizTalk.Streaming.dll).

This post shows what Xpath functionality is supported and (more important) what is not supported in this stream.

Usage

I wrote a sample pipeline component, that reads an XML file from disk with a collection of promoted properties and Xpath statements.

<Promotions>

  <Property enabled="false" name="name" propNamespace="ns" xPath="" />

</Promotions>

 

The component keeps all the Xpaths and properties in a collection and passes the Xpaths to the XPathMutatorStream.  This stream will call a delegate function in the pipeline component, where we will promote the property, linked to that Xpath with the Xpath value.

Code

Collection of properties

We maintain a list of PropertyConfig objects and a reference to the BizTalk message.

 

class PropertyConfig
{
public string PropertyName { get; set; }
public string PropertyNamespace { get; set; }
public string XPath { get; set; }
}

public class XpathMutatorTester : IComponent, IPersistPropertyBag, IBaseComponent, IComponentUI
{
private List properties = new List();
private IBaseMessage biztalkMessage = null;
}

 

Execute method

 

public IBaseMessage Execute(IPipelineContext pContext, IBaseMessage pInMsg)
{

 

We start by reading the (hardcoded) XML file and interpret it, using LINQ for XML.  Notice that we also add the Xpaths to the XPathCollection object, that will be passed to the XPathMutatorStream.

 

// Set variables
biztalkMessage = pInMsg;
XmlReader reader = XmlReader.Create(pInMsg.BodyPart.Data);
XPathCollection xpaths = new XPathCollection();
// Load configuration file
var xmlConfig = XDocument.Load(@"d:\Projects\R&D\XPathMutatorStreamExample\Sample.xml");
// Load all properties that are enabled
var propertyList = from p in xmlConfig.Elements("Promotions").Elements("Property")
                   where p.Attribute("enabled").Value == "true"
                   select p;
// Load all properties to collection
foreach (var property in propertyList)
{
    PropertyConfig config = new PropertyConfig()
    {
        PropertyName = property.Attribute("name").Value,
        PropertyNamespace = property.Attribute("propNamespace").Value,
        XPath = property.Attribute("xPath").Value
    };
    properties.Add(config);
    xpaths.Add(config.XPath);
}

 

The only thing left in this method is changing the stream of the incoming message to the XPathMutatorStream and passing the delegate function that will be called, once an Xpath is fetched.

 

// Set delegate method that will be called when Xpath is found
ValueMutator mutator = new ValueMutator(handleXpathFound);
// Define the XPathMutator Stream
pInMsg.BodyPart.Data = new XPathMutatorStream(reader, xpaths, mutator);
return pInMsg;

 

The ValueMutator is the delegate function where the Xpath value gets promoted to the correct promoted property

 

private void handleXpathFound(int matchIdx, XPathExpression matchExpr, string origVal, ref string finalVal)
{
// Select property config from the configuration, based on the xpath expression
var property = properties.First(p => p.XPath == matchExpr.XPath);
biztalkMessage.Context.Promote(property.PropertyName, property.PropertyNamespace, origVal);
}

 

Supported Xpath functionality

The sample queries are tested on the following sample XML:

<PurchaseOrder>

  <Header number="1302" deliverydate="1/2/2010"></Header>

  <Party name="Microsoft" type="BY" />

  <Party name="CODit" type="SU" />

  <Lines>

    <Line id="1" amount="39" currency="USD">

      <enabled>true</enabled>

    </Line>

    <Line id="2" amount="29" currency="USD">

      <enabled>true</enabled>

    </Line>

    <Line id="3" amount="31" currency="USD">

      <enabled>true</enabled>

    </Line>

    <Line id="4" amount="9" currency="EUR">

      <enabled>true</enabled>

    </Line>

  </Lines>

</PurchaseOrder>

Basic Xpath queries. 

Queries without functions, sums, conditions are fully supported.

·         /PurchaseOrder/Header/@number

·         /PurchaseOrder/Lines/Line/@amount

Indexed Xpaths.

Indexed Xpaths seem to be supported on the XPathMutator implementation.  Indexed Xpaths are Xpaths that search for an element at a specific position.  The following sample worked.

·         /PurchaseOrder/Lines/Line[3]/@amount

Xpath queries with sub queries. 

Queries with ‘forward where conditions’ in one of the elements, are supported.  The following expression is supported, because the name and the type attribute are on the same element.

·         /PurchaseOrder/Party[@type='SU']/@name

This query is also supported, because the currency attribute in the sub clause is one level higher (and is read earlier) than the enabled element that is being read.

·         /PurchaseOrder/Lines/Line[@currency='USD']/enabled

The following query is not supported, because the enabled element in the sub query is one level down, compared to the currency attribute we are reading.  The exception we get is: Cannot get the child value.

·         /PurchaseOrder/Lines/Line[enabled='true']/@currency[1]

The following query is also not supported, with the same exception (Cannot get the child value).  To be honest, I was expecting this one to work, because we are checking and getting the enabled element , which is on the same level.

·         /PurchaseOrder/Lines/Line[enabled='true']/enabled[1]

Bottom line, we can conclude that sub queries are only supported when the items in the sub query occur before the item that gets selected.

Functions.

It looks like functions are not being supported, which makes sense in some cases (sum, average, total), but which does not make sense in other cases (concat, substring…)

These Xpath queries threw an exception, while adding them to the XPathCollection object.

·         sum(/PurchaseOrder/Lines/Line/@amount)

·         count(/PurchaseOrder/Lines/Line)

·         substring(/PurchaseOrder/Header/@number, 1, 2)

The use of namespaces

For the sake of simplicity, the above sample did not contain any namespaces in the XML, which is uncommon off course.  I always prefer the above notation for Xpaths, compared to this one:

Defining namespace aliases is luckily possible, by using the NamespaceManager on the XPathCollection object:

 

XmlReader reader = XmlReader.Create(pInMsg.BodyPart.Data);
XPathCollection xpaths = new XPathCollection();
xpaths.NamespaceManager = new XmlNamespaceManager(reader.NameTable);
xpaths.NamespaceManager.AddNamespace("po", "purchaseOrder");

 

 

This allows executing the Xpath as follows:

·         /po:PurchaseOrder/po:Header/@number

Disadvantages and risks of the streaming xpath approach

While we all agree that the XPathMutatorStream is well suited to keep memory usage low and to support larger messages, there are quite some pitfalls in using this component in a truly streaming fashion.

-          In the sample, the properties only get promoted, when the stream is being read.  This makes it impossible to read the property in a component, further down the pipeline, when the stream is not yet consumed.

-          It is difficult to check if all Xpaths have been found, because this check should only occur when the message is fully read (which ideally is done by the Messaging Agent).

Using Xpaths in the CODit products

A lot of our products (especially Transco & Matrix) make intensive use of Xpath for lookup, database translation, routing…  (more information can be found here)

We have made the design decision to not use the XPathMutatorStream in our products, because we want to give the full functionality of Xpath to our customers and we prefer functionality before performance. 

All the above Xpaths are supported in our products (including functions) and we also extended the functionality by adding the possibility of concatenating fields and constants, by using the ‘+’ sign.

Conclusion

The XPathMutatorStream is a great class to read basic Xpaths on large messages in pipeline components that are built for specific (pre-known) message types.  To make use of it in generic components with configurable Xpaths, it might lack too much flexibility. 

Sam Vanhoutte, CODit

 


By sam
December 9, 2009 at 8:18 AM

At my customer, we had a very strange behavior of the BizTalk environment (2 BTS 2006 R2’s running on a SQL 2005 SP2 cluster).

 

We had an orchestration that took about 1000 milliseconds to execute at the beginning of our performance test (20000 executions per hour of this orchestration).

After many hours of testing, we started noticing a performance degrade.

After 2-3 hours the orchestration took an average of 2000 milliseconds to execute, after 8 hours already 5000 milliseconds!!! This continued without getting better again.

 

I did very detailed monitoring of this environment and finally found the problem…

 

The problem is related to SQL 2005 SP2.  This version of SQL contains a bug, having the TokenAndPerUserStore cache not cleared… As this cache gets bigger and bigger queries take longer and longer to complete (These cache entries are used for cumulative permission checks for queries).

When the response times became high (> 5000 milliseconds after hours of testing), I cleared this cache.  The result was immediate: performance went back to normal!!!

 

This bug is fixed with a cumulative update package or by installing SP3 of SQL Server.