Slow performance and large MessageBox

This was a tricky one.  We’d seen performance dropping over a period of days.  Symptoms were as follows:

  • Constant SQL churning.
  • Large MessageBox Database size.
  • Large MessageBox backups for system volume.
  • Horrendously slow response times.

I decided at first it was the lack of the “Back up BizTalk Databases” job (this job archives completed messages in the MB database as it runs).  Once enabled, performance got worse as the large MessageBox created large backup files and lots of processing.  Of course, running out of space on the backup drive didn’t help, but that’s another discussion.

In digging (using SQL Query Analyzer), I found thousands of messages (140,000 in our case), all dehydrated, in the following tables:

  • SELECT Count(*) AS BizTalkServerIsolatedHostQ FROM BizTalkServerIsolatedHostQ
  • SELECT Count(*) AS DynamicStateInfo_BizTalkServerIsolatedHost FROM DynamicStateInfo_BizTalkServerIsolatedHost
  • SELECT Count(*) AS Instances FROM Instances
  • SELECT Count(*) AS MessageParts FROM MessageParts
  • SELECT Count(*) AS MessageRefCountLogTotals FROM MessageRefCountLogTotals
  • SELECT Count(*) AS Spool FROM Spool

So, I identified two potential causes:

  • Long-running scope of STP orchestrations (I don’t think this caused it, but it’s something we should try to resolve, time allowing).
  • BizTalk defect as described in MSKB 867449.  Hotfix released on 6/11/2004; but I don’t see it in any patches as of this writing.  This is an actual defect, relating to a BizTalk server that loses connection spamming the message tables once connection is re-established.  For the REALLY sleepy, read this article:
    http://support.microsoft.com/default.aspx?scid=kb;en-us;867449&Product=bizt2002

As it turns out, In this environment, the BizTalk and SQL servers are on Virtual Server .VHD files and undergo a nightly backup.  If the BizTalk server starts sooner than the SQL server, BizTalk frantically (and correctly) attempts to reconnect.  However, as the defect article describes, it pushes large numbers of messages into the MB.

The manual fix:

  • Use HAT (Operations, Messages with no filter) and delete dehydrated messages.  The downside: HAT can only delete 2047 at a time .. and is S-L-O-W.
  • Write something in WMI (I found some incomplete examples, but couldn’t use them).

I settled on backing up the MessageBox database and execute the following SQL statements in SQL Query Analyzer:

  • DELETE FROM BizTalkServerIsolatedHostQ
  • DELETE FROM DynamicStateInfo_BizTalkServerIsolatedHost
  • DELETE FROM Instances
  • DELETE FROM MessageRefCountLogTotals
  • DELETE FROM MessageParts
  • DELETE FROM Spool

This has the effect of orphaning all the dehydrated message parts.  After this, run all the maintenance jobs in SQL Enterprise Manager (Management, SQL Server Agent, Jobs) to tidy up the orphan references. Restart both servers so they’ll recover, and watch performance on each as they get to know each other again.

I’d appreciate comments as to the veracity of this solution; did I get all the tables, am I causing other damage, etc.

About Michael Coates
I am a pragmatic evangelist. The products, services and solutions I write about fulfill real-world expectations and use cases. I stay up-to-date on real products I use and review, and share my thoughts here. I apply the same lens when designing an architecture, product or when writing papers. I am always looking for ways that technology can create or enhance a business opportunity .. not just technology for technology's sake. My CV says: Seasoned technology executive, leveraging years of experience with enterprise and integration architectural patterns, executed with healthy doses of business acumen and pragmatism. That's me. My web site says: Technology innovations provide a myriad of opportunities for businesses. That said, having the "latest and greatest" for its own sake isn't always a recipe for success. Business successes gained through exploiting innovation relies on analysis of how the new features will enhance your business followed by effective implementation. Goals vary far and wide: streamlining operations, improving customer experience, extending brand, and many more. In all cases, you must identify and collect the metrics you can apply to measure your success. Analysis must be holistic and balanced: business and operational needs must be considered when capitalizing on a new technology asset or opportunity.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: