Promote a group member to master secret server

If your MSS should fail, here are the steps to promote a member server to the MSS:

  • Change the master secret server name from the original to the new using ssomanage: “ssomanage -updatedb”.
  • Stop the ENTSSO service on the new master secret server.
  • Start the ENTSSO service on the new master secret server. It will recognize that it is the master secret server and that it has no secrets.
  • On the new master secret server, restore the backed up master secret file using ssoconfig: “ssoconfig -restoresecret BackupFile”

The new server is now the master of the group.

Slow performance and large MessageBox

This was a tricky one.  We’d seen performance dropping over a period of days.  Symptoms were as follows:

  • Constant SQL churning.
  • Large MessageBox Database size.
  • Large MessageBox backups for system volume.
  • Horrendously slow response times.

I decided at first it was the lack of the “Back up BizTalk Databases” job (this job archives completed messages in the MB database as it runs).  Once enabled, performance got worse as the large MessageBox created large backup files and lots of processing.  Of course, running out of space on the backup drive didn’t help, but that’s another discussion.

In digging (using SQL Query Analyzer), I found thousands of messages (140,000 in our case), all dehydrated, in the following tables:

  • SELECT Count(*) AS BizTalkServerIsolatedHostQ FROM BizTalkServerIsolatedHostQ
  • SELECT Count(*) AS DynamicStateInfo_BizTalkServerIsolatedHost FROM DynamicStateInfo_BizTalkServerIsolatedHost
  • SELECT Count(*) AS Instances FROM Instances
  • SELECT Count(*) AS MessageParts FROM MessageParts
  • SELECT Count(*) AS MessageRefCountLogTotals FROM MessageRefCountLogTotals
  • SELECT Count(*) AS Spool FROM Spool

So, I identified two potential causes:

  • Long-running scope of STP orchestrations (I don’t think this caused it, but it’s something we should try to resolve, time allowing).
  • BizTalk defect as described in MSKB 867449.  Hotfix released on 6/11/2004; but I don’t see it in any patches as of this writing.  This is an actual defect, relating to a BizTalk server that loses connection spamming the message tables once connection is re-established.  For the REALLY sleepy, read this article:
    http://support.microsoft.com/default.aspx?scid=kb;en-us;867449&Product=bizt2002

As it turns out, In this environment, the BizTalk and SQL servers are on Virtual Server .VHD files and undergo a nightly backup.  If the BizTalk server starts sooner than the SQL server, BizTalk frantically (and correctly) attempts to reconnect.  However, as the defect article describes, it pushes large numbers of messages into the MB.

The manual fix:

  • Use HAT (Operations, Messages with no filter) and delete dehydrated messages.  The downside: HAT can only delete 2047 at a time .. and is S-L-O-W.
  • Write something in WMI (I found some incomplete examples, but couldn’t use them).

I settled on backing up the MessageBox database and execute the following SQL statements in SQL Query Analyzer:

  • DELETE FROM BizTalkServerIsolatedHostQ
  • DELETE FROM DynamicStateInfo_BizTalkServerIsolatedHost
  • DELETE FROM Instances
  • DELETE FROM MessageRefCountLogTotals
  • DELETE FROM MessageParts
  • DELETE FROM Spool

This has the effect of orphaning all the dehydrated message parts.  After this, run all the maintenance jobs in SQL Enterprise Manager (Management, SQL Server Agent, Jobs) to tidy up the orphan references. Restart both servers so they’ll recover, and watch performance on each as they get to know each other again.

I’d appreciate comments as to the veracity of this solution; did I get all the tables, am I causing other damage, etc.

BizTalk 2004 Configuration Failures

I started this post to list suspect issues that may impact loading a BizTalk configuration (i.e., running configframework.exe in either a silent or interactive mode).  I may eventually split it into separate posts (WMI, security, permissions, etc.).  I am not trying to offer specific solutions to all failures in this post, although I do provide some suggestions and pointers to solutioins where I’ve seen these issues.  This really is a collection of things you can confirm during the configuration steps.

The application of a BizTalk configuration file can be your delight or downfall, and differs for most every environment in which I’ve installed.  Kudos to the big brains at MS for spending their resources on all that’s great about BizTalk; I’m confident they’ll come around to my wish list of easier/faster/better configuration options.

So, that said, let’s look at what makes a configuration fail, easy to hard (note: these are from a multi-server deployment on a domain; some items will not apply to single-server deployments).

Easy (failures in this group typically roll back and do not require restoring a server to vanilla or running configframework with the /u switch):

  • Unreadable configuration file (immediate; confirm you can open the XML file in IE before attempting to apply it).
  • Missing configuration file or bad path.
  • Non-existent user accounts (immediate; confirm domain and account names).
  • Non-existent group accounts (immediate; confirm domain and group names).
  • Bad passwords (the wizard checks these as you enter them).
  • Non-existent SQL server -or- databases/MDFs on the SQL server.
  • SQL Connectivity (this can be easy, medium and very hard, if connection is lost during the configuration step).

With these issues (for the most part), you either cannot complete the wizard, or configframework will fail on invocation with a pointer to your configuration file.  You will realize and resolve most of them while the wizard is running (confirm/create accounts, check passwords, remove databases or MDFs, etc.).  Note: BizTalk wants to run over TCP/IP and Kerberos connection to SQL, so save some time: ensure you can connect to the BizTalk server over port 1433 in SQL Enterprise Manager before you run the configuration wizard.

Medium (will likely require restoration to vanilla, and removal of the present configuration):

  • IIS Application Pool identity failure (if installing BAS, ensure you log on with the same account that the BAS application pool runs under).
  • Inadequate permissions on the SQL Server for the WSS (BAS) administration account (be sure this user has database creator and security administrator roles).
  • Improper group membership (if configuring using domain accounts, forgetting to include the domain name for a particular service, causing the configuration to look for a local group that will not exist).

The result of these errors is a rollback, where BizTalk removes the configured components from the server, leaving behind the SQL databases and other artifacts.  See the post titled “Restoring BizTalk to a Vanilla State“ to recover from these errors.

Hard (cannot remove present configuration, or configuration times out):

  • SQL Connection loss during configuration (has the result of putting the server in a half-configured state); look to network or WMI issues.
  • WMI recycle failure (I’ve seen some services refuse to stop and prevent WMI from recycling during configuration; the configuration times out and the process must be killed).

When configuration must be terminated, the server is in a non-configured state; cannot be unconfigured or configured.  See the post titled “When Good Configurations go Bad“ for the fix to this state, then restore the server to vanilla.  Confirm SQL connectivity and do test recycles of WMI-related services with the account under which you’re installing; some environments have a Group Policy (GPO) to prevent local administrators from managing services (ceding this responsibility to domain admins).

I am keen to receive any feedback and other posts.  Please submit comments.