Over the last year and a half of monitoring and maintaining several SQL Server Transactional Replication instances, I’ve found a good system for tracking down the sometimes mysterious causes of errors. When I ran into my first error, I didn’t even know where to begin. I now use the following steps to not only find the replication transaction (the insert/update/delete that failed), but also the origination transaction (what the actual user’s command was). My methods all originated from a great post I found at http://www.sqlservercentral.com/blogs/robert_davis/2010/03/27/Making-Sense-of-Replication-Errors-in-Replication-Monitor/.
My system is based upon trying to find causes of duplicate key failures, but you could use it to find the source of virtually any error.
- Copy the transaction sequence number out of the error details in the Replication Monitor. This should be listed with the error in the Agent Details window.
- Use the sp_browsereplcmds system stored procedure while connected to the distributor (and the distribution database). I like to use the procedure in the link I posted above to find results.
- Look for the command definition in the results. Go to the subscriber database and check the definition of the stored procedure to find the table being inserted/updated/deleted to.
- Run sp_depends on the table found in step 3 to find potential procedures or triggers that caused the stored procedure from step 3 to be executed.
- Look at the definitions of any stored procedures or triggers found in step 4 to be sure they are flagged with “not for replication”.