SQL Database Replication : Repairing

SQL Database Replication : Repairing

I recently came across an issue where SQL database replication was failing between two nodes. The replication queue was full of un-replicated transactions,I could verify this looking at the following tables:

  • MSrepl_queuedtraninfo
  • MSreplication_queue

In this case there was over 8 million unreplicated transactions. Due to the shear number of unreplicated transactions I decided that it would probably be best to simply re-seed the database on the remote node. The process below outlines how this was achieved.

Please note that this process requires permissions to be reset on the subscriber once completed.

1. Perform backup of publisher (COMMS-S01) and subscriber (COMMS-M01) database.
2. Stop OTS (stop OTS service on COMMS-TRANS01/02 and kill any OTS.exe processes via task manager)
3. Stop the log reader and distribution replication agents.
4. Use the following commands to flush the replication queue: [less than one minute to run]
a. EXEC sp_repldone @xactid = NULL, @xact_segno = NULL, @numtrans = 0, @time = 0, @reset = 1
b. EXEC sp_replflush
5. Verify that the queue is now empty on the subscriber: [less than one minute to run]
a. EXEC sp_replshowcmds
b. DBCC opentrans
6. Now un-publish and truncate the transaction log on the publisher (comms-s01):
a. BACKUP LOG OTSTrack WITH TRUNCATE_ONLY
b. DBCC SHRINKFILE (OTSTrackLog, NOTRUNCATE)
c. DBCC SHRINKFILE (OTSTrackLog, TRUNCATEONLY)
7. Finally shrink the database on both comms-s01:
a. DBCC SHRINKDATABASE(OTSTrack, 10)
8. Drop the OTSTrack database on COMMS-M01 (standby)- this will be recreated during replication setup.
9. Check the rowcount of the following tables in the distribution database (should not be 8 million):
a. MSrepl_queuedtraninfo
b. MSreplication_queue
10. Re-republish the database and configure sunscriptions as per GL documentation.
11. Check that replication agents are started, if not start log reader first then distribution agent.
12. Re-create user accounts on COMMS-M01 using attached ‘script.’
13. Run Stored Procedure Permissions Script on subscriber (comms-s01)
14. Start OTS (start OTS service on COMMS-TRANS01/02)