Problem
I am encountering the following SQL Server transactional replication error:
"The row was not found at the Subscriber when applying the replicated UPDATE command for Table '[dbo].[REPLT_1]' with Primary Key(s): [ID] =10 (Source: MSSQLServer, Error number:20598)".
How can I resolve this replication error?
Solution
This tip is a continuation of the earlier tip, which describes steps to fix transactional replication when it fails due to primary key violations. We will perform a demo of a replication failure, specifically when the row is not found in the subscriber database while performing an update operation on the publisher database.
This tip assumes you have prior knowledge of transactional replication and are comfortable configuring transactional replication between two databases. Refer to these replication tips on MSSQLTips to learn more. Additionally, Robert Pearl wrote an excellent article regarding the common data consistency errors faced in transactional replication. Another excellent tip, by Jeffrey Yao, uses PowerShell automation to fix errors of this type in replication.
Configure Transactional Replication and Create Tables for Replication
As a first step, ensure transactional replication is configured. Once done, use the sample script below to create a few tables in the publisher database that can be used for replication.
CREATE TABLE [dbo].[REPLT_1]( [ID] [int] NOT NULL, [newIDs] [int] NULL, [ContactPerson] [nvarchar] (20) NULL PRIMARY KEY CLUSTERED ( [ID] ASC )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF) ON [PRIMARY] ) ON [PRIMARY] GO
Add Article to Transactional Replication
Next, add the newly created table to replication. Refer to this tip for more information on how to setup SQL Server transactional replication. Now, using the script below, insert values into the table and confirm replication is working as expected.
INSERT INTO [dbo].[REPLT_1] ([ID] ,[newIDs] ,[ContactPerson]) VALUES (1 ,1 ,'ddd') GO
The sample output of this table:
Confirm that the tables are replicating as expected by checking the replication monitor.
Simulate Replication Failure - Update the Subscriber Table Primary Key
In the subscriber database, perform an update to a primary key value in the existing row by using the script below.
--Run this in the subscriber database update [dbo].[REPLT_1] set ID=13 where ID=10
Below is the output from the subscriber table.
Using the script below, we updated the primary key value in the subscriber database to 13, which was originally 10.
-- Run this on publisher DB update [dbo].[REPLT_1] set ID=112 where ID=10
Replication Errors Begin - Unable to Find the Corresponding Row
As expected, replication starts to fail. After a few minutes, check the replication status through the Replication Monitor, as seen in the screenshot below. This error message will display � "The row was not found at the Subscriber when applying the replicated UPDATE command for Table '[dbo].[REPLT_1]' with Primary Key(s): [ID] =10 (Source: MSSQLServer, Error number:20598)".
The error number is 20598.
The error message indicates that the issue is with the primary key ID 10 on the publisher database. As expected, replication fails as it cannot locate the primary key ID value of 10 in the subscriber database.
Using the script below, you can view the details in the distribution database by filtering by the error code to get the details of the replication failure.
Use Distribution go select * from dbo.MSrepl_errors where error_code in ('20598')
You can see that the replication is failing continuously with the error because the value 10 is not found in the subscriber.
Options to Collect Additional Information Regarding the Replication Errors
As described in the previous tip, you can investigate further using the sp_browsereplcmds command. To identify additional details of the failed transaction, copy the transaction sequence number from the replication monitor (below).
Replace the @xact_seqno_start and @xact_seqno_end with the transaction sequence number copied from the replication monitor.
Use distribution go exec sp_browsereplcmds @xact_seqno_start = '0x0000002B00012DD0000600000000', @xact_seqno_end = '0x0000002B00012DD0000600000000', @publisher_database_id = '1'--Query MSpublisher_databases on distributor DB
Details of Internal Commands and Procedures
The update operation is failing for this value, '10'. You can view additional information if you carefully review the 'Command' section in the output.
{CALL [sp_MSupd_dboREPLT_1] (112,,,10,0x01)}
The section in the command column shows the actual procedure being executed on the subscriber database. You can copy this procedure and run it directly on the subscriber database to get the actual details of the stored procedure.
--Run this on the subscriber database sp_helptext sp_MSupd_dboREPLT_1
The last part of the sp_MSupd procedure refers to the table impacted by replication errors. In this case, the table name is REPLT_1. The output on the subscriber is below.
The full contents of the procedure are below.
create procedure [sp_MSupd_dboREPLT_1] @c1 int = NULL, @c2 int = NULL, @c3 nvarchar(20) = NULL, @pkc1 int = NULL, @bitmap binary(1) as begin declare @primarykey_text nvarchar(100) = '' if (substring(@bitmap,1,1) & 1 = 1) begin update [dbo].[REPLT_1] set [ID] = case substring(@bitmap,1,1) & 1 when 1 then @c1 else [ID] end, [newIDs] = case substring(@bitmap,1,1) & 2 when 2 then @c2 else [newIDs] end, [ContactPerson] = case substring(@bitmap,1,1) & 4 when 4 then @c3 else [ContactPerson] end where [ID] = @pkc1 if @@rowcount = 0 if @@microsoftversion>0x07320000 Begin if exists (Select * from sys.all_parameters where object_id = OBJECT_ID('sp_MSreplraiserror') and [name] = '@param3') Begin set @primarykey_text = @primarykey_text + '[ID] = ' + convert(nvarchar(100),@pkc1,1) exec sp_MSreplraiserror @errorid=20598, @param1=N'[dbo].[REPLT_1]', @param2=@primarykey_text, @param3=13233 End Else exec sp_MSreplraiserror @errorid=20598 End end else begin update [dbo].[REPLT_1] set [newIDs] = case substring(@bitmap,1,1) & 2 when 2 then @c2 else [newIDs] end, [ContactPerson] = case substring(@bitmap,1,1) & 4 when 4 then @c3 else [ContactPerson] end where [ID] = @pkc1 if @@rowcount = 0 if @@microsoftversion>0x07320000 Begin if exists (Select * from sys.all_parameters where object_id = OBJECT_ID('sp_MSreplraiserror') and [name] = '@param3') Begin set @primarykey_text = @primarykey_text + '[ID] = ' + convert(nvarchar(100),@pkc1,1) exec sp_MSreplraiserror @errorid=20598, @param1=N'[dbo].[REPLT_1]', @param2=@primarykey_text, @param3=13233 End Else exec sp_MSreplraiserror @errorid=20598 End end end
If you look at the procedure closely, you will see it has parameters based on the number of columns in the article. We can simulate the error in the replication monitor by executing the procedure directly in the subscriber database. Make sure to provide the correct parameters.
--Run this on the subscriber database exec [sp_MSupd_dboREPLT_1] 112,111,'MM',10,1
The output of the error from the subscriber database is below.
Earlier in this tip, we updated the publisher table, where the primary key value was 10, to 112. You can see the actual value in the publisher table from this command section column. If you perform a Select on the publisher database for a value of 10, it will yield no results since there is none.
select * from [dbo].[REPLT_1] where ID=10
However, from the command section of the output of sp_browsereplcmds, you know the primary key value of 10 was updated to 112. So, if you query for 112, you can locate the row.
Another option to view the error details is to review the distribution agent job directly on the Job Activity Monitor. From here, click on the correct distribution agent job for that database and view the job history.
Right-click on the distribution agent job, review the job steps (as shown) and view the job history. You will notice the details of the values causing replication to fail.
As you can see, the distribution agent job repeatedly tries and fails because it cannot find the correct row.
Steps to Fix the Replication Errors
To fix the replication errors, use the sample script below to enter the value into the subscriber database from the publisher database.
insert into [Subsriber].[Subscriber_database].[dbo].[REPLT_1] select * from [dbo].[PublisherDB] where ID=112
Unfortunately, only doing this one step will not fix the replication. Once the value is inserted into the subscriber database, it needs to be updated to reflect the missing row on which the replication is failing. In this case, the value needs to be updated to 10.
--Run this on subscriber database update [dbo].[REPLT_1] set ID=10 where ID=112
You can refresh the Replication Monitor to check the status or stop and start the correct distribution agent job. In a short while, you will see that replication has recovered, and the replication monitor's status has also changed.
You can use the command sp_replmonitorsubscriptionpendingcmds to check for undistributed commands. In this demo, we fixed issues related to one entry in the subscriber database. But you may have to deal with multiple entries in a real scenario.
Option to Skip the Errors in the Distribution Agent Profile
As mentioned in the previous tip, another option is to skip the data consistency errors in the distribution agent. It is generally not recommended to change this setting straight away as it is crucial to analyze the reasons and the source for the replication failures. That way, the issue can be fixed permanently, and there is no need to worry about missing or skipping transactions. Refer to the tip described earlier to ignore the errors. While using the custom profile, make sure to use the correct error number, which in this case is 20598. Click on the correct subscription from the replication monitor to go to the correct distribution agent profile. Click to create the new custom profile and fill in the details below.
Enter error number 20598, which is related to the missing rows in the subscriber database. Click OK. Be sure to change the agent profile to use the new custom profile.
Using this method, you only skip specific data consistency errors related to the correct error number.
NOTE: After creating the new custom profile, restart the distribution agent job for the change to take effect. You can use the same script provided in the earlier tip. Or, locate the distribution agent job in the Job activity monitor and stop and start the job.
-- To STOP the Distribution Agent: sp_MSstopdistribution_agent @publisher, @publisher_db, @publication, @subscriber, @subscriber_db --To START the Distribution Agent: sp_MSstartdistribution_agent @publisher, @publisher_db, @publication, @subscriber, @subscriber_db
If replication still does not work, go to the subscription node as shown. Right-click and click on View Synchronization Status.
Make sure to stop (1) and start (2) as shown. Depending on the number of outstanding transactions to be delivered, it may take time to complete.
Once the synchronization process is complete, recheck the replication monitor to verify the status of the undistributed commands. Using this option, you should be able to get the replication working again. Also, on the replication monitor, check to see if any data consistency errors are being skipped, as shown below.
In this tip, you saw a demo of a known issue in transactional replication due to missing rows in the subscriber database. The next few tips will continue to discuss other errors that we may encounter in transactional replication.
No comments:
Post a Comment