Distributed File Server Replication on Windows Server 2012 R2 Bug 2951262

I came across a bug with Windows Server 2012 R2 where a spoke server randomly failed to replicate to the hub server.  In total there were three DFSR servers partaking in replication within a single replication group.

The following diagram shows an overview of the topology and the server experiencing issues:

Randomly SPOKE2 stopped replicating to the hub server HUBSERVER and is no longer responding to DFS Health Reports resulting in the health report generation process hanging forever.


When restarting the DFS-R service on SPOKE2, SPOKE2 gets reported to be in an Indeterminate State for approximately 2-3 hours.


 After being in an Indeterminate State for 2-3 hours, the status change to �Auto Recovery� for approximately 6 more hours.  During this time the DFS-R service generates a large amount of disk activity as it goes through and checks all the files.  This can be observed using Windows Resource Monitor.

The Auto Recovery process never completes successfully, nor are there any errors in the event log on SPOKE2.  Rebooting SPOKE2 or restarting the DFS-R service results in the server going back to an indeterminate state for another 2-3 hours then starting the Auto Recovery process again.

Resolution

This issue is caused by a bug with Windows Server 2012 R2 documented on Microsoft KB 2951262.

http://support.microsoft.com/kb/2951262

You must manually request the Hotfix which Microsoft will email to you and install it on the effected servers.  After installing the hotfix the server will revert to an Indeterminate state then start Auto Recovery again however this time after the Auto Recovery process, it will resume replication as normal.
Previous
Next Post »