NDMP backups of NetApp filer fail with error "Final error: 0xe000fefd - The media server has lost the network connection to the remote agent"

book

Article ID: 100025587

calendar_today

Updated On:

Description

Error Message

V-79-57344-65277- The media server has lost the network connection to the remote agent.

NDMP Job Log Error:

NDMP Log Message: Mover encountered internal socket error.
NDMP Mover Halted: Internal Error

 

Debugging the Backup Exec service with SGMON and Debugging Netapp filer the following information is observed in the Debug Logs:


Backup Exec SGMON Debug:

BENGINE: [ndmp\ndmpcomm]     - ndmp_readit: Caught message on closed connection. Socket 0x7e4 len 0xffffffff
BENGINE: [ndmp\ndmpcomm]     - ndmp_readit: ErrorCode :: 10053 : An established connection was aborted by the software in your host machine.
BENGINE: [ndmp\ndmpcomm]      - ERROR: processing message 0x705: error decoding arguments.
BENGINE: [ndmp\ndmpcomm]      - ERROR: ndmpcSendRequest->connection error
BENGINE: [ndmp\ndmpcomm]      - ERROR: ndmpSendRequest failed:
BENGINE: [loops]              - NASBackupBSDProcessor::UpdateByteCountStats(). GetBytesProcessedCount() failed with NDMP_CONNECT_ERR !!
BENGINE: [tpfmt]              - BEWSNdmpMover: MoverAbort called.
BENGINE: [tpfmt]              - BEWSNdmpMover: Mover halting. ndmpError = 7, internal_error = 258, mover_halt_reason = INTERNAL_ERROR
BENGINE: BEWSNdmpMover: Switching state ACTIVE ==> HALTED
BENGINE: [tpfmt]              - BEWSNdmpMover: Mover halted.
BENGINE: [loops]              - NDMP Log Message: Mover encountered internal socket error.
BENGINE: [loops]              - NDMP Notify Mover halted: Internal Error.
BENGINE: [ndmp\ndmpcomm]      - ERROR: ndmpcSendRequest->connection error
BENGINE: [ndmp\ndmpcomm]      - ERROR: ndmpSendRequest failed:
 

Netapp NDMP Debug Log:

[ndmpd:141]: IO exception: Connection reset by peer: Connection reset by peer
[ndmpd:141]: Connection reset by peer: Connection reset by peer
[ndmpd:141]: Cleaning up connection
[ndmpd:141]: Error sending notify shutdown message
[ndmpd:141]: Ndmpd session closed successfully
[ndmpd:141]: Calling NdmpServer.kill

Cause

This failure is due to Network Configuration issue and poor performance in the Network.

 

Resolution

1) If there is Network Teaming involved with Network Cards on the Backup Exec Media Server disable the teaming.

2) Disable TCP Offload Chimney feature on the Backup Exec Media server and reboot server.


Open an elevated command prompt and issue the following commands:


netsh int tcp set global chimney=disabled

netsh int tcp set global rss=disabled


Information about the TCP Chimney Offload, Receive Side Scaling, and Network Direct Memory Access features in Windows Server 2008
http://support.microsoft.com/kb/951037

 

3) If BroadCom Network Cards are present on the Backup Exec Media Server check the properties on the Network card an make the following adjustments if needed:


Flow Control Disable -> change this setting to AUTO (DEFAULT)

Receive Buffers 0 -> change this to 750 (DEFAULT)


4) Turn on Netapp TCP NoDelay feature on the Netapp NDMP server. 


 

Issue/Introduction

V-79-57344-65277 - The media server has lost the network connection to the remote agent. During a backup of a NetApp NDMP filer the job may fail with "The media server has lost the network connection to the remote agent". This is primarily observed when the Network connection between the Backup Exec media server and Netapp is poor or mis-configuration in the Network or Network cards

Additional Information

ETrack: 0xe000fefd UMI: V-79-57344-65277