When DFSClient has socket timeout to access data nodes in Hadoop MapReduce, we may consider two parameters in hdfs-site.xml
dfs.socket.timeout, for read timeout
dfs.datanode.socket.write.timeout, for write timeout
In fact, the read timeout value is used for various connections in DFSClient, if you only increase dfs.datanode.socket.write.timeout, the timeout can continue to happen.
I tried to generate 1TB data with teragen across more than 40 data nodes, increasing writing timeout has not fixed the problem. When I increased both values above 600000, it disappeared.