Search This Blog

Loading...

Wednesday, April 11, 2012

DFSClient write timeout in Hadoop MapReduce

When DFSClient has socket timeout to access data nodes in Hadoop MapReduce, we may consider two parameters in hdfs-site.xml

dfs.socket.timeout,  for read timeout
dfs.datanode.socket.write.timeout, for write timeout

In fact, the read timeout value is used for various connections in DFSClient, if you only increase dfs.datanode.socket.write.timeout, the timeout can continue to happen.


I tried to generate 1TB data with teragen across more than 40 data nodes, increasing writing timeout has not fixed the problem. When I increased both values above 600000, it disappeared.

2 comments:

  1. Is there any other way to get answer like this? I tried with out success. Any way thanks for your help.
    I learned a lot from Besant Technologies in my college days. They are the Best Hadoop Training Institute in Chennai


    http://www.hadooptrainingchennai.co.in

    ReplyDelete
  2. I got a job by saying this answer in my last interview. thanks for awesome help.
    I got more idea about Hadoop from Besant Technologies. If anyone wants to get Hadoop Training in Chennai visit Besant Technologies.

    http://www.hadooptrainingchennai.co.in

    ReplyDelete