IT Interview Questions and Answers: Why do we use HDFS for applications having large data sets and not when there are lot of small files?

Saturday, May 31, 2014

Why do we use HDFS for applications having large data sets and not when there are lot of small files?

HDFS is more suitable for large amount of data sets in a single file as compared to small amount of data spread across multiple files. This is because Namenode is a very expensive high performance system, so it is not prudent to occupy the space in the Namenode by unnecessary amount of metadata that is generated for multiple small files. So, when there is a large amount of data in a single file, name node will occupy less space. Hence for getting optimized performance, HDFS supports large data sets instead of multiple small files.

IT Interview Questions and Answers

Saturday, May 31, 2014

Why do we use HDFS for applications having large data sets and not when there are lot of small files?

0 comments:

Post a Comment

Search This Blog

Followers

Blog Archive

Labels

Archivo del blog

IT Interview Questions and Answers

Saturday, May 31, 2014

Why do we use HDFS for applications having large data sets and not when there are lot of small files?

0 comments:

Post a Comment

Search This Blog

Followers

Blog Archive

Labels

Subscribe To

Archivo del blog