Evolving Apache Hadoop YARN to Provide Resource and Workload Management for Services


Almost to the date, two years ago the Apache Hadoop community voted to make YARN a sub-project of Apache Hadoop followed by the GA release nearly a year ago last fall.

Since then, it’s becoming plainly obvious that Apache Hadoop 2.x, powered by YARN as its architectural center, is the best platform for workloads such as Apache Hadoop MapReduce, Apache Pig, Apache Hive etc., which were designed to process data on Apache Hadoop HDFS. Furthermore, YARN has been embraced in a wholesale manner by other open-source communities providing data-processing frameworks such as Apache Giraph, Apache Tez, Apache Spark, Apache Flink, and many others.

Equally exciting, YARN has already evolved beyond just data-processing applications to long-running services with support from Apache Helix and applications like Apache Storm, Apache HBase, Apache Accumulo, and many others running on YARN via Apache Slider. […]