Spring For Apache Hadoop Introduction:
Spring for Apache Hadoop simplifies developing Apache Hadoop by providing a unified configuration model and easy to use APIs for using HDFS, MapReduce, Pig, and Hive. It also provides integration with other Spring ecosystem project such as Spring Integration and Spring Batch enabling you to develop solutions for big data ingest/export and Hadoop workflow orchestration.
Overview Of Spring For Apache Hadoop Job Support:
The Spring for Apache Hadoop provides an integration with the Spring Framework to create & run the Hadoop MapReduce, Hive, & Pig jobs as well as work with the HDFS & HBase. If you have simple needs to work with Hadoop, including the basic scheduling, you can add the Spring for the Apache Hadoop namespace to your Spring based project & get going quickly by using Hadoop.
As the complexity of your Hadoop application increases, you may want to use the spring batch & spring Integration to regain on the complexity of developing an large Hadoop application.The Spring for Apache Hadoop is built & also tested with the JDK 7, Spring Framework 4.2 & is by default built against Apache Hadoop 2.7.1.
The Spring for Apache Hadoop is tested daily against the number of Hadoop distributions. To take the full advantage of Spring for Apache Hadoop you need a running Hadoop cluster. If you don’t already have one in your environment, then a good first step is to create an single-node cluster
The Spring for Apache Hadoop supports reading from & writing to the HDFS, running on various types of Hadoop jobs (Java MapReduce, Streaming), scripting & HBase, Hive & Pig interactions. The important goal is to provide excellent support for the non-Java based developers to be productive using the Spring for Apache Hadoop & not have to write any Java code to use the core feature set.