Archive for March, 2012

Pangool: Hadoop API made easy

We are proud to announce Pangool, an Open Source java library with the aim to be a replacement for the Hadoop API. Hadoop has a steep learning curve. Pangool’s goal is to simplify Hadoop development without losing the performance or flexibility of the low level Hadoop’s API.


Pangool is a Tuple MapReduce implementation for Hadoop. By employing an intermediate Tuple-based schema and configuring a Job conveniently, many of the accidental complexities that arise from using the Hadoop Java MapReduce API disappear. Things like secondary sort and reduce-side joins become extremely easy to implement and understand. Pangool’s performance is comparable to that of the Hadoop Java MapReduce API. Pangool also augments Hadoop’s API by making multiple outputs and inputs first-class and allowing configuration via object instance instead of static classes.

Read more…