Datasalt has ample knowledge of the Big Data field, distributed systems, scalability, search engines and web crawling and web mining. Our philosophy is based on ongoing knowledge recycling and intense research into new techniques, together with the consolidation of the best practices learned. The numerous technologies upon which Datasalt’s solutions are based include:
Take the Tour
Hadoop and its ecosystem
Google revolutionized the world of mass parallel processing systems (MPP) with the invention of the MapReduce paradigm, developed to manage and process the huge amounts of data aggregated to its search engine. Hadoop is an open-source implementation that opens the doors of the MapReduce paradigm to the rest of the industry. Hadoop is mature technology, used by hundreds of companies including Facebook and Yahoo, with clusters of more than 4000 nodes. Hadoop is the technology behind the concept of Big Data.
Real-time stream processing
Nowadays, being able to provide quick answers to Big Data processing is becoming more and more valuable. Real-time processing systems such as Storm, together with scalable queuing systems such as Kafka provide the necessary means to cleaning, pre-aggregating and unlocking the value in streams of massive amounts of events that must be handled and processed in a scalable and realiable way. Many companies are already powering their solutions with efficient real-time stream processing using Storm.
The trend of NoSQL databases arose due to the limitations regarding scalability and flexibility of relational databases. NoSQL databases leave out certain requirements such as ACID guarantees, fixed table schemas, secondary indexes or foreign keys, to gain in flexibility and scalability. They do not usually support SQL queries, hence the name “no SQL”. NoSQL databases prove to be useful in developing scalable applications and for storage of big data.
Lucene and Solr search systems
Powerful, simple methods for enquiring and extracting information from the Big Data archives are needed. Search engines based on inverted indexes enable one to search amongst the deluge of information simply and quickly. At Datasalt, we rely on the strength of the Lucene and Solr search systems.