Google launches DataFlow (a successor to MapReduce)

beste binäre Optionen Plattform uns June 30th, 2014|

binäre optionen

Adaptive MapReduce Scheduling in Shared Environments

erstaunliche binäre Optionen Bewertung May 31st, 2014|

binäre optionen

Is Hadoop showing its age?

opzioni binarie May 22nd, 2014|

Spark Ecosystem

April 21st, 2014|

In a previous post  we introduced Spark, a framework that will play an important role in the Big Data area.  You can find a good starting point to understand what is Spark following this page from DataBricks, however let me reproduce an overview in this post. Spark runs on top of existing Hadoop clusters to provide enhanced and additional functionality. Although Hadoop is effective for storing vast amounts of data cheaply, the computations it enables with MapReduce are highly limited. MapReduce is only able to execute simple computations and uses a high-latency batch model. Spark provides a more general and powerful alternative to Hadoop's MapReduce, offering rich functionality such as stream processing, machine learning, and graph computations.  Spark provides out of the box support for deploying within an existing Hadoop [...]

Spark: Big Data Analytics Beyond Hadoop

April 20th, 2014|

Hadoop is definitely the de-facto standard for large scale data processing across nearly every industry and enterprise. However, while  "Volume", "Variety" and "Velocity" of data increases, Hadoop as a batch processing framework cannot cope with the requirement for real time analytics.  As we saw in our Technology Basics  for Data Scientist course, the scientific community is offering alternatives like Storm framework that provides event processing and distributed computation capabilities open sourced by Twitter. Storm uses custom created "spouts" and "bolts" to define information sources and manipulations to allow batch, distributed processing of streaming data.  A Storm application is designed as a topology of interfaces which create a "stream" of transformations. It provides similar functionality as a MapReduce job with the exception that the topology will theoretically run indefinitely until it is manually terminated. Hortonworks, one of the [...]

Big Data: Una oportunidad para los emprendedores y las empresas

December 13th, 2011|

La aparición de Linux dio poder a los desarrolladores innovadores, que además, con el conjunto de paquetes de software Linux, Apache, MySQL y PHP (LAMP, que cambió totalmente el escenario de las aplicaciones web), les permitió programar potentes servidores web a partir de código abierto. Todo ello llevó a la creación de nuevas empresas en el sector TIC, siendo la base de lo que se conoce como Web 2.0. Para mí MapReduce, la piedra angular de este nuevo mundo llamado Big Data, puede suponer lo mismo. Siendo el punto central de un ecosistema de herramientas de código abierto para el análisis a gran escala de la marea de datos que hoy en día hay disponible, tanto privadas como públicas. Todo un mar de oportunidades para los emprendedores [...]