16 Mar 2016, 09:30

Around the Data - March 2016 - Kafka Connect and Streams, Hortonworks HDP 2.4, Elasticsearch cluster sizing, Spark Graph Frames, Flink 1.0


  • Kafka and Confluent 2.0.1 were released (bugfix release)
  • A more detailled presentation about Kafka Connect.
  • Confluent released a custom version of what will be in Kafka 0.10 and it's called Kafka Streams ; it aims to manipulate data within Kafka without requiring any external system such as Spark & co. Interesting move to make it feature richer and more autonomous to enirch data but challenge may be that kafka was performant due to it's low level of features. Will it remain performant with such additions ?

Hortonworks (Hadoop Distribution)



  • Databricks introduces Graph Frames ; it"s built on top of Spark Data Frames (and related APIs) and aim to ease manipulating graph data.