10 Feb 2016, 09:30

Around the Data - February 2016 - ELK, Kafka, Flink, Spark





  • Spark 2015 year in review : Databricks (core developper of Spark) made a review of 2015 : the 4 release, the features, how spark is used, etc. I was surprised to see that majority of spark usage was as a standadlone cluster and not in an Hadoop context.

11 Nov 2015, 09:30

Around the Data - November 2015 - Diving in Kafka and ELK 2.0 releases


  • A 2 part blog post serie (part 1, part 2) to learn the genese of Kafka within LinkedIn before being opensourced and hosted by the Apache foundation ; always interesting to know the first use cases the software was built for and how it became the distributing messaging system we now know. It also stand current use cases for Kafka.
  • Putting Apache Kafka to use : a practical guide to building a streaming data platform (part 1 ; part 2) : the first part is about the shift to the event based approach and the definition of a streaming data platform. The second part is about about implementation best practices.
  • A kafka presentation from MixIt event (in French), which introduces Kafka and how it is used at EDF for the "Linky" device (energy monitor)

ElasticSearch / Logstash / Kibana

29 Apr 2015, 09:30

Around the Web - April 2015

Ergonomy / User Experience

  • The best icon is a text label : a reminder that icon must be meaningful, with some examples of do's/don'ts and at the end that a text label may be more accurate than an icon.


Over the 4 years we have slowly moved away from device specific breakpoints in favour of content specific breakpoints, i.e. adding a breakpoint when the content is no longer easy to consume (whatever the device is).



  • M6Web Tech team published an article (English version ; French version) on how they mocked a backend application while they were building the frontend one and til the backend is fianlised. Beyond the tool involved, the most important point is the "interface agreement" in which backend and frontend teams agreed on how the coming API would work and be used to avoid bad surprises as much as possible at the end.
  • A visual guide to CSS3 Flexbox properties : title is self explainatory about what it is !
  • Introduction to Service Workers : Service Worker will allow offline experiences, periodic background syncs, push notifications and other things that would normally require a native application. Atricle introduce on how service workers are working and some current limitations ; if you are not familar with Javascript promises syntax, have a look at this article "Javascript Promises".

Responsive Web Design


Virtualisation (Docker)

[Edit 18/5, 21/5, 28/5,19/6,10/7 - Update docker tutorial list]

13 Feb 2014, 12:50

Elasticsearch 1.0 - distributed & RESTful search engine

ElasticSearch (ES) is a distributed search engine, RESTful and based on the (famous) library Apache Lucene ; if you use indexation services in your application, you may already use it or Solr which is also based on Lucene.

Aside ElasticSearch, the company behind the product also release 3 other opensource products linked with ES :

  • Kibana to produce dashboard and reports from ES and more widely to interact with data in ES.
  • Logstatsh combined with ES to analyse logs & events.
  • Marvel was also just released to monitor your ES cluster.

So with the 1.0 release, a lot of things have been included in ES, which had already a lot of interesting features (Documentation).

There are also bindings for PHP, Java, Perl, Python, Ruby, Javascript ; so you should be able to integrate it with your app easily.

Xebia (French consulting company) published some blog posts on ES in last december which shows you how to start with ES :

IIplayed with it a little bit and was quite impressed by its relevance. The only "issue" was to push content trough the REST Api as binary indexation is not native but there is file system river for that.

So if you need to index/retrieve content or manipulate data, I would recommend you having a look at ElasticSearch ecosystem.

ElasticSearch is also used in Graylog² and according to a colleague, on log analysis, it would be very relevant with the use of Gralylog Extended Log Format (GELF). If someone has an experience on Graylog² vs ES/Logstash/Kibana on this, I'm interested to have their opinions !