April 4, 2017 Jan Krag   Event Git

Report from FOSDEM 2017

Blogging from the FOSDEM Conference in Brussels

We came to Brussels for Git Merge, heard about FOSDEM and I decided to stay.

This blogpost is an attempt at telling the story of my experience and summarizing many of the projects, tools and technologies I ran into at presentations and booths.


A newbie at FOSDEM 2017

This is the story of my first ever visit to FOSDEM in Brussels. It is written in part during the event and partly afterwards. You will find “in the moment” reports, post-event summaries or comments on various tech I encountered and even musings about the overall experience.

What is this FOSDEM anyway?

In short, a crazy big conference, very different from anything I have ever visited. It might not even be fair to use the term conference. FOSDEM refers to it as an event. FOSDEM is free and even non-registration - just show up.

Who knows how many people came? The expected crowd is about 8000 people and I heard estimates closer to the 10K mark. Other impressive numbers: 610 speakers, 6-700 individual “events” and up to 55 parallel tracks.

Amazingly, the whole event is non-commercial and run by volunteers.

If you want to read more, check out the official website FOSDEM or specifically go the about page that also includes a section on the history.

How I happened to be here…

I happened to be in Brussels the days before FOSDEM, attending Git Merge 2017 where my colleague Johan Abildskov and I gave a Git Jedi mind tricks workshop on Thursday. See the blog post Report from Git Merge 2017. During Git Merge I heard people talk about this thing called FOSDEM and found out that it was happening the following weekend. I have probably heard it mentioned before, but I had no real idea of what it was.

Thursday evening we attended the Git Merge speakers’ dinner where I happened to be seated together with Karen Sandler from the Software Freedom Conservancy (A great non-profit that helps promote, improve, develop, and defend Free, Libre, and Open Source Software (FLOSS) projects). Some of the talk during dinner revolved around the free software movement and about FOSDEM where the Conservancy has a booth. All this got me curious.

Late that evening I studied the FOSDEM homepage and the immense schedule.

Crazy schedule

Then and there I decided that I wanted to stay if at all possible. I had no real idea what I was getting into, but the opportunity was not to be missed.

The rest was mostly a question of asking the boss if it was ok (thanks Lars Kruse!) and booking a new plane ticket.

Follow the crowd

Saturday morning I found my way to the nearest metro station to buy a ticket and then went searching for the #71 bus. Once I got around the right corner the rest was easy. There were already 20 or more people waiting for the same bus and it was quite obvious that they were headed the same way - an indicator for what was awaiting. The bus ended up packed beyond bursting point and when we were dropped of at the end, it was really just about following the crowd!

Follow the crowd to FOSDEM

Sorry, this room is FULL

It turns out that a lot of FOSDEM is running around campus for interesting talks, only to find “Room is FULL” signs on the door. I guess this is just the consequence of a no-bounds 8000+ participant event, where most talks are in rooms with 50-150 seats. The signs are there for fire safety reasons, and every single room has a volunteer that watches the capacity and closes the door when needed.

Sorry, this Room is FULL

Once you figure this out, you learn that you might have to come early for talks you really want to hear. This might not help though. Most rooms are themed Devrooms, e.g. there is a room dedicated to Perl for the entire day. This means that many people involved with a specific project are just in that room the entire day. The worst case I encountered was a popular talk where over 100 people were waiting in line, and only about 5 left the room in the break.

Luckily, and quite amazing for a volunteer event of this magnitude, every single talk in every room is live streamed and recorded. For one talk I just sat in the hallway in front of the door and streamed the talk going on in the room behind me.


Mini-reports from talks and booths

This is where we get into the technical part of the story. The following section consists of a lot of individual reports from various talks and booths. Some are longer sections, some not much more than honorable mentions of projects I encountered that should be worth checking out.

I got a seat at the lightning talks Saturday afternoon. Always a gamble, but quite a few of the stories below are based on these lightning talks.

Kubernetes on the road to GIFEE

My first talk of the conference was a Keynote by Brandon Philips from CoreOS, talking about Kubernetes. After a short intro about the amazing growth on the internet, he starts off with a short intro to containers and why they are cool.

Brandon then ventured into a live demo of etcd, demonstrating voting and election.

Then the fun started. Kubernetes federation! Running another layer of Kube on top of multiple data centers or cloud providers running Kubernetes. Currently WIP, but sounds very cool.

Estimated that we now have 100M servers worldwide, and 3.5 Billion internet users. We need self driving infrastructure where software can be deployed globally with the same ease as new apps are distributed on android or iOS.

Mentions:

  • Tectonic by CoreOS - self-driving container infrastructure. (free tier available).
  • Quay.io name-dropped a number of times as a good tool to build and distribute container images.

Brandon also claimed that Kubernetes is now the largest growing OS community on github.

Bitergia

I had a good chat with Bitergia at their booth, mainly about Grimoire Lab which is their open source platform for analytics of software projects. It is built on Elastic and Kibana and provides lots of nice visualizations.

One notable detail is that it goes way beyond the classic source code analysis and can also integrate sources such as CI data from Jenkins, Issue and task management data, code review info and even data from mailing lists, wikis and chat.

Bitergia are also kind enough to offer a hosted version of Grimoire Labs at Cauldron.io that allows you to generate nice dashboards for the public repos of your GitHub organisation or personal account. I took the opportunity to create a dashboard for Praqma

Cauldron Praqma Dashboard


Browsing the O’Reilly booth

O’Reilly had a booth with a long open table full of books for sale and a 10% discount. I could have bought at least 10 relevant books, but they would have been too heavy for my carry-on flight allowance. I couldn’t fully resist a few thin books though.

  • Introducing Go by Caleb Doxsey. Praqma’s very productive intern, Simon, has already read it, and says it is really good
  • Your Code as a Crime Scene by Adam Tornhill. This book has been on my personal must read list for a long time and is quite relevant in the context of some of our customers
  • Exercises for Programmers - 57 Challenges… by Brian P. Hogan. We had just decided to start up some Friday Code Kata sessions to learn or practice Go-lang and TDD. I figure this book could give me inspiration for some exercises to introduce
  • Building Tools with GitHub by Chris Dawson. I just thought this one looked interesting and probably contains useful tips that we could use at Praqma. Lars Kruse has started reading it and is quite impressed

Twitter streaming importer

The first talk apart from the keynote, where I actually managed to get a seat, was in the Graph DevRoom. The talk by self proclaimed graph enthusiast Matthieu Totet was about real time graphing data streamed from Twitter. Part of the talk was spent exploration a Gephi model graphing live tweets from FOSDEM. Apart from being a quite impressive demo of the technology, it also gave a fun picture of what was happening in different metaphorical corners of the big FOSDEM campus. For instance, the hashtag #Mozdem has a clearly visible island in the bigger graph of tweets.

Virtuozzo containers

Alexander Stefanov Sorry to say that I never really found out what he was talking about. I think I was missing some pre-requisites or a basic understanding of the problem domain he was talking into.

LizardFS - distributed file system

The speaker mainly showed how he could get a bunch of LizardFS nodes up and running in Docker containers in a very short time, but as there was zero intro to what LizardFS actually is, I had a hard time really following along. Something that was further exasperated by the challenges of presenting from a super high-res laptop where you can’t resize the terminal font. No idea what he was doing in the terminal. This was actually a recurring symptom in quite a few other talks as well. I don’t want to tease the linux users at an Open Source conference, but I seems that it is still not easy enough to change font size or resolution on the fly.

I did get the opportunity to look up LizardFS, and found out that it is described as a distributed, parallel, scalable, fault-tolerant, Geo-Redundant and highly available file system. Good to know. At least now it is on my radar if I ever happen to need one of those in my life.

Encryption for the masses with pretty Easy privacy (p≡p)

Privacy and encryption are not my strong topics, but hey, I might just learn something and the abstract promises “A rough overview” so maybe not too hard core. Either way, I had decided to stay for the following talk about Passbolt anyway, so no reason to leave the room.

My takeaways can be summarized to:

p≡p is…

  • Pretty easy privacy
  • Made to be super easy to understand with traffic light grading of security
  • Integrated with major email tools like Exchange, Office365, Gmail etc.
  • An abstraction to use existing crypto tools
  • “Privacy by default”. The goal is to make privacy so easy that it no longer becomes a burden added only when really warranted by circumstances, but instead becomes the default
  • made with a unified encrypted inbox in mind. Right now the focus is email, but it should end up covering chat etc. as well

Passbolt - Open source password manager for teams

Remy Bertot presented Passbolt - A fully open source password manager for teams. As far as I understood, it is not quite ready for prime time but close. For instance mobile clients are not available yet, but that’s on the roadmap.

I think the tool looks super nice and promising, and it is definitely something I will keep my eyes on, both for personal use and maybe to solve some of the needs we have in Praqma for managing shared account credentials.

Among the notable things that Remy presented was a quite impressive build environment utilizing Jenkins and Travis. Another cool feature is that they have made all the assets for their official style guide available as an npm package. Great idea.

From pipelines to graphs

Diomidis Spinellis presented dgsh a.k.a. Directed Graph Shell. dgsh is a new shell that can parallelize tasks with fork-join type operations instead of doing the pure linear pipe style of most existing unix shells.

Many standard unix tools have been adapted to be dgsh aware, and Diomidis showed some impressive examples of what can be done using this technique.

There are lots of cool examples on the project page.

As a bonus feature, dgsh is capable of generating nice GraphVis illustrations of the graph pipelines processed:

git committer plot

Scaling logging infrastructure with syslog-ng

In this talk, Peter Czanik from Balabit talked about a high performance logging daemon called syslog-ng.

Syslog-ng provides central logging for your infrastructure. It performs various roles like:

  • collect: gather logging data from many different platforms and applications
  • processing:
    • classify,
    • normalize.
    • rewrite - e.g. anonymize
    • reformatting to meet needs of receiving systems - e.g. convert to JSON
    • enriching (e.g. add geo-ip info)
  • filter: e.g discard surplus lines, route to different destinations.
  • destination handling:
    • traditional: file, network, sql, etc.
    • big data & NoSQL: e.g Hadoop, mongo, elastic
    • messaging: e.g. Kafka

Peter talked about how classic free-form log messages (typically date + host + text) are good for human consumption, while machines much prefer structured logging like name-value pairs. Parsers in syslog-ng are good at turning unstructured logs into name-value pairs and sending these on for further processing and storage.

He also talked about a client - relay - server model vs. a more classic client-server model. Example: relay in each datacenter. This allows processing and especially filtering to be done locally before transmitting over the slower wire to the central log handler.

Network Policy Controller in Weave Net

Blocking unwanted network traffic in Kubernetes. This talk was about the Weave Network Policy Controller which uses ip-tables and ip-sets to govern which Linux containers can talk to which other containers under control of Kubernetes.

Learning programming in the 21st century

This talk by Juan Julián Merelo (@jjmerelo) was more philosophical in nature. Among other things, Juan talked about how the programming environment of today is much more multi facetted and polyglot than the more monotone single-language environments where many of us learned our programming chops.

These days the applications we build typically involve many different languages across many architectural boundaries, but most programming is still taught in a classic approach focusing on a single language and with emphasis on basic operations like data-structures, flow control and IO. Juan discussed how the modern learning experience should instead be organized and focused around this multi-environment (console, UI, web, embedded), multi-language (in a single application) and multi-tool world.

Memorable quotes from the talk - reproduced here from memory and of course totally out of context:

  • stop programming in C in every language
  • If you are a self taught programmer: Remember that the person that taught you, knew NOTHING about the language.
  • You don’t learn language in isolation. Language and ecosystem go hand in hand.

Analysing GitHub data in Google BigQuery

Another very interesting lightning talk, presented by Filipe Hoffa from Google, was about Analyzing terabytes of OS code with Google BigQuery.

The short version is that Google have taken nearly all the open source projects on GitHub and 5 years of metadata and made the whole data set publicly available in their BigQuery platform.

What is BigQuery:

  • terabytes in seconds
  • simple sql interface
  • scalable
  • interop: Tableau, R and many others

Three main data sets:

  • GitHub Archive - 8.7 billion events, hourly updates
  • GHTorrent - annotated versions of above events, real time updates
  • 46 Tb of source code

Who would want to analyze GitHub data:

  • Maintainers - who is interacting with my project, contributions, forking and so on and how?
  • Project users
  • Project choosers - is this project popular, healthy, active? related projects?
  • Language and library maintainers: How is language A or library B used across the world?
  • Data lovers

Example queries:

  • when was my projects starred? Does this correlate with e.g. hacker news mentions?
  • top java imports over time
  • tabs vs. spaces…
  • More examples

Filipe has a blog post with more details.

Bazel - Google’s new build tool

Google rather recently open sourced their internal build tool under the name Bazel made specifically for speed, scalability and reliability. They had a booth and I had a really good chat with them. I concluded that Bazel is absolutely worth a closer look. It has a declarative approach to build definitions, but with a language similar to Python. A strong focus of Bazel is Reproducible builds. Focusing on correctness of build results near guaranteed reproducibility is a feature that I believe could be influential on how we design build pipelines.

Bazel runs clean on Linux and Mac and currently runs on Windows with MSys. The famous distributed build farm part of Google’s builds is not included in the open source tool yet, but it is on the roadmap.

OwnCloud

OwnCloud is an open source self-hosted alternative to cloud storage solutions like Dropbox. It seems like these alternatives are getting quite mature by now and worth pursuing if you want to be in control of your own data, but still have the convenience of cloud sharing. OwnCloud also have an enterprise/commercial version available with some closed source extensions.

NextCloud

NextCloud is a break-away fork of OwnCloud. They have a slightly different business model for their enterprise offerings and make all their features available in Open Source. They have added a number of extensions to the platform including things like video conferencing and chat, so I guess it could replace much more than Dropbox.

The rabbit hole is deep

One thing that struck me especially while visiting the booths, is how deep the Open Source rabbit hole is going on the hardware side. I have been aware of a few open source hardware projects especially on the small board sector (Arduino and the likes), but I saw a number of other projects and companies that go much further than that and feel the desire to call out a few of them:

  • Olimex TERES-I Do-It-Yourself Open Source Hardware and Software Laptop
  • Turris Omnia router - Open source secure router
  • Vikings - Libre-friendly sustainable hosting built on 100% free (libre) software all the way down to boot firmware, run with 100% green energy. If I was in the market for server hosting, this is where I would go.


End credits

Coming up with a good closing statement for the whole FOSDEM weekend is really hard. As the post in total suggests, it was somewhat of a wild ride and I probably left out half of the experience. Apart from a few talks and many booths that remain unmentioned, there were also the interesting chats with random people in the coffee line and impromptu sessions in the hallways outside full rooms. I also have a long “check out later” list of websites, tools and projects to keep me busy. Maybe some will even spark future blog posts.

Would I go again? Absolutely. It might not be on top of my must have conference list just yet, as I am currently not deeply involved with any of the participating Open Source projects, but if I got another chance to combine it with a conference like Git Merge I would do it all over again just for the inspiration that it provides.

Author: Jan Krag

Read more about Jan