AngelHack Berlin! Team Veebibi: Lessons Learned and how we made it

This weekend I attended the AngelHack Berlin that took place at the ImmobilienScout campus and has been part of a worldwide event series leading to the “crown” of hackers.

(Promotion) Personally I’m absolutely not into the “location” world but rather work on a project for social content curation, called Qurate that’s pretty close to the Value Proposition of the winning team (Edgar tells…) that won the 2w ride to San Francisco Silicon Valley. Thanks to Qurate you can relive the event again, based on Tweets and photos taken by the crowd. Here’s my personal “story” in pictures. If you want to build your own, feel free to do so by using our media table on https://www.qurate.de/angelhackber . I’ll ask the YouIsNow team to contribute their photos as well 😉

This time I wanted to do something visual, something that I can show off and something that does  good for those who need. Two weeks ago went the news that the “Verkehrsverbund Berlin Brandenburg” (VBB, hence the project name 😉 ) has opened up its data (article on Golem)  so just before I went to sleep on May, 3rd I had the idea to integrate that data somehow. Here’s what we came up with: http://veebibi.herokuapp.com

The story behind Veebibi

I’ve been born in Berlin and yet a child I often found asking myself, “Where is this bus going?” (I mean the route, not the target. Obvious). Most of us that live in the inner circle of Berlin are using the Metro or the S-Bahn for transit. It’s rather comfortable and – at least the metro – is mostly in time and runs very frequently, at least on daytime. But I found, that I had no clue where to find a good bus plan overview (obvious solution: look at the wall map, I know, I mean: on my smartphone). Actually that wouldn’t have solved my problem anyway: as said, “I want to know which route this bus is taking!“. Another example: once in January (it was very cold) I waited at the main station for the S-Bahn to come; suddenly the speaker said: “unfortunately your train will be delayed for an unknown time“. I only had to go 3 stations to Hackescher Markt and took the U55 to Brandenburger Tor. From there I had a walk to work and I nearly froze an ear and two toes off. I could’ve taken some bus for sure (the TXL maybe?) but finding out with a smartphone is not as easy as it might sound, especially not if you only have O2 at hand. You usually visit http://www.fahrinfo-berlin.de/, say where you are, where you want to go and pick a line from the search results. Google Maps isn’t showing VBB transit lines (except the trains) at all.

If that kind of app exists already or not wasn’t important for us anyway. We thought: Lets honour the effort of Berlin Brandenburg to finally open up their data set and utilize that data to draw transit lines on a Google map. Actually there is a little political background behind the data, too: the VBB never would’ve opened up without the pressure of the project OpenPlanB and such. Apps like Öffi (everyone loves you for that one, Andreas!!) had to use unofficial data sets to get transit information and as I heard the Deutsche Bahn is far from happy that suddenly thousands of hackers can write apps for their unsatisfied customers (some details at the end of the Golem article mentioned above).

How we did it

We fetched the open data set from the official web site. It happens to be in a standard format called GTFS and is globally adapted by public transit authorities. Now, we’re hackers, we wanted to learn and didn’t give a shoot on what the specification reads so we tried to import everything on our own. The data delivered by the VBB is splitted into 8 CSV files that make up a relational data structure. Relational? Come on, it’s 2013, NoSQL is the big buzz (NewSQL is, but that’s another topic) so we wanted to have the stuff in MongoDB. Don’t shake your head before you’ve seen the results 🙂

Our team member Robert  tried to import the data into Mongo directly but as you can imagine: that was a desaster. Lesson One (for the beginners): don’t import relational data into an objective database! You can’t join them anyway. So I suggested to write a “workaround”: first import the data into a relational system (MySQL is always a good decision), transform them into a document-like representation and import that stuff into Mongo. At that point Robert decided to skip the time table data from the set because that would’ve blown the overall result up to millions of rows (it’s basically a cross join of 7 tables, so we reduced to 5). He exported the result set to CSV and it looks like this:

CSV:

9087171,1,0,1,170,"S+U Alexanderplatz via Hauptbahnhof","Bus TXL",BVB---,"Flughafen Tegel (Airport) (Berlin)",52.5540690,13.2928370
9019105,2,0,0,170,"S+U Alexanderplatz via Hauptbahnhof","Bus TXL",BVB---,"Buchholzweg (Berlin)",52.5469730,13.3177930
9020202,3,0,0,170,"S+U Alexanderplatz via Hauptbahnhof","Bus TXL",BVB---,"S Beusselstr. (Berlin)",52.5343140,13.3287030
9002102,4,0,0,170,"S+U Alexanderplatz via Hauptbahnhof","Bus TXL",BVB---,"Turmstr./Beusselstr. (Berlin)",52.5273390,13.3287430
9003104,5,0,0,170,"S+U Alexanderplatz via Hauptbahnhof","Bus TXL",BVB---,"U Turmstr. (Berlin)",52.5258370,13.3424010
9003204,6,0,0,170,"S+U Alexanderplatz via Hauptbahnhof","Bus TXL",BVB---,"Kleiner Tiergarten (Berlin)",52.5249900,13.3457640
9003201,7,0,0,170,"S+U Alexanderplatz via Hauptbahnhof","Bus TXL",BVB---,"S+U Berlin Hauptbahnhof",52.5258470,13.3689240

The first id marks the stop, the next the position of that stop in the route. Skipping some cols we find the target of the line, the line name (“Bus TXL”), the company’s name that’s responsible for it (BVG), the stop’s name and its geocoordinates. Next we needed to transform those lines into JSON documents that fit into MongoDB. With one eye open and one half of his brain already shut down at 2am Robert hacked a PHP script that did the job pretty well. I don’t know how he made his way home alive after that (he went by bike) but I’m glad he made it! I spent one hour to fix the bugs he left over and came up with JSON data that’s compatible with mongoimport. Here’s an example document:

{ "target":"Flughafen Tegel (Airport) (Berlin)","line":"Bus TXL","stations":[ 
 { "id":"9003104", "name":"U Turmstr. (Berlin)", "loc": {"lng":13.3424010,"lat":52.5258370} },
 { "id":"9002102", "name":"Turmstr./Beusselstr. (Berlin)", "loc": {"lng":13.3287430,"lat":52.5273390} },
 { "id":"9020202", "name":"S Beusselstr. (Berlin)", "loc": {"lng":13.3287030,"lat":52.5343140} },
 { "id":"9019105", "name":"Buchholzweg (Berlin)", "loc": {"lng":13.3177930,"lat":52.5469730} },
 { "id":"9087171", "name":"Flughafen Tegel (Airport) (Berlin)", "loc": {"lng":13.2928370,"lat":52.5540690} }
 ]
}

It has been already 5am but from now on everything went straight. I imported the JSON into a Mongo instance hosted at MongoLabs (mongoimport -d mongo -c veebibi converted.json), an addon you can get from Heroku and put an index on the stations’ loc fields:

db.veebibi.ensureIndex( { "stations.loc" : "2d"} )

That makes querying lines (!) by position as simple as:

db.collection("veebibi").find({
 "stations.loc": { "$near": [54.036022,10.447311] } }
});

yields an array of up to 100 lines including all their stops. The perfect foundation for Veebibi since it’s exactly what we want. Since routes are stored more than once (a bus line might fork depending on time of the day and goes in both directions) I “consolidated” the response data by picking the line with the most stops.

The frontend is a piece of cake since everything’s just JSON. We let your browser acquire your current position (navigator.geolocation.getCurrentPosition), send it to our backend and transform the result coordinates into Google Maps polylines, one for each returned line:

this.locator.findLines(latlng, function(lines) {
 var polyOptions = {
   strokeOpacity: 1.0,
   strokeWeight: 3
 };
 _.each(lines, function(line) {
   polyOptions.strokeColor = VB.Frontend.COLORS[_.random(VB.Frontend.COLORS.length)];
   var poly = new google.maps.Polyline(polyOptions);
   poly.vbbLine = line;
   google.maps.event.addListener(poly,'click', function(e) {
      alert(this.vbbLine.line); //show line details to the user
   });
   self.polyLines.push(poly); //prepare polylines for removal on next click
   poly.setMap(self.gmap);
   var path = poly.getPath();
   _.each(line.stations, function(station) {
      var latLng = new google.maps.LatLng(station.loc.lat, station.loc.lng);
      path.push(latLng);
   });
 });
});

And that’s what you see when you click on a map on Veebibi. Interested readers will notice the usage of underscore iterators and the Google Maps V3 API.

veebibi_berlin

While I was hacking the core of all that stuff our team member Gabriel (I never remembered his name on location, now I can) spent some hours on writing most of the “frontend” you see when visiting the page for the first time. He used Twitter’s Backbone.js for many elements and tried to make everything normalized and responsive. Here some learnings he had when coworking with me:

1. you should not do git push origin master if it’s not working well. Instead push a branch that the maintainer can merge. The “real way” is actually: fork the project, push to your fork’s master and create a Pull Request for the maintainer on the root project.

2. You don’t put CSS information inside your main file. All <style> is evil.

3. The Javascript mongodb-native driver doesn’t compile on Windows. At least not at 3am.

4. You should configure your git to not ask for username and password every time. If you reject that advise, be sure that you don’t accidentally push a new publicly visible branch when doing: git push origin my-branchgabi@somedomain.com-pA22w0rD . It’s very easy to forget pressing enter on 6am with no sleep.

Our fourth colleague Alexander did research on the Google Maps API in the meanwhile, unfortunately the results he came up with didn’t make it into the final code but he found that’s pretty simple to make polylines follow actual streets. If you have a look at the veebibi output you’ll notice that bus routes are assembled out of straight lines. Usually buses don’t go right accross the Tiergarten lawn so this obviously can be improved. He sent me this GIST  around midnight. It describes how you can utilize the waypoints option to let Google Maps render a correct route across streets. For buses that might not be 100% exact but totally sufficient to render a nice view.

Our fifth colleague Zachary who went a long way from Ohio to join us on AngelHack BER (just kidding, he’s in the city for studies) was taking care of an idea that Gabriel came up with: while the bus lines are pretty uninteresting at first glance, why not pepper the view up with a heatmap that’s rendered on a Twitter search result for popular / trending hashtags (e.g. “#party” or “#bbq”) so you know where to go once you realized how to get there – we actually call that the “party mode” component of Veebibi: buy a beer, get on a bus and head to a party. In Berlin that can be really fun:)

We never integrated that stuff unfortunately but Zachary did an amazing job analysing the FusionTable concept in Google  that can be used to generate datasources for maps overlays with a huge amount of location data. In our case we could simply have used the standard way of doing things (for a limited set of data the Google Maps API alone is sufficient to render heatmaps).

The Pitch

I was the lucky one to pitch that project on stage, using a presentation assembled by Alexander and I made clear on the first second that this wasn’t going to be the next “We have a brilliant business concept and here is how to make money with it” pitch ). It’s simply a product of some productive minds that used a day and a night to hack the shoot out of their brains. The audience cheered when they saw that you can actually travel from Berlin to Stralsund just using public transit lines so I’m glad we achieved our goal: we made something to make people cheer!

veebibi_stralsund

[tweet https://twitter.com/picsoung/status/331035922592301057 ]

Thank You All!

So I can only finish this article with an especially grateful “Thank You!” to Alexander from Westech Ventures who honored our team’s effort with a special price for “an idea that could possibly grow to a business”. The core idea to utilize GTFS data to build up a global transit information system is definitely not unique but would lead to a possible B2B-approach that could work world wide. The way we utilized the date is far from an industrial state but we showed that it’s absolutely possible. So we got away with 4 Chinese Android tablets; imho more than we could’ve expected.

I’d like to thank Robert, Gabriel, Alexander (and your girlfriend: thanks for the logo 😉 ) and Zachary for making this possible! Not to forget the orga team of ImmobilienScout24 / You Is Now that offered a brilliant location for the Hackathon and done a great job to feed us over the time (I won’t have donuts for the next couple of months!).

PS: don’t forget to visit Qurate and contribute your impressions. And tell your own story if you want to 🙂

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s