Feb 28, 2015

Building Waze for the Boston subway: my first adventure in civic hacking

Last weekend I attended CodeAcross Boston 2015, a civic innovation hackathon hosted by Code for Boston as part of the national CodeAcross weekend.

Having been driven insane by the wintry woes of Boston’s public transit system, my idea for the event was to build “Waze for the T” – a crowdsourced alerting system to let commuters directly warn each other of delays, crowded platforms, and other problems on the subway. (The T is Boston’s nickname for the public transit system.)

I joined forces with two coworkers from Panorama Education, and over a weekend we built out the idea as a mobile web app using Meteor. We affectionately called it MBTA Ninja. (As weird as the new gTLDs are, .ninja domains are a gold mine for hackathons!) We weren’t really expecting anyone besides our coworkers and friends to use the app at first…

…and then it went viral on Twitter.

The last week has been a whirlwind. In just five days, 14,000 people have visited the site over 21,000 times. It’s been featured on various news sites, and we even got interviewed for local TV news on Thursday—the clip does a great job of telling the story.

In thinking over this experience, I’ve come away with two main thoughts.

1. Creating new data sources is important.

When I hear the term “open data”, I tend to think of governments opening up existing data sources so that they can be more readily accessed, or mashed up in interesting ways. Boston has a respectable list of hundreds of open datasets, as well as a fantastic transit data API that now powers dozens of apps. As expected, we saw some innovative projects at CodeAcross Boston using this type of data, including tools for analyzing building permits and 311 service requests.

MBTA Ninja doesn’t fit into that category though. Rather than relying on existing data, the site actually creates a new data source based on decentralized reporting from people on the ground who are seeing the reality of a situation. This might seem counterproductive—isn’t the problem these days that we have too much data to sort through? There are certainly domains where that’s the case, but there are still dark spots out there where we could benefit from having way more data.

At Panorama Education, the startup I work at, we’re helping schools incorporate feedback surveys so that they can better understand what students are experiencing behind classroom doors and on the playground. Another example is Ushahidi, which is an incredible open platform for crowdsourced reporting that has been used for things like tracking child deaths in Syria and tracking the status of water wells in Afghanistan.

My favorite project at CodeAcross Boston was also oriented around creating a new data source. The founder of a nonprofit called Foster Skills created Rate My Foster Home, which would enable kids to fill out online surveys about their foster care conditions.

I think these examples show that there are still tons of places—ranging from as trivial as a train platform to as important as a foster home—where we can benefit from more data. I loved this tweet from the hackathon, and it made me want to work on a meatier problem next time around:

When building new data sources, it’s also important to realize the difference in value between unstructured and structured data. People were already tweeting up a storm about the subway (I know because I would check every morning), but it was hard to interpret that data because it was completely unstructured. MBTA Ninja requires people to submit alerts using specific categories, and also enables them to upvote existing alerts rather than creating their own independent tweets. It turns out that imposing a domain-specific structure makes the data far more actionable in this case.

On their page about tracking wells in Afghanistan, Ushahidi makes a similar point:

asking the right questions of what’s required goes a long way in removing the burden of inefficiency around processing unstructured information (what you will most likely receive using SMS or social media channels).

I’m sure there are many domains that could benefit from more structured reporting, rather than trying to parse unstructured Twitter data. The problem, though, is actually getting people to use these mechanisms. This is a challenge that applies to MBTA Ninja, since the utility of the app is entirely reliant on people actually using it. We initially thought we might have to incentivize contributions somehow, but so far we’ve had tons of alerts, and popular ones have been getting 30 or 40 confirmations.

I think people are getting enough value out of the app that they’re willing to contribute by reporting and upvoting since it’s a really easy process. There also seems to be some aspect of community solidarity involved—just the act of reporting or upvoting an alert might be an outlet for frustration. While other apps that create new data sources might require more thorough data entry than MBTA Ninja, I think the general principles still apply: clearly demonstrate the value of the data to users, and make it as easy as possible to contribute.

2. Open source is da bomb

My second takeaway is that the open source movement is the best thing that’s ever happened to the field of software development.

Thought exercise: imagine telling someone back in the year 2000 to build a website that provides an interface for reporting and viewing transit events, syncing everything in realtime to hundreds of mobile clients simultaneously. Of course it needs to have an intuitive, mobile-friendly UI that anyone can instantly use. No page refreshing obviously, because that’s annoying. Oh yeah, and it needs to go from zero to deployed in the cloud in just two days.

In 2015, this has somehow, unbelievably, become nearly trivial. Using Meteor for the realtime backend, MaterializeCSS for the frontend components, and Heroku for cloud deployment1, it was just a matter of combining some parts in the right way to realize our idea. Talk about standing on the shoulders of giants.

We also open sourced our code (Github repo) and have already reaped the benefits of community contributions, with five pull requests contributed by people outside of the original team who just wanted to help out.

Among the countless ways that people have found to give back to the world, open source software feels special in terms of the leverage that an individual contribution can have. The crazy thing about the open source community is that such a large percentage it is dedicated to producing better tools (from operating systems to programming languages to libraries), which often are in turn used to create even better tools. Intuitively, this type of ecosystem seems like it should lead to exponential progress, and we’re still in the early days of seeing that curve play out.

In conclusion…

In a world where political systems are increasingly gridlocked every day, and much of Silicon Valley is focused on peddling ads, the civic innovation and open data movements are a bright and optimistic exception to the zeitgeist. When the noble goals and pervasive reach of governments combine with the freewheeling innovation of the tech world, I’m confident that amazing things can happen.

I couldn’t have asked for a better first experience, and I’m excited to stay active in this space. If you live in Boston, you should come by a Code for Boston meetup sometime, and join dozens of people who are interested in hacking the future.

Discuss this post on Hacker News


  1. Heroku itself isn’t open source, but is built on open-source foundations.