Technology Blog 1 – The MTA API (and beyond!)


No matter where you live in this great city, there is one irrefutable truth- you are at the mercy of the MTA. The trains may run on time or not run at all, and on rare occasions you can get from Astoria to Coney Island in 35 minutes (it’s happened to me) or you can be stuck in a tunnel with no cellphone service for 45.

Though the MTA is a cruel mistress, they have, in recent years been forthcoming with information regarding almost all aspect of the transit system in the form of an API


weworks subway countdown
anyone who has been up to the 7th floor has seen the neat train countdown clock by the elevators. I saw that and looked on the MTA mobile website for such a feature and clicked around and couldn’t find what I was looking for (I found it easily on the desktop site)… what if I looked into accessing the MTA API to get what I wanted?

The MTA API is in the GTFS format (more on that later), and I am looking specifically and train arrival times. so I want the realtime data (it’s updated every 30 seconds).

the feed i'm looking for

Connecting to MTA…

In order to get the GTFS feed from the MTA, you need to ask for an API key. The API key allows the MTA a way to see who is using their data, and how often they are directing traffic to the MTA’s servers.

It’s a violation of the MTA’s TOS to send queries directly to the MTA servers, you’re required to download the feeds and host them on your own website– that’s something to know if you have an app that’s pinging their servers for each of your 40,000 users. But for our purposes, we can ignore all that!

Once you get your API Key (it’s fairly simple, you really just need a US address) you can get your data and start to use it… right? wrong!


GTFS, originally called Google Transit Feed Specification, and now called General Transit Feed Specification, is a data format originally developed by (guess who?) Google to allow for improvements to Google maps.

In the 12 years since its original release GTFS has been adopted by hundreds, if not thousands of public transportation networks worldwide.


a GTFS feed comes as a .zip file, and contains a number of other files, which are simple .txt files.

We’re working in ruby, so we need a way for ruby to interpret this GTFS feed… we can either
a) code something ourselves, or…
b) see if there are any resources out there to help

Getting HALP!

Although had some promising gems, it wasn’t what I was really looking for… I am looking to decode realtime feeds specifically. Fortunately, some really nice guy did some work on this very topic and posed a public repo on github, GOING YOUR WAY!


a quick look at the documentation and we see a very friendly example. We can now start writing our app.

require 'protobuf'
require 'google/transit/gtfs-realtime.pb'
require 'net/http'
require 'uri'

i got all that just from the example. and also

feed = Transit_realtime::FeedMessage.decode(data)
for entity in feed.entity do
if entity.field?(:trip_update)
p entity.trip_update

The Train Is Pulling Into The Station!!!!!!!!

now we’re finally getting somewhere. I don’t exactly know what this does, but I know it’t a great start, and I plug in all the info that I can. That includes the URI from the MTA with my API key. I run the file and I get CRIT BY WALL OF TEXT (the console colors are pretty though, right?)
wall of api text
now this is where a more experienced programmer might be able to save a lot of time: but for me, I now entered experimentation mode…

It’s some kind of array or hash and it’s running this for entity in feed.entity do.
If i want just the first entry of this GTFS thing, I’ll pop a return in the loop to break me of it so I can see what I’m dealing with….

for entity in feed.entity do
if entity.field?(:trip_update)
p entity.trip_update

I end up with:

# route_id="B" direction_id=0> stop_time_update=[# departure=# stop_id="D35S" schedule_relationship=#>, # departure=# stop_id="D39S" schedule_relationship=#>, # departure=# stop_id="D40S" schedule_relationship=#>] vehicle=nil timestamp=0 delay=0>

Well, shit… that’s ugly as hell, but at least it’s manageable. Maybe I can reorganize it in a text editor and see what’s what

# route_id="B" direction_id=0>

# stop_time_update=
# [#
# departure=#
# stop_id="A14N" schedule_relationship=#>, # departure=# stop_id="D13N" schedule_relationship=#>] vehicle=nil timestamp=0 delay=0>

The Train, Boss, The Train!!!

Wow! now, that’s something waaaay more clear. All I have to do is look at the MTA documentation and figure out what means what, and bingo presto, I’m well on my way to being done!

I want This: stop-id

For once, I find the MTA’s documentation helpful, and I can surmise that the southbound F Train station at avenue P is going to be F34S… F for F train, 34 for Avenue P, and S for southbound (don’t ask me why i only need southbound… well, ok, it’s because there’s only southbound service at avenue P… I’m not going to get into that.

Looking at what I organize above, it looks like Stop_id is inside stop_time_update, which is inside trip_update.

after some fiddling, i was able to write a functional iteration to get all the approaching F trains at ave P

for entity in feed.entity do
if entity.field?(:trip_update)
for upd in entity.trip_update.stop_time_update do
if upd.stop_id == "F34S"
p upd.arrival.time

and I get….


what the fuck is that?

For me, an complete beginner, this is a moment of pure frustration, because I dont know what the heck this is, this time duckduckgo gives me some big time halp!

a search for “ten-digit time ruby” leads me to

i browse through and find this handy bit… #=> 1989-11-28 00:00:00 -0500

Now if you’re going to take anything away from this blog, take this away. this 10-digit and 13-digit time is a thing, and its the number of seconds since the EPOCH… a date in 1970. You’ll probably see this again and again. So REMEMBER….
spock makes bones remember

a handy little adjustment to my code…

if upd.stop_id == "F34S"

and VOY-LUH! I got my arriving trains!!!!

arriving train timetable

It’s done… for now… I can now build on this to get all sorts of data… different stations, different subway lines, you name it!

That took waay too long, but i’m super glad I did it!