glow.mozilla.org: smoke and mirrors, and RESTful design

Glow shotWhen I was a kid, my aunt gave me a book called the art of engineering. The title sounded weird to me at first – isn’t engineering the opposite of art?

It’s not – artful design can be visible in the best pieces of software, and not only at the user interface level. I find the realtime display of Firefox 4 downloads by glow.mozilla.org fascinating, and being my curious self I wondered how the data is transferred.

Starting with the requirement of broadcasting real-time data to millions of clients simultaneously, many of us would end up with expensive message queuing systems, RPC, WebSockets, SOAP^H^H^H^H (not SOAP – don’t make me cry). Lots of fun ways to add some powers of ten to your budget.

Don’t believe anyone who tells you that software has to be complicated, or that engineering cannot be artful. Simplicity always wins, and glow.mozilla.org is an excellent example of that.

The first thing that I noticed when looking at how glow gets its data (which was very easy thanks to the use of sane http/json requests) is that glow is not real-time.

I’d call it smoke and mirrors real-time: the client just requests a new batch of data points every minute, and the server can change this interval at any time, which can be very handy if traffic increases. Fetching slightly old data every minute is more than enough for a human user who doesn’t care if the data is a bit outdated, and it makes the system a bit simpler.

The first of these two regular data requests is to an URL like http://glow.mozilla.org/data/json/2011/03/21/14/42/count.json. The path already tells you a lot about what this is, which although not required is often a sign of a good RESTful design.

The response contains an array of data points (number of downloads per minute), along with two very important items that control the data transfer:

{
   "interval":60,
   "data":[
      [
         [
            2011,3,21,13,43
         ],
         1349755
      ],
      [
         [
            2011,3,21,13,44
         ],
         1350332
      ],
      ...
   ],
   "next":"2011/03/21/14/43/count.json"
}

The interval tells the client when to ask for data next, and the next item is the path to the next batch of data. At least that’s what I assume, I haven’t checked the client code in detail but that seems obvious.

Using URLs and data that seem obvious is the essence of the Web, and of a good RESTful design. Using RPC, WebSockets or any other supposedly more sophisticated mechanism would bring nothing to the user, and would only make things more complicated. Being able to throttle data requests from the server-side using the interval and next items is very flexible, obvious, and does not require any complicated logic on the client side.

The second data URL looks like http://glow.mozilla.org/data/json/2011/03/21/14/42/map.json, and if my quick analysis is correct it returns geographic coordinates of the dots that represent geolocated downloads. It uses the same interval/next mechanism for throttling requests.

All in all, an excellent example of engineering smoke and mirrors applied in the right way, and of simple and clean RESTful design. No need for “sophisticated” tools when the use case doesn’t really require them. Kudos to whoever designed this!

Update: The Mozilla team has more details on their blog. Thanks to Alex Parvulescu for pointing that out.

5 Responses to glow.mozilla.org: smoke and mirrors, and RESTful design

  1. You’re right… now i come to think of it… I do agree that it’s clever that they can modify the fetch interval on the server side based on some parameters like e.g. concurrent users.. Nice to keep in mind for future use.

  2. Gregory says:

    The “whoever designed this” would be Jeff Balogh (backend; he did most of the stuff you talked about) and Matthew Claypotch (frontend; he did most of the stuff you see).

  3. ozten says:

    Excellent post on art vs engineering in practice. Anytime you see “Real-Time” you know it’s a lie. A computer is a digitized artifact with varying degrees of bias.

    Hats off to to the creators for massaging an existing “close enough” stream of data into a beautiful and educational Real-Time illustration of Firefox 4 downloads.

  4. Jeff Balogh says:

    You nailed it. The interval tells the client how long the current blob of json should be played back, so we start trying to get the next URL halfway through the interval.

    I started out with grandiose plans of using node and redis to get really real time and use something like socket.io for pseudo-WebSockets, but I went for per-minute static JSON files in the interest of shipping something. We know how to scale serving static files and it’s easy to cache since a file is only written once.

    The next and interval keys were added so we didn’t have to manage files paths and date math in JavaScript. They let us grow or shrink the refresh rate, but the intention for that is to seamlessly move the client to faster updates once we figure out how to get closer to real-time. I wasn’t considering the scalability aspect of it.

    Thanks for this write-up. It’s fascinating to see someone else’s take on the design, especially since we stumbled upon it without a lot of forethought.

  5. bdelacretaz says:

    @Jeff thanks for your insights! Sounds like one of those “do a lot of thinking to come up with something simple” projects – those designs are the ones that tend to stick.

Follow

Get every new post delivered to your Inbox.

Join 26 other followers

%d bloggers like this: