glow.mozilla.org: smoke and mirrors, and RESTful design

22 March 2011 mozilla simplicity web

When I was a kid, my aunt gave me a book called the art of engineering. The title sounded weird to me at first - isn't engineering the opposite of art?

It's not - artful design can be visible in the best pieces of software, and not only at the user interface level. I find the realtime display of Firefox 4 downloads by glow.mozilla.org fascinating, and being my curious self I wondered how the data is transferred.

Starting with the requirement of broadcasting real-time data to millions of clients simultaneously, many of us would end up with expensive message queuing systems, RPC, WebSockets, SOAP^H^H^H^H (not SOAP - don't make me cry). Lots of fun ways to add some powers of ten to your budget.

Don't believe anyone who tells you that software has to be complicated, or that engineering cannot be artful. Simplicity always wins, and glow.mozilla.org is an excellent example of that.

The first thing that I noticed when looking at how glow gets its data (which was very easy thanks to the use of sane http/json requests) is that glow is not real-time.

I'd call it smoke and mirrors real-time: the client just requests a new batch of data points every minute, and the server can change this interval at any time, which can be very handy if traffic increases. Fetching slightly old data every minute is more than enough for a human user who doesn't care if the data is a bit outdated, and it makes the system a bit simpler.

The first of these two regular data requests is to an URL like http://glow.mozilla.org/data/json/2011/03/21/14/42/count.json. The path already tells you a lot about what this is, which although not required is often a sign of a good RESTful design.

The response contains an array of data points (number of downloads per minute), along with two very important items that control the data transfer:

  {
     "interval":60,
     "data":\[
        \[
           \[
              2011,3,21,13,43
           \],
           1349755
        \],
        \[
           \[
              2011,3,21,13,44
           \],
           1350332
        \],
        ...
     \],
     "next":"2011/03/21/14/43/count.json"
  }

The interval tells the client when to ask for data next, and the next item is the path to the next batch of data. At least that's what I assume, I haven't checked the client code in detail but that seems obvious.

Using URLs and data that seem obvious is the essence of the Web, and of a good RESTful design. Using RPC, WebSockets or any other supposedly more sophisticated mechanism would bring nothing to the user, and would only make things more complicated. Being able to throttle data requests from the server-side using the interval and next items is very flexible, obvious, and does not require any complicated logic on the client side.

The second data URL looks like http://glow.mozilla.org/data/json/2011/03/21/14/42/map.json, and if my quick analysis is correct it returns geographic coordinates of the dots that represent geolocated downloads. It uses the same interval/next mechanism for throttling requests.

All in all, an excellent example of engineering smoke and mirrors applied in the right way, and of simple and clean RESTful design. No need for "sophisticated" tools when the use case doesn't really require them. Kudos to whoever designed this!

Update: The Mozilla team has more details on their blog. Thanks to Alex Parvulescu for pointing that out.