BitWorking

This is Joe Gregorio's writings (archives), projects and status updates.

Sparklines in data: URIs in Python

Sparklines, as defined by Tufte, are intense, simple, word-sized graphics. Kind of like this: . I seemed to stumble across them at just the right time, as I have regression tests I am adding to on a daily basis. The result is a flood of information. I believe sparklines may be the answer to my information avalanche.

All of my regression scripts are written in Python and the output of those scripts is HTML. Embedding sparklines in that report output sounds like a perfect application of "data: URIs" [RFC 2397], which allow you to take small bits of data, like small images, and instead of serving them up seperately, you embed the data right into the URI. In this case, I'll generate PNG formatted sparklines then encode them as data URIs that can be included directly in my HTML formatted regression test results. So I dashed off to find a sparklines module for Python.

I found none.

I did find one for PHP, http://sparkline.org, but that would not be Python now would it?

Not to be discouraged, I set off to load up the standard image manipulation package for Python.

Crud.

Explain to me again why doesn't Python have a standard image manipulation package?

I settled on Python Imaging Library (PIL).

Making the sparklines turned out to be incredibly easy, making me think that the reason there wasn't any libraries is that it's just so easy to glue the right pieces together in Python that a library would be overkill. Here is the code:


import Image, ImageDraw
import StringIO
import urllib

def plot_sparkline(results):
   """Returns a sparkline image as a data: URI.
       The source data is a list of values between
       0 and 100. Values greater than 95
       are displayed in red, otherwise they are displayed
       in green"""
   im = Image.new("RGB", (len(results)*2, 15), 'white')
   draw = ImageDraw.Draw(im)
   for (r, i) in zip(results, range(0, len(results)*2, 2)):
       color = (r > 50) and "red" or "gray"
       draw.line((i, im.size[1]-r/10-4, i, (im.size[1]-r/10)), fill=color)
   del draw

   f = StringIO.StringIO()
   im.save(f, "PNG")
   return 'data:image/png,' + urllib.quote(f.getvalue())

if __name__ == "__main__":
    import random
    html = """
    <html>
        <body>
            <p>Does my sparkline 
                <img src="%s"> 
            fit in a nice paragraph of text?
            </p>
        </body>
    </html>"""
    print html % plot_sparkline([random.randint(0, 100) for i in range(30)])

The example output is just a plot of 30 random values. You should put more meaningful data in there. All the work is done on plot_sparkline, which plots the data as 4 pixel high bars in an image that is 15 pixels high. After the binary PNG image is generated, converting it to a data URI is staggeringly easy, it's done in the return statement of plot_sparkline. The output of the script is a sample HTML file to demonstrate the sparkline embedded in the HTML.

The above code produces output that should look like:

Does my sparkline fit in a nice paragraph of text?

Nota bene: If you are not able to see the image in the above text then that means that you are probably using Internet Explorer, which does not implement data URIs. You might want to get a better browser.

Update: Ooops, I forgot to give Anil proper credit for bringing sparklines to my attention.

Update 2: If you want a continuous plot instead of a series of tick marks it is easy enough to code up. Note that it even has the red dot on the last data point.


def plot_sparkline2(results):
    im = Image.new("RGB", (len(results)+2, 20), 'white')
    draw = ImageDraw.Draw(im)
    coords = zip(range(len(results)), [15 - y/10 for y in results])
    draw.line(coords, fill="#888888")
    end = coords[-1]
    draw.rectangle([end[0]-1, end[1]-1, end[0]+1, end[1]+1], fill="#FF0000")
    del draw 

    f = StringIO.StringIO()
    im.save(f, "PNG")
    return urllib.quote(f.getvalue())

Update 3: Highlighting the minimum point in is a matter of adding two more lines, one to find the minimum, and another to plot a rectangle at that point. While Tufte has pointed out that sparklines are targeted at high resolution printing, there are advantages to working with them on the computer. For example, on a web page we can put the raw data into the title of the image and they will be displayed when the mouse hovers over the sparkline. Try it out, hover your mouse over the image: 1 78. Here is the code that generated that sparkline, which not only generates the 'img' element but also prints the minimum point and the last data value in colors that match the corresponding points in the sparkline.


def plot_sparkline3(results):
    im = Image.new("RGB", (len(results)+2, 20), 'white')
    draw = ImageDraw.Draw(im)
    coords = zip(range(len(results)), [15 - y/10 for y in results])
    draw.line(coords, fill="#888888")
    end = coords[-1]
    draw.rectangle([end[0]-1, end[1]-1, end[0]+1, end[1]+1], fill="#FF0000")
    min_pt = coords[results.index(min(results))]
    draw.rectangle([min_pt[0]-1, min_pt[1]-1, min_pt[0]+1, min_pt[1]+1], fill="#0000FF")
    del draw 

    f = StringIO.StringIO()
    im.save(f, "PNG")
    return """<img src="data:image/png,%s" title="%s"/> 
         <b style="font-size: 10pt;font-family: Verdana, Arial, Helvetica, sans-serif">
            <span style="color:#0000FF">%d 
            <span style="color:#FF0000">%d
         </b>""" % (urllib.quote(f.getvalue()), results, min(results), results[-1] )

Update 4: I am deeply impressed with the work being done on RedHanded, which is not only sparklines in Ruby, but they're generating BMPs and PNGs from scratch. Wow.

Update 5: What's better than sparklines? How about sparklines + imagemaps.

Update 6: If you want to use sparklines beyond just where data: URIs are available, please avail yourself of my Sparkline Generator. It's a web application, and a web service, for generating sparkline images, all with source code.

interesting. i could serialize whole Java object and send it as a parameter to the applet

Posted by arkady on 2005-04-26

The Python Imaging Library pretty much /is/ the standard image manipulation package - at least it's the only one I ever see used or discussed. Are you saying it should be included in the Python standard library?

Posted by Simon Willison on 2005-04-26

I like this solution, because it's self-contained. I was trying to solve the same problem, and my idea was to write a CGI script that returned the image, so you could something like this:

<img src="/spark.cgi?data=1,2,3,4,5" />

But I think inline PNGs make a nice solution. And if you use the sparklines just for additional information, people with IE won't miss much.

PS: maybe you could put a note near the comment box saying which HTML tags are accepted?

Posted by Roberto on 2005-04-26

Simon: Yes, I am saying it should be inculded in the standard library.

Roberto: Will do.

Posted by Joe on 2005-04-26

Pythonware, which writes PIL, make their money off custom programming jobs, usually extending their code for clients with specific needs.  As a result, they need the ability to make releases on their own schedule (or on their clients' schedules), which doesn't mesh well with the standard library.  At least, that's my impression.

Posted by Ian Bicking on 2005-04-26

I just hacked my pyTextile plugin to allow inline sparklines using your function:

http://dealmeida.net/en/Projects/PyTextile/sparklines.html

Posted by Roberto on 2005-04-26

Just this weekend http://agiletesting.blogspot.com/2005/04/sparkplot-creating-sparklines-with.html came in through the ether. Another interesting sparklines-implementation in python using matplotlib.

Posted by Steffen Gl?ckselig on 2005-04-26

Nice implementation. You might want to rethink your color scheme, however, the most common form of colorblindness being red-green.

Posted by Tom Moertel on 2005-04-27

What's the point in publishing pictures that can't be displayed by IE?

Posted by Ziv Caspi on 2005-04-27

Roberto: Excellent!

Tom: Thanks for tip, now updated to be gray and red.

Ziv: What's the point in using IE?

Posted by Joe on 2005-04-27

Ziv: one word: Greasemonkey.

Posted by Mark on 2005-04-27

There's no "point". It's what I (and a few other readers, one may only assume) have. From your log stats: how many hit do you get from IE?

Posted by Ziv Caspi on 2005-04-29

Sorry to disappoint you Ziv, but the majority of my readership can see those images. And I believe I am performing a public service by encouraging the rest to upgrade to something besides IE.

55% Mozilla
37% IE
4% Safari
3% Opera

Posted by joe on 2005-04-29

I'm not disappointed at all. Your own stats mean that a third of your readership can't see what you meant (at least not without some considerable effort). If you don't care for leaving us out, well, that's your decision.

Posted by Ziv Caspi on 2005-04-29

Hmm, I thought livejournal was better indexed than this, but it also looks like I never got around to posting the python code I give example output from here: http://www.livejournal.com/users/eichin/39893.html

It's only about 50 lines of basic PIL plotting, and then using .tostring("jpeg") and base64.encodestring to produce the drop-into-blog output.  As noted, enough people-I-care about used browsers where the data trick didn't work... but also, livejournal has a (reasonable) 64K/post limit, and the data encoding ran into that very easily in my early experiments - remember that even if the image format is compressed, the wrapper-format has to be valid attribute data (and I think base64 should average smaller than urllib.quote in practice...)

Definitely something to integrate more directly, I really like the directly-in-textile approach mentioned above.

Posted by Mark Eichin on 2005-04-30

Browsers: konqueror 3.3.1 (fedora 3) can't show the data: sparkline either.

Posted by anonymous on 2005-05-06

To those it didn't occur to (almost me), do "view source" to see how big these things are.  ~800 bytes of %00%02...

konqueror?  more like luzor!  just kidding.

Posted by Steve Witham on 2005-07-11

Yeah, anything included via the data: URL scheme does take up a fair bit of space due to the encoding.  BUT... it saves a second round-trip to the server to fetch the image.  So whereas it might cost more in data, they'll usually load faster.  In fact, you're guaranteed that the image has loaded by the time the page has loaded. :-)

I'm a big fan of the data: URL scheme.  I used to use it with XML stuff... say a product catalog has a picture, if you store that picture Base64-encoded in the original XML content, then you can XSLT it into an image with a data: URL and pretty much go straight to the PDF with that.

It's also an elite way to fool sites like LiveJournal into hosting your images which they wouldn't ordinarily host. :-)

Posted by Trejkaz on 2005-08-28

2005-04-25