Snap together a new datasette

Posted by Katie McLaughlin on May 12, 2019

TL;DR:

pip install datasette sqlite-utils 
sqlite-utils insert SQLITE_DB.db TABLE_NAME FILE.JSON
datasette serve SQLITE_DB.db

lets you introspect a JSON file in datasette, like magic ✨🎩✨


Simon Willison gave a wonderful talk at PyCon US 2019 entitled Instant serverless APIs, powered by SQLite, where he demonstrated taking SQLite databases and visualising them with his project datasette.

He also mentioned a package he wrote, csvs-to-sqlite, which as the name suggests, converts csv files into SQLite databases.

Given this, you can take your CSV files and import them easily into datasette, which gives a rich environment in which to explore the data.

At the moment PyCon AU 2019 is in review period, and we are using Pretalx, which provides a JSON API to inspect data.

Earlier this morning I spent a bunch of time using this API and wrote a pipeline (NOTE: hacky code) that would take submission and review data and generate a XLSX file that I could then import into Google Sheets to share/edit as I wanted. This took me a few hours, and I learnt some things along the way (like ExcelWriter).

I like Google Sheets (and Microsoft Excel when I have access to it) because I can use my Excel-foo to manipulate data and share the results with others.

Excel-foo isn’t my only superpower; I’m also adept at SQL. So when I watched Simon’s talk, I asked if there’s a way to convert from JSON to SQLite and then import into datasette. Turns out, he’s also written sqlite-utils, which gives a CLI to import JSON directly into SQLite.

Therefore, it’s simple as heck to get any JSON API data into datasette:

First, you need to get the data from your API into JSON. That may involve credentials, pagination, etc.

import requests
import json
import os

EVENT = "YourEventName"
PRETALX_TOKEN = os.environ.get("PRETALX_API", None)

def slug(s):
    return "https://pretalx.com/api/events/%s/%s?limit=100" % (EVENT, s)


def get(uri):
    resp = requests.get(uri, headers={"Authorization": "Token %s" % PRETALX_TOKEN})
    if resp.ok:
        return resp.json()
    else:
        print(resp.text)
        return {}


def save_data(name):
    resp = get(uri=slug(name))
    results = resp["results"]

    while resp["next"]:
        resp = get(uri=resp["next"])
        results += resp["results"]

    with open(f"{name}.json", "w") as f:
        json.dump(results, f)

    print(f"{len(results)} {name} results saved")

save_data("submissions")
save_data("reviews")

From there, it’s a process of installing and invoking the apps:

$ pip install datasette sqlite-utils
$ sqlite-utils insert data.db submissions submissions.json
$ sqlite-utils insert data.db reviews reviews.json
$ datasette serve data.db

You then have a way to view and query your API data in datasette!