Using JSON functions in PostgreSQL to support GraphQL

Hold on Cowboy

This blog post is pretty old. Be careful with the information you find in here. It's likely dead, dying, or wildly inaccurate.

I love PostgreSQL, always have. I remember, it was easily more than a decade ago, when I first ran into PostgreSQL (PG). It seemed all major open source projects used MySQL. When I learned PG was also open source, plus had more SQL compliant features, I knew it would someday become the de facto open source database. At the time, the client tools, support, and pure speed were not there, but the foundation for a great product existed.

Anyway, this post is not about my ❤️ for PG, but about how PG can help your GraphQL efforts. First let’s start out with a database example. Here we have a table for people and the corresponding emails for those people.

CREATE TABLE people (
    id          serial PRIMARY KEY,
    fname       varchar(40) NOT NULL,
    lname       varchar(40) NOT NULL
);

CREATE TABLE emails (
    id          serial PRIMARY KEY,
    email       varchar(256) NOT NULL,
    person      integer REFERENCES people (id),
    primary     boolean DEFAULT false
);

Go ahead and load the query into an example database. If you have docker installed, you can bring up PG easily with docker run --name pg -p=5432:5432 -d postgres and use any PG client (like Postico).

You said something about GraphQL?

Yes. We are of course using GraphQL in our API servers because it just makes sense (I think you’ll get rid of REST once you give GraphQL a solid chance). So we have a GraphQL schema to resembling this type definition.

Query {
  person(id: ID): Person
}

type Person {
  id: ID
  fname: String
  lname: String
  emails: [Email!]!
}

type Email {
  id: ID
  email: String
  primary: Boolean
}

Here each person has an array of emails they could have associated with their record. The GraphQL client request would look something like this.

query {
  person(id: "1") {
    id
    fname
    lname
    emails {
      id
      email
      verified
    }
  }
}

So What’s the Big Deal?

Everything looks fine. Our client will request the GraphQL. The GraphQL resolvers (I’m using Apollo) will happily fetch the person from PG, then will take the person’s ID and fetch the emails for them. You’re resolvers might look like this

exports.type = {
  Person: {
    emails (person) {
      // SELECT * FROM emails WHERE person = person.id
    }
  },
}

exports.query = {
  person (_, { id }) {
    // SELECT * FROM people WHERE id = id
  }
}

Time for Some Postgres Muscle

The setup we have is nice, but it’s not going to scale. Think about loading a list of people, maybe 100. Then for each person you would have to make a database call to resolve their emails individually. You just flicked the insane dial up to 11. There are a number of ways to handle this seemingly weakness of GraphQL. One such method is to use the DataLoader utility provided by Facebook. DataLoader will help you batch your queries for a request. This would work, but could also get out of hand too if you needed to join other tables on top of the emails table.

Let’s look to the JSON functions PG supplies.

json_agg

This handly little function will allow us to collect all our emails and represent them as a JSON array for a field. You’ll notice this query uses the json_agg on an entire table emails and then aliases it as emails (otherwise the column would be called json_agg, yuck). Next you’ll see we use the GROUP BY to combine all the people records into a single column. Otherwise the LEFT JOIN emails makes multiple rows in our result set and the json_agg function needs to work with an aggregated result set.

SELECT people.*, json_agg(emails) as emails
FROM people
LEFT JOIN emails on emails.person = people.id
GROUP BY people.id

This query will return a response like this in JSON

[{
  id: 1,
  fname: "John",
  lname: "Stillwell",
  emails: [
    {"id":1,"email":"[email protected]","person":1,"primary":true}
    {"id":2,"email":"[email protected]","person":1,"primary":false},
    {"id":3,"email":"[email protected]","person":1,"primary":false},
  ]
}]

Now when a person is fetched from the database, we populate the emails as well in the same database call. This saves in performance and you no longer need the resolver for emails, but just in case you might happen to need to resolve the emails for a person, you can revise your Person type to a function taking advantage of Node.js new async / await features.

exports.type = {
  Person: {
    async emails (person) {
      if (person.emails) return person.emails
      const emails = await // SELECT * FROM emails WHERE person = person.id
      return emails
    }
  },
}

to_json

Another handy function from PG is the ability to just include a row as JSON. The to_json function is similar to the json_agg, but it does not aggregate a collection of rows. It takes one row and represents it as a JSON object. You would use this for a one to one type of relationship.

Let’s say our Person has a spouse, we want to include this subrecord for our person. Let’s modify our GraphQL definition for a Person like this

type Person {
  id: ID
  fname: String
  lname: String
  emails: [Email!]!
  spouse: Person
}

Now to populate the spouse field we could use to_json to grab the field by its id and populate spouses

SELECT people.*, json_agg(emails) as emails, to_json(spouses) as spouse
FROM people
LEFT JOIN emails on emails.person = people.id
LEFT JOIN people spouses on people.spouse = spouses.id
GROUP BY people.id, spouses.*

This will produce a result much like

[{
  id: 1,
  fname: "John",
  lname: "Stillwell",
  emails: [
    {"id":1,"email":"[email protected]","person":1,"primary":true}
    {"id":2,"email":"[email protected]","person":1,"primary":false},
    {"id":3,"email":"[email protected]","person":1,"primary":false},
  ],
  spouse: {
    id: 1,
    fname: "Dorothy",
    lname: "Stillwell",
  }
}]

JSON versus JSONB? That is the question. For my own purposes, I store items as JSONB in the database, it seems to be a better, faster format. However, when I’m just working with rows and JSON objects I just use JSON. The jury is still out for me.

Bonus jsonb_set

This is a newer function and really not used much by me, but could aid your efforts to return data in a shape GraphQL enjoys.

SELECT people.*, jsonb_set('{ "meta": {}}'::jsonb, '{meta, primaryEmail}'::text[], to_jsonb(emails.*)) as meta
FROM people
LEFT JOIN emails on emails.person = people.id AND emails.primary IS TRUE
GROUP BY people.id, emails.*;

You can see we use our friendly to_jsonb function to convert our email row to a JSON object. Take a look how we use jsonb_set to take an existing object { "meta": {}}, then specify a path {meta, primaryEmail}. Last we assign a value of the path to to_jsonb(emails.*). The result is a nicely formed JSON object like so.

[{
  id: 1,
  fname: "John",
  lname: "Stillwell",
  meta: {
    primaryEmail: {
      "id":1,"email":"[email protected]","person":1,"primary":true
    }
  }
  ...
}]

Conclusion: PG can easily handle all of your JSON needs

We’ve seen here just a small taste of what PostgreSQL can do converting responses into heart pleasing JSON. If you’re interested in more JSON functions goodness from PG, then head over to their docs or query StackOverFlow.

Until next time, enjoy our Wild America — Marty Stouffer