A REST View of GraphQL
A REST View of GraphQL
Software designers often compare GraphQL, a language specification for defining, querying, and
updating data, and REST, an architectural style that describes the Web. We'll explore why this
comparison doesn't make sense, and what questions we should be asking instead. In this article,
we will talk about:
What is REST?
REST is an architectural style that came out of Roy Fielding's PhD thesis in 2000. The work
examined the properties that made the World Wide Web successful, and derived constraints that
would preserve these properties. Roy Fielding was also part of the HTTP/1.0 and HTTP/1.1
working committees. Some of these constraints were added to the HTTP and HTML specs.
Before understanding REST, it is useful to look at the different kind of participants on the web:
Websites: Programs that serve content for humans to consume and interact with on a browser.
API Service Providers: Programs meant to enable other programs to consume and interact with
data
API Clients: Programs written to consume and interact with data from an API Service Provider.
Note that a program can act in multiple roles. For example: an API service provider can also be a
client consuming API's from another API service provider.
Also note that the internet and the World Wide Web are not the same. There are other participants
on the internet that we don't talk about here (mail servers, torrent clients, blockchain based
applications, etc)
Communication between utilities is done only using stdin and stdout via a text interface
Re-usability: Allows mixing and matching any two utilities if the second can process the data from
the first. For example, I can pipe the output from cat or ls or ps to grep. The author of grep does
not worry about where the input is coming from.
But in following this constraint, we add latency to the processing since each utility must write its
output to the stdout and the next utility must read it from stdin.
An alternate design can be for ls, grep, cat, etc to be libraries with well defined interfaces. The
end user is then expected to write programs integrating different libraries to solve their problem.
This system would be more performant and roughly as reusable as the earlier system, but would be
more complicated to use.
Software design is about identifying the set of design constraints that best serve the requirements.
Let's talk about the constraints that you should follow to build a "RESTful" service:
Use the client/server pattern for communication
Keep the server stateless by having each request send everything required to process that request.
Uniform interface
Servers must expose resources with unique IDs
Requests and responses must have all the information to be able to interpret them, i.e they must be
self-descriptive
Hypermedia as the engine of application state (HATEOAS): Clients must rely only on the
response to a request to determine the next steps that can be taken. There must be no out-of-band
communication related to this.
Layered system: Each component in a system must only rely on the behaviour of systems it
immediately interacts with.
Code on demand: This is an optional constraint. A server can send code to be executed by the
client to extend the functionality of the client (JavaScript, for example).
The purpose of these constraints is to make the web simple to develop on, scalable, efficient, and
support independent evolution of clients and servers. This article explains these constraints in
more detail.
REST in Practice
HTTP has been the protocol of choice for implementing the REST architectural style. Some of the
constraints such as the client/server pattern, marking resources as caching, and a layered system
are baked into HTTP. Others need to be explicitly followed.
The uniform interface & HATEOAS in particular are the most frequently violated REST
constraints. Let us look at each of the sub-constraints:
The different components of a REST Request
By convention the URI plays the role of the resource ID, and HTTP methods are the uniform set of
actions that can be performed on any resource. The first constraint is automatically followed by
definition when using HTTP. Most backend frameworks (Rails, Django, etc) also nudge you in the
direction of following the second constraint.
Clients retrieve and manipulate resources using media types such as HTML or JSON. A media type
such as HTML can also contain the actions the client can perform on the resources, e.g. using
forms. This allows independent evolution of servers and clients. Any client that understands a
particular representation can be used with any server that supports that representation.
This constraint is useful if you are expecting to have multiple services serving similar types of data,
and multiple clients accessing them. For example, any web browser can render any page with
content type text/html. Similarly, any RSS reader will work with any server that supports
application/rss+xml media type.
Self-descriptive messages: requests and responses must have all the information to
be able to interpret them
Again, this allows services and clients to independently evolve, since clients do not assume a
particular response structure. This works well in case of web browsers or RSS readers. However,
most API clients are built to access a particular service and are tied to the semantics of the
response. In these cases, the overhead of maintaining self-descriptive messages is not always
useful.
This constraint is also expected to allow services and clients to independently evolve, since clients
don't hard-code the next steps available. HATEOAS makes perfect sense if the consumer is an end
user using a browser. A browser will simply render the HTML along with the actions (forms and
anchor tags) that the user can take. The user will then understand what's on the page and take the
action they prefer. If you change the URL or form parameters for the next set of actions available
to the user, nothing has to change in the browser. The user will still be able to read the page,
understand what's happening (maybe grudgingly) and take the right action.
Unfortunately, this won't work for an API client:
If you change the parameters required for an API, the developer for the client program most likely
has to understand the semantics of the change, and change the client to accommodate these
changes.
For the client developer discovering API's by browsing through them one by one is not very useful.
A comprehensive API documentation (such as a Swagger or Postman collection) makes more
sense.
In a distributed system with micro services, the next action is often taken by a completely different
system (by listening to events generated fired from the current action). So returning the list of
available actions to the current client is useless.
So for API Clients, following the HATEOAS constraint does not imply independent evolvability.
A lot of API clients are written to target a single backend, and some amount of coupling will always
exist between an API client and the backend. In these cases we can still reduce the amount of
coupling by making sure:
Adding new parameters to an API call should not break existing clients
It should be possible to add a new serialization format (Say JSON to HTML or protobuf) for a new
client.
#1 and #2 above are usually achieved by API versioning and Evolution. Phil Sturgeon has great
posts on API Versioning and Evolution that talk about various best practices. #3 can be achieved
by respecting the HTTP Accepts header
What is GraphQL?
GraphQL is a language for defining data schemas, queries and updates developed by Facebook in
2012 and open sourced in 2015. The key idea behind GraphQL is that instead of implementing
various resource endpoints for fetching and updating data, you define the overall schema of the
data available, along with relationships and mutations (updates) that are possible. Clients then can
query the data they need.
# Our schema consists of products, users and orders. We will first define these
types and relationships
type Product {
id: Int!
title: String!
price: Int!
}
type User {
id: Int!
email: String!
}
type OrderItem {
id: Int!
product: Product!
quantity: Int!
}
type Order {
id: Int!
orderItems: [OrderItem!]!
user: User!
}
type OrderFilter {
id: Int
userEmail: String
userId: Int
productTitle: String
productId: Int
}
#query orders
orders(where: OrderFilter, limit: Int, offset: Int): [Order]
}
type OrderInput{
orderItems: [OrderItemInput!]!
user: User!
}
scalar Void
The key advantage of GraphQL is that once the data schema and resolvers have been defined:
Clients can fetch the exact data they need, reducing the network bandwidth required (caching can
make this tricky--more on this later).
Frontend teams can execute with very little dependency on the backend teams, since the backend
pretty much exposes all the data that is possible. This allows front end teams to execute faster.
Since the schema is typed, it is possible to generate type-safe clients, reducing type errors.
To really achieve #1 and #2, resolvers will usually need to take various filter, pagination, and sort
options.
Optimizing resolvers is tricky because different clients will request for different subsets of the data.
You need to worry about avoiding N+1 queries while implementing resolvers.
It is easy for the client to construct complex nested (and potentially recursive) queries that make
the server to a lot of work. This can lead to DoS for other clients.
GraphQL makes it effortless for frontend developers to iterate, but at the expense of backend
developers having to put in additional effort upfront.
With Hasura, you do not have to worry about the first 3 problems since Hasura compiles your
GraphQL queries directly to SQL queries. So you do not write any resolvers while using Hasura.
The generated queries use SQL join to fetch related data avoiding the N + 1 query problem. The
allowed list feature also provides a solution to #4.
GraphQL vs REST
Having understood what GraphQL and REST are, you can see that "GraphQL vs REST" is the
wrong comparison. Instead, we need to be asking the following questions:
GraphQL is consistent with the REST constraints of client/server, statelessness, layered system
and code on demand since GraphQL is usually used with HTTP and HTTP already enforces these
constraints. But it breaks the uniform interface constraint, and to some extent, the cache
constraint.
The cache constraint states that responses to requests must be labelled as cacheable or not. In
practice, this is implemented by using the HTTP GET method, and using Cache-Control, Etags
and If-None-Match headers.
In theory, you can use GraphQL and remain consistent with the cache constraint if you use HTTP
GET for the query and use the headers correctly. However, in practice, if you have vastly different
queries being sent by different clients, you will end up with a cache key explosion (since each
queries result will need to be cached separately) and the utility of a caching layer will be lost.
GraphQL clients like Apollo and Relay implement a client-side cache that solves these problems.
These clients will decompose the response to a query and cache the individual objects instead.
When the next query is fired, only the objects not in the cache need to be refetched. This can
actually lead to better cache utilization than HTTP caching as even parts of the response can be
reused. So, if you are using GraphQL in a browser, Android or iOS, you do not have to worry about
client side caching.
If you need shared caching (CDN's, Varnish, etc) however, you need to make sure you are not
running into a cache key explosion.
Hasura Cloud supports data caching by adding an @cached directive to your queries. Read more
about how Hasura supports both query caching and data caching
GraphQL breaks the Uniform resource constraint. We've discussed above why it is okay for API
clients to break this constraint in certain scenarios. The uniform resource constraint is expected to
bring the following properties:
Simplicity - Since everything is a resource and has the same set of HTTP methods applicable to it
We've recently added Relay support to Hasura. Since Hasura auto generates GraphQL queries and
mutations from your database schema you also automatically get a uniform set of actions for each
of your resources.
The web of the 1990's and 2000's is different from today's web:
Front end applications we build today are far richer and offer far more functionality.
They are also increasingly built in javascript framework such as React, Vue, Angular instead of
rendering templates from the backend. The javascript application then becomes an API client to
the backend.
Front end applications are also increasingly accessed over mobile networks which are often both
slower and flaky
We need to take these into account while designing current systems. GraphQL helps build
performant applications in this context.
As described in the GraphQL section, the key advantage of using GraphQL is that data fetching on
the front end becomes easier and this makes iterating on the front end much faster. The tradeoffs
are:
The uniform resource constraint is broken. For the vast majority of API clients, this should not be a
problem.
You need to make sure every possible query that can be fired will be served efficiently.
With Hasura you can solve #2 above by using the permission system to disallow aggregate queries
or by setting an explicit list of allowed queries.
Hasura cloud also comes with a solution for the caching problem.
Can't I just use query parameters to specify the exact data I need without breaking
REST?
Yes, you can, for example by supporting the sparse field-sets spec. You are better off using
GraphQL in practice though since:
Any sort of query language will need parsing and implementation on the backend. GraphQL has
much better tooling to do this.
With GraphQL, the response shape is the same as the request shape, making it easier for clients to
access the response.
However if you do need the uniform interface constraint to be followed, this is the approach you
should take. Note that using sparse field-sets will also potentially lead to a cache key explosion,
breaking shared caching.
Conclusion
We've looked at both GraphQL and REST, and whether it makes sense to compare them. System
design goes through the following process:
Derive design constraints to meet the requirements (especially the non-functional ones)
REST describes these constraints for the Web. As such, most of these constraints are applicable to
most systems. For example, organizing your APIs by resources, having IDs for each of your
resources, and exposing a mostly common set of actions are all best practices for GraphQL API
design as well (and for that matter an RPC based system as well).
If you want to try out GraphQL, head over to the learn course. Hasura is a quick way to get started
and play with GraphQL, since your backend is set up automatically.