Serverless Best Practices - Paul Johnston - Medium
Serverless Best Practices - Paul Johnston - Medium
Within the community we’ve been debating the best practices for many
years, but there are a few that have been relatively accepted for most of
that time.
And remember that best practices are not “the only practices”. Best
practices rely on a set of underlying assumptions. If those assumptions
don’t t your use case, then those best practices may not t.
The problem with one/a few functions running your entire app, is that
when you scale you end up scaling your entire application, rather than
scaling the speci c element.
If you have one part of your web application that gets 1 million calls,
and another that gets 1 thousand calls, you have to optimise your
function for the million, whilst including all the code for the thousand.
That’s a waste, and you can’t easily optimise for the thousand. Separate
them out. There’s so much value in that.
There are a very few edge cases where this is a valid pattern, but they
are not easily broken down.
Basically, don’t do it. You simply double your cost, and make debugging
more complex and remove the value of the isolation of your functions.
Functions have cold starts (when a function is started for the rst time)
and warm starts (it’s been started, and is ready to be executed from the
warm pool). Cold starts are impacted by a number of things, but the
size of the zip le (or however the code is uploaded) is a part of it. Also,
the number of libraries that need to be instantiated.
The more libraries that need instantiating, the slower it is to cold start.
Things like express are built for servers, and serverless applications do
not need all the elements in there. So why introduce all the code and
dependencies? Why bring in super uous code? It’s not just something
that will never get run, but it could introduce a security risk.
There are so many reasons for this being a best practice. Of course, if
there is a library that you have tested, know and trust, then absolutely
bring it in, but the key element there is testing, knowing and trusting
the code. Following a tutorial, is not the same thing.
This one will get me into the most trouble. A lot of web application
people will jump on the “but RDBMS are what we know” bandwagon.
To be honest, serverless people are not against RDBMS, they are against
connections. Connections take time, and if you imagine a function
scaling up, each function environment needs a connection, and you’re
introducing both a bottleneck and a I/O wait into the cold start of the
function. It is needless.
The biggest point to make here is that serverless architecture may well
require you to rethink your data layer. That’s not the fault of serverless.
If you try to reuse your current data layer thinking and it doesn’t work,
then it’s probably a lack of understanding serverless architectures.
But then, you were using some sort of con guration management tool
anyway to run everything weren’t you? And you already used CI and
CD tools of some sort right? You still have to DevOps with serverless.
Going back to the functions not calling other functions, it’s important to
point out that this is how you chain functions together. A queue acts as
a circuit breaker in the chaining scenario, so that if a function fails, you
can easily drain down a queue that has got backed up due to a failure,
or push messages that fail to a dead letter queue.
With client applications with a serverless back end, the best approach is
to look into CQRS. Separating out the point of retrieving data from the
point of inputting data is key to this kind of pattern.
It’s not always possible, but try to avoid querying from a data lake
within a serverless environment.
Serverless requires you to rethink your data layer signi cantly. This is
the biggest gotcha with new people coming to serverless who tend to
reach for the RDBMS and fall at not only because the scaling catches
them out, but their data structures become too rigid too fast.
You will nd that your ows will change as your application changes
and scale will change all of it. If all you have to do is redirect a ow it’s
easy. It is far harder damming a lake.
I know this point is a bit more “out there” than others, but it’s not a
straight forward one to make.
If you don’t consider your application and how it will scale then you
set yourself up for problems. If you make something with a slow cold
start (lots of libraries and using an RDBMS for example) and then get a
spike in usage, you could end up signi cantly increasing concurrency of
your function, and then maxing out your connections, and slowing
your application down.
So, don’t just drop an application in, and then imagine that it will work
the same under load. Understanding your application under load is still
part of the job.
Conclusion
There are lots more things I could have put in here and this is my
opinion about the things that I have to explain most to people when I
talk to them.