Paulo Lopes

10 things I learned making the fastest js server runtime in the world

This presentation is about server performance, which means that no time in the world would be enough to cover it all. Hopefully, I can share with you the top

#10 things I’ve learned while putting JavaScript on the top of the server side benchmarks.

You will learn about runtimes and engines, how some are more capable than others, and sometimes the obvious choice is not always the right one…

This talk is about thinking outside of the box, being creative and don’t take anything for granted. We will debunk myths about native code vs script or RAM usage, it’s going to be fast! I promise!

Portrait photo of Paulo Lopes

Transcript

Thank you. Thank you, everyone. Well, there's one thing I've learned working with my team that I would like to share and that I will never forget is that we know that writing fast applications makes our users and our customers happy. So, who doesn't want to write fast code? Raise your hand.

Naw. That's interesting.

So, before we start, I have a couple of questions that I need to ask. And then I'll see what the Internet tells us. So, the first question is, if you go on your favorite web browser and your favorite search engine and you type "Is JavaScript fast? How fast is JavaScript?" Probably get something like this that says, and I'm just quoting," Under the right circumstances it is very fast. Actually, as fast as C." If you search again, another result would be, "Why is it so fast? It is because as soon as you understand event loop and how it processes requests, you realize it's so fast." You start to see a pattern, because fast is because it's fast.

And you keep going on and you get stuff like, "How can it be so fast since it's a single thread?" And the answer, like in this example is because it's lightweight. We keep going and then you find this interesting question." How fast is it compared to Java?" Well, because most recruiters think that Java and JavaScript is the same thing, which is kind of interesting. So, if you look around on the Internet, you see JS is  and shines when it comes to a huge amount of short connections.

And finally, I could be all day showing Google results. But what does it make faster than Java? Well, and the answer is because the sync ecosystem is more than 50,000 modules written in asynchronous style. It's kind of a strange answer to the question. But giving all these questions, we need to ask, do we trust the Internet? Like the Internet is full of stories, and like some Game of Thrones characters said a few weeks ago, stories connect people.

However, stories are not exact science. And above all, they should not drive us as a software engineer.

So, my interpretation is, do I trust the Internet? No. I don't. And why? Because I am a software engineer, as you can  if you don't know who invented  who coined term  you can go in our exhibition hall and there's an explanation there who did it. And you see that if you look on the dictionary for software engineering, it says engineer is the application of science and mathematics by which the properties of matter and the sources of energy and nature are made useful to people.

So, as a software engineer, we should apply science and mathematics to solve our problems. So, going back to the question, is JavaScript fast? We need  we must be able to reproduce a problem. We must be able to explain the results. And reproduce the results.

So, I think the right answer is, is JavaScript fast? I don't know. From these results, it's not clear.

So, starting now with the main topic. Like when I was planning the talk, I needed a title. So, I ended up with so things I learned make the fastest JavaScript server runtime in the world. I carefully decided to pick the word "Server." Going to Wikipedia for a definition, a server is a computer in the network of users that is used to provide services to other computers in the network.

So, what I'm about to tell you is not about command line applications or lambdas. It's about long running processes. So, we need to also define what is fast. So, because when I say fast, I don't mean I'm fast because I put my server on a race car and the car running around.

No. What I mean by fast is we need to obtain a common set of metrics.

And for this, I'm using what the site reliability engineering has found out. So, if don't know anything about site reliability engineering, there's this interesting link with nice books from Google. And Google has one of the biggest teams on SRE. And SRE has identified five golden signals.

So, golden signals are critical to the monitoring teams to monitor their systems and identify problems before they become really big problems. So, there are many metrics to monitor. But this team  this team  this SRE team  showed that rate errors and latency, saturation utilization contain virtually everything you need to know about what's going on and where. Getting the signals is quite challenging and relies on a lot of the tools and services you have at your disposal.

But for now, I'm just considering rate as in request per second. Errors like in errors per second, of course. And latency as in like response time including waiting and queuing.

So, focusing on wait, errors and latency. I'm focusing on the software. I'm not focusing on the hardware or in the operating system. So, a typical server application has a wellset  wellknown set of characteristics. We need to know how the application behaves and only once we understand that, we can talk about it.

So, what is a server application? So, my definition of server application is a long running process that should be deployed on a cloud or in bare metal. And it should be attached to a fast network. Otherwise the network becomes your bottleneck. And, of course, should have enough CPU and memory. So, your application is not strained by your hardware.

So, a longrunning process has different characteristics from a shortrunning process, of course. So, in a longrunning process, the startup and warming up is not really relevant in the fullspan life cycle of the server because it's a very tiny moment. Again, this isn't true if you're talking about web applications on your browser. Because you want to be as fast as possible because that's what's drives the happiness of your users.

So, now we need to define our two major things. Our two benchmark things. Most Internet articles will tell you how fast something is. But most of the time when you read the whole article, you see some graphs, really nice graphs. But the information about how the tests will perform and how the results were obtained is needed.

From an engineering perspective, this is incorrect. We should be able to reproduce the test and the results. In a lab that gives you more or less exactly the same results of course.

On top of that, the experiment was we need to confirm that the results are not biased. So, when I write a benchmark, I don't want to make it be my friend and enemy of the others. It needs to be fair. And writing benchmarks, of course, is hard.

Because first every benchmark you write will never represent a real-world use case. It's always like a tiny subset that doesn't really represent your application. So, you need to get into conclusions from just looking at the tiny bit of your life cycle.

So, getting peers to review your code can be really hard to find. And getting peers to  that are willing to review what you wrote is even harder. So, what I'm trying to tell is that benchmarking is hard. And, however, there is a very popular benchmark out there that is called the tech and power frameworks benchmark. Why is this benchmark so interesting to me? Well, to me it's like taking power.

A benchmark shows you the true nature of open source. It has more than 500 contributors. So, more than 500 different people have contributed to tests and reviewed the tests.

There are more than 3,000 merged pull requests. So, lots of people spend time reviewing or adding new tests to the framework. And they already have more than 10,000 commits. So, it shows that it's kind of a big project.

It's not something that someone just planned in the weekend. Oh, I want to check on my framework, how it works. No, it's something that has been growing steadily for the last couple of years.

And it already tests more than 630 different frameworks. And these frameworks are written in different languages. So, this makes my life easier because I don't need to invent my own benchmark. I don't need to explain it. I can just use it to prove what I want to say.

So, if you want to have the link, this is like their GitHub repo. And from the GitHub repo you can get to the main website, of course.

And as I said, there are like 630 different frameworks. So, if I would try to print on the screen how it looks right now, well, it wouldn't fit on the screen. So, what I did, I just rotated my screen. I took a screenshot. And don't worry about the size.

It's not really relevant. What I'm trying to say is that there are lots of frameworks that are already being tested. And the quick question I want to ask the audience is, like, can you spot the best result for the JavaScript framework on this graph?

So, probably you cannot because it's very small. So, I have here a helper. You'll find that as shocking as it can be, the first entry for JavaScript ranks at number 89. Which performs at about 22.7% of the performance of the best result.

So, if you look at this, and think, well, we all have this idea that JavaScript is fast, but results prove things wrong. That it's not as fast as we think it is. So, what we need to do is that we need to look under the hood.

So, before we can do any optimization, we need to understand what's going on. And we shouldn't jump into conclusions and start tweaking the code of the benchmark. Because otherwise we are just yak shaving. And you're not really looking into the problem.

You're just trying to mitigate what could be the cause. So, instead of this, we need to take a scientific approach. And if you haven't learned anything about profiling in other applications, I would recommend for you to look at the tutorial on the NodeJS website on profiling.

So, just to give you like in a nutshell the information from the  from this tutorial. That if you look at one of the tests of the benchmark, which is a very simple return, hello world string from an HTTP server. The best result that you could  that you saw on the benchmark was implemented like this. So, it uses the cluster module.

The cluster module will fork the node process for the number of CPUs that the environment has. And then it uses the express server to set the content type and send the response.

Okay. Probably the express is not the most performant library out there. But this is just for illustration.

So, once we do this and we do profiling, we get a flame graph. So, flame graphs are really interesting too when we're talking about performance because they give you a visual explanation on where your CPU time is spent. The width of the bars or the coloring doesn't really matter. The coloring is just to give it  make it nice and it's called flame graphs because usually we paint it from red to yellow like a flame. But what is important to notice is that as you go from bottom up, you see where the code is spending most time on your CPU.

So, if you observe this, you basically  what the flame graph is telling you is that there is a very tiny piece on the top where JavaScript code is being spent on. And then there's lots of time where it's spent on native. And native means the Node bindings, V8 will leave for the sync IO and also for the event loop.

So, once we start trying to optimize this, the code, we end up like trying to optimize just the tip of the iceberg. You cannot optimize everything. Because most of the time  and if I would go back  most of the time here is spent on native code. So, you're just optimizing the tip of the iceberg.

So, this makes you think, right? This is interesting. What can we do about this? If you ask yourself, what is the first thing that comes on your mind when I say, JavaScript engine? Most of you will say V8. So, if you look at the mission statement of the V8 project, it reads something like speed up real world performance for more than JavaScript and enable developers to build a faster future web.

So, performance on V8 is great. But there are more engines out there. So, if you look at the table, not an authority on JavaScript engines. Just lists the compatibility of ES6 across many. There you can see engines like ChakraCore, SpiderMonkey, Safari.

And there was a new one added last year, crowdjs. So, what my experiment was all about is that, well, I should try other engines. Because if most of the CPU is spent on native, probably I should look into engines that handle this JavaScript runtime in a different way.

So, I decided to look into rawjs. So, raw JS is an extension of the Java machine that supports more languages and execution models. So, the project includes a new hey performance compiler called Raw because as you know the most difficult thing in computer science and science is naming things. You all it also Graal because it is interesting. And the objective of Graal is to improve the performance of the machine on any language.

And another goal is to allow free form mixing of any programming language in a single program. So, it allows you to do polyglot programming. So, on the same program you can use Java, Scala, Ruby, Rust, C++. And what's interesting about this is that because it's a new project and it's all up to date, they offer a modern JavaScript runtime based on ES2019, ES2020, which isn't released yet but they already implemented most of the features.

And the ultimate goal is a very fast server. But I don't want to change my programming language. I want to stay on JavaScript. So, if I look at the definition of rawjs on their website, their goals are to execute JavaScript code with the best possible performance. They have full support for the latest ES specification.

And the fast interoperability with all the languages on  either on the JVM or the language supported by Graal like Ruby, Python and R. There is research around this because this project, although it was open sourced last year, it's been running for more than eight years behind closed doors. It's just been opened now because now they feel that it's like in a real stable mature project.

So, the people working and researching on this have already shown that the engine is slightly better or on par with V8 for just pure language benchmarks. And you can read more about the paper there.

So, and although you can even run like unmodified Node applications on it because it just allows you to just replace V8 from Node, I need to formulate a hypothesis. What if we create a project that I would call agnostic for X that first will replace V8 with a project. Second, will replace the Eclipse vortex. Will replace the V8 with a Graal compiler.

Will not have Node bindings. It will have text definitions.

This will be discarded at runtime. The code that you don't run is the best code, it's the fastest. You don't need to run it. And offer a basic JS and loader. And basic compatibility and allows you to develop and profile the application with the tools you already know like the Chrome DevTools.

So, if we were going to implement the previous example that I showed with Node and express using this new style, this is how the old express code would look like. I guess it's not that hard to understand what's happening here. The important thing here to notice is that the library I chose, vortex, by default uses all the available cores on your machine so you don't need to use a cluster module to do forks. This is all handled behind the scenes for you. And Vortex provides us an optimized sync IO build on to have of an open source project used by big names like Google, Twitter, Netflix just to name a few.

If you want to test this, first thing, well, you need to install a very simple application called ES4XPM short for project manager. We cannot run Node directly. Need to run through ES first.

If I show you, this is how it looks. If I create a project. I can make it like with a new module syntax. And I recorded this so  because I'm afraid that I wouldn't have enough time. So, just have a couple of dependencies.

This is pure npm stuff. I just use Vortex and web because I want to do a web application. So, I create  and you can even do like the ES6 modules. I can say, okay.

My home page is a function that I will export that will just say, hello. Hello from Vortex Plus ES4X. And, of course, now I need to bootstrap a server. I get my index, which is like my main application.

And again, just import some code from the vortex library. I now import my route from the module I just created. Route.

And now I just bootstrap the  bootstrap my application. So, I create the router. It's kind of the same idea as the express server. So, I now create the router  a route on home. And I just paste my callback.

And now I create the server. I specify who will handle my server request. Who will be my router? And I start listening on port 8080. Same hello message.

So, I'm running. So, now I can just install  starting to make npm and  doesn't really matter. There are a couple of utilities. I can quickly get all of my application running on VSCode.

You can see it's already bugging. I can put a break point. And if I put a break point and now make an HTTP request, you see that the request there is stopped.

And what's interesting here to see is that due to the nature of GraalVM, you can see on the debugger, the code from the Java side and the code that you wrote. Everything is optimized. So, the expectation is that once you write code in this way, your user code plus your runtime plus your interop plus your engine plus your IO libraries plus the whole world that runs your application. In this case, the Graal JDK. It will all be optimized by Graal.

Not just the script itself. Not just optimizing the tip of the iceberg, you're optimizing everything. To test it, I submitted to the tech and power implementation using this project. It was reviewed and got accepted and this is how things are.

This is like the CI builds. You see now ES4X is ranking on number five which brings JavaScript from number 86 if I'm not mistaken to number five in simple database query test that gets results. And ranked number six when doing multiple queries. So, you see the parallel loading and testing.

So, if I have to compare now this experiment with all the frameworks that were already on the benchmark, this is how it compares. So, when working with JSON we see that the results give you like two times better results than the previous best result.

When going on a post base, doing one query, it's three and a halftimes better. But if you have to be fair, testing for the best previous one running on Postpress, it's six times better doing multiple queries. There's lots of concurrency going on. It's still two and a halftimes better than the previous one. And doing data updates where the query is really the issue, it's like five times better than the previous one.

To put this in numbers if you think about request response, you see that IO is better, of course. I'm not talking about a very small improvement, tiny improvements. I'm talking about huge numbers. So, the final tip is that optimization is like a neverending job. So, for example, we could get better results if we used an enterprise edition of Graal instead of the open source edition.

That gives you like 20% better performance. And because it's optimization, it's a neverending job. You need to rinse and repeat and go just like that.

So, the key points I want to give is that there's nothing wrong with JavaScript. JavaScript can be fast. And probably you don't need to switch to Go, Rust, whatever, because you're having performance issues. You can still  if you dare to experiment, you can still remain on JavaScript. So, if you want to learn more you can.

Either find me on Twitter, GitHub, the source code forum on GitHub. And if there are any questions, you can catch me later. Thank you.

[ Applause ]