I/O-bound vs CPU-bound in Node.js
You may have heard that Node.js is good for I/O-bound applications. And the commonly mentioned counterpart is the CPU-bound application. You may have wondered what do these terms actually mean.
"What do the terms 'CPU bound' and 'I/O bound' mean?"
Bound implies performance bottleneck
Computation is said to be bound by something when that resource is the bottleneck for achieving performance increase. When trying to figure out if your program is CPU, memory or I/O-bound, you can think the following. By increasing which resource would your program perform better? Does increasing CPU performance increase the performance of your program? Memory, hard disk speed or network connection? All of these questions lead you to right to the source, of which resource your program is being held upon.
Express.js app is I/O bound
This is the case for the typical Node.js web server application. Most of the time is spent waiting for network, filesystem and perhaps database I/O to complete. Increasing hard disk speed or network connection improves the overall performance.
In its most basic form, Node.js is best suited for this type of computing. All I/O in Node.js is non-blocking, and it allows other requests to be served while waiting for a particular read or write to complete.
SHA-1 hasher is CPU-bound
An example of a CPU-bound application would be a service that calculates SHA-1 checksums. The majority of the time is spent crunching the hash - shifting and bitwise xorring the input string.
This kind of application leads to trouble in Node.js. If the application spends too much time performing a CPU-intensive task, all other requests are being held up. Node.js runs a single-threaded event loop to concurrently advance many computations, for example, serving multiple incoming HTTP requests. This works well as long as all event handlers are small and yet wait for more events themselves. But if you perform CPU intensive calculation, your concurrent web server Node.js application will come to a screeching halt. Other incoming requests will wait as only one request is being served at a time - not a very good service level.
There are strategies for coping with CPU intensive tasks. You can separate the calculation to elsewhere - forking a child process or using cluster module, using low level worker thread from libuv or creating a separate service. If you still want to do it in the main thread, the least you can do is give the execution back to the event loop frequently with nextTick(), setImmediate() or await.
Not every Node.js program needs high level of concurrency. For example, when running webpack during build, it doesn't really matter if it's CPU intensive or not. Sure it may take time, but it's running a dedicated job instead of serving HTTP requests.
Healthy Node.js application is I/O-bound
A typical healthy Node.js server application is I/O bound. That is what Node.js was designed for and handles well using the single-threaded event loop. CPU bound tasks cause trouble if not handled correctly - by yielding execution frequently back to the event loop or moving it to another thread, process or service. There also exist other classifications that we did not touch here, such as memory-bound or cache-bound.
Related articles
- Let asynchronous I/O happen by returning control back to the event loop
- Event loop from 10,000ft - core concept behind Node.js
Semantic Versioning Cheatsheet
Learn the difference between caret (^) and tilde (~) in package.json.