Event Loop in Node.js - Concurrency Model
In the last article, I mentioned that the Node.js uses the event-loop and callback mechanism to lift off I/O operations from the javascript's thread to the system's kernel. And in this article, we are going to understand how this actually happens.
If you have not read the previous article then I highly recommend you to read it before proceeding further: Link: Blocking and Non-Blocking in Node.js - Asynchronous Operations and Callbacks
Node.js is a single-threaded system and one thread can do only one thing at a time. So, if we have 1000 concurrent users requesting our web server then for the single-threaded system designed in the traditional approach will result in serving one request at a time and the other users will have to wait till their execution is taken up from the wait queue. This problem demands a better solution. Let's take an example to analyze and then evolve a better solution.
Consider this example; the Node.js thread is given the following tasks:
- Task1 - Read a file.
- Task2 - Request another HTTP server/service
- Task3 - Perform some CPU intensive calculations.
If we have a single thread that simply performs the tasks given to it then this task execution will be serial. Task2 will be blocked by Task1 although the thread is sitting idle while the OS is making the filesystem operation.
If I am able to read the file and make network calls at the same time then this is called a concurrent execution. We can make the result returned when the task is completed at a later point of time resulting in asynchronous design.
If I have to convert a single thread so that we have to make non-blocking calls then we need to change the structure in which we do our task.
So, we will utilize the single thread of the Node.js not to complete the task but to accomplish it.
What is the difference between complete and accomplish?'
Let's understand, most of the kernel or the operating system on which we work, whether its Linux, Mac OS, or Windows have async handlers, i.e. we can ask them to do something and they will provide the result in a callback.
The operating system has many threads, using which it helps to access the different system resources. OS can help to access the file in a different thread and similarly network call in a different thread.
In the above diagram, we can find that the main thread (Node single thread) is responsible for delegating the tasks to OS and then operate on the result received in the callbacks. This way it is not actually performing the tasks but getting it done by OS threads. This design now becomes really efficient and performant.
Now, we saw that we can use OS threads to perform our tasks but what about the CPU intensive tasks?
These can not be delegated to the OS due to a lack of such task handling. So, the main thread will have to execute them. It will keep the main thread busy and will make the other operations in the waiting state. Node.js solves this problem using Thread Pool.
Node.js provides Thread Pool which by default has 4 threads (can be changed with the environment variable). They are used to perform asynchronous tasks which cannot be performed by the OS.
Now, all tasks 1,2, and 3 can be performed independently or concurrently and the main thread will still be responsive to any other task delegation.
This design is implemented in Node.js through Event Loop. One of the things that the event loop has is called phases.
Note: Here we have taken few phases for the explanation. There are other phases as well.
Each phase has a callback queue associated with them. For Example, When we call setTimeout then a callback is added in the callback queue of the Timer phase.
If we have to get some I/O operation performed then it will be enqueued in the callback queue of the poll phase. OS will pick the task from this phase and after the task's completion, it will enqueue the result callback in this poll phase.
The event loop gets into one phase at a time, traverses the callback queue and executes the callbacks. It then moves to the next phase, either when all the callbacks have been executed or the limit we have specified for the callbacks. i.e. we can say if 50 callbacks are executed then move to the next phase and leave the remaining callbacks pending till that phase is reached again.
Event loop makes the Node.js async operation really fast using OS threads and thread pool. The application itself remains really performant with the main thread free to handle the task requests.
I hope you will now appreciate the event loop design.
Next Article of this series: Chrome V8 Engine - Javascript Runtime for Node.js