Non-blocking I/O and Node JS
A while ago I researched about Non-blocking I/O. I started with Node Js (An Non-blocking I/O framework built on the google chrome’s JS engine intended to write high scalable networking applications) and I was suprised about how an HTTPServer built with this framework can fast handle a thousand of concurrent requests and do it with a very efficient memory usage.
It can be done because Node Js doesn’t start a new thread or process when a new request come to the server. Everything in Node Js run in a single thread and nothing is blocking. It does asynchronus I/O calls and tells the operating system to notify it back when the I/O task is completed using epoll (Linux), kqueue (FreeBSD), select or whatever your OS provides to do this kind of things. In the meantime, Node Js can continue processing other requests or doing extra stuff. It Never ever blocks.
Another remarkable thing is that you don’t have a particular stack for each connection since you don’t have threads. That’s cause a huge memory save when you have high concurrency levels on your server.
Read more at node official’s page. It’s a very promising project and it’s on the earlier stages.
The issues of Non-blocking Node Js programming model
An issue related to this model of programming is that your code must be written as a set of callbacks that are invoked when the I/O operation it’s done. To be more explicit, lets look at this example:
var http = require("http") var server = http.createServer(function (req, res) { http.get({ 'host' : 'google.com'}, function (google_response) { setTimeout(function () { res.end(google_response.headers['location']) }, 2000) }) res.writeHead(200, {'Content-Type': 'text/plain'}) res.write("hello ") }) server.listen(8000)
The code just run a server on localhost at port 8000. When you make a request to http://localhost:8000 it will write “hello”, do an http get request to google, wait for 2 seconds and then print the location header. Note that I write the code using callback functions. Normally in Node Js, almost all your code looks like this.
In addition, you need to write Javascript on the server side. Although, if you don’t like Js you can write CoffeScript for Node instead. If you come from languages like python or ruby you probably like CoffeScript.
Eventlet (The Pythonic Way)
‘Cause I’m a Python enthusiastic and I don’t want to write code the way Node Js proposes I switch my research to eventlet. A python library that provides a synchronous interface to do asynchronus I/O operations.
Green Threads And Coroutines
Eventlet uses green threads to achieve cooperative sockets. Python’s Green threads are built on the top of greenlets, a module of stackless python that implements coroutines for the python languaje. One good thing of green threads is that they are cheap. Spawn a new green thread is much more faster than create a new posix thread and it consumes much less memory too!
Taking advantage of coroutines Eventlet can patch the socket-related modules of the python standard library and make it work with them in order to change the synchronous behaviour to asynchronous behaviour. So it means you don’t need to change your synchronous code to be asynchronous!.
If you want examples of what Eventlet can do read this.
Benchmarking
Finally I made a little bechmark between A Node Js Server, A WSGI Server using Eventlet and the python HTTPServer of the standard library.
The Node Js Server:
var http = require('http'); http.createServer(function (req, res) { res.writeHead(200, {'Content-Type': 'text/plain'}); res.end('Hello World\n'); console.log(req.headers['host'] + " - - [" + req.client._idleStart + "] \"" + req.method + " " + req.url + " " + req.httpVersion + "\" " + res.statusCode + " -"); }).listen(6000, "127.0.0.1"); console.log('Server running at http://127.0.0.1:6000/');
The Eventlet Wsgi Server:
from eventlet import wsgi import eventlet def handler(env, start_response): start_response('200 OK', [('Content-Type', 'text/plain')]) return ['Hello, World!\r\n'] wsgi.server(eventlet.listen(('', 7000)), handler)
The Stdlib HTTP Server :
from SocketServer import ThreadingMixIn from BaseHTTPServer import HTTPServer, BaseHTTPRequestHandler class Handler(BaseHTTPRequestHandler): def do_GET(self): self.send_response(200) self.send_header("Content-type", "text/plain") self.end_headers() self.wfile.write('Hello, World!\r\n') class SimpleHTTPServer(ThreadingMixIn, HTTPServer): pass server = SimpleHTTPServer(("localhost", 8000), Handler) print "Serving on port: %s" % 8000 server.serve_forever()
Now I have the Node Js server running on port 6000, the Eventlet Wsgi server on port 7000 and the python Http Server on port 8000.
Lets use the linux apache benchmark command to make 10K requests to each server with a concurrency level of 5:
Python Http Server Results:
Server Software: BaseHTTP/0.3 Server Hostname: localhost Server Port: 8000 Document Path: / Document Length: 15 bytes Concurrency Level: 5 Time taken for tests: 8.956 seconds Complete requests: 10000 Failed requests: 0 Write errors: 0 Total transferred: 1320000 bytes HTML transferred: 150000 bytes Requests per second: 1116.51 [#/sec] (mean) Time per request: 4.478 [ms] (mean) Time per request: 0.896 [ms] (mean, across all concurrent requests) Transfer rate: 143.93 [Kbytes/sec] received
Eventlet Wsgi Server Results:
Server Software: Server Hostname: localhost Server Port: 7000 Document Path: / Document Length: 15 bytes Concurrency Level: 5 Time taken for tests: 3.796 seconds Complete requests: 10000 Failed requests: 0 Write errors: 0 Total transferred: 1360000 bytes HTML transferred: 150000 bytes Requests per second: 2634.18 [#/sec] (mean) Time per request: 1.898 [ms] (mean) Time per request: 0.380 [ms] (mean, across all concurrent requests) Transfer rate: 349.85 [Kbytes/sec] received
Node Js Server Results:
Server Software: Server Hostname: localhost Server Port: 6000 Document Path: / Document Length: 15 bytes Concurrency Level: 5 Time taken for tests: 1.821 seconds Complete requests: 10000 Failed requests: 0 Write errors: 0 Total transferred: 790000 bytes HTML transferred: 150000 bytes Requests per second: 5489.98 [#/sec] (mean) Time per request: 0.911 [ms] (mean) Time per request: 0.182 [ms] (mean, across all concurrent requests) Transfer rate: 423.54 [Kbytes/sec] received
Now let increase the concurrency level. Let set it to 100.
Eventlet Wsgi Server Results:
Server Software: Server Hostname: localhost Server Port: 7000 Document Path: / Document Length: 15 bytes Concurrency Level: 100 Time taken for tests: 9.063 seconds Complete requests: 10000 Failed requests: 0 Write errors: 0 Total transferred: 1360000 bytes HTML transferred: 150000 bytes Requests per second: 1103.35 [#/sec] (mean) Time per request: 90.633 [ms] (mean) Time per request: 0.906 [ms] (mean, across all concurrent requests) Transfer rate: 146.54 [Kbytes/sec] received
Node Js Server Results:
Server Software: Server Hostname: localhost Server Port: 6000 Document Path: / Document Length: 15 bytes Concurrency Level: 100 Time taken for tests: 1.463 seconds Complete requests: 10000 Failed requests: 0 Write errors: 0 Total transferred: 790000 bytes HTML transferred: 150000 bytes Requests per second: 6834.49 [#/sec] (mean) Time per request: 14.632 [ms] (mean) Time per request: 0.146 [ms] (mean, across all concurrent requests) Transfer rate: 527.27 [Kbytes/sec] received
Python Http Server Results:
Benchmarking localhost (be patient) Completed 1000 requests Completed 2000 requests Completed 3000 requests Completed 4000 requests Completed 5000 requests Completed 6000 requests Completed 7000 requests apr_socket_recv: Connection reset by peer (104) Total of 7830 requests completed
Ups! the server breaks with a broken pipe (I run the test several times and it never completes the 10K requests)
Note: I run the test with a concurrency level of 1K and just Node Js could pass the test. The both python server breaks at one point.
Conclusion
Based on the benchmarks I think there’s no discussions possibility about wich framework have more scalability and is more efficient.
However, if you don’t need to handle a huge quantity of requests concurrently and you want to write your app in pure python I recommend Eventlet instead of the standard sinchronous socket library. The advantages of cheap green threads makes the difference when you need to do concurrent I/O operations. In addition, green threads offers you a deterministic behaviour and doesn’t have context switch overhead (unlike posix threads and processes). This video shows it better.
A great feature of eventlet is you don’t have to rewrite your code to make it asynchronous. You start with this and learn how to change your application behaviour patching the socket library using eventlet.
Looking fordward
This post was not intended to build an opinion about wich framework or library is better or wich is more efficient or beautifull. It’s just a mind opener article. I shown you a different model to do I/O stuff on networking applications. This’s just the start!. I’ll recommend you to get deep on researchs about this model of I/O. It seems to become stronger in the next years with the advent of real time web applications and comet technologies.
Now it’s time to think about my new project… And by the way, it includes non-blocking I/O, a bunch of networking, and of course, Python =).