How fast can be asyncio module with PostgreSQL working together

Tasks where I/O matters

It's [web services] becoming just a proxy in front of database systems, or specialized web services, that sends back JSON responses to the web browser, and let it handle all the templating and all the display work.   
A new development era (2013) by Tarek Ziadé  

That blog posts by Tarek Ziadé  was quite actual back than three years ago, but it's even more actual now. Now days there are a lot of different types of services that mainly requires almost no other work but to receive request, send new request to 3rd party service, like DB, and return some part of received data to user. That are simple chats and messaging exchange services, services for collecting data, some recommendation engines, etc... These are places where I/O actually matters and it's the spot where python asyncio module fits perfectly.

I got very inspired after I had read article uvloop: Blazing fast Python networking, so I decided to try to build asynchronous prototype of service that would use all new and cool async approaches in python to get as much requests-per-second as it's possible. Important thing is to understand that asyncio does not actually increase python speed or productivity, for that are pypy and pyston projects,  what async approach can increase is a throughput of applications that don't do lot of computations but has a lot of I/O, basically it very efficient for proxy-alike parts of application.

As a suitable problem I decided to make a very primitive Ad Serving Service that works directly with DB in simple proxy-alike manner: it logs received request data to DB and gets some data from DB based on request parameters to serve that data to user and that is all, service doesn't include any additional control logic or computations. I'd like to describe some of that experience in this post.

Used technological stack:

I choose uvloop and httptools  because of the speed, in few words it's 10 times faster then aiohttp and builtin asyncio loop and completely suites for my task. For more details on this please check that amazing performance benchmark report from MagicStack guys.

Why PostgeSQL? Well, with release 9.5 where substantial performance improvements for multi-CPU machines and new GROUP BY analysis features GROUPING SETSCUBE and ROLLUP were introduced it becomes very suitable for warehousing and OLAP and that is something that it in context of building a AdServer are features that is required to have. Also I very recommend to check out this presentation by Josh Berkus about using and setting up PSQL as DW DB.

Working with PostgreSQL in async way

PostgreSQL supports async connections, but not everything  goes that smoothly about it, I'm not sure is it a only psycopg adapterproblem or PostgreSQL itself, but asynchronous connections docs says:

When an asynchronous query is being executed, connection.isexecuting() returns True. Two cursors can’t execute concurrent queries on the same asynchronous connection.

There are several limitations in using asynchronous connections: the connection is always in autocommit mode and it is not possible to change it. So a transaction is not implicitly started at the first query and is not possible to use methods commit() and rollback(): you can manually control transactions using execute() to send database commands such as BEGINCOMMIT and ROLLBACK. Similarly set_session() can’t be used but it is still possible to invoke the SET command with the proper default_transaction_... parameter.

With asynchronous connections it is also not possible to use set_client_encoding()executemany()large objectsnamed cursors.

COPY commands are not supported either in asynchronous mode, but this will be probably implemented in a future release.

There sure is a nice library aiopg that makes things more easy, but as far as it's based on psycopg adapter it shares same limitations for asynchronous connections. This actually was a bit disappointing to me, especially because of having cursors that can’t execute concurrent queries... At some point it even looks like it demands to use multi-threading/multiprocessing approach to get maxim throughput. I hope they will add support of concurrent queries for asynchronous connections. Also it seems that it puts some limitations on setting transaction type as 'read' and same probably goes to for setting isolation level.

For testing DB I created two simple tables, one with some hacks to increase productivity for writing requests log:

create unlogged table tbl (ID SERIAL PRIMARY KEY, data Json);
ALTER TABLE tbl SET (autovacuum_enabled = false, toast.autovacuum_enabled = false);

and another to retrieve data from it for serving it to user with some good number of rows to represent the variousity of possible ad variants for user to show. :

create table t_random as select s, uuid_generate_v4() from generate_Series(1,50000000) s;
CREATE UNIQUE index on t_random (uuid_generate_v4);

My original intention was to use one connection with single cursor for writing requests logs to DB and one connection with multiple cursors for reading fetching data, but in the end I decided to use a singe cursor for each connection with queue of tasks for each of worker. So both type of workers for writing and reading from DB looks pretty the same:

async def worker(loop, queue, conn_param, handler=None, dump_size=1):
    conn = await aiopg.connect(**conn_param, loop=loop)
    async with conn.cursor() as cur:
    values = []
    while True:
        values.append(await queue.get())
        if len(values) > dump_size:
             await cur.execute('\n'.join(values))
             values[:] = []
             if handler is not None:
                 handler(await cur.fetchall())

The writing worker was accumulating 10000 logs before write it to DB and the reading worker query DB for every and each request and then response to client with that data. This code in example is a bit different from I actually wrote, but represents what is going on quite accurately. Also it was required to me to create an own HTTP Protocol Handler to have a custom and fast HTTP header parser:

class HttpProtocol(asyncio.Protocol):
    __slots__ = (
        '_loop', '_transport', '_request', '_parser',
        '_url', '_headers', '_read_queue', '_write_queue')

    def __init__(self, loop, read_queue, write_queue):
        self._loop = loop
        self._read_queue = read_queue
        self._write_queue = write_queue
        self._reset()

    def _reset(self):
        self._request = None
        self._parser = None
        self._headers = []
        self._url = None

    def on_url(self, url):
        self._url = url

    def on_header(self, name, value):
        self._headers.append((name, value))

    def on_headers_complete(self):
        self._request = (
            self._url, self._headers,
            self._parser.get_http_version())

        self._loop.create_task(self._write_queue.put( *params* )
        self._loop.create_task(self._read_queue.put( *params* )

        self._reset()

    def connection_made(self, transport):
        self._transport = transport

    def connection_lost(self, exc):
        self._transport = None

    def data_received(self, data):
        self._parser = httptools.HttpRequestParser(self)
        self._parser.feed_data(data)

This HttpProtocol class is quite the same as the realization from MagicStack guys. The really different and most important part here is creating tasks on on_headers_complete callbacks that putts some request related params to queue. I guess that is it about code, if you interested you can find full source code in my repo, but that is a very buggy code with loots of possible issues, I do not recommend to use it for any other purposes than just for fun...

The numbers

All test have performed via wrk tool on Amazon EC2 t2.micro instance with using RDS PostgreSQL 9.5 (db.t2.small).

First test shows the speed of logging data to PostgreSQL DB:

> wrk -t17 -c500 -d30s http://127.0.0.1:8888/
Running 30s test @ http://127.0.0.1:8888/
  17 threads and 500 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency    21.33ms   38.17ms 440.43ms   89.86%
    Req/Sec   696.26    484.09     5.43k    66.68%
  328070 requests in 30.08s, 1.55GB read
Requests/sec:  10905.09
Transfer/sec:     52.70MB

Second test shows the speed of both logging and retrieving data at same time:

> wrk -t17 -c500 -d30s http://127.0.0.1:8888/
Running 30s test @ http://127.0.0.1:8888/
  17 threads and 500 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency   148.90ms   64.22ms 462.27ms   89.39%
    Req/Sec   215.01     84.39   363.00     80.28%
  101320 requests in 30.10s, 9.37MB read
Requests/sec:   3365.84
Transfer/sec:    318.83KB

Conclusion

New asyncio module has its sweet spot in python development and it's getting more and more attractive with appearance of new libraries and tools around, especial for development simple web-based services where you need to handle a lot of requests in single thread without much of computation. I definitely will use asyncio for some simple tasks of data aggregation/retrieving and other cases of simple proxy-alike behaviour, but I wouldn't use it in more complex applications where I/O doesn't matter that much or  more control is required. At the end I very recommend to check article Asynchronous Python and Databases by Mike Bayer, the creator of SQLAlchemy.

Prev Post Next Post