Wasn't there an article about how the async syntax was benchmarked to actually be slower than the traditional way of using threads? What's the current story on python async?

reference: http://calpaterson.com/async-python-is-not-faster.html

The "slower" is not really the problem--as the article notes, the sync frameworks it tested have most of the heavy lifting being done in native C code, not Python bytecode, whereas the async frameworks are all pure Python. Pure Python is always going to be slower than native C code. I'm actually surprised that the pure Python async frameworks managed to do as well as they did in throughput. But of course this issue can be solved by coding async frameworks in C and exposing the necessary Python API using bindings, the same way the sync frameworks do now. So the comparison of throughput isn't really fair.

The real issue, as the article notes, is latency variation. Because async frameworks rely on cooperative multitasking, there is no way for the event loop to preempt a worker that is taking too long in order to maintain reasonable latency for other requests.

There is one thing I wonder about with this article, though. The article says each worker is making a database query. How is that being done? If it's being done over a network, that worker should yield back to the event loop while it's waiting for the network I/O to complete. If it's being done via a database on the local machine, and the communication with that database is not being done by something like Unix sockets, but by direct calls into a database library, then that's obviously going to cause latency problems because the worker can't yield during the database call. The obvious way to fix that is to have the local database server exposed via socket instead of direct library calls.

leafboi

>whereas the async frameworks are all pure Python.

No it's not pure python. It's a combination. The underlying event loop uses libuv, a C library that's makes up the underlying core of nodejs. The marker of "Uvicorn" is an indicator of this as "Uvicorn" uses uvlib.

Overall the benchmark is testing a bit of both. The event loop runs on C but it has to execute a bit of python code when handling the request.

>If it's being done via a database on the local machine, and the communication with that database is not being done by something like Unix sockets, but by direct calls into a database library, then that's obviously going to cause latency problems because the worker can't yield during the database call.

I am almost positive it is being done with some form of non blocking sockets. The only other way to do this without sockets is to write to file and read from file.

There is no "direct library calls" as the database server exists as a separate process to the server process. Here's what occurs:

  1. Server makes a socket connection to database.
  2. Server sends a request to database
  3. database receives request, reads from database file.
  4. database sends information back to server.

Any library call you're thinking of that's called from the library here may be a "client side" library meaning that the library actually makes a socket connection to the sql server.

pdonis

> I am almost positive it is being done with some form of non blocking sockets.

Database libraries in Python that support this (as opposed to blocking, synchronous sockets, which are of course common) are pretty thin on the ground. That's why I would have liked to see more details in the article about exactly how the benchmark was doing the database queries.

> There is no "direct library calls" as the database server exists as a separate process to the server process.

Yes, you're right, I wasn't being very clear. The key question is, as above, whether nonblocking sockets are being used or not.

zzzeek

the psycopg2 driver for PostgreSQL supports an async mode which uses PostgreSQL's full blown non-blocking API, this is what I used when I did my tests and might be what was used here. There is also the asyncpg driver that is native to PG's non-blocking API. PG is the one database that does lend itself to async because it has a fully non-blocking client library available.

https://www.postgresql.org/docs/12/libpq-async.html

https://www.psycopg.org/docs/advanced.html#green-support

https://github.com/MagicStack/asyncpg