I use recursive queries in SQL to search persistent suffix arrays (mainly when there is a LOT OF DATA). You can do string search hundreds of gigabytes of data within ~0.004s with SQLite on cheap SSD and pretty much no RAM - my goal is to be able to be able to search and align on all genetic information we have so far (about ~9 trillion base pairs in total)
If anyone knows how to increase INSERT speeds on Postgres, I would appreciate. I can get about 100K per second with COPY or putting the inserts in a transaction. Have tried some more mirrored NVME drives and more RAM with ZFS, as well as dropping some unimportant indexes, but I'm still a little limited. Any thoughts?
Steps:
- use ext2 or even tmpfs as the filesystem, this disables journaling and COW features of ZFS; mount options async,nobarrier
- set tables to be UNLOGGED to disable Postgres WAL
This got things fast enough for my last use-case. But next ideas, as at some point you then become CPU/memory bound
- you could try sharding by partitioning the table
- another trick is to use SQLite as a backing store for inserts (it is quite fast if you turn off all logging) and then query with Postgres via a FDW https://github.com/pgspider/sqlite_fdw