What does HackerNews think of twemproxy?
A fast, light-weight proxy for memcached and redis
- Be asynchronous and based on Boost.Asio.
- Perform automatic command pipelining [1]. All C++ libraries I looked at the time would open new connections for that, which results in unacceptable performance losses.
- Parse Redis responses directly in their final data structures avoiding extra copies. This is useful for example to store json strings in Redis and read them back efficiently.
With time I built more performance features
- Full duplex communication.
- Support for RESP3 and server pushes on the same connection that is being used for request/response.
- Event demultiplexing: It can server thousands of requests (e.g. websocket sessions) on a single connection to Redis with back pressure. This is important to keep the number of connections to Redis low and avoid latency introduced by countermeasures like [3].
This client was proposed and accepted in Boost (but not yet integrated), interested readers can find links to the review here [2].
[1] https://redis.io/docs/manual/pipelining/
[1] Heron, a realtime, distributed, fault-tolerant stream processing engine - https://github.com/twitter/heron
[2] Finagle, a fault tolerant, protocol-agnostic RPC system - https://github.com/twitter/finagle/
[3] FlockDB, a distributed, fault-tolerant graph database - https://github.com/twitter/flockdb
[4] Gizzard, a flexible sharding framework for creating eventually-consistent distributed datastores - https://github.com/twitter/gizzard
[5] Twemcache, a Twitter fork of Memcached - https://github.com/twitter/twemcache
[6] Twemproxy, a fast, light-weight proxy for memcached and redis - https://github.com/twitter/twemproxy
[1] Fabric, an SDK for mobile apps - https://docs.fabric.io/android/fabric/overview.html
[2] Heron, a realtime, distributed, fault-tolerant stream processing engine - https://github.com/twitter/heron
[3] Finagle, a fault tolerant, protocol-agnostic RPC system - https://github.com/twitter/finagle/
[4] FlockDB, a distributed, fault-tolerant graph database - https://github.com/twitter/flockdb
[5] Ruby implementation of the ICU (International Components for Unicode - https://github.com/twitter/twitter-cldr-rb
[6] Clockwork Raven, Human-Powered Data Analysis with Mechanical Turk - https://github.com/twitter/clockworkraven
[7] Gizzard, a flexible sharding framework for creating eventually-consistent distributed datastores - https://github.com/twitter/gizzard
[8] Twemcache, a Twitter fork of Memcached - https://github.com/twitter/twemcache
[9] Twemproxy, a fast, light-weight proxy for memcached and redis - https://github.com/twitter/twemproxy
[10] Iago, a webapp load tester - https://github.com/twitter/iago
[11] Ospriet, a bestof/voting app - https://github.com/twitter/ospriet
In Twitter's case, we've also contributed to the Redis ecosystem via twemproxy: https://github.com/twitter/twemproxy
Twemproxy helps scale some of the traffic for the top websites in the world: https://github.com/twitter/twemproxy#users
EDIT: Looks like instead of using distinct ports to delineate separate clusters like twemproxy, it uses key prefix routing. It also supports "replicated pools", and a few other fancy/neat things. Interesting!
The key to scaling and maintaining HA with redis is using clustering, either built into the app or through something like nutcracker (https://github.com/twitter/twemproxy) and making sure you properly balance.
The performance of redis makes it very worthwhile to deal with the persistence/HA issues.
We ($dayjob) have been using it in production and it has been _solid_. twemproxy is quality engineering.
As a sidenote, look at the amount of shenanigans of complexity and redirects in the "github repository link" contained in the article:
http://links.services.disqus.com/api/click?
format=go&
key=cfdfcf52dffd0a702a61bad27507376d&
loc=http%3A%2F%2Fantirez.com%2Fnews%2F44&
subId=804356&
v=1&
libid=1354646989332&
out=https%3A%2F%2Fgithub.com%2Ftwitter%2Ftwemproxy&
ref=http%3A%2F%2Fnews.ycombinator.com%2Fnews&
title=Twemproxy%2C%20a%20Redis%20proxy%20from%20Twitter%20-%20Antirez%20weblog&
txt=https%3A%2F%2Fgithub.com%2Ftwitter%2Ftwemproxy&
jsonp=vglnk_jsonp_13546470034491
HOLY MOLY!At the time we adopted memcached, that's the version we went with and made sure it worked well in our production environment as we scaled as a company. We also open sourced twemproxy [https://github.com/twitter/twemproxy] which is a lightweight proxy for memcached which has worked well for us in combination with twemcache and may work well for others too.
We just want to reiterate that twemcache has worked well for our unique environment and any teams evaluating memcached should try all their try all their options, just like any other piece of software you adopt in your stack.
One of the reasons of open sourcing our work was to share our ideas with the memcached community to see what worked well for us and help everyone. For example, this is also how we treat our work with our MySQL fork [https://github.com/twitter/mysql] which we maintain in the open and have signed an OCA with Oracle to help get work pushed upstream so everyone benefits in the long run.