What does HackerNews think of libcluster?
Automatic cluster formation/healing for Elixir applications
libcluster[0] has a bunch of strategies to form clusters. It seems that ECS supports service discovery through DNS, so the DNSPoll[1] strategy should work.
[0] https://github.com/bitwalker/libcluster [1] https://hexdocs.pm/libcluster/Cluster.Strategy.DNSPoll.html
I personally use mix release[2] that assembles a tarball of itself (together with BEAM), then rsync that to Hetzner, restart the remote process and viola. I like simplicity.
Re cluster of nodes, the easiest would be to use a library[3] to automate the formation of said cluster.
[0] https://www.gigalixir.com/
[1] https://fly.io/docs/elixir/getting-started/
[2] https://elixir-lang.org/getting-started/mix-otp/config-and-r...
When a stream starts I start a supervisor that then starts a GenServer to manage the port. On init a port is started for FFmpeg (using the above bash wrapper) with args that sends 16-bit PCM audio back to the port through the `handle_info/2` callback.
When a new live HLS segment is downloaded by FFmpeg the entire segment's audio is sent to the GenServer all at once (could be a few handle_info/2 calls, but it happens quickly). Since I want to work in small fixed chunks, I send the segment's audio to an AudioBuffer GenServer (started as a sibling under the same supervisor). This buffer uses binary pattern matching to segment the audio in chunks exactly 2 seconds long while keeping any remainder in the GenServer's state for the next buffer event. I then send the chunks to another ChunkBuffer GenServer that pops chunks at 2-second intervals for processing.
Since everything is supervised, if (when...) FFmpeg crashes the supervisor just restarts it. Meanwhile, the audio in the buffer is still processing and nothing goes down. There might be a duplicate word or two in the transcription if the restarted port processes a segment again, but everything keeps running smoothly.
For even more reliability, I have the application running clustered across four locations in the US, EMEA, and APAC using libcluster[^2]. The stream supervisor is started under a Horde.DynamicSupervisor[^3] with a custom distribution strategy. The strategy prefers the region closest to the company HQ, but if it goes down, the processes will be restarted in another region.
[^1]: https://hexdocs.pm/elixir/1.13.4/Port.html#module-zombie-ope...
Hot code updates for most applications aren't really worth it in my opinion, assuming you do something like blue/green rollover deployments. It's cool that it's possible though. But it requires appup files and afaik Distillery is one of the release tools that has support for it built-in.
Don't know the specifics in Erlang, but in Elixir you can just use Node.connect/1 and Node.set_cookie/2: https://hexdocs.pm/elixir/Node.html
Edit: There's also stuff like libcluster (https://github.com/bitwalker/libcluster) that allow for this at a higher level afaik.