What does HackerNews think of awesome-distributed-systems?

A curated list to learn about distributed systems

Distributed systems is a very broad research topic nowadays, so it might make sense to check conferences covering a narrower topic, e.g. Cloud computing. Google Scholar might give some useful results for more specific search terms.

The reading lists for distributed systems university courses might also be interesting as well as this "awesome" link list on github:

https://github.com/theanalyst/awesome-distributed-systems

From "Ask HN: Learning about distributed systems?" https://news.ycombinator.com/item?id=23932271 :

> Papers-we-love > Distributed Systems: https://github.com/papers-we-love/papers-we-love/tree/master...

> awesome-distributed-systems also has many links to theory: https://github.com/theanalyst/awesome-distributed-systems

And links to more lists of distributed systems papers under "Meta Lists": https://github.com/theanalyst/awesome-distributed-systems#me...

In reviewing this awesome list, today I learned about this playlist: "MIT 6.824 Distributed Systems (Spring 2020)" https://youtube.com/playlist?list=PLrw6a1wE39_tb2fErI4-WkMbs...

> awesome-bigdata lists a number of tools: https://github.com/onurakpolat/awesome-bigdata

From a previous question re: "Ask HN: CS papers for software architecture and design?" (https://news.ycombinator.com/item?id=15778396 and distributed systems we eventually realize were needed in the first place:

> Bulk Synchronous Parallel: https://en.wikipedia.org/wiki/Bulk_synchronous_parallel .

Many/most (?) distributed systems can be described in terms of BSP primitives.

> Paxos: https://en.wikipedia.org/wiki/Paxos_(computer_science) .

> Raft: https://en.wikipedia.org/wiki/Raft_(computer_science) #Safety

> CAP theorem: https://en.wikipedia.org/wiki/CAP_theorem .

Papers-we-love > Distributed Systems: https://github.com/papers-we-love/papers-we-love/tree/master...

awesome-distributed-systems also has many links to theory: https://github.com/theanalyst/awesome-distributed-systems

- Byzantine fault: https://en.wikipedia.org/wiki/Byzantine_fault :

> A [Byzantine fault] is a condition of a computer system, particularly distributed computing systems, where components may fail and there is imperfect information on whether a component has failed. The term takes its name from an allegory, the "Byzantine Generals Problem",[2] developed to describe a situation in which, in order to avoid catastrophic failure of the system, the system's actors must agree on a concerted strategy, but some of these actors are unreliable.

awesome-bigdata lists a number of tools: https://github.com/onurakpolat/awesome-bigdata

Practically, dask.distributed (joblib -> SLURM,), dask ML, dask-labextension (a JupyterLab extension for dask), and the Rapids.ai tools (e.g. cuDF) scale from one to many nodes.

Raft has a visualisation that you can interact with to understand it better https://raft.github.io/

Also might be worth checking out https://github.com/theanalyst/awesome-distributed-systems not sure if any of these resources are interactive though.