What does HackerNews think of awesome-distributed-systems?
A curated list to learn about distributed systems
The reading lists for distributed systems university courses might also be interesting as well as this "awesome" link list on github:
> Papers-we-love > Distributed Systems: https://github.com/papers-we-love/papers-we-love/tree/master...
> awesome-distributed-systems also has many links to theory: https://github.com/theanalyst/awesome-distributed-systems
And links to more lists of distributed systems papers under "Meta Lists": https://github.com/theanalyst/awesome-distributed-systems#me...
In reviewing this awesome list, today I learned about this playlist: "MIT 6.824 Distributed Systems (Spring 2020)" https://youtube.com/playlist?list=PLrw6a1wE39_tb2fErI4-WkMbs...
> awesome-bigdata lists a number of tools: https://github.com/onurakpolat/awesome-bigdata
> Bulk Synchronous Parallel: https://en.wikipedia.org/wiki/Bulk_synchronous_parallel .
Many/most (?) distributed systems can be described in terms of BSP primitives.
> Paxos: https://en.wikipedia.org/wiki/Paxos_(computer_science) .
> Raft: https://en.wikipedia.org/wiki/Raft_(computer_science) #Safety
> CAP theorem: https://en.wikipedia.org/wiki/CAP_theorem .
Papers-we-love > Distributed Systems: https://github.com/papers-we-love/papers-we-love/tree/master...
awesome-distributed-systems also has many links to theory: https://github.com/theanalyst/awesome-distributed-systems
- Byzantine fault: https://en.wikipedia.org/wiki/Byzantine_fault :
> A [Byzantine fault] is a condition of a computer system, particularly distributed computing systems, where components may fail and there is imperfect information on whether a component has failed. The term takes its name from an allegory, the "Byzantine Generals Problem",[2] developed to describe a situation in which, in order to avoid catastrophic failure of the system, the system's actors must agree on a concerted strategy, but some of these actors are unreliable.
awesome-bigdata lists a number of tools: https://github.com/onurakpolat/awesome-bigdata
Practically, dask.distributed (joblib -> SLURM,), dask ML, dask-labextension (a JupyterLab extension for dask), and the Rapids.ai tools (e.g. cuDF) scale from one to many nodes.
Also might be worth checking out https://github.com/theanalyst/awesome-distributed-systems not sure if any of these resources are interactive though.
- https://github.com/mxssl/sre-interview-prep-guide
- https://github.com/theanalyst/awesome-distributed-systems
One linked resource for example: