What does HackerNews think of cugraph?

cuGraph - RAPIDS Graph Analytics Library

Language: Cuda

> When applied to sparse adjacency matrices, these algebraic operations are equivalent to computations on graphs

Sparse matrix: https://en.wikipedia.org/wiki/Sparse_matrix :

> The concept of sparsity is useful in combinatorics and application areas such as network theory and numerical analysis, which typically have a low density of significant data or connections. Large sparse matrices often appear in scientific or engineering applications when solving partial differential equations.

CuGraph has a NetworkX-like API, though only so many of the networkx algorithms are yet reimplemented with some possible CUDA-optimizations.

From https://github.com/rapidsai/cugraph :

> cuGraph operates, at the Python layer, on GPU DataFrames, thereby allowing for seamless passing of data between ETL tasks in cuDF and machine learning tasks in cuML. Data scientists familiar with Python will quickly pick up how cuGraph integrates with the Pandas-like API of cuDF. Likewise, users familiar with NetworkX will quickly recognize the NetworkX-like API provided in cuGraph, with the goal to allow existing code to be ported with minimal effort into RAPIDS.

> While the high-level cugraph python API provides an easy-to-use and familiar interface for data scientists that's consistent with other RAPIDS libraries in their workflow, some use cases require access to lower-level graph theory concepts. For these users, we provide an additional Python API called pylibcugraph, intended for applications that require a tighter integration with cuGraph at the Python layer with fewer dependencies. Users familiar with C/C++/CUDA and graph structures can access libcugraph and libcugraph_c for low level integration outside of python.

/? sparse https://github.com/rapidsai/cugraph/search?q=sparse

Pandas and scipy and IIRC NumPy have sparse methods; sparse.SparseArray, .sparse.; https://pandas.pydata.org/docs/user_guide/sparse.html#sparse...

From https://pandas.pydata.org/docs/user_guide/sparse.html#intera... :

> Series.sparse.to_coo() is implemented for transforming a Series with sparse values indexed by a MultiIndex to a scipy.sparse.coo_matrix.

NetworkX graph algorithms reference docs https://networkx.org/documentation/stable/reference/algorith...

NetworkX Compatibility > Differences in Algorithms https://docs.rapids.ai/api/cugraph/stable/basics/nx_transiti...

List of algorithms > Combinatorial algorithms > Graph algorithms: https://en.wikipedia.org/wiki/List_of_algorithms#Graph_algor...

cuPy's sparse matrix support is still more limited than its dense functionality, but it's expanding quickly in version 8.0 in particular: https://docs.cupy.dev/en/stable/reference/sparse.html

The C++/CUDA backend to cuGraph contains many low-level graph operations on really sparse graph structures as well: https://github.com/rapidsai/cugraph

RAPIDS has picked up Dask for multi-gpu aspects of cudf (think spark/pandas on GPUs), and as cugraph is single GPU (https://github.com/rapidsai/cugraph) for going fast on ~billion row datasets... I'm guessing dask+cugraph will be happening for the next 100-1000X, if not already.

Graph partitioning is a weird world, so will be interesting to see!