Non-von Neumann architectures have a lot of catching up to do on the software side. Nearly every mainstream programming language has "get one thing from memory, compute on it, put it back" semantics.

Plenty don't though, like functional programming and relational databases. They're common enough these days.

Eh. Most functional languages actually serialize operations pretty linearly in their default evaluation model, which is why they have explicit constructs for parallelism and concurrency.

There are some languages that are intended to parallelize implicitly, though. They tend to use keywords like "data-parallel" and "array language" to describe themselves. Futhark is a good example: https://futhark-lang.org/

And co-dfns: https://github.com/Co-dfns/Co-dfns (subset of APL compiled for GPU).