A very clear and interesting post.
I've been trying to fit big-enough long-running stuff into JVMs for a few years, and have found that minimizing the amount of garbage is paramount. Its a bit like games- or C programming.
Recent JVM features like 8-bit strings and not having a size-limit on the interned pools etc have been really helpful.
But, for my workloads, the big wastes are still things like java.time.Instant and the overhead of temporary strings (which, these days, copy the underlying data. My code worked better when split strings used to just be views).
There are collections for much more memory-efficient (and faster) maps and things, and also efficient (and fast) JSON parsing etc. I have evaluated and benchmarked and adopted a few of these kinds of things.
Now, when I examine heap-dumps and try and work out where more I can save bytes to keep GC at bay, I mostly see fragments of Instant and String, which are heavily used in my code.
If there was only a library that did date manipulation and arithmetic with longs instead of Instant :(
You're absolutely right. Its one reason I struggle with the modern fashion for immutable classes and FP, they are always making copies of everything, seems crazy.
Ideally, a good compiler that understands FP will, behind the scenes, detect when it's safe to mutate the old data rather than creating a copy. That's a big part of why Haskell manages to be neck-and-neck with C despite being functionally pure.
Where it gets tricky is in an environment like the JVM where programming in that style was not anticipated, and introducing any optimizations along these lines for the benefit of the proverbial Scala fans needs to be balanced against the obligation not to adversely impact idiomatic Java code.
That said, even without that, it's not necessarily crazy. It's just a value call: Do you believe that more functional code is easier to maintain, and perhaps value that above raw performance? I'm old enough to remember similar debates about how object-oriented C++ code should be, and to have at least encountered Usenet posts from similar debates about how structured C code should be. I don't bring this up by way of trying to weasel in some "historical inevitability" argument - these are legitimate debates, and there are still problem domains where coding guidelines may discourage, or even prohibit, certain structured programming practices. For very good reasons.
Haskell is only close to C in extremely rare cases or when using unsafe features and the FFI.
Would you say idiomatic Haskell is faster or slower than idiomatic use of Java and the JVM? I'm interested in actual experience and preferably benchmarks of real cases, no thought experiments please :)
(If this sounds harsh, it's not my intention. In another HN thread I had someone "explain" to me how Java and Java's OOP is "not suitable for business software development". If this seems like a bizarre statement which disregards more than a decade of business software development -- this is why I ask for actual experience and not opinions or "I think this can't be right").
https://www.techempower.com/benchmarks/
The source code is published too (https://github.com/TechEmpower/FrameworkBenchmarks)