I don't get why this library is better than GWT.
The essential argument seems to be 2 things:
1. It can generate node.js compatible JavaScript so you can run Java code in node. I'm not sure how many people have felt the need to do that, but it's not something GWT can do.
2. It understands TypeScript, so it can make use of the work people have done to (essentially) create type annotations over various JavaScript libraries, and automatically (? or at least painlessly) build Java wrappers for them.
Why anybody wants to run node.js instead of Java. Java is much better suitable for the server side than node.
Nobody has commented yet. I can only guess this is because all agree: it goes without saying that this sweeping generalisation is patently untrue.
Still, allow me to humor you and give just one (of the many) reason(s) why: async I/O.
Yes, it's possible in Java. Yes, Java has futures that make it not just possible, but also a reality in this universe.
Unfortunately, Java has made the inconvenient decision to retain backwards compatibility and keep all its synchronous I/O operations available. Perhaps not a complete folly. For all of the good that decision has brought, there is one downside: any code (any lib) can still lock up an entire OS thread. Result: all libs actually do this. No matter how Future-istic your codebase; use any lib and you're back to sync I/O.[1] This is absolutely killing for high concurrency, I/O bound servers, who now lock up not just an FD or two per conn, but also an entire OS thread.
Javascript's threading model, on the other hand, is so shit that it came out the other side. By being inherently single threaded, all blocking I/O must be done through callbacks. Result: you can't block an OS level thread in javascript! So all libraries are inherently async (from the OS level).
There are, of course, many more reasons why Java is not "much better suitable for the server side than node", but this is just one of them (libraries, existing knowledge in your org, community, ...). In reality, both have their uses. Node has proven itself by now.
Disclaimer: I actually hate Javascript. And Java. Including Java 8 (for all the shit it, wisely, keeps around).
[1] E.g. Amazon's official AWS SDK, which is sync IO. Want async IO for S3 (or anything there)? Write it yourself. Which you won't. Because there's already a lib. "But it's not async!" // TODO.
I hope we can agree that, whether Netty or Node can accept 100M connections per millisecond or not -- is irrelevant if the rest of your stack cannot keep up.
In most applications, eventually you need to hit the database to process the request... and the database does not care if your original http request was asynchronous or not... it will still take its sweet time in milliseconds (or seconds for complex queries) to respond.
And the best way to prevent that is caching.
And that is where Node is at disadvantage.
In Java (and C, and C#, and a bunch of others), if your server has 16 cores, you will be running a single process on 16 cores and probably 32 concurrent threads, and they can all access some shared data (notably, database cache) with low intra-process latency, using some simple sharing primitives.
Now, Node... if a Node process is single-threaded, then one would need to spawn 16(32) Node processes to fully utilize the server, and they will be using inter-process communication to talk to each other -- much slower than Java's intra-process.
You're right but there are downsides to this approach.
First, you're creating a single point of failure. If any part of the application crashes for whatever reason it brings everything down with it.
If you intend to have fault tolerance, maintain uptime during updates and/or provide A/B testing capabilities, you'll still require at least a redundant server and a load balancer to funnel incoming requests between the two.
Second, sharing memory across cores comes at a cost.
A simple mutex/lock strategy will work but will be inefficient as it blocks on both reads and writes to shared data.
You could choose to use a CAS (Compare and Swap) strategy to avoid locks altogether. As long as you can ensure your models as well as all of the internal data structures they use are immutable and thread safe. Considering the nested nature of OOP inheritance as well as language level access restrictions on classes/methods/members, it can be difficult/impossible to ensure immutability without hand-rolling your own data structures.
Third, sharing state across contexts comes with hidden costs that need to be taken into consideration. For instance, if the shared state for all threads is updated (ie a cached value is added/updated) it will invalidate the L1, L2, and L3 caches on the processor. The overhead of cache invalidation increases in proportion to the number of cores/processors involved.
Source: http://www.drdobbs.com/architecture-and-design/sharing-is-th...
I assume you have experience with multi-threaded programming. I won't touch on the transient nature and inherent difficulty in replicating bugs introduced in multi-threaded programming. Other than to say, I've done enough of it in the past to develop a very healthy fear.
-----------------------
What about Node...
You're right to assume that Node instances will run as a cluster. Ideally one instance per core, to reduce context switching and prevent cache invalidations between the foreground and background workers internal to a Node instance.
As for a cache layer between the DB and Node instances. Since communication between Node instances will happen at the OS level via IPC, there's no benefit to implementing a global cache internal to the Node application. Instead, it would be better to offload the responsibility to a separate service that is specialized for caching (ex redis). This reduces the complexity of the application, surface area for bugs, and development time.
Load balancing will be required anyway, so it makes sense to use Nginx. Nginx is much more fault tolerant, providing sane limits for incoming requests, as well as the ability to define routes that short-circuit requests to static files.
What's interesting about deploying a Node cluster is its 'redundant by default' nature including auto-respawning of cluster instances https://www.npmjs.com/package/express-cluster.
If you desire a more granular control over cluster management you could use https://github.com/Unitech/pm2.
To make this work the servers themselves will need to be stateless. Which means, handling session management requires additional work-arounds.
-----------------------
Java is likely 'faster' when deployed in a strictly vertically scaled setup. Node.js is more geared to horizontal scaling by default. Just like Java now includes good support for async request handling, I have no doubt the community will design tools that also favor scaling out.
Either way, I don't think raw performance is going to be the deciding factor. Choice of platform is a business decision that depends on the resources and skills and/or preference of the team responsible for development.