> it may sound obvious, but - optimizing your app to fulfill a request in 1/10 the time is like adding 9 servers to a cluster. Optimizing to 1/100 the time (reducing requests from say 1.5 sec to 15ms) is like adding 99 server
Who the hell is casually optimizing away 90% of the latency in the time it takes to spin up 9 more pods? That's insane. Like the latter is an operation on the scale of minutes.
Also, this is a throughput/latency conflation.
90% of the cost a startup incurs is in the things it could be doing instead, not in the infra. Opportunity cost dominates.
I've optimized software at startups for a 10x to 100x performance improvement. It's not hard to find these areas when you have the fortune of hindsight. These past 2 weeks I took CI from ~40 min to ~2 min and not only improved everyone's lives at the company but also made it way cheaper to host beefier CI machines. If 1 build takes 20 minutes & 32 GB of ram vs 1 minute & 32gb of ram the costs are way different. At previous companies re-architecting core systems to allow better binpacking saved similarly large amounts of money (15k/month -> 1k/month).
At way smaller companies (2 people), though, this is way harder unless someone has done something really wrong.
What sort of things did you optimize to improve the build times?
Dropping our Docker+Gradle builds for Bazel and using Bazel's caching and running a Bazel daemon on CI machines so I don't pay startup/analysis costs. On builds with no change CI time is like under 1 minute.
I could have done that with Gradle but we also want to support multiple languages (Java, Python, NodeJS & React, Golang).
Great answer. Honestly, while the theory is that you can Dockerize your build and you can do remote caching with Bazel I've never seen anyone do it. Like it seems some confluence of steps that doesn't occur. I think I wouldn't use Docker for builds at all just because of this performance regression.
Bazel's caching abilities are by far the best I've ever worked with because it understands the full source tree. It can also cache test executions. There's some tests in my code that make sure I'm calling out to crypto libraries correctly and these tests take >30 seconds to execute but almost never change. With bazel I can feel free to write as many of those integration tests as I want since they will only ever be rerun when something effects them (I.E. I change the version of my crypto library).
> Honestly, while the theory is that you can Dockerize your build and you can do remote caching with Bazel I've never seen anyone do it
Yea, you likely don't want to run bazel within a docker container, you want to build a Docker container within bazel [0]. The performance of this way of doing things is much better. My monorepo has >30 services and `docker-compose up --build` was becoming super slow. To address this I've written bazel_compose [1] to obtain the same workflow docker-compose offers you with bazel as your container build system. It also supports a gradual migration scheme and will build both the Dockerfile AND the bazel version of your container to make sure they both start.
Unfortunately the bazel community is mainly populated with companies who are 100x the size of the average and as such they already cant run all of their services on their dev machines and so they don't see the value of something like this. This version of bazel_compose is out of sync with HEAD @ caper but if you're adventurous I'd recommend checking it out. It has extra features to watch all of the source files using ibazel and will automatically build&restart containers (<<10 seconds in my experience) as you edit and save code.