Bit of a warning. If you do not set cpu requests, your pods may end up with cpu.shares=2.

Java, for example, makes some tuning decisions based on this that you're not gonna like.

The Go runtime also locks in some unwarranted assumptions at process start time, and never changes its parameters if the number of available CPUs changes.
Explicitly setting GOMAXPROCS is probably the cleanest way to limit CPU among the runtimes that are out there, however. For example, if you set requests = 1, limits = 1, GOMAXPROCS=1, then you will never run into the latency-increasing cfs cpu throttling; you would be throttled if you used more than 1 CPU, but since you can't (modulo forks, of course), it won't happen. There is https://github.com/uber-go/automaxprocs to set this automatically, if you care.

You are right that by default, the logic that sets GOMAXPROCS is unaware of the limits you've set. That means GOMAXPROCS will be something much higher than your cpu limit, and an application that uses all available CPUs will use all of its quota early on in the cfs_period_us interval, and then sleep for the rest of it. This is bad for latency.