Instead of fixing something RIGHT NOW, meaning adding another commit to your build, why aren't you instead rolling back to a known good commit?

Image is already built.

C/I already certified it.

The RIGHT NOW fix is just a rollback and deploy. Which takes less time than verifying new code in any situation. I know you don't want to hear it but really, if you need a RIGHT NOW fix that isn't a rollback you need to look at how you got there in the first place. These systems are literally designed around never needing a RIGHT NOW fix again. Blue/Green, canary, auto deploys, rollbacks. Properly designed container infrastructure takes the guesswork and stress out of deploying. Period. Fact. If yours doesn't, it's not set up correctly.

> Properly designed container infrastructure takes the guesswork and stress out of deploying. Period. Fact. If yours doesn't, it's not set up correctly.

Hmmm, I don't have the experience to know if it's setup correctly or not. All I can do is watch it fail and then learn from my mistakes.

Is there a container "framework" that out of the box gives me all of " Blue/Green, canary, auto deploys, rollbacks..." so I don't have to guess if I'm doing it right?

We’ve been using Convox[0] for the last 2 years. I’ve been pretty happy with how simple it is to work with. We’re still on version 2 which uses AWS ECS or Fargate. Version 3 has migrated to k8s and is provider agnostic. We just haven’t had the bandwidth to upgrade yet.

[0] https://convox.com/

We are using Convox v2 too and are happy with it, but I'm hesitant to do the upgrade to introduce the complexity of kubernetes to our devs and if convox the right abstraction on top of kubernetes when there which is already a pile of abstractions in k8s itself (and so many other tools to choose from in the k8s universe).

https://github.com/aws/copilot-cli isn't ready for our use cases, but is more or less convox v2 built by AWS.