Out of curiosity, have you considered using a workflow management system such as AWS Step Functions [1] instead of a set of queues? If so, what made you decide to go with queues?
Workflow engines neatly solve a lot of the challenges you mention in the article essentially for free:
* Model your problem as a directed graph of (small) dependent operations. Adding new steps does not require adding new queues, just new code.
* let the workflow engine manage the state of your transactions (the important, strongly consistent part) and just focus on implementing the logic. Scale horizontally by adding more worker nodes.
* get free error handling tools, such as automatic re-tries on failed steps, support for special error handling steps, and metrics you can alarm on (e.g. number of open workflows, number of errored workflows, etc.)
* get a free dashboard where you can look up the state of a given transaction, look at error messages, (mass) re-drive failed transactions (e.g. after an outage or after having fixed a bug), etc.