One thing that came up in the Rails Performance slack this morning (https://www.railsspeed.com):
Efficiency vs Resiliency
If you hyper-optimize a system to run at 100% utilization, almost any future change to that app will cause a problem. Imagine a bug which adds 10% more CPU load, now you have CPU thrashing.
Even Google runs their machines at 80-90% utilization because that 10-20% is the slack required to safely absorb most changes.
@getajobmike I feel like we’ve seen a lot of this same issue in other kinds of systems recently … supply chains in various industries, airline capacity optimization, and much more.
@glv Absolutely. I believe there's a fundamental law for all systems engineering here.