> Enterprise architecture is immutable by default now, and destroying and replacing is the norm.
real life is harder. If I have a cluster of 8 H200 machines running training I can't really destroy it and redeploy. Technically I can but I need to spend time with the data scientists to make sure they configured everything to continue training from checkpoints. And if this cluster is idle for a day the amount of money wasted is around my monthly salary..
hm, maybe more enterprisey clusters are used in a such a way that any node can be replaced at any time.
And this gets into another complication of ET that doesn't happen with PT: with Product Tech, the onus is on the customers to modernize around a new update, whereas with ET, it's our responsibility to work around the customers, on their schedule, and their timeline, unless we want to be fired for "bad customer service".
We cannot simply rip and tear like Product can, placing trust in your orchestrators to rebuild from configs with brand new instances. We can't spool up Chaos Monkey and test-tank the ERP system, because the ERP team has no interest (or political benefit) in modernizing their infrastructure to support Configuration Management tools or pipelines.
real life is harder. If I have a cluster of 8 H200 machines running training I can't really destroy it and redeploy. Technically I can but I need to spend time with the data scientists to make sure they configured everything to continue training from checkpoints. And if this cluster is idle for a day the amount of money wasted is around my monthly salary..
hm, maybe more enterprisey clusters are used in a such a way that any node can be replaced at any time.