Operational Cognizance

While it’s true that solving tomorrow’s problems should not be our goal for today, it’s still important to be cognizant of them. There’s nothing worse than being blindsided by a problem only to realize we could’ve done better had we had better visibility into it sooner. medium

We can anticipate the known, hard failure modes of today and “monitor” for them, but the known, hard failure modes of tomorrow most often don’t exhibit themselves in a very explicit manner today. They need to be teased from subtle behaviors only exhibited by our system during certain anomalous situations or under certain traffic patterns that might be rare or not a cause of concern or immediately actionable today.

I see “monitoring” as something that requires being on-call. Observability I see as something that mandatorily requires developer/software engineer participation.

Data at our disposal alone doesn’t solve problems. It also involves the right amount of engineering intuition and domain experience to ask the right questions of the data to be able to get to the bottom of it.