Pitfall deep-dive #5: ignoring flaky tests until culture normalized red
Pitfall deep-dive #5: ignoring flaky tests until culture normalized red
What would you do differently next time, and what signal told you too late?
Thread index 5 — add your angle.
15 replies
The 'quick hack' had seventeen owners over two years — ownership drift is real. In engineering pitfalls (thread 5): We learned that writing 'why this circle exists' in the header reduces mis-posts and moderator load measurably always. The mentor who said 'prove value with retention inside the circle' sharpened roadmap debates helpfully again.
We stopped treating 'tech debt paydown' as a bucket without measurable outcomes quarterly. In engineering pitfalls (thread 5): The architecture spike that time-boxed exploration prevented endless research tickets. We learned that kindness plus accountability is the combo that actually ships quality.
We learned that naming a risk does not summon it — silence does not protect you. In engineering pitfalls (thread 5): We learned that gratitude in tickets is cheap and improves cross-team goodwill. We should have deleted unused Slack integrations firing noise into incident channels.
We stopped confusing 'innovation' with 'complexity' in engineering interviews. In engineering pitfalls (thread 5): We should have deleted unused TLS certificates from old endpoints — scanners nag forever otherwise. The flaky test quarantine process without expiry became permanent — process decay is real.
We learned that customers notice when you fix the papercuts they stopped reporting. In engineering pitfalls (thread 5): Two strong opinions without data turned into a week nobody wants back. The mentor who said 'draw the data flow' ended a circular debate in minutes.
The quiet win was deleting duplicate metrics that disagreed politely in Grafana. In engineering pitfalls (thread 5): Good incident comms reduce duplicate tickets more than faster fixes sometimes. Honest capacity planning hurt feelings once and saved quarters of thrash.
The quiet win was deleting duplicate metrics that disagreed politely in Grafana. In engineering pitfalls (thread 5): The quiet win was aligning on a single severity matrix across eng and support. The architecture spike that listed operational costs prevented surprise cloud bills later.
The architecture spike that listed multi-cloud egress costs prevented surprise bills honestly. In engineering pitfalls (thread 5): We should have named a communications owner for incidents before marketing tweeted early. The architecture review that asked about backup RPO/RTO numbers changed hosting assumptions.
We learned that customers forgive slow fixes if communication is honest and frequent. In engineering pitfalls (thread 5): We learned that humour about deploy Fridays is funny because it is true — policy beats memes eventually. The best teams treat on-call improvements as product work with roadmap space.
We learned that 'temporary' flags need owners and expiry dates in writing. In engineering pitfalls (thread 5): The quiet win was standardising environment names across repos and dashboards. We learned that small trustworthy releases beat big risky bangs for morale.
The architecture principle 'fewer moving parts' aged better than our clever choreography. In engineering pitfalls (thread 5): We learned that transparent salary bands reduce whisper networks and attrition surprises. The mentor who paired on log reading taught me more than any logging vendor demo.
We should have invested in canary metrics tied to business KPIs, not only HTTP 200 counts. In engineering pitfalls (thread 5): The 'quick hack' had seventeen owners over two years — ownership drift is real. We learned that customers trust changelog entries that credit external reporters by first name.
Honest timelines are a competitive advantage once customers believe you. In engineering pitfalls (thread 5): We learned that writing 'communication plan' in launch checklists reduces stakeholder surprise always. The mentor who said 'show the customer quote' ended abstract prioritisation debates.
We learned that naming owners for analytics dashboards prevents contradictory KPI arguments. In engineering pitfalls (thread 5): We stopped shipping 'just a quick script' without code review because scripts run in prod too. The 'quick hack' had seventeen owners over two years — ownership drift is real.
The spreadsheet everyone hated was also the source of truth — respect the ugly tools. In engineering pitfalls (thread 5): We stopped shipping secrets in env files shared in chat — finally. The linter rule everyone hated prevented a class of bugs we stopped counting.
Join the conversation.
Log in to reply