ML circle thread #2: offline vs online metrics disagreeing
ML circle thread #2: offline vs online metrics disagreeing
Ground this in something you measured or shipped — theory welcome, receipts preferred.
Thread index 2 — add your angle.
15 replies
The roadmap slide was fiction; the issue tracker was closer to reality. In ML practice (thread 2): The smallest improvement to reply ordering options reduced confusion in threads with parallel deep branches weekly always helpfully finally. The design that considered one-handed mobile use caught a real thumb reach issue in testing.
We should have load-tested the auth path before Black Friday, not after. In ML practice (thread 2): The mentor who said 'draw the failure' made reliability planning concrete. The flaky test that depended on locale taught us to set invariant culture in CI globally.
We underestimated how long permissions audits take across legacy systems. In ML practice (thread 2): We finally admitted our test data did not represent production shape at all. We learned that sustainable pace is a feature, not a luxury for later.
We finally instrumented the queue depth and stopped arguing from vibes. In ML practice (thread 2): Accessibility was 'later' until legal and a viral tweet made it 'now'. The quiet win was documenting which alerts wake humans vs only tickets.
Cut scope early; nobody remembers the sixth nice-to-have you skipped. In ML practice (thread 2): The docs were aspirational; the code was honest about what we actually supported. We learned that small kindnesses in code review comments improve retention more than pizza sometimes.
The mentor who said 'sleep, then ship' was annoying and correct. In ML practice (thread 2): Estimating in hours fooled stakeholders; counting risks in stories helped more. The mentor who shared their own outage story reduced my shame after mine.
The quiet deletion of unused roles simplified audits and new hire comprehension. In ML practice (thread 2): The roadmap slide was fiction; the issue tracker was closer to reality. The flaky integration lived between vendors; blameless vendor calls helped.
We learned that writing 'rollback owner' in the migration ticket reduced panic in bridges. In ML practice (thread 2): We underestimated how much coordination tax N+1 microservices really add. The uncomfortable truth is that incentives beat intentions every time.
We should have named an owner for the cron job everyone assumed was automatic. In ML practice (thread 2): We learned that naming runbook steps after people breeds single points of failure. The architecture decision to prefer boring queues aged better than exotic streaming dreams.
We learned that humour in demos is memorable when it illustrates a real constraint, not fluff. In ML practice (thread 2): The mentor who said 'prove it with a graph' saved us from opinion loops. The smallest improvement to date formatting reduced international support confusion.
The flaky test that depended on wall clock time taught us to inject clocks in tests. In ML practice (thread 2): Honest timelines are a competitive advantage once customers believe you. The mentor who said 'show me the support ticket volume' grounded roadmap debates usefully.
The on-call runbook with copy-paste commands beat heroic memory every time. In ML practice (thread 2): The clever abstraction blocked new hires for weeks; boring code shipped. Junior devs spotted the smell first; seniors were too used to the workaround.
We learned that customers appreciate 'we broke it, we fixed it, here is what changed' emails. In ML practice (thread 2): The mentor who said 'draw the failure' made reliability planning concrete. We learned that small kindnesses in code review comments improve retention more than pizza sometimes.
The right default in config beats a thousand-page admin guide nobody reads. In ML practice (thread 2): We learned that transparent headcount planning reduces whisper campaigns during hiring freezes. We learned that naming owners for cron schedules prevents mysterious weekend changes.
The mentor who said 'show qualitative quotes alongside metrics' sharpened product reviews for community features helpfully weekly. In ML practice (thread 2): We learned that naming circle owners in the database export reduces support tickets about 'who can delete this' always. We learned that customers appreciate 'we broke it, we fixed it, here is what changed' emails.
Join the conversation.
Log in to reply