I lost some momentum on my bootcamp journey in recent days. Both “embers” in the flame panel have gone dim (see banner), signifying I haven’t submitted any work for two days. I had made it a point of pride that those embers were never lost. So, what’s really going on?
I was, and still am, working on Chapter 5: Storage. The lessons so far:
- CH 5 - Lesson 2 (Goose Migrations) - Postgres SQL schema migrations
- CH 5 - Lesson 3 (SQLC) - Code generator from SQL commands
- CH 5 - Lesson 4 (Database Review) - Quick notes about dialects and a pop quiz
- CH 5 - Lesson 5 (Create User) - We have the SQL defined, now wire up the API handler
- CH 5 - Lesson 6 (Create Chirp) - CRUD operations for “chirps” (where I am now)
I’d already been down the Goose and SQLC road in a previous course, so I thought I’d blast right through this part.
Instead, I felt something snap.
In between Lesson 4 and Lesson 5, I was overwhelmed by a need to not be the guy in R&D responsible for operations issues. There were some major alarm bells echoing in my head from my most recent five-year ops gig. At the time, I felt like it was all I could do to hold back the flood of issues.
But now, here I was, actually writing code. And I’ve wanted to fix some code for years, code that had been keeping us up nights. As trivial as the changes were between lessons 4 and 5, it was NOT going to behave like THOSE programs!
What went through my head was something like:
Okay, so you’ve added a new feature. Very nice. Now, how are you going to prevent this from waking up the operations team at 3AM? And how will they roll back the change while they wait for you to get out of bed?
That first question unleashed a torrent of others.
- What’s the best way to merge schema migration contributions from multiple developers?
- Does the customer’s ops team need better migration tools, and should they be built into the binary?
- How would a production environment handle rollbacks?
- Should a new binary refuse to start if it depends on un-processed migrations?
- What part of the ecosystem runs the migrations, and when?
- If this process is running in Kubernetes, what should
livenessandreadinessprobes look like? - How can a rolling update roll back if some part of it already did the migration?
Most importantly:
- How would we test all this, including the ecosystem-dependent tooling around this new process’s codebase?
- If this was the day we broke up the monolith, what would that look like?
Alarm bell echoes
The past decade of post-graduation work experience has forged in brass a bank of alarm bells that only experience offers. In my most recent job I was a DevOps Engineer by title, and in addition to my full-time duties, I had been on call 24/7 every second week for over a year. The pressure in banking is pretty intense. This can be complicated by organizational design.
Conway’s Law tells us that organizations design systems that mirror their own communication structures. Put another way, a company’s systems mirror their org chart. My DevOps job was under the cloud operations team. That made it remote from pretty much everyone in R&D. Cloud ops was a new team … well, we were new relative to the nearly 40 years the company had been around. There wasn’t even a dotted line between my manager and the developers, and so it was tough to apply pressure on R&D from where I sat.
I know because I tried. You can tell from my questions what was going through my mind: downtime budgets, delivery surprises, runbooks, etc.
Quieting the alarms
As I lost those two Boot.dev embers, and while I learned that Anthropic was afraid to release their latest model Mythos lest it shatter our lovely little Internet, I produced for my own private edification (for now) the following artifacts:
- A design governance doc, a feature development loop, an API lifecycle guide, a schema versioning strategy
- A twelve-discipline DevOps-aware manifesto inspired by the Twelve-Factor App
- A report about how listening to Kim and Spear could have prevented recent real world ops catastrophes
- A K8s readiness probe design that classifies dependencies as private vs shared, informed by SRE
- A devops doc series that compares health check semantics across five different orchestrators
- A six-node sandbox K8s cluster so I can practice demonstrating I’m right to be worried
… all of this to answer the question: as a coder, and assuming something like this code will be successful enough one day, what do I owe to the person who deploys the 40th or 400th edition of this program at 3am, when the bank traffic is light?
The products of all this extra work might come out in these pages in time. A lot of the extra documentation was aided by AI. I asked it to summarize the intent behind all the design questions I was asking, and from its productions I could see a familiar pattern of organizational and technical struggles emerging. I doubt the documents will emerge here as they are today, if at all. I hope some will.
I haven’t worked with them “in anger” yet (to quote a brilliant trainee), so I can’t say how much will survive my editorial filter. But the story is the urgency to produce the documents in the first place, instead of just jumping to the next bootcamp lesson.
Stop running with soup
If I had to sum up the “snap”, it was cognitive dissonance reaching a breaking point. There was such a huge distance between the rate I was working on code and the real world problems I’m familiar with that I couldn’t keep rushing to completion. I needed to slow down and resolve not to spill the metaphorical soup. Let me explain.
Going through Boot.dev , while it is an intense pleasure to finally get to write application code myself again after 5 years away from that role, it is also making me face my demons. The course is excellent at what it does, and what it does is get you building, but there’s a part they deliberately leave out. The courses can’t tell you how to put your own personal stamp on your work. I think the cognitive dissonance snap is part of my personal stamp.
My personal stamp, my signature and the brand it conveys, can only come from listening to the alarm bells I still hear months after leaving operations. This week I started doing that.
Having most recently been the organizational shlimazel, who gets the soup spilled on them, and previously the shlemiel, who clumsily spills the soup, it’s great to have the chance to think deeply about how architecture, design, and code quality can impact the lived experience of everybody in the food chain. Every organization has their reasons for having become who they are (perhaps a subject for another post), and nobody comes to work planning to punt a bowl of bisque, but bootcamp is my reset/sabbatical, and I’m making it mine.
I’m pumping the brakes because the way I see it, AI is definitely making a lot of people run with soup, myself included. If I don’t work out how to follow Kim and Spear’s advice to slowify, simplify, and amplify, and instead follow the FOMO in Sam and Dario’s marketing that tends to accelerate, complicate, and obfuscate, I don’t think I’m doing anybody any favours.
Now, before I get back to coding camp, I think I’d better relax and do my taxes.
