My wife and I recently had a discussion about the Agile Methodology, and how I ran it for my teams. She also manages large technical projects, but wasn’t familiar with Agile/Scrum/Sprints, so I spent a little time going over it with her. After thinking about it for a while, I figured this was as good a time as any to get my thoughts on Agile out there. I feel like this is a post I’ve written already, somewhere, so if I’m repeating (or contradicting!) myself, I apologize for that.
Also, an upfront warning, I’m going to swear a lot in this blog post. I tend to swear when I have strong emotions about things that are broken, and I desperately want to fix them. I believe lots of things with software development processes are broken, so this will creep in. If this bothers you, you probably won’t be very comfortable in the software industry, and you certainly should not apply for a job at SyncBuildRun. There, you’ve been warned.
There are several good posts on how Agile has failed, and I certainly have my own thoughts. If one looks at the original manifesto, there are great intentions there. However, Agile either gets a bad rep, or it has become a strict ritualized process where any deviation from what was taught in a Scrum Master Certification Course is cursed as being “Not True Agile!”
So Agile can clearly be a religion, by which I mean that there are many people practicing it many different ways, some of them certain that their way is the right way, but with no clear scientific ability to demonstrate that any one way is better than any other. If there were a “right way” to do Agile, everyone would be doing it that way. Thus, it’s important to distil what matters about Agile, and why one should follow a Scrum or Kanban process, and furthermore which parts of the process to follow.
These are my tips and tricks. They may not work for you. They worked well for my teams. This discussion presumes some knowledge of Agile, but I’ll try to define things as I go if I guess that they’re not common terminology.
The goal of Agile is to deliver working software to the customer, as frequently as possible, and have it be the software that the customer actually needs (or says they need) even if requirements change along the way. That’s the crux of it. The two most critical bits of process around this are 1) Communication, and 2) Iteration. Everything else is just details.
The typical Agile “work segment” is the Sprint, which is amusingly named, since software is often a marathon, and one does not typically complete a marathon well by running multiple short segments of it as quickly as possible. The Sprint is, typically, a two-to-four week period of work, during which a team signs up for a number of tasks, and then commits to complete those tasks by the end of that time period. At the beginning of the Sprint is a planning phase, and at the end there is a review phase (intended to be with the customer, be they internal or external, but rarely done so in practice) and a retrospective phase. Then the process starts over with a new Sprint. Tasks are selected from the Backlog by the team based on an overall goal (or goals) for the Sprint, the tasks are “Story Pointed”, there may be a round of “Planning Poker”, then there are daily Stand Up meetings… Oy, what the fuck are we even talking about anymore? This already sounds like a crazy cult activity.
Okay, so here’s what worked for my teams, and made them productive and happy. This is the result of running this process with four teams at Amazon and one team at Novel. I’ll almost certainly kick off the same process once we scale up at SyncBuildRun, but it will ultimately be up to the team to decide how to run their processes.
We keep a Backlog. This is just a list of all the things we believe we need to do. We know this won’t be complete, probably ever. That’s fine, it’s a list, and we add or remove things from the list as necessary. We try not to spend a lot of time doing “backlog grooming” because most people would rather eat a bullet than sit through a grooming session. Maybe once a quarter we do a sanity check and blaze through the items looking for duplicates or bullshit tasks that have become meaningless in light of new discoveries.
We Story Point our tasks, usually at the beginning of a Sprint. Actually, we do two rounds of Story Pointing (which is atypical), and I’ll get to that in a second, but first I want to talk about what Story Points are. For a long time before Amazon, I was convinced that Story Points should reflect the amount of time required to complete a task. Even at Amazon, I had multiple discussions with TPMs (Technical Program Managers, a thankless job of coordinating people, requirements, and schedules that often resulted in someone quitting because management did not want to hear how an insisted-upon timeline was impossible no matter how the numbers were massaged, but I digress) about what Story Points meant, and it took almost a year to arrive at my own clear conclusion.
Story Points are numbers assigned to the larger “tasks”, called Stories because they are usually a story about a user, as in “as a user, I want to be able to load a document”. Story Points are a number from the Fibonacci Sequence (1, 2, 3, 5, 8, 13, 21…), and they are chosen from this set because Story Points are a combined metric of time and uncertainty. Read that sentence again; Story Points do not simply measure the time it will take to complete the functionality for a Story, they also measure how much “unknown” there is about getting it working correctly. Thus, a 1 point task may take a day or less to complete, and it is absolutely obvious to the developer what code they have to write (“I need a class that stores two floats and a string, and it needs these four simple mathematical methods plus one getter for the name. 4 hours with unit tests.”). An 8 point task, on the other hand, may involve two or three weeks of work, and there is a large amount of uncertainty about how to do this work. (“Hmm, I know I need to do some network calls, but I don’t know if I need UDP or TCP. I may need some cryptographic functions too, but I’ll have to research what libraries are available…”). A 21 point task is a multi-month research effort, and anything larger than 21 points should be broken down into multiple Stories of 21 points or less.
The goal, of course, is to break up Stories into a bunch of 5s or less; reasonable time periods of effort, manageable uncertainty. By manageable we mean, if the task slips, it only slips by two days, max. Typically a team will estimate Story Points on a per-story basis, and they may use special Playing Cards or index cards or hand signals… Don’t waste your time with this. If you have a good, solid team, with trust and good leadership, just have people call out their estimates. The manager or team lead should ensure that once in a while he asks the quiet guy or the new woman on the team what they think, so that it’s not just one superstar calling all the shots and being either aggressively optimistic or sandbagging everything. As your team builds up trust, your estimates will get better. Call out outliers, and if you don’t agree, ask some detailed questions. “Why do you think it will take that long? Is it really that easy? What are you uncertain about? Did you think about…” A good team can blaze through over a story a minute if they’re on the ball and they understand the stories well. Again, the goal here is to wind up with a bunch of Stories for a Sprint that are mostly 5 points or less. A few 8s are okay. Anything higher should be broken apart into smaller Stories.
Okay, so that’s Story Points. If you can wrap your head around the purpose and usefulness that bit of bullshit terminology, everything else will be easy.
Now for the process. When I ran Sprints, I ran them in three-week cycles. However, we didn’t do Story Work for the full three-weeks. Those three weeks were broken up like this:
- Monday Morning, Week 1 – Do Sprint Planning. Look at the backlog, see what we think we can pull into the Sprint, Story Point it and assign out tasks as a team. Spend no more than two or three hours kicking this off.
- Monday Afternoon through Wednesday Morning – Prototype, Research, Investigate, Design, and talk to other teams. There will be unknowns and might be cross-collaboration required with other teams. This is an opportunity to reduce uncertainty, and ensure there isn’t anything that was forgotten. Do not write production code during this time.
- Wednesday, 11Am – Sprint Planning, Round 2. Update the Story Pointing now that we have some solid data, pull stories from the Sprint if they’re the wrong stories or they can’t be completed, add new discoveries that fit into the Sprint, or put them in the backlog.
- Wednesday Noon, Week 1 to Wednesday Noon, Week 3 – Work on those Stories, doing daily standups to track status on the Scrum Board, talk about issues publicly, commit to the deliverables for the next day, etc. Main thing is, for two weeks straight, the team does work.
- Wednesday Afternoon, Week 3 – Pencils down, start testing and integrating. Any new check-ins from here on that are not bug fixes or integration fixes are a breach of trust.
- Friday Noon, Week 3 – Sprint Review and Retrospective. If everything works, is integrated, “shippable”, and there are no bugs, great, we party or go home early. If not, keep working and maybe I’ll see you on Saturday. Success is a working, shippable build that is tested and integrated, not simply checked in code that kinda does what the customer wanted.
“But that’s not Scrum, that’s Scrumerfall!” some might say. These people should go work someplace else. Agile is not about always writing code all the time. It’s about delivering something solid to the customer. The up-front planning time is there to enhance communication, specifically between different teams, and reduce uncertainty. It is difficult to do this doing the work periods, but it is easy to do during collaborative periods and planning times. By allocating a full two days to this design and collaboration work, we increase the opportunity for understanding the potential complexity and interaction issues across teams and functional areas.
Likewise, the two days at the end reduce “Big Bang Integration”, and give the team an opportunity to test their code. Far too often, a developer has slipped in some large codebase literally in the last minutes of the Sprint, only for the rest of the team to discover that it changed APIs and broke contracts, requiring lots of extra integration work. The smug developer usually stands up proudly and says “well, I got my work done!” and then leaves others to clean up the integration mess he left. This is the kind of Difficult Genius behavior that we don’t tolerate at SyncBuildRun, and this process of baking in testing and integration time helps to eliminate and flush out these problems.
Why does this work? Solid teams who embrace this pattern know that every three weekends they can potentially go home early, and not have to worry about anything breaking. They start again the following Monday without a scramble to speed ahead as fast as possible, but to plan thoughtfully, and then to build on something that they know was solid from the week before. They still get all the benefits of selecting their work, scoping the effort, adjusting to changing requirements on a three-week basis, and finishing with something solid, but they reduce uncertainty and increase stability. It’s a process for happy developers, and five teams of implementation has resulted in five happy, productive teams. It may not be the right Agile Process for your team, but there’s solid data that it’s a process that has worked for some very good teams of mine.