Refactoring : Change Your Project's Culture

Introduction

Several years ago, the company I was working for was renovating their technical interview process, and thus were soliciting interview questions from their employees.

I was a junior developer at the time, and as the more senior software developers contributed advanced Java, Ruby, and .NET puzzlers, I scrolled through the categories doubting I had anything to add.

But while browsing I noticed a section for "design best practices", and on a whim I wrote up two questions on refactoring that had been inspired by some bad code I had recently encountered in a legacy application we were supporting. I felt they were really basic, and probably would not actually get used, but the section was empty and it was a contribution... So I typed them up.

Much to my surprise, those questions have proved to be the ones most frequently used in technical interviews, and the ones interview candidates most frequently struggle with. An alarming majority of software developers do not give satisfactory answers to these questions even while doing well on other parts of the technical interview.

The two questions were:

What is wrong with this code?

boolean someCondition = someMethod();  
if (someCondition) {  
    doA();
    doB();
    doC();
} else {
    doX();
    doY();
    doA();
    doB();
    doC();
}

And:

What is wrong with this code?

if (someExpensiveCalculation(x, y, z)) {  
    doX();
}
doA();  
doB();  
doC();  
System.out.println("Some message...");  
if (someExpensiveCalculation(x, y, z)) {  
    doX();
}

Perhaps it was the wording... (When a software developer hears "what is wrong..." they look for a bug.) But even after assurances that the code compiles and runs correctly, as many as 80% of software developer interviewees still struggled with answering these questions correctly.

Why?

The questions all center around refactoring.

The first question is simple a test of whether or not the duplication of

doA();  
doB();  
doC();  

bothers you as a software developer. The code can just as easily be rewritten to:

boolean someCondition = someMethod();  
if (!someCondition) {  
    doX();
    doY();
}
doA();  
doB();  
doC();  

(Since doA(), doB(), and doC() will be executed regardless of the value of someCondition. This reduces the amount of code on the screen, improving readability, and removes duplication which reduces the likelihood of errors the next time the code is updated.)

The second question is similar in that it encourages the refactoring of the value from the call to someExpensiveCalculation() into a separate variable and caching the result.

boolean result = someExpensiveCalculation(x, y, z);  
if (result) { doX(); }  
doA();  
doB();  
doC();  
System.out.println("Some message...");  
if (result) { doX(); }  

Why the struggle over this?

The answer is probably for many more reasons than I will be writing about here. Its a complex issue that involves several facets of human nature and is not something you can distil into a couple of bullet points. So in this blog I want to focus specifically on a class of reasons why refactoring is neglected, all involving project culture and what you can do to change that.

NOTE: If you need a solid definition of what refactoring is and how to do it, I highly recommend this book.

Why Should I Refactor?

Before discussing why we don't refactor, it is worth asking the question, why do we want to refactor?

Refactoring is the art of rewriting existing code to simplify it, remove redundancy, and/or restructure it to perform a broader function. Defining refactoring that way seems to make it obvious why you would want to refactor... After all, who doesn't want simpler, easier to maintain code, that is constantly molding to meet new requirements?

The problem is that refactoring has a cost. It takes times to refactor. In fact, it often takes longer to refactor code than it does to just simply slap in a new feature or hack in a tweak to functionality. It also introduces risk by changing much larger amounts of code than the bare minimum necessary to solve the problem at hand.

What makes refactoring important though is that it's cost is short term and one time and its benefits are long term and are recurring. It is like investing money. When you invest a small amount of money early, it becomes a large amount of money later.

When you reduce the complexity of code, you make it easier and easier to change that code the next time. Each time you change that code and your life is easier because you have refactored it, you are earning interest on your one time investment. Likewise, removing duplication in code means it will be less work the next time you need to change that particular vein of logic, and reduces the risk of introducing bugs because you forget to change the same logic in multiple places. Again, you are earning interest on your one time refactoring.

Re-designing your code to meet new demands is the most extreme example. While it takes the most amount of time it also has the most lasting benefits to your code by keeping your code closely aligned to the problems you are solving and avoiding the accumulation of technical debt.

As an example, I recently refactored my project's audit system to use a "plug in" architecture, even though, at the time, we had only a single, simple file based logger. Just recently we got a request to add two additional audit streams to the application. Because of that prior refactoring, instead of having to track through all the code base to find all the auditing logic and cram in two new streams, I simply write the streams as well encapsulated plug ins, add them to a config file, and done. Hours instead of a few days.

This all sounds good in theory, but how about some numbers to make it more convincing. The current application that I work on is just over 10 years old. When we started working on it, it had over 250,000 lines of Java code. Just over a year later, the amount of functionality in the application has increased (i.e., we have added new features and expanded existing ones) but the lines of code have dropped to 36,000. That is an over 80% decrease in the amount of code that we have to maintain and change, while delivering more functionality than before. This was all done through a sweeping regime of refactoring that has been built into our development process.

As we worked tasks, we refactored the application component by component. Any time we worked a task on a new component, we took extra time to refactor that component to some degree. Over time, the refactoring had a ripple effort, allowing broader and deeper refactoring to happen faster and faster.

What has that bought us? We do not measure velocity very strictly on my project (I personally think that measuring velocity is overrated), but by conservative estimates our velocity is 50% - 60% faster than before. Maybe as high as a twofold  increase. Because the code is now so tightly factored, changes can be implemented very quickly without having to wade through lots of redundant code and get entangled in errors caused by unforeseen consequences.

Our customer gave us the latitude to refactor as we went, trusting us to fulfill their requirements on time and improve the code as we did it. Now they are reaping the benefits. We are emptying the backlog faster than they can fill it and, according to their measure, the application has improved more in a year under our team than it had it the prior five under a much larger team.

That is why we refactor. Refactoring keeps a project lean and healthy and helps it grow at a good pace without eventually crumbling under its own weight.

Objections

Since originally writing those interview questions I have worked on several large projects and have gradually come to realize that the culture of large software projects is often what discourages refactoring. There are many good software developers I have worked with who understand the importance of refactoring but are inhibited from doing it because the project is resistant to refactoring, for various reasons.

I want to address several of the common reasons that I have encountered and argue as to why these objections are short sighted.

"I have already paid for that code"

My customer, several projects back, literally used those words when faced with the decision whether or not to rewrite some particularly horribly written, buggy, and underperforming code. Do not fix badly written code, because I already paid for it. I managed to be keep my outside voice muted, but my inside voice was screaming "yes, and now you are going to keep paying and paying and paying!"

There are two fallacies with this mindset. First, is the assumption that code you already have does not cost you anything to keep. I bought a piece of code to do a job, and since it is still doing that job I do not need to buy another one.

If all code was well written, this would be a valid. But not all code is written equally well. Code can have a lasting effect on a project, either by adding value like earned interest on an investment or siphoning value away like the interest rate on a debt.

A sloppy job from a programmer rushing to get something done late at night can introduce bad code / bad design into your system that costs extra effort to maintain for years afterwards unless it is rewritten. If you have bad code, you will keep paying for it until you fix it. The sooner you fix it the better.

What makes this worse is that bad code tends to spread, both because it often requires other design compromises to keep it working and also because of the psychological impact on the team.

Second, this objection also suffers from the wrong assumption that the underlying assumptions of the project have not changed since a given piece of code was written. The requirements of a project change over time. Sometimes faster, some times slower, but they always change.

Embracing this change is the core philosophy of agile project management, and it should affect the code base as well as the project management technique. Code that was written years ago in a project's life can / probably has grown stale. It was good code when it was written, but circumstances and technology have changed now, and what once was good code is now bad code that needs an update.

Projects are risk averse

I once worked on a project on which a senior developer took several opportunities to chastise me for refactoring because I was "changing a lot of things that could break." (On top of that he was also mad because my refactoring made it hard to read the svn diffs when we wanted to perform code reviews... I'll leave that for the jury to decide.)

But in his defense, he had a fair point. Refactoring does increase your odds (at least at first) of introducing new bugs (or exposing old ones you hadn't seen before.)

The problem is that "risk aversion" is often equated with "change aversion" and is a paralyzing force that prevents you from solving problems that actually reduce risk in the long run. A closely related problem is that people also often define risk only one way: the chances that a bug will appear to your users.

This mindset of avoiding risk paralyzes large projects and discourages refactoring because refactoring is perceived as "changing a lot of code that might break" instead of a "removing technical debt so we have fewer bugs in the long run." In order to overcome this objection you need to first redefine what risk is, and then create tools to reduce it.

Viewing risk should as the chance a bug will impact a user is naive without considering other factors such as the time it takes to resolve the bug.

As an example, I worked on a project where our users found a bug in our production system that was preventing a certain class of them from performing their job. They were at a complete stand still because of this bug, which, after investigation, was determined to require only a simple JavaScript fix to our web tier.

But management refused to let us patch it directly in production, and instead directed us to follow the normal test / release procedure, which lasted a minimum of three weeks. Why? Because "deploying to production without testing is risky".

Tell me? What is more risky? Having a three week minimum to touch production or making a live JavaScript fix which could be safely reverted in less than 30 seconds?

Refactoring your code not only reduces risk directly by fixing and simplifying code, it also secondarily reduces risk by (1) making there be less code to fix if a bug does surface and (2) educating the developer about the code so they can diagnose and fixes bug quicker.

Risk should not be seen as just the chance that a bug will occur, but more wholistically as how likely will it occur, how quickly can we fix it, and how can we make sure it does not happen again.

Refactoring does have the added chance that you will unintentionally introduce a new bug because you are changing swaths of existing code. But instead of being paralyzed by the fear of introducing a bug, focus on building tools that make refactoring safer.

A large suite of automated unit tests goes a long way to ensuring that a refactoring will be safe. And it also provides you the ability to learn from past bugs by writing a new unit test for each bug you fix. Shortening your release process so that bugs in production can be addressed immediately and not emeshed behinds weeks of process also goes a long way to mitigate the true risk of refactoring while still reaping its benefits.

"We don't have time to refactor"

A common, and very reasonable objection to refactoring is "We just don't have the time." You are in a position where deadlines are tight, there are ten "number 1 priorities", and your customer is constantly screaming "moar code!" Its hard to take the time out on a task to refactor your code base, when you can just slap a fix in and move on to the next burning fire.

The problem is, is that this is a vicious cycle. The answer to the objection really is "you don't have the time to not refactor!"

The longer you post pone refactoring, the worse and more costly your code becomes to maintain, which consumes more and more of your time. Eventually you can even find yourself trapped in a spin cycle, where you cannot make any forward progress because you are simply trying to maintain the code / process that you have.

As hard as it may be, you need to stop and refactor. Refactoring costs you time in the short run, but buys you time in the long run. Like good finances, you have to stop charging the credit card and start paying down the principal or you will max out your credit line and have no options left.

If delaying refactoring is a vicious downward spiral, building in time to refactor is an upward one. The more you refactor, the easier and faster it becomes to do it, and the more time you save when implementing other stories. This time savings snowballs hopefully enabling you to eventually catch up to the needs of your customers (unless they are completely unreasonable...)

Like saving for your retirement though, sacrifices have to be made to stop the crazy cycle and do the refactoring even when you are at your busiest. If you don't it, will only get worse.

A Call To Arms

In my experience, common sense seldom prevails over beaucracy and the entrenched bad management that plagues many software projects. Unless you are fortunate enough to be a developer run startup or working in a cutting edge technology company, you are going to be faced with a backward, stodgy culture that is antithetical to refactoring.

Instead of letting it beat you down though, take charge of your project! In many cases, I have found that the best thing to do is to just do it. Don't ask for permission, don't make it into a meeting, don't try to build a defense of it. Just refactor as you go.

It should be part of your style, inseparable from how you code. Don't viewing refactoring as "something else you do", make it part of the fiber of your daily coding. You are not going to get fired for refactoring, and the best way to win others to your side is to show them the benefits of doing it.

(And if your boss really ends up firing you for refactoring, it was time to get a new job anyway... We're hiring.)

As a young industry we need to start building more respect for ourselves as professionals. We need to move away from the steorotype of software developers as sloppy hackers and more towards a customer / professional relationship resembling that of a medical doctor or lawyer. We have a professional responsibility to deliver good to our customers even if they are asking for something which they don't realize is bad for them.