📖 Refactoring: Improving the Design

As is often the case with refactoring, the early stages were mostly driven by trying to understand what was going on. A common sequence is: Read the code, gain some insight, and use refactoring to move that insight from your head back into the code. — location: 1459 ^ref-25001

Whenever I’ve shown people how I refactor, they are surprised by how small my steps are, each step leaving the code in a working state that compiles and passes its tests. — location: 1472 ^ref-50183

If someone says their code was broken for a couple of days while they are refactoring, you can be pretty sure they were not refactoring. — location: 1495 ^ref-64759

I use “restructuring” as a general term to mean any kind of reorganizing or cleaning up of a code base, and see refactoring as a particular kind of restructuring. — location: 1497 ^ref-6284

Kent Beck came up with a metaphor of the two hats. When I use refactoring to develop software, I divide my time between two distinct activities: adding functionality and refactoring. When I add functionality, I shouldn’t be changing existing code; I’m just adding new capabilities. I measure my progress by adding tests and getting the tests to work. When I refactor, I make a point of not adding functionality; I only restructure the code. I don’t add any tests (unless I find a case I missed earlier); I only change tests when I have to accommodate a change in an interface. — location: 1513 ^ref-43818

I can visualize this state of affairs with the following pseudograph: — location: 1563 ^ref-7685

The best time to refactor is just before I need to add a new feature to the code base. — location: 1584 ^ref-12410

Whenever I have to think to understand what the code is doing, I ask myself if I can refactor the code to make that understanding more immediately apparent. — location: 1602 ^ref-23835

Whether I’m adding a feature or fixing a bug, refactoring helps me do the immediate task and also sets me up to make future work easier. This is an important point that’s frequently missed. Refactoring isn’t an activity that’s separated from programming—any more than you set aside time to write if statements. — location: 1626 ^ref-58094

But I think the most dangerous way that people get trapped is when they try to justify refactoring in terms of “clean code,” “good engineering practice,” or similar moral reasons. The point of refactoring isn’t to show how sparkly a code base is—it is purely economic. We refactor because it makes us faster—faster to add features, faster to fix bugs. It’s important to keep that in front of your mind and in front of communication with others. — location: 1722 ^ref-51328

Perhaps the calling code is owned by a different team and I don’t have write access to their repository. Perhaps the function is a declared API used by my customers—so I can’t even tell if it’s being used, let alone by who and how much. Such functions are part of a published interface—an interface that is used by clients independent of those who declare the interface. Code ownership boundaries get in the way of refactoring — location: 1731 ^ref-33035

Integrating branches that are four weeks old is more than twice as hard as those that are a couple of weeks old. Many people, therefore, argue for keeping feature branches short—perhaps just a couple of days. Others, such as me, want them even shorter than that. This is an approach called Continuous Integration (CI), also known as Trunk-Based Development. With CI, each team member integrates with mainline at least once per day. — location: 1763 ^ref-27305

Refactorings often involve making lots of little changes all over the code base—which are particularly prone to semantic merge conflicts (such as renaming a widely used function). — location: 1769 ^ref-61557

CI and re-factoring work well together, which is why Kent Beck combined them in Extreme Programming. — location: 1772 ^ref-15848

Refactoring can be a fantastic tool to help understand a legacy system. — location: 1807 ^ref-40272

yagni—yagni makes it easier to do refactoring. This is because it’s easier to change a simple system than one that has lots of speculative flexibility included. — location: 1893 ^ref-19613

The second approach is the constant attention approach. Here, every programmer, all the time, does whatever she can to keep performance high. This is a common approach that is intuitively attractive—but it does not work very well. Changes that improve performance usually make the program harder to work with. This slows development. This would be a cost worth paying if the resulting software were quicker—but usually it is not. — location: 1913 ^ref-470

Even if you know exactly what is going on in your system, measure performance, don’t speculate. You’ll learn something, and nine times out of ten, it won’t be that you were right! — Ron Jeffries — location: 1940 ^ref-42561

To do refactoring properly, the tool has to operate on the syntax tree of the code, not on the text. Manipulating the syntax tree is much more reliable to preserve what the code is doing. This is why at the moment, most refactoring capabilities are part of powerful IDEs—they use the syntax tree not just for refactoring but also for code navigation, linting, and the like. This collaboration between text and syntax tree is what takes them beyond text editors. — location: 2007 ^ref-23160

The power of using the syntax tree to analyze and refactor programs is a compelling advantage for IDEs over simple text editors, but many programmers prefer the flexibility of their favorite text editor and would like to have both. A technology that’s currently gaining momentum is Language Servers [langserver]: software that will form a syntax tree and present an API to text editors. — location: 2029 ^ref-15612

This book has taught refactoring to many people, but I have focused more on a refactoring reference than on taking readers through the learning process. If you are looking for such a book, I suggest Bill Wake’s Refactoring Workbook [Wake] that contains many exercises to practice refactoring. — location: 2036 ^ref-7397

Josh Kerievsky tied these two worlds closely together with Refactoring to Patterns [Kerievsky], which looks at the most valuable patterns from the hugely influential “Gang of Four” book [gof] and shows how to use refactoring to evolve towards them. — location: 2040 ^ref-47533

and Refactoring HTML [Harold] (by Elliotte Rusty Harold). — location: 2046 ^ref-46624

is Michael Feathers’s Working Effectively with Legacy Code [Feathers], which is primarily a book about how to think about refactoring an older codebase with poor test coverage. — location: 2047 ^ref-63817

Mysterious Name — location: 2074 ^ref-3627

Long Function In our experience, the programs that live best and longest are those with short functions. Programmers new to such a code base often feel that no computation ever takes place—that the program is an endless sequence of delegation. When you have lived with such a program for a few years, however, you learn just how valuable all those little functions are. All of the payoffs of indirection—explanation, sharing, and choosing—are supported by small functions. — location: 2099 ^ref-64987

A heuristic we follow is that whenever we feel the need to comment something, we write a function instead. Such a function contains the code that we wanted to comment but is named after the intention of the code rather than the way it works. — location: 2109 ^ref-55833

They often signal this kind of semantic distance. A block of code with a comment that tells you what it is doing can be replaced by a method whose name is based on the comment. Even a single line is worth extracting if it needs explanation. — location: 2127 ^ref-46539

Conditionals and loops also give signs for extractions. Use Decompose Conditional (260) to deal with conditional expressions. — location: 2129 ^ref-60543

With loops, extract the loop and the code within the loop into its own method. — location: 2135 ^ref-28561

Data is more awkward to manipulate than functions. Since using a function usually means calling it, I can easily rename or move a function while keeping the old function intact as a forwarding function — location: 3393 ^ref-20858

(so my old code calls the old function, which calls the new function). I’ll usually not keep this forwarding function around for long, but it does simplify the refactoring. Data is more awkward because I can’t do that. — location: 3394 ^ref-26486

When I have a function that gives me a value and has no observable side effects, I have a very valuable thing. I can call this function as often as I like. I can move the call to other places in a calling function. It’s easier to test. In short, I have a lot less to worry about. — location: 6949 ^ref-51813

Mutable data isn’t a big problem when it’s a variable whose scope is just a couple of lines—but its risk increases as its scope grows. — location: 2192 ^ref-40954

When we make a change, we want to be able to jump to a single clear point in the system and make the change. When you can’t do this, you are smelling one of two closely related pungencies. — location: 2201 ^ref-25811

context boundaries are usually unclear in the early days of a program and continue to shift as a software system’s capabilities change. — location: 2208 ^ref-33556

Shotgun surgery is similar to divergent change but is the opposite. You whiff this when, every time you make a change, you have to make a lot of little edits to a lot of different classes. — location: 2219 ^ref-15964

Feature Envy When we modularize a program, we are trying to separate the code into zones to maximize the interaction inside a zone and minimize interaction between zones. A classic case of Feature Envy occurs when a function in one module spends more time communicating with functions or data inside another module than it does within its own module. We’ve lost count of the times we’ve seen a function invoking half-a-dozen getter methods on another object to calculate some value. — location: 2237 ^ref-28461

We find many programmers are curiously reluctant to create their own fundamental types which are useful for their domain—such as money, coordinates, or ranges. — location: 2275 ^ref-62307

You can move out of the primitive cave into the centrally heated world of meaningful types by using Replace Primitive with Object (174). If the — location: 2281 ^ref-52735

Speculative Generality Brian Foote suggested this name for a smell to which we are very sensitive. You get it when people say, “Oh, I think we’ll need the ability to do this kind of thing someday” and thus add all sorts of hooks and special cases to handle things that aren’t required. — location: 2318 ^ref-51602

Often, you’ll hear advice that all superclasses should be abstract. You’ll guess from our snide use of “traditional” that we aren’t going to advise this—at least not all the time. We do subclassing to reuse a bit of behavior all the time, and we find it a perfectly good way of doing business. There is a smell—we can’t deny it—but usually it isn’t a strong smell. — location: 2436 ^ref-39907

Don’t worry, we aren’t saying that people shouldn’t write comments. In our olfac-tory analogy, comments aren’t a bad smell; indeed they are a sweet smell. The reason we mention comments here is that comments are often used as a deodorant. It’s surprising how often you look at thickly commented code and notice that the comments are there because the code is bad. — location: 2447 ^ref-64830

Often, sections of code work only if certain conditions are true. This may be as simple as a square root calculation only working on a positive input value. With an object, it may require that at least one of a group of fields has a value in it. Such assumptions are often not stated but can only be deduced by looking through an algorithm. Sometimes, the assumptions are stated with a comment. A better technique is to make the assumption explicit by writing an assertion. An assertion is a conditional statement that is assumed to be always true. Failure of an assertion indicates a programmer error. Assertion failures should never be checked by other parts of the system. Assertions should be written so that the program functions equally correctly if they are all removed; indeed, some languages provide assertions that can be disabled by a compile-time switch. — location: 6876 ^ref-18265

I often see people encourage using assertions in order to find errors. While this is certainly a Good Thing, it’s not the only reason to use them. I find assertions to be a valuable form of communication—they tell the reader something about the assumed state of the program at this point of execution. I also find them handy for debugging, and their communication value means I’m inclined to leave them in once I’ve fixed the error I’m chasing. — location: 6882 ^ref-7912

When you see that a condition is assumed to be true, add an assertion to state it. — location: 6888 ^ref-50821

There is a real danger of overusing assertions. I don’t use assertions to check everything that I think is true, but only to check things that need to be true. Duplication is a particular problem, as it’s common to tweak these kinds of conditions. — location: 6911 ^ref-9937

When you feel the need to write a comment, first try to refactor the code so that any comment becomes superfluous. — location: 2458 ^ref-51203

Now it was easy to run tests—as easy as compiling. So I started to run tests every time I compiled. Soon, I began to notice my productivity had shot upward. — location: 2483 ^ref-39068

I became more aggressive about doing the tests. Instead of waiting for the end of an increment, I would add the tests immediately after writing a bit of function. — location: 2489 ^ref-8474

This chapter is just an introduction to the world of self-testing code, so it makes sense for me to start with the easiest case—which is code that doesn’t involve user interface, persistence, or external service interaction. Such separation, however, is a good idea in any case: Once this kind of business logic gets at all complicated, I will separate it from the UI mechanics so I can more easily reason about it and test it. — location: 2523 ^ref-26130

So I like to see every test fail at least once when I write it. My favorite way of doing that is to temporarily inject a fault into the code, for example: — location: 2587 ^ref-7520

Testing should be risk-driven; — location: 2625 ^ref-31080

trying to write too many tests usually leads to not writing enough. — location: 2626 ^ref-46178

confidence in the tests at worst. Instead, I prefer to do this: Click here to view code image describe('province', function() { let asia; beforeEach(function() { — location: 2656 ^ref-65121

I like my tests to all operate on a common bit of fixture, so I can become familiar with that standard fixture and see the various characteristics to test on it. — location: 2670 ^ref-1966

Notice how I’m playing the part of an enemy to my code. I’m actively thinking about how I can break it. I find that state of mind to be both productive and fun. It indulges the mean-spirited part of my psyche. — location: 2720 ^ref-59463

Putting in lots of validation checks between modules in the same code base can result in duplicate checks that cause more trouble than they are worth, especially if they duplicate validation done elsewhere. — location: 2738 ^ref-52141

Don’t let the fear that testing can’t catch all bugs stop you from writing tests that catch most bugs. When — location: 2747 ^ref-48817

An important habit to get into is to respond to a bug by first writing a test that clearly reveals the bug. — location: 2768 ^ref-3576

The best measure for a good enough test suite is subjective: How confident are you that if someone introduces a defect into the code, some test will fail? This isn’t something that can be objectively analyzed, and it doesn’t account for false confidence, but the aim of self-testing code is to get that confidence. — location: 2775 ^ref-37115

It is possible to write too many tests. One sign of that is when I spend more time changing the tests than the code under test—and I feel the tests are slowing me down. But while over-testing does happen, it’s vanishingly rare compared to under-testing. Chapter — location: 2778 ^ref-28150

Every refactoring in this book has a logical inverse refactoring, — location: 2828 ^ref-60261

The argument that makes most sense to me, however, is the separation between intention and implementation. If you have to spend effort looking at a fragment of code and figuring out what it’s doing, then you should extract it into a function and name the function after the “what.” — location: 2867 ^ref-26399

Backlinks

Created at 2024-04-05, last modified at 2025-08-18.
This is a Book note.