Code refactoring

Code refactoring is the process of changing a computer program's internal structure without modifying its external functional behavior or existing functionality, in order to improve internal quality attributes of the software. Reasons include to improve code readability, to simplify code structure, to change code to adhere to a given programming paradigm, to improve maintainability, to improve performance, or to improve extensibility.

Overview

In software engineering, "refactoring" source code means improving it without changing its overall results. The process could be informally referred to as "cleaning up" or "taking out the garbage." Refactoring neither fixes bugs nor adds new functionality, though it might precede either activity. Rather, it improves the understandability of the code, changes its internal structure and design, and removes dead code.

These changes are intended to make the code easier to comprehend, more maintainable, and more amenable to change. Refactoring is usually motivated by the difficulty of adding new functionality to a program or fixing a bug in it.

As refactoring is an operation that produces revised programs from originals, it is a special case of program transformation.

In extreme programming and other agile methodologies, refactoring is an integral part of the software development cycle: developers first write tests, then write code to make the tests pass, and finally refactor the code to improve its internal consistency and clarity. Automatic unit testing helps to preserve correctness of the refactored code.

Code smells are a heuristic to indicate when to refactor, and what specific refactoring techniques to use.

The simplest example of a refactoring is to change an identifier (such as a variable name) into something more meaningful, such as from a single letter 'i' to 'interestRate'. While the concept of renaming an identifier is relatively simple, the implementation of it may not be. Performing a global search-and-replace operation on any body of text can result in undesired results if by identifiers that contain the original identifier as a substring, overloaded identifiers and scoping rules. A more complex refactoring is to turn the code within a block into a subroutine. An even more complex refactoring is to replace a conditional statement with polymorphism.

While "cleaning up" code has happened for decades, the key insight in refactoring is to intentionally "clean up" code separately from adding new functionality, using a known catalogue of common useful refactoring methods, in conjunction with testing the code, to ensure that existing behavior is preserved. The new aspect is explicitly wanting to improve an existing design without altering its intent or behavior. Failure to perform refactoring can result in accumulating technical debt.

Some refactoring methods face challenges in being used.^[1] Refactoring the business layer stored in a database schema is difficult or impossible, because of schema transformation and data migration that must occur while system may be under heavy use. Finally, refactoring that affects an interface can cause difficulties unless the programmer has access to all users of the interface. For example, a programmer changing the name of a method in an interface must either edit all references to the old name throughout the entire project or maintain a stub with the old method name. That stub would then call the new name of the method.

Hardware refactoring

While the term refactoring originally referred exclusively to refactoring of software code, in recent years code written in hardware description languages (HDLs) has also been refactored. The term hardware refactoring is used as a shorthand term for refactoring of code in hardware description languages. Since HDLs are not considered to be programming languages by most hardware engineers ^[2], hardware refactoring is to be considered a separate field from traditional code refactoring.

Automated refactoring of analog hardware descriptions (in VHDL-AMS) has been proposed by Zeng and Huss ^[3] In their approach, refactoring preserves the simulated behavior of a hardware design. The non-functional metric that improves is that refactored code can be processed by standard synthesis tools, while the original code cannot. Refactoring of digital HDLs, albeit manual refactoring, has also been investigated by Synopsys fellow Mike Keating ^[4]^[5]. His target is to make complex systems easier to understand, which increases the designers' productivity.

In the summer of 2008, there was an intense discussion about refactoring of VHDL code on the news://comp.lang.vhdl newsgroup ^[6]. The discussion revolved around a specific manual refactoring performed by one engineer, and the question to whether or not automated tools for such refactoring exist. Responses to this question did not bring up suggestions for hardware refactoring tools thus far.

History

Although refactoring code has been done informally for years, William Opdyke's 1993 Ph.D. dissertation^[7] is the first known paper to specifically examine refactoring,^[8] although all the theory and machinery have long been available as program transformation systems. All of these resources provide a catalog of common methods for refactoring; a refactoring method has a description of how to apply the method and indicators for when you should (or should not) apply the method.

Martin Fowler's book Refactoring: Improving the Design of Existing Code^[1] is the canonical reference.

The first known use of the term "refactoring" in the published literature was in a September, 1990 article by William F. Opdyke and Ralph E. Johnson.^[9] Opdyke's Ph.D. thesis^[7], published in 1992, also used this term.^[8] The term "refactoring" was almost certainly used before then.

The term "factoring" has been used in the Forth community since at least the early 1980s. Chapter Six of Leo Brodie's book Thinking Forth (1984) is dedicated to the subject.

In extreme programming, the Extract Method refactoring technique has essentially the same meaning as factoring in Forth; to break down a "word" (or function) into smaller, more easily maintained functions.

List of refactoring techniques

Here is a very incomplete list of code refactorings. A longer list can be found in Fowler's Refactoring book and in Fowler's Refactoring Website^[10].

Techniques that allow for more abstraction
- Encapsulate Field - force code to access the field with getter and setter methods
- Generalize Type - create more general types to allow for more code sharing
- Replace type-checking code with State/Strategy
- Replace conditional with polymorphism

Techniques for breaking code apart into more logical pieces
- Extract Method, to turn part of a larger method into a new method. By breaking down code in smaller pieces, it is more easily understandable. This is also applicable to functions.
- Extract Class moves part of the code from an existing class into a new class.

Techniques for improving names and location of code
- Move Method or Move Field - move to a more appropriate Class or source file
- Rename Method or Rename Field - changing the name into a new one that better reveals its purpose
- Pull Up - in OOP, move to a superclass
- Push Down - in OOP, move to a subclass

Automated code refactoring

Many software editors and IDEs have automated refactoring support. Here is a list of a few of these editors, or so-called refactoring browsers.

IntelliJ IDEA (for Java)
Eclipse's Java Development Toolkit (JDT)
NetBeans (for Java)
Embarcadero Delphi
Bicycle Repair Man (for Python, works with emacs and vi)
Visual Studio (for .NET)
ReSharper (An addon for Visual Studio)
Refactor Pro (An addon for Visual Studio)
Visual Assist (An addon for Visual Studio with refactoring support for VB, VB.NET. C# and C++)
DMS Software Reengineering Toolkit (Implements large-scale refactoring for C, C++, C#, COBOL, Java, PHP and other languages)
Photran a Fortran plugin for the Eclipse IDE

References

^ ^a ^b Fowler, Martin (1999). Refactoring. Addison-Wesley. ISBN 0-201-48567-2. {{cite book}}: Cite has empty unknown parameters: |1= and |coauthors= (help)
^ Hardware_description_languages#HDL_and_programming_languages
^ Kaiping Zeng, Sorin A. Huss: Architecture refinements by code refactoring of behavioral VHDL-AMS models. ISCAS 2006
^ M. Keating :"Complexity, Abstraction, and the Challenges of Designing Complex Systems", in DAC'08 tutorial [1]"Bridging a Verification Gap: C++ to RTL for Practical Design"
^ M. Keating, P. Bricaud: "Reuse Methodology Manual for System-on-a-Chip Designs", Kluwer Academic Publishers, 1999.
^ http://newsgroups.derkeiler.com/Archive/Comp/comp.lang.vhdl/2008-06/msg00173.html
^ ^a ^b Opdyke, William F (1992). "Refactoring Object-Oriented Frameworks" (compressed Postscript). Ph.D. thesis. University of Illinois at Urbana-Champaign. Retrieved 2008-02-12. {{cite journal}}: Cite journal requires |journal= (help); Unknown parameter |month= ignored (help)
^ ^a ^b Martin Fowler, "MF Bliki: EtymologyOfRefactoring"
^ Opdyke, William F. (1990). "Refactoring: An Aid in Designing Application Frameworks and Evolving Object-Oriented Systems". Proceedings of the Symposium on Object Oriented Programming Emphasizing Practical Applications (SOOPPA). ACM. {{cite conference}}: Unknown parameter |booktitle= ignored (|book-title= suggested) (help); Unknown parameter |coauthors= ignored (|author= suggested) (help); Unknown parameter |month= ignored (help)
^ Refactoring techniques in Fowler's refactoring Website

External links

What Is Refactoring? (c2.com article)
Martin Fowler's homepage about refactoring
Aspect-Oriented Refactoring by Ramnivas Laddad
A Survey of Software Refactoring by Tom Mens and Tom Tourwé
Refactoring To Patterns Catalog
Extract Boolean Variable from Conditional (a refactoring pattern not listed in the above catalog)
Test-Driven Development With Refactoring

[fowler99-1] Fowler, Martin (1999). Refactoring. Addison-Wesley. ISBN 0-201-48567-2. {{cite book}}: Cite has empty unknown parameters: |1= and |coauthors= (help)

[2] Hardware_description_languages#HDL_and_programming_languages

[3] Kaiping Zeng, Sorin A. Huss: Architecture refinements by code refactoring of behavioral VHDL-AMS models. ISCAS 2006

[4] M. Keating :"Complexity, Abstraction, and the Challenges of Designing Complex Systems", in DAC'08 tutorial [1]"Bridging a Verification Gap: C++ to RTL for Practical Design"

[5] M. Keating, P. Bricaud: "Reuse Methodology Manual for System-on-a-Chip Designs", Kluwer Academic Publishers, 1999.

[6] ttp://newsgroups.derkeiler.com/Archive/Comp/comp.lang.vhdl/2008-06/msg00173.html

[opdyke-thesis-7] Opdyke, William F (1992). "Refactoring Object-Oriented Frameworks" (compressed Postscript). Ph.D. thesis. University of Illinois at Urbana-Champaign. Retrieved 2008-02-12. {{cite journal}}: Cite journal requires |journal= (help); Unknown parameter |month= ignored (help)

[etymology-8] Martin Fowler, "MF Bliki: EtymologyOfRefactoring"

[opdyke90-9] Opdyke, William F. (1990). "Refactoring: An Aid in Designing Application Frameworks and Evolving Object-Oriented Systems". Proceedings of the Symposium on Object Oriented Programming Emphasizing Practical Applications (SOOPPA). ACM. {{cite conference}}: Unknown parameter |booktitle= ignored (|book-title= suggested) (help); Unknown parameter |coauthors= ignored (|author= suggested) (help); Unknown parameter |month= ignored (help)

[10] Refactoring techniques in Fowler's refactoring Website

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]