Without Haste: Object-Oriented Programming Notes

Encapsulation

The internal representation of an object is hidden
The hiding of data implementation by restricting access to accessors and mutators
Prevents outside users from setting values to invalid or inconsistent states
Reduces system complexity

Abstraction

Code to interfaces instead of to detailed implementations
Decompose complex systems into smaller components to manage complexity

Inheritance

Reuse code by establishing an object hierarchy

Polymorphism

Literally, one name with many forms

Overloading

You can have several methods with the same name and different parameter lists
The correct method is determined at compile-time
(The return type does not differentiate the method signatures)

Overriding

A derived class can override a base class method
The correct method is determined at run-time

These are the SOLID principles.

Single Responsibility

A class should have only one responsibility.
Therefore, it will have only one reason to change.

For example, class Calculator should not format its outputs for display.

Open/Closed

A class should be open for extension and closed for modification.
Similarly, make a class easy to extend so it does not have to be modified.

The goal is to design modules such that their behavior can be extended without modifying their source code.

For example, you might use the Decorator Pattern when writing a "discount price" module, instead of one class that contains all the discounts. The first design can be extended, the second requires editing the module when requirements change.

Liskov Substitution

Derived classes can be substituted for their base classes.

That means that anything you can call on the base class, and anything you can pass the base class into, will also work with any derived class.

Don't break base class functionality when writing derived classes.

Ex from Robert Martin: if you have base class Bird with Flying methods, do not make derived class Penguin under Bird because Penguins cannot fly.

(This principle is named for Barbara Liskov, who introduced the idea in 1987)

Interface Segregation

Clients should not have to implement interfaces they do not use.

For example, interface IShape should not include a method Volume() because 2D shapes do not have a volume but would be forced to implement the method.

Segregate (separate) the methods into as many different interfaces as makes sense.

Dependency Inversion

High level models should not depend on low level models.

For example, a PasswordReminder class does not care what sort of database it accesses, so it should accept a database connection as an argument instead of creating its own. It should also be accepting the base class DBConnection rather than the more specific derived class MySQLConnection.

This will also make test cases much easier to write.

(Modules aka Packages aka Assemblies)

This discussion is about the Module level, but coupling and cohesion apply equally to every level of detail in programming.

How large is a Module? A programmer should be able to hold the whole Module in their head at one time.

Coupling

Coupling is the degree of interdependence between software modules.
For instance, the more Types from ModuleA that ModuleB depends on (uses), the more tightly coupled the two modules are.
The goal is to reduce coupling between Modules.

Benefit of low coupling:
- Reduced mental strain, because you can focus on one section of the code at a time.
- Supports parallel development, where teams can work on different parts of the code at once without affecting each other.
- Easier deployments, because you don't have to deploy the entire code base every time.
- Fewer bug in general.

Cohesion

Cohesion is the degree to which Elements inside a Module belong together.
In other words, how closely related are the Elements (conceptually, or through code dependency).
The goal is to have highly cohesive Modules.

"Incoherent fragments of ideas are as hard to understand as an undifferentiated soup of ideas." from Domain Driven Design.

Benefits of high cohesion:
- Usually occurs simultaneously with low coupling, because everything is where it should be. An element in ModuleA that belongs in ModuleB means that coupling is higher than it needs to be.
- Demonstrates a deep understanding of the domain, which will improve the entire project.

A paradigm or style in which the language expresses the logic of a computation without describing its control flow. This is an abstraction, specific implementations of the logic exist within the language.

Ex: SQL
Ex: Regular Expressions
Ex: LINQ

The focus is on WHAT must be accomplished rather than the details of HOW.

These languages also Lack Side-Effects or, in other words, is Referentially Transparent. They also frequently have a clear correspondence to mathematical logic.

Uses statements that change a program's state.

Procedural Programming is a subset of Imperative Programming.

This section describes different ways programming languages handle passing arguments/parameters into functions/methods.

Categories

There are two common ways to divide programming languages.

(1) Pass-by-value vs Pass-by-reference
In this case, the pass-by-reference umbrella includes both pass-by-reference and pass-by-object-reference.

(2) Pass-by-value vs Pass-by-reference vs Pass-by-object-reference

Pass By Value

Pass-by-value means pass a copy into the method. Any change to the parameter in the method will not affect the argument that was passed in.

Primitives in Java and C# are passed by value. That includes data types like int, string, and bool.


//C#
using System;
class Program
{
    static void Main(string[] args)
    {
        int x = 5;
        Console.WriteLine(x); //outputs 5

        IncrementInt(x);
        Console.WriteLine(x); //outputs 5 still

        Console.ReadLine();
    }

    public static void IncrementInt(int y) //x is the argument being passed in to y, the parameter
    {
        y += 1; //only the local copy of x (called y) is edited
    }
}

The data types called "primitives" in Java and C# are all "immutable objects" in Python. They are technically passed-by-object-reference, but the behavior is de-facto pass-by-value, because all edit operations cause the object to be re-instantiated.


 #Python
x = 5
print(x) #outputs 5

incrementInt(x)
print(x) #outputs 5 still

def incrementInt(y): #x is the argument being passed in to y, the parameter
    y += 1 #only the local copy of x (called y) is edited

What are called "primitives" in C#, are all immutable objects in Python. Technically, Python does not use pass-by-value, but the behavior is the same for types like int, string, and bool.

Pass By Object Reference

Pass-by-object-reference means to pass copy of the pointer (aka reference) to the object. Most operations on the parameter will affect the original argument.

Objects in Java and C# are passed by object reference (although this is commonly called simply pass-by-reference).


//C#
using System;
using System.Collections.Generic;

class Program
{
    static void Main(string[] args)
    {
        List<int> x = new List<int>() { 1, 2, 3 };
        Console.WriteLine(x.Count); //outputs 3

        AppendToList(x);
        Console.WriteLine(x.Count); //outputs 4

        ClearList(x);
        Console.WriteLine(x.Count); //outputs 0

        InitializeList(x);
        Console.WriteLine(x.Count); //outputs 0 still

        Console.ReadLine();
    }

    public static void AppendToList(List<int> y)
    {
        y.Add(4); //operations on the object affect the original argument
    }

    public static void ClearList(List<int> y)
    {
        y.Clear(); //operations on the object affect the original argument
    }

    public static void InitializeList(List<int> y)
    {
        y = new List<int>() { 1, 2, 3, 4, 5, 6, 7 }; //instantiation edits the pointer y, and does not affect the original argument
        //now x and y are pointing to difference objects, so any operations on y will not affect x
    }
}

Everything is an object in Python, so everything is passed-by-object-reference. Immutable objects (integers, string, booleans, etc) are technically passed-by-object-reference, but since all edit operations actually re-instantiate the object, they are de-facto pass-by-value.

In Python, collections like list, set, and dictionary are all mutable objects, so they have the expected pass-by-object-reference behavior.


x = [1, 2, 3]
print(len(x)) #outputs 3

appendToList(x)
print(len(x)) #outputs 4

clearList(x)
print(len(x)) #outputs 0

initializeList(x)
print(len(x)) #outputs 0 still

def appendToList(y):
    y.append(4)

def clearList(y):
    y.clear()
    
def initializeList(y):
    y = [1, 2, 3, 4, 5, 6, 7]

Pass By Reference

This definition will differentiate pass-by-reference from pass-by-object-reference. The term pass-by-reference is very frequently used loosely to mean either definition.

Pass-by-reference means everything you do to the parameter affects the original argument, even instantiation.

In C#, you can specify pass-by-reference.


//C#
using System;
using System.Collections.Generic;

class Program
{
    static void Main(string[] args)
    {
        List<int> x = new List<int>() { 1, 2, 3 };
        Console.WriteLine(x.Count); //outputs 3

        AppendToList(x);
        Console.WriteLine(x.Count); //outputs 4

        ClearList(x);
        Console.WriteLine(x.Count); //outputs 0

        InitializeList(x);
        Console.WriteLine(x.Count); //outputs 0 still

        InitializeList(ref x);
        Console.WriteLine(x.Count); //outputs 7

        Console.ReadLine();
    }

    public static void AppendToList(List<int> y)
    {
        y.Add(4); //operations on the object affect the original argument
    }

    public static void ClearList(List<int> y)
    {
        y.Clear(); //operations on the object affect the original argument
    }

    public static void InitializeList(List<int> y)
    {
        y = new List<int>() { 1, 2, 3, 4, 5, 6, 7 }; //instantiation edits the pointer y, and does not affect the original argument
        //now x and y are pointing to difference objects, so any operations on y will not affect x
    }

    public static void InitializeList(ref List<int> y)
    {
        y = new List<int>() { 1, 2, 3, 4, 5, 6, 7 }; //instantiation affects the original argument
    }
}

You cannot specify pass-by-reference in Python, as far as I know.

A software metric related to the complexity of the program. A measure of the linearly independent paths through a program's source code.

Ex: A to B to C to D
    Cyclomatic Complexity equals 1 because there is only 1 path

Ex: A to B to D
    A to C to D
    Cyclomatic complexity equals 2 because there are 2 paths

Generally, in a control flow graph, complexity = edges - nodes + 2*loops
(loops are called Connected Components)

Continuous Integration

Developers continuously merge they commits into the main branch, so that merge conflicts appear quickly and at smaller sizes than if you wait a while to do this.

Ideally, every developer does a fresh merge from main to their branch, resolves conflicts, then merges up to main.

Continuous Delivery

Continuous Integration PLUS automated testing that runs when the main branch is updated PLUS an one-button push to production process.

Your product is always ready to push to production, whenever you need it.

Continuous Deployment

Continuous Delivery BUT if all automated test pass THEN every update to main is automatically pushed to production.

The CAP theorem for distributed computing, published by Eric Brewer:

It is not possible for a distributed computer system to simultaneously provide Consistency, Availability, and Partition Tolerance. At most, two of these can be provided at a time.

Consistency: all nodes see the same data at the same time.

Availability: every request receives a FAILED or SUCCEEDED response.

Partition Tolerance: the system continues to operate despite partial system failure or arbitrary message loss.

Principles for building software-as-a-service applications. The goal is portability and resilience when deployed to the web.

Codebase

There should be only one codebase for a deployed service.

Dependencies

All dependencies should be declared explicitly, including dependencies on system tools and libraries.

Config

Configuration that varies between deployments should be stored in the environment.

Backing Services

??
All backing services are treated as attached resources and attached and detached by the execution environment.

Build, Release, Run

The delivery pipeline should strictly consist of Build, Release, Run.

Processes

Applications should be deployed as stateless processes. Persistent data should be stored on a backing service.

Port Binding

Self-contained services should make themselves available to other services by specified ports.

Concurrency

??
Concurrency is advocated by scaling individual processes.

Disposability

Fast startup and shutdown support a robust and resilient system.

Dev/Prod Parity

All environments should be as similar as possible.

Logs

Applications should produce logs as event streams and leave aggregation to the execution environment.

Admin Processes

Admin tasks should be included in source control and be packaged with the application.

The original 23 design patterns were described by the Gang of Four: Erich Gamma, Richard Helm, Ralph Johnson, and John Vlissides.

Creational Patterns
    Abstract Factory
    Builder
    Factory Method
    Prototype
    Singleton

Structural Patterns
    Adapter
    Bridge
    Composite
    Decorator
    Facade
    Flyweight
    Proxy

Behavioral Patterns
    Chain of Responsibility
    Command
    Interpreter
    Iterator
    Mediator
    Memento
    Observer
    State
    Strategy
    Template Method
    Visitor

Strategy Aka Policy

Decouple the highly variable part of a process from the stable part.

(similar to State)
Make algorithms interchangeable by creating an Interface for them
Let your class reference the Interface type, so any implementation of the algorithm can be set
Allows you to easily change class behavior

Variation
There are multiple ways to accomplish an objective
Each way is made up of several algorithm choices
Group each set of choices into its own object that inherits from a base class
Now you can swap between the cohesive sets of algorithms

If you see the same conditional sprinkled all over a process, it might be good to split it into two processes with one conditional deciding between them.

Ex: should we judge a Route by the Total Time or the Total Distance? That's a policy decision.

State

(similar to Strategy)
Allow an object to change its behavior as its internal state changes, by changing what instance of an interface or derivation of a base class is used for an action

Observer

This is how events work
A publisher class allows any subscriber class to subscribe to its events
When an event occurs, the publisher alerts all of its subscribers

Decorator

When you have many possible versions of a class that can occur in any combination, like toppings on a pizza
Make a base class, with none of the options, and derive from it for each option
Let each derived option contain an object of the base class

Now you can wrap the base with layers of derived options, to any depth
Methods can recursively drill into the layers

Factory

Create a static factory class that will return a variety of derived types of the same base type
The factory class can handle the details of instantiating the derived classes

Abstract Factory

An extension of the Factory pattern.

When you have a family of related objects, and there is more than one level of categorization for the objects, you can create a factory for each level and let the client use any of them.


//products
public abstract class DisplayLanguage {...}
public abstract class English : DisplayLanguage {...}
public abstract class Japanese : DisplayLanguage {...}
public class Roman : English {...} //normal english
public class Brail : English {...} //brail encoded english
public class Kanji : Japanese {...} //logographic chinese characters
public class Hiragana : Japanese {...} //phonetic characters

//factories
public abstract class DisplayLanguageFactory 
{ 
    public DisplayLanguage Create(); 
}

public class EnglishFactory : DisplayLanguageFactory
{
    public English Create()
    {
        //can return Roman or Brail object
    }
}

public class JapaneseFactory : DisplayLanguageFactory
{
    public Japanese Create()
    {
        //can return Kanji or Hiragana object
    }
}

Singleton

Ensures that there will only ever be one instance of a class used throughout an entire application

C# implementation:


public class Singleton
{
    private static Singleton _instance;

    //Singleton.Instance is the only way to retrieve an instance of a Singleton
    public static Singleton Instance {
        get {
            if(_instance == null)
            {
                _instance = new Singleton();
            }
            return _instance;
        }
    }

    //private constructor
    private Singleton()
    {
    }
}

Command

Encapsulates a request as an object, so that you can pass, queue, log, undo, or redo it
Decouples the requestor of an action from the object that performs the action

Adapter

(similar to Facade)
A layer between a client API and your own code
So your code all references the adapter, and only the adapter references 3rd party code
This protects most of your code from changes in the 3rd party code

More specifically, and to distinguish Adapter from Facade:
An Adapter is a wrapper that allows a client to use a different protocol than that understood by the implementer of the behavior.
When a client sends a message to an Adapter, it is converted to a semantically equivalent message and sent on to the "adaptee". The response is converted and passed back.

Evans: Our emphasis is on translation between two models, but I think this is consistent with the intent of Adapter.

Gamma: The emphasis is on making a wrapped object conform to a standard interface that clients expect.

Facade

(similar to Adapter)
Create a single interface between your code and all 3rd party code
Your code will see a single object in the place of a myriad of client interfaces

More specifically, and to distinguish Facade from Adapter:
A Facade is an alternative interface for (one or more) subsystems that simplifies access for the client. You can facilitate access to some features and hide the rest.
A Facade does not change the model of the underlying system. It should be written strictly in accordance with the other system's model.

Proxy

An adapter whose purpose is to either limit access to some operations of the original class, or to control when expensive operations are run

Template Method

Defines an order of operations, while letting the details of each step by variable


public abstract class TemplateExample
{
    public void TemplateMethod()
    {
        Step1();
        Step2();
        Step3();
    }
    
    public virtual void Step1() {}
    public virtual void Step2() {}
    public virtual void Step3() {}
}

Iterator

Access the elements of an aggregate object sequentially, without exposing its underlying representation

Mediator

A mediator object encapsulates how a set of objects will interact. A mediator has great control on how a program runs.

Business objects do not communicate directly with each other. They communicate through the mediator only. The mediator is like an internal Adapter or an anti-corruption layer.
- "It promotes loose coupling by keeping objects from referring to each other explicitly, and it allows their interaction to be varied independently."
- "Client classes can use the mediator to send messages to other clients, and can receive messages from other clients via an event on the mediator class."

So, like a message bus inside a single module.

Composite

An individual object and a collection of those objects are treated uniformally

For example, modeling a folder and file tree

Memento

Allows you to save the state of an object, perhaps to restore the state later

Specification

Allows business rules to be combined by chaining them together with business logic.


interface ISpecification
{
    bool IsSatisfiedBy(object candidate);
    ISpecification And(ISpecification other);
    ISpecification Or(ISpecification other);
    //other logical operators...
}

abstract class CompositeSpecification : ISpecification
{
    public abstract bool IsSatisfiedBy(object candidate);
    
    public ISpecification And(ISpecification other)
    {
        return new AndSpecification(this, other);
    }
    
    public ISpecification Or(ISpecification other)
    {
        return new OrSpecification(this, other);
    }
}

public class AndSpecification : CompositeSpecification
{
    private ISpecification leftCondition;
    private ISpecification rightCondition;
    
    public AndSpecification(ISpecification left, ISpecification right)
    {
        leftCondition = left;
        rightCondition = right;
    }
    
    public override bool IsSatisfiedBy(object candidate)
    {
        return (leftCondition.IsSatisfiedBy(candidate) && rightCondition.IsSatisfiedBy(candidate));
    }
}

//OrSpecification class...

//sample business rule
public class IsAdminSpecification : CompositeSpecification
{
    public bool IsSatisfiedBy(object candidate)
    {
        return ((candidate is User) && (candidate as User).Type == "Admin");
    }
}

Notification

Instead of throwing an Exception when an expected error occurs, return a Result object with the error listed.

Commonly used for data validation: you can return a list of everything that is wrong, instead of returning just the first error hit.

Martin Fowler :"You should use Notification whenever validation is done by a layer of code that cannot have a direct dependency to the module that initiates the validation." For example, when the Presentation Layer needs to validate user input, and the validation logic belongs in the Domain Layer.

Bridge

You have two or more categories that combine to create an exponential number of possibilities. Each unique combination has a different implementation. A basic inheritance hierarchy gets out of hand fast.

The Bridge pattern divides this into an interface hierarchy and an implementation hierarchy.

Bridge is similar to Adapter pattern. The difference is that Bridge is designed upfront to allow abstraction and implementation to vary independently, while Adapter is retrofitted to let unrelated classes work together.

Ex: 2 types of thread scheduler X 3 software platforms

Basic hierarchy


class ThreadScheduler { ... }

class PreemptiveThreadScheduler : ThreadScheduler { ... }
class TimeSlicedThreadScheduler : ThreadScheduler { ... }

class UnixPTS : PreemptiveThreadScheduler { ... }
class WindowsPTS : PreemptiveThreadScheduler { ... }
class JavaVirtualMachinePTS : PreemptiveThreadScheduler { ... }

class UnixTSTS : TimeSlicedThreadScheduler { ... }
class WindowsTSTS : TimeSlicedThreadScheduler { ... }
class JavaVirtualMachineTSTS : TimeSlicedThreadScheduler { ... }

Bridge hierarchy:


interface IThreadScheduler
{
    IPlatform platform;
    void CallMethodA();
    void CallMethodB();
}

class PreemptiveThreadScheduler : IThreadScheduler
{
    public PreemptiveThreadScheduler(IPlatform p)
    {
        platform = p;
    }
    public void CallMethodA()
    {
        platform.MethodA();
    }
    public void CallMethodB()
    {
        platform.MethodB();
    }
}

//class TimeSlicedThreadScheduler ...

interface IPlatform
{
    void MethodA();
    void MethodB();
}

class UnixPlatform : IPlatform
{
    public void MethodA() { ... }
    public void MethodB() { ... }
}

//class WindowsPlatform ...

//class JavaVirtualMachinePlatform ...

Hardware example:
    IToggle can be a wall switch, a pull chain, a light sensor, etc
    IDevice can be a lamp, a radio, a tv, etc

Flyweight

You have a very large number of expensive objects. Break the objects into two parts: the expensive shared part, and the cheap variable part.

The flyweight contains the expensive part.
The lightweight parts are removed entirely; they are handled by the client.


public class Factory
{
    private List<Flyweight> repository;
    //returns a shared object
    public Flyweight GetFlyweight();
}

public class Flyweight
{
    //private expensive stuff
    
    //public methods accept the lightweight variable parts as arguments
}

The client calls getFlyweight(). When the client needs something specific, it passes specifics to the flyweight methods. The flyweight combines its shared expensive data with the lightweight variable data from the client to perform operations.

Ex: a webpage loads images, which are memory-intensive
when an image is shown multiple times on the same page (such as with a tiled background) then you have one Flyweight for the shared image and a list of locations

Builder

Separate the construction of a complex object from its representation. The same construction process can create different representations.

This can solve at least two types of problems.

1) It's a response to the Telescoping Constructor Anti-pattern - when the increase of combinations of object constructor parameters leads to an exponential number of constructors. Instead, the Builder receives each parameter linearly and returns one final object.

2) Used to create flat data objects (xml, sql, ...) where the final product is not easily edited. The builder keeps it in an easy to edit format until you are done, then generates the final format.

Builders are good candidate for fluent interfaces due to the linear input of options.


//problem 1
public class Director
{
    private Builder builder;
    
    public void Construct()
    {
        builder.BuildPart();
    }
}

public class Builder
{
    public void AddOptionA() { ... }
    public void AddOptionB() { ... }
    public void AddOptionC() { ... }
    public Result GetResult() { ... }
}


//problem 2
public class Director
{
    private Builder builder;
    
    public void Construct()
    {
        builder.BuildPart();
    }
}

public abstract class Builder
{
    public void BuildPart();
}

public class XmlBuilder : Builder
{
    public void BuildPart() { ... }
    public Xml GetResult() { ... }
}

public class JsonBuilder : Builder
{
    public void BuildPart() { ... }
    public Json GetResult() { ... }
}

Prototype

Initiate an object with default values, clone it repeatedly. The prototype could be a concrete object or an abstract object.

Chain Of Responsibility

Avoid coupling of the sender of a request to its receiver by giving more than one object a chance to handle the request. Chain the receiving objects and pass the request along the chain until an object handles it.

Marker Interface

Aka Tagging Interface

Requires a language with run-time type information available about objects. This is a means to associate metadata with a class in languages without that explicit support.

Create an empty interface.
The code tests for the presence of the interface on an object at decision points.


if(myObject is IEmptyInterface) { ... }

This is unnecessary in C# and Java, which both support Annotations (Custom Attributes).

Pipes And Filters

A pipeline is a chain of processing elements where the output of element N is the input of element N+1.

Connecting elements into a pipeline is analogous to Function Composition (combining simple functions into more complicated operations).

Linux operations like Grep, Sed, and Awk are designed to be used this way.

Continuation

aka Continuation Passing Style (CPS)

Each function is given an extra function-type argument (the continuation function).
Once the function has its return value, it will invoke the continuation function with that return value, and return that result instead.

Ex: you follow a web link, you aren't logged into that site yet so you are redirected to the login page, once you login you are redirected to the page you started on.

Larger design patterns made up of several basic patterns

Model View Controller

Separation of duties

Model: the actual data
View: a particular way of looking at the data, possibly a summary
Controller: handles user actions and updates Model and View as needed

From Martin Fowler's "Analysis Patterns" on the conceptual structure of business processes that are applicable to many domains because they share the similarities of being businesses.

These notes are a first sketch of the concepts.

Accountability

When one party or organization is responsible to another. Applicable to organizational structure, contracts, employment, etc.

Party: an individual or an organization. A group of any size that acts as a whole.
E.g. a Party can have a phone number, address, email, bank accounts, taxes, etc

Organization Hierarchy


//Business model
Sales Offices report to Divisions report to Regions report to Operating Unit

//Object model
public abstract class Organization //or interface Organization
{
    Organization parent;
    List<Organization> subsidiaries;
}

class OperatingUnit : Organization { ... } //parent must remain null
classs Region : Organization { ... } //parent must be an OperatingUnit
class Division : Organization { ... } //parent must be a Region
class SalesOffice : Organization { ... } //parent must be a Division

Often there are two more overlapping hiearchies within one organization. You can track both hiearchies in one object, or break them into entirely different objects.


public abstract class Organization
{
    Organization salesParent;
    List<Organization> salesSubsidiaries;
    
    Organization productParent;
    List<Organization> productSubsidiaries;
}

[source: Exception handling patterns]

Considerations

Throwing an exception is a control structure (like if or goto) in the sense that it causes control of the program to shift, sometimes dramatically. The method throwing the exception has no information about where control will return to.

Throwing an exception is a very slow operation, comparatively. It should be used sparingly.

In general, do not use exceptions as intentional flow control. And throw exceptions as little as possible.

Naming Exceptions

Describe the problem, why is the exception being thrown.

Use a standard naming format. Do not create both "XNotFoundException" and "NoSuchYException" in the same project.

Do not make exception specific to a class.
If you see the exception hierarchy closely matching your class hierarchy, you are probably not generalizing your exceptions by behavior enough.

Example:
Consider ArrayIndexOutOfBoundsException => IndexOutOfBoundsException => RangeException. The first is specific to arrays only, the last is general to many situations.

Example:
ClassNotFoundException, MethodNotFoundException, and FieldNotFoundException could all be generalized to MissingException. When the exception occurs, the context it occurs in will fill in the rest of the details.

Example:
It does not make sense to name a field in class Person "PersonFirstName". You simply call it "FirstName" because it being within class Person provides context.
Similarly, the code that catches an exception will know a lot about the context of the exception based on what operation was just attempted.

Some major categories of exceptions are:
- logical errors in the program (for example, the program is in an impossible state)
- resource failures (for example, a file cannot be read)
- user errors (for example, the user input is poorly formatted)
- configuration errors (for example, invalid configuration)

Refine Exceptions

When you start creating exceptions types for your project, begin with general categories. For instance, you could start with a single Exception class named for your project that all other exceptions you throw will derive from.

As work continues and you find places where a more specific exception makes sense, create sub classes.

Consider how users of your service will want to group exceptions together to catch them.

When To Raise Exceptions

In general, exceptions should only be raised for exceptional (unusual) conditions. The normal operation of a method should not rely on exceptions.

When to raise exceptions:
- your method encounters unusual conditions it cannot handle
- a client service has breached its obligations (such as passing badly formatted data)
- your method cannot fulfill its obligations

Make sure to always "clean up" before throwing an exception.
For instance, release all locks on files and close all database connections.
For instance, leave the current object in a consistent state, not a partially altered state. Satisfying this condition leads you to putting all validations steps first (anything that could produce an exception) and all data altering steps last.

When To Catch Exceptions

Exceptions do not have to be caught right away. Only catch the exception in methods that have enough information and options to do something about the problem.

Seeing many "try/catch" pairs throughout your code may mean that you need to refactor out some of that logic. It may be serving little purpose, while cluttering the code.

On the other hand, don't let the exception propagate so high up that the context of the exception is lost.

In general, do not write "catch(Exception e) { }", which will catch any exception type at all. Find out what specific exceptions could occur during an operation, and then catch them specifically.

Converting Exceptions

If a service you are using (or a layer of your own project that might as well be a service)
throws a variety of detailed exceptions, it often makes sense to convert those lower-level
or third-party exceptions into the exceptions of the current layer.

Information from the lower level exception can be included in the higher level exception when it is appropriate.

Example:
Lower level message IOException("Disk is full") may be converted to a higher level DataException("Cannot save customer record: Disk is full.")

Security Door Pattern

You have an operation that will be used internally and from clients.

Make one private method that does the operation. It assumes that all necessary validations, data formatting checks, etc have already been done. You can use this method internally.

Make one public method that runs validations, data formatting checks, etc and only calls the private method if everything checks out. Clients will use this method.

Bouncer Pattern

A method that will either throw an exception or do nothing. This is the same pattern as an Assert statement, and is frequently used for validation.

Benefits are reuse of code (if it will be called from many places) and self-documentation (if called from few places). The name of the bouncer method describes its purpose.


private void ValidateFormat(string text, string detailForMessage)
{
    if(notFormattedRight(text))
        throw new FormatException("message" + detailForMessage);
}

Alternatives To Exceptions

Error Avoidance: check conditions before trying an operation.
For example, check if a value is 0 before dividing by that value.
For example, check how much money is in a customer's account before withdrawing money.
There are many reasons you should not always do this: primarily duplication of logic, and the conditions may change between checking and executing.

Null Object: a object of the expected type that contains no data. All normal operations work on the NullObject, so it will not cause "null exceptions" to be thrown.
This is usually implemented as a Singleton pattern, ensuring that all instances of the NullObject are the same object, so it is easy to check if what you have is the NullObject.

Error Values: return a special value indicating an error occurred. This is useful in cases where the error is minor and can be ignored.
This can be much easier to unit test than exceptions.
This can also encourage nested if statements, which is bad.

Pass Error Handler: pass an error handler method into the method that might need it.
This can be much easier to unit test, because the error handling code is already in its own method.

Use Assertions: cause the program to fail until the problem is solved by a person. This is for errors that the program absolutely cannot handle.

Bottom Propagation: a special value that does not cause errors when operated on, but also generally does not change value when operated on. This doesn't imply that anything is really wrong, just that a edge has been hit.
Similar to NullObject.

Refactor means to change the structure/design of code without changing the functionality.
Goal: the code is easier to understand.
Goal: the code is cheaper to modify in the future.

Before refactoring, ensure you have a comprehensive set of automated unit tests, integration tests, and end-to-end tests.
(You definitely need the unit tests. The others add more reliability.)
That way you can be sure that you aren't breaking intended functionality.
Run your tests frequently while refactoring. The fewer changes you've made when a test breaks, the easier it is to debug the error.

"Test, small change, test, small change, test, small change. It is that rhythm that allows refactoring to move quickly and safely."

Refactoring is a process that builds on itself.
Individual changes may not seem very important, but as they build up they reveal more important changes that can now be made.
See Domain Driven Design.

Refactoring is more difficult when you are bound by:
- the database design
- a public interface

There are times it is faster to start over from scratch than to refactor code.
In my experience, programmers are more comfortable with and excited by starting over from scratch, and they will chose this option more often than they should.
Therefore, I recommend erring heavily on the side of refactoring.

Problems with starting over from scratch:
- you lose business logic that was embedded in the old code, but not documented
- it will take far longer than you estimate to catch up with the functionality of the old code base
- you still have to support the old case base anyway

A lot of this terminology is taken from Martin Fowler's book "Refactoring".

Code Stinks

Things in the code base that make programmers uncomfortable, even before they can say what they don't like about it.
Terminology is taken from Martin Fowler's "Refactoring".

Vague names (variable, method, class, etc)

Duplicated code
- code is not duplicated if it is serving different layers of the architecture (ex domain object vs service response object)
- code is not duplicated if it is serving distinct stakeholders (it will diverge as change requests come in)

Long methods

Large classes

Long parameter lists

Divergent changes - a class is altered repeatedly for very different reasons

Shotgun surgery - every task involves tiny changes in many different classes

Feature envy - a method in Class X that is mostly concerned with data and operations in Class Y

Data clumps - the same few fields show up together in many different classes

Primitive obsession - the disinclination to use small classes for important concepts, just because "classes shouldn't be small"

Switch statements - consider polymorphism instead

Parallel inheritance hierarchies - everytime you make a child of Class X, you also must make a child of Class Y

Lazy classes - a class that isn't worth maintaining

Speculative generality - "we might need this later, so I'll generalize it now"

Temporary field - a field that is only set/used occassionally

Message chains - class X asks class Y asks class Z asks... and eventually the response bubbles back to X

Middle man - a class that delegates too much to other classes

Inappropriate intimacy - a pair of classes that have too much knowledge about each other's private members

Alternative method with different interfaces - methods with the same purpose, but different types of parameters

Incomplete library class - logic that belongs in a library is implemented outside of it (applies when you contol the library)

Data class - a class with data but no behavior

Refused bequest - a child class does not use all the data or operations of the parent class

Comments - comments that mark poor code, or that explain what is happening instead of letting the class/method/variable names make that clear
- "When you feel the need to write a comment, first try to refactor the code so that any comment becomes superfluous."
- comments that explain "why" are good comments

String parsing
- I've seen so many errors around string parsing. Structure your code so that strings are treated as complete units. Limit and carefully test where string parsing occurs.

Rename

Rename a variable or a method or a class such that they reveal their intention and meaning.
Avoid abbreviations (unless they are common acronyms in your domain).

It is always worth renaming to increase clarity of meaning.

"Any fool can write code that a computer can understand. Good programmers write code that humans can understand."

Exit Methods As Soon As Possible

Use as many "return" statements in a Method as you need.
If you limit yourself to 1 "return" statement, you will often end up with deeply nested conditional logic.
By exiting the Method as soon as possible, you make the paths through the Method easier to read.

Substitute Algorithm

An algorithm is unclear, and you want to restructure it.

Replace Conditional With Polymorphism

Ex Before: Class X contains conditional logic based on Property P.
Ex After : Class X is abstract and has several child Classes U, V, W. Each child class is specific to one possible value of Property P.

Instead of conditional logic, each child Class overrides just its own P-specific logic.
The "condition" is now handled automatically by which Class is instantiated.

This increases the extensibility of the code, and follows the Open/Closed Principle.

Change Bidirectional Association To Unidirectional

Class A and B reference each other. You simplify this to Class A referencing Class B.

See Domain Driven Design on the topic of simplifying the domain model to focus attention on the most important relationships, instead of trying to accurately map all the details of the real world.
See Domain Driven Design Aggregates.

Bidirectional associations are difficult to maintain:
- creating the objects takes some juggling
- removing the objects from memory is prone to errors
- the association is harder to store in a database

Change Unidirectional Association To Bidirectional

Class A points to Class B. Update Class B to point back to Class A.

When to do this?
Class B requires a reference back to Class A to complete an operation.

I'm not sure about this one.
I expect this is actually a rare need.
I expect that more frequently, there is a bigger restructuring of the class diagram that needs to happen here.

Decompose Conditional

Given a complicated conditional expression, move some or all of the logic to separate Methods.
This lets you name the logic to explain it.

Example:


// before
if(date.before(SUMMER_START) || date.after(SUMMER_END))
    charge = quantity * _winterRate + _winterServiceCharge;
else
    charge = quantity * _summerRate;

// after
if(isSummer(date))
    charge = quantity * _summerRate;
else 
    charge = quantity * _winterRate + _winterServiceCharge;

Consolidate Conditional Expression

Given multiple guard clauses that return the same result, consolidate them in to one Method.

Example:


// before
double disabilityAmount() {
    if (_seniority < 2) return 0;
    if (_monthsDisabled > 12) return 0;
    if (_isPartTime) return 0;
    return calculateAmount();
}

// after
double disabilityAmount() {
    if (!eligibleForDisability()) return 0;
    return calculateAmount();
}
bool eligibleForDisability() {
    return (_seniority >= 2
        || _monthsDisabled <= 12
        || !_isPartTime);
}

Consolidate Duplicate Conditional Fragments

Make it clear what part of the logic is conditional and what is not by keeping only conditional logic inside conditions.

Example:


// before
if (isSpecialDeal()) {
    total = price * 0.95;
    send();
}
else {
    total = price * 0.98;    
    send();
}

// after
if (isSpecialDeal()) {
    total = price * 0.95;
}
else {
    total = price * 0.98;    
}
send();

Remove Nested Conditional With Guard Clause

Don't nest conditional statements. Deeply nested code is harder to read.


// before
double getPayAmount() {
    double result;
    if(_isDead) 
        result = deadAmount();
    else {
        if(_isSeparated) 
            result = separatedAmount();
        else {
            if (_isRetired) 
                result = retiredAmount();
            else 
                result = normalPayAmount();
        }
    }
    return result;
}

// after
double getPayAmount() {
    if (_isDead) 
        return deadAmount();
    if (_isSeparated) 
        return separatedAmount();
    if (_isRetired) 
        return retiredAmount();
    return normalPayAmount();
}

Remove Control Flag

Use "break" and "return" instead of a loop control flag.
Loop control flags are a common source of errors.


// before
boolean found = false;
for (int i = 0; i < people.length; i++) {
    if (! found) {
        if (people[i].equals("Don")){
            sendAlert();
            found = true;
        }
    }
}

// after
for (int i = 0; i < people.length; i++) {
    if (people[i].equals("Don")){
        sendAlert();
        break;
    }
}

Preserve Whole Object

Instead of getting several values from Object A and passing them to a method call, just pass the whole object.

When to do this?
The method is likely to need different Fields/Properties from Object A in the future.
The method needs all, or almost all, of the Fields/Properties of Object A.

When NOT to do this?
The method's object ought not to have a code dependency on Object A.
The method receives these arguments from other sources than Object A, and you don't want a special overload for this use case.

This might be a hint that the method belongs in Object A.

Replace Error Code With Exception

Instead of returning a special error value (magic number), just thrown an Exception.

See discussions of Soft Errors vs Exceptions.

Replace Exception With Test

If you can return a valid non-error value on a special case, test for that case instead of throwing an Exception.
This only applies when the calling code does not need to check for a magic number value before contining execution.

Example: returning 0 will cause proper behavior, so return 0 instead of throwing an Exception.

Inline Temp

Similar to Replace Temp with Query.

Example:


// before
double basePrice = anOrder.basePrice();
return (basePrice > 1000)

// after
return (anOrder.basePrice() > 1000)

Introduce Explaining Variable

The opposite of Inline Temp.

Example:


// before
if((platform.toUpperCase().indexOf("MAC") > -1) 
    && (browser.toUpperCase().indexOf("IE") > -1)
    && wasInitialized() && resize > 0)
{ }

// after
const boolean isMacOs = platform.toUpperCase().indexOf("MAC") > -1;
const boolean isIEBrowser = browser.toUpperCase().indexOf("IE") > -1;
const boolean wasResized = resize > 0; 
if(isMacOs && isIEBrowser && wasInitialized() && wasResized)
{ }

The next step here is to consider if conditionals like "browser.toUpperCase().indexOf("IE") > -1" can be moved into the browser object with a signature like "browser.IsIE". (Or into the owner of the browser string.) Then you can use Inline Temp to re-simplify the if statement.

Split Temp Variable

A variable is reused for a new purpose.

When to do this? Always. You should never reuse a variable for a different purpose.

Example:


// before
double temp = 2 * (_height + _width);
System.out.println (temp);
temp = _height * _width;
System.out.println (temp);

// after
const double perimeter = 2 * (_height + _width);
System.out.println (perimeter);
const double area = _height * _width;
System.out.println (area);

Remove Assignments To Parameters

A parameter value is written over. Use a local variable instead.

When to do this? Always. You should never write over a parameter value. Why is it there if you're overwriting it?

This does not apply to "out" parameters.
This does not apply to altering a member of an object parameter.

Replace Temp With Query

The result of a method call is stored in a local variable.
The variable value is not changed again, and is just used in a calculation later.
You can remove the local variable and simply call the method within the calculation.

When NOT to do this:
When the name of the local variable claifies the meaning of the method return value. (And you don't have access to the method to rename it.)
When there are several method results used in the calculation, and it improves legibility to collect those values ahead of time in local variables.
When the method result is used in more than one place, and you want to ensure they don't diverge.
When the method call is costly, and you need to call it more than once. (Fowler advises to not worry about performance at this stage.)

Replace Array With Object

You are using an array to hold values with different meanings, rather than to hold a collection of values with one meaning.
Applies also to lists, dictionaries, tuples, etc.

When to do this?
Always; you should never use a collection where the position of elements holds specific meanings.

Example:


// before
string[] record = new string[] { "Janice", "32", "New York City", "Interior Decorator" };

// after
public class Person
{
    public string Name { get; set; }
    public int Age { get; set; }
    public string City { get; set; }
    public string Job { get; set; }
}

Of course, the whole point of tuples is to do exactly this - avoid making a class.
I like tuples for destructuring such as "var (name, age, city, job) = person.Destructure();".
I don't like tuples for passing data around.

Encapsulate Field

In C# terms: replace a Field with a Property.
In general: use public Getter/Setter Methods to access a private Field so you know it is always accessed the same way.

When to do this:
Always, for public mutable Fields.

Encapsulte Collection

Class A uses a collection to store data. Do not reveal access to the collection publicly.
If you let other classes access the collection directly, they will rely on Class A always storing its data as this collection.
And they with muck about with your data.

When to do this?
Always, for public collections.


// before
public class Course
{
    public List<Student> Students { get; set; }
}

// after
public class Course
{
    private List<Student> _students;
    public Student Students { get { return _students.ToArray(); } }
    
    public void AddStudent(Student student) {
        _students.Add(student);
    }
    
    public void RemoveStudent(Student student) {
        _students.Remove(student);
    }
}

Move Field

Move Field A from Class X to Class Y.

When to do this?
When Class Y uses the Field more than Class X does.
When the field is conceptually related to Y more than to X.

Replace Data Value With Object

Encapsulate a primitive data value in an class.
Similar to Extract Class.

When to do this?
The new class is conceptually important.
There is behavior specific to this data value that belongs in the new class.

Change Value To Reference

Replace a value object with a reference object.

When to do this?
To save memory - many objects can reference the same instance instead of each referencing a private instance.
To better reflect the domain - to show that this really is the same object that everything is referencing.

Immutability is immaterial here.
You can save a lot of memory by sharing an immutable object (use Reference).
A mutable object may need to be shared across a distributed system (use Value).

Change Reference To Value

Opposite of Change Value to Reference.

When to do this?
The unique identity of the reference object is not important in the domain.
The system is distributed such that a single object in memory cannot be shared.

Replace Magic Number With Symbolic Constant

Give a name to number (and string) literals.
This will communicate the meaning of the literal.

When to do this?
Almost always; rarely use number/string literals in your code.

When to NOT do this?
The literal's meaning is so common and obvious that naming it would be less clear.
- Ex: if(denominator == 0) throw MathException("Cannot divide by 0");
- Ex: percent = actual / total * 100;
- I can only think of math examples, so my rule is to never use string literals, only named constants.

Replace Type Code With Class

Class A has a primitive Field that can be set to one of several constant values.
Convert these possible values into a Class that can only be set to one of those values.

In .Net, this means use an Enum.
Note that in .Net an Enum does not limit what values can be set, but it does communicate which values should be set.

Replace Type Code With Subclasses

Class A has a primitive Field that can be set to one of several constant values. In addition, the value of the Field affects Class behavior.

Make Class A abstract and create a child Class for each possible value of the Field.
Use polymorphism instead of conditional logic to determine behavior.

Replace Type Code With State/Strategy

Class A has a primitive Field that can be set to one of several constant values. In addition, the value of the Field affects Class behavior.

Make the Field of type abstract Class B and create a child Class for each possible value of the Field.
Use polymorphism instead of conditional logic to determine behavior.

Use this (instead of Replace Type Code with Subclasses) if the value of the Field can change throughout the life of the Object A.

See Replace Conditional with Polymorphism.

Remove Setter Method

Default to not allowing Fields/Properties to be altered.

Extract Method

Take a section of code from Method A and move it into new Method B.
Method A now calls Method B.

When to do this?
When Method B is cohesive and has a single responsibility.
A good indication is that Method B has a short and communicative name.
A good indication is that several local variables have been moved to B and are local to just B.
A good indication is that B has 0-2 parameters.
A good indication is that B has 0 or 1 return values.
A good indication is that B has no side effects (nullipotent).

The length of Method A is immaterial.
There is an added bonus if you shorten a very long method, but the increase in clarity is worth it on its own.

The number of times Method B is called is immaterial.
There is an added bonus if you reduce code duplication, but the increase in clarity is worth it on its own.

Inline Method

Replace a method call with the body of the method itself, and delete the method that is no longer needed.

When to do this?
The method body is as clear as the name of the method.
The method did not have high cohesion internally and low coupling externally.
The method is only called from one statement.

In .Net, do not inline a method for performance concerns.
The .Net compiler will automatically inline methods when (A) they are very short or (B) they are only called from a few places.
The division of methods is to serve the understanding of the programmers.

Extending the example from the book: inlining the method is not sufficient here.


// before
boolean moreThanFiveLateDeliveries() {
    return _numberOfLateDeliveries > 5;
}
int getRating() {
    return moreThanFiveLateDeliveries() ? 2 : 1;
}

This code matches "method body is as clear as the name of the method", so inlining is the first step.
But a better solution is to raise this hidden business policy up.


// after
class LateDeliveryPolicy : IPolicy
{
    private const _threshold = 5;
    
    boolean IsViolated(Object widget) {
        return widget.NumberOfLateDeliveries > _threshold;
    }
}
int getRating() {
    return LateDeliveryPolicy.IsViolated(widget) ? 2 : 1;
}

I point this out because just inlining the method does not remove the code smell, it just changes it.

Move Method

Move Method A from Class X to Class Y.

When to do this?
When Method A uses data from Y, but not data from X.
When Method A is conceptually related to Y more than to X.

This usually involves renaming the Method to better match its new Class.

Sometimes the old Method is left in Class X and now it just calls the Method in Class Y (delegation).
This is useful if Class Y has a public interface that you need to continue supporting.

Replace Method With Method Object

Move a method into an object that exists just for this operation.
- the operation can be decomposed into many private methods
- data can be shared between methods with private fields instead of passing lots of parameters around
See Services in Domain Driven Design.

When to do this?
The method is very long and the logic is complex. It uses many local variables throughout the process.
The method coordinates between multiple objects, but does not really belong inside any of them.
The method is significant to the domain, and deserves to be raised up.

Not all objects are based on Nouns.
Some operations are important enough in the domain to be raised up to object level.

Method Objects should be stateless.

Introduce Foreign Method

You want to add a Method to a Class you don't control.

In C#, this is adding an Extension Method.

When to do this?
Conceptually, this Method belongs in the Class, and you want that communicated in the code.

Introduce Assertion

In .Net, you'd use a Guard Clause instead of an Assertion.

Verify the state of the parameters before continuing with the method.

Add Parameter

As part of a larger refactoring, you need to add a Parameter to a Method signature.

Remove Parameter

As part of a larger refactoring, you need to remove a Parameter to a Method signature.

Separate Query From Modifier

Separete query/lookup logic from modifier/command logic.

This supports code reuse because the Methods are more granular.
This separates Nullipotent Methods from Non-Nullipotent Methods, which shows which Methods are safer to call.

Example:


// before
int getTotalOutstandingAndSetReadyForSummaries() { }

// after
int getTotalOutstanding() { }
void setReadyForSummaries() { }

Parameterize Method

Generalize the functionality of several methods into one method by using Parameters.

Example:


// before
fivePercentRaise();
tenPercentRaise();

// after
raise(decimal percentage);

Replace Parameter With Explicit Method

Do not run different logic based on the value of one Parameter.

This is NOT the opposite of Parameterize Method.

Example:


// before
void setValue (String name, int value) {
    if (name.equals("height"))
        _height = value;
    
    if (name.equals("width"))
        _width = value;
}

// after
void setHeight(int arg) {
    _height = arg;
}

void setWidth (int arg) {
    _width = arg;
}

Replace Parameter With Method

Instead of getting a return value from Method A and passing it to Method B, let Method B call Method A directly.

When to do this?
The value provided to Method B consistently comes from Method A.

Example:


// before
int basePrice = _quantity * _itemPrice;
discountLevel = getDiscountLevel();
double finalPrice = discountedPrice(basePrice, discountLevel);

// after
int basePrice = _quantity * _itemPrice;
double finalPrice = discountedPrice(basePrice);

Introduce Parameter Object

You have a set of parameters that are always passed to methods as a set.

Example:


// before
int operationA(string a, int b, int c, bool d);
int operationB(string e, int f, string a, int b, int c);
int operationC(string a, Object g, int b, bool h, int c);

// after
public class Widget {
    public string a { get; set; }
    public int b { get; set; }
    public int c { get; set; }
}
int operationA(Widget w, bool d);
int operationB(string e, int f, Widget w);
int operationC(Widget w, Object g, bool h);

Hide Method

Default to methods being private.

Form Template Method

Two sibling child Classes perform the same series of steps, but what they do at each step is different.
Use a Template Method to show the similaries in the processes.

Example:


// before
public class Site {
}
public class ResidentialSite : Site {
    decimal getBillableAmount() {
        double base = _units * _rate;
        double tax = base * Site.TAX_RATE;
        return base + tax;
    }
}
public class LifelineSite : Site {
    decimal getBillableAmount() {
        double base = _units * _rate * 0.5;
        double tax = base * Site.TAX_RATE * 0.2;
        return base + tax;
    }
}

// after
public class Site {
    decimal getBillableAmount() {
        decimal baseAmount = getBaseAmount();
        return baseAmount + getTaxAmount(baseAmount);
    }
    decimal abstract getBaseAmount();
    deicmal abstract getTaxAmount(decimal baseAmount);
}
public class ResidentialSite : Site {
    decimal override getBaseAmount() {
        return _units * _rate;
    }
    decimal override getTaxAmount(decimal base) {
        double tax = base * Site.TAX_RATE;
        return base + tax;
    }
}
public class LifelineSite : Site {
    decimal override getBaseAmount() {
        return _units * _rate * 0.5;
    }
    decimal override getTaxAmount(decimal base) {
        double tax = base * Site.TAX_RATE * 0.2;
        return base + tax;
    }
}

Replace Constructor With Factory Method

Constructors are generally very simple. Signal complex logic by using a Factory.

Encapsulate Downcast

If a Method's callers commonly have to perform a downcast on the return value, the Method should handle that itself.
(This seems to be a Java problem, because all Collections can only hold Objects.)

To generalize this: If a Method's callers commonly have to perform a transformation on the return value, the Method should handle that itself.

Example:


// before
Object lastReading() {
    return readings.lastElement();
}

// after
Reading lastReading() {
    return (Reading)readings.lastElement();
}

Extract Class

One class is doing the work of multiple classes. Extract part of the data and logic into a new class.

When to do this?
The class does not have a Single Responsibility (or Single Reason to Change).
The class contains private logic that you want to be able to unit test.
The class contains a subset of data and logic that forms a cohesive concept with a clear name.

Make sure the new class is well-named (Self-Documenting Code and Intention-Revealing Interface).
Make sure the new class contains all the cohesive data and logic needed for its calculations (Single Responsibility).

There is no minimum or maximum size for objects, provided they are internally cohesive and externally decoupled.

Inline Class

The opposite of Extract Class.

When to do this?
The class is not worth maintaining separately.
The class merely delegates all its operations to one other class.
The class is not an important domain concept.

Extract Subclass

One class has data and behavior that only applies in some cases.
Pull that into a child Class. The original Class remains concrete.

Extract Superclass

Multiple classes have data and behavior in common.
Move their shared data and behavior into a new parent Class from which they will all inherit.

Only do this if the Classes have strong shared concepts.
For instance, don't combine Person and Institution just because they both have Address data and behavior.

Extract Hierarchy

A Class is doing too much work, much of it divided by conditional logic.
Divide the conditional logic into child Classes.

Extract Interface

When to do this?
Several clients use the same subset of a Class's interface, and it forms a conceptual whole.
Multiple Classes share a subset of their interface, and it forms a conceptual whole.

Extracting an interface is a good way of delineating a use case for a Class that might be used in many different ways.

Example:


// before
public class Employee {
    getRate();
    hasSpecialSkill();
    getName();
    getDepartment();
}

// after
public interface IBillable {
    getRate();
    hasSpecialSkill();
}
public class Employee : IBillable {
    getRate();
    hasSpecialSkill();
    getName();
    getDepartment();
}

Collapse Hierarchy

Opposite of Extract Subclass.

Replace Inheritance With Delegation

A child Class uses only part of the parent Class interface, or does not use the parent Class data.

Example:


// before
public class Vector {
}
public class Stack : Vector { 
}

// after
public class Vector {
}
public class Stack {
    private Vector _vector;
}

Replace Delegation With Inheritance

When a Class is constantly delegating to its Member, maybe it ought to inherit from that Member's Class instead.

Opposite of Replace Inheritance with Delegation.

Hide Delegate

Client calls Class A and related Class B directly.
Instead, have Client call Class A only, and Class A can delegate some of the calls to Class B.

When to do this?
Class B is never called without an attached Class A - an indication that Class A is more important conceptually than B.
Class B is naturally a part of Class A's Aggregate (see Domain Driven Design) and therefore should not be called directly.
You want to present a smaller interface to the Client.

The example in the book involves a bidirectional link between Class A and Class B.
This can be extended with examples from Domain Driven Design: once Class B is accessed through Class A, that indicates that the bidirectional link can be simplified to a directional link from Class A to Class B.
See Change Bidirectional Association to Unidirectional.

Remove Middle Man

The opposite of Hide Delegate.

Introduce Local Extension

There is a lot of custom functionality you want to use wrapped around a library Class (a class outside your control).
Inherit from the library Class or (if it is Sealed) write a Wrapper for it.

When to do this?
There is a significant amount of custom functionality you want in this library Class.

Moving Behavior Into The Class

Look at the where your Class's getter methods are used. Is there logic outside the Class that should be moved into it?

Ex: Class B gets a collection from Class A and counts how many elements match a standard criteria. Instead, Class A should provide a Method for this.

Replace Subclass With Fields

Opposite of Replace Type Code with Subclasses.

Introduce Null Object

Given that you frequently have to check if objects of Class A are null or not before operating on them, create a child Class B that is the "null object".
B can implement all operations of A such that it does not cause errors if you try to use them.
This is a form of Polymorphism - instead of checking a conditional before using behavior, you just use the behavior.

Now you never need to check for null.

As a bonus, you can make NullWidget a Singleton to save memory.

Example:


public class Widget {
    public static Factory() {
        if(conditional)
            return new NullWidget();
        return new Widget();
    }
    public virtual int SomeOperation() {
        return calcuation();
    }
}
public class NullWidget : Widget { 
    public override int SomeOperation() {
        return 0;
    }
}

An example is when Widget needs to display data involving 10 possibly null Properties. If each of them has a Null Object implemented, there is very little conditional logic need to display everything correctly.

Pull Up Field

All sibling child Classes contain the same Field.
Raise this Field to the parent Class.

Pull Up Method

All sibling child Classes contain the same Method.
Raise this Method to the parent Class.

Pull Up Constructor Body

All sibling child Classes contain the same Constructor.
Raise this Constructor to the parent Class.

This may involve the child Classes passing some parameters to the new base Constructor.

Push Down Field

Class A's field is irrelevant to most of the child Classes.
Push the field down to the child Classes that use it.

Push Down Method

Class A's method is irrelevant to most of the child Classes.
Push the method down to the child Classes that use it.

Tease Apart Inheritance

An inheritance hierarchy is doing more than one task.
A sign of this is combinatorial names.

See Bridge Design Pattern.

Example:


// before
publc class Deal { }
public class ActiveDeal : Deal { }
public class TabularActiveDeal : ActiveDeal {}

public class PassiveDeal : Deal { }
public class TabularPassiveDeal : PassiveDeal { }

// after
publc class Deal { 
    private PresentationStyle _style;
}
public class ActiveDeal : Deal { }
public class PassiveDeal : Deal { }

public class PresentationStyle { }
public class SinglePresentationStyle: PresentationStyle {}
public class TabularPresentationStyle: PresentationStyle {}

Convert Procedural Design To Objects

You have procedural code and want to restructure it into object-oriented code.

This is a nebulous task.

[Defunctionalization at Work by Danvy, Nielson]
[Defunctionalization: Everybody does it, Nobody talks about it]

Definitions

Higher Order Function: a function which (A) accepts a function as a parameter AND/OR (B) returns a function.

First Order Function: a function that is not a higher order function.

Defunctionalization: the transformation of higher order functions into first order functions that perform the equivalent work but do not accept function-type parameters.

Direct Form (or Functionalized Form): the use of higher order functions instead of first order functions.

Ex: Filters

Higher Order Function: a list-filtering function that acceps the conditional filter as a parameter.


public List<T> Filter<T>(List<T> list, Func<T, bool> filter)
{
    //returns only elements that pass the filter
}

First Order Function: defines several commonly-used filters as options.
public List<T> Filter<T>(List<T> list, FilterEnum filter)
{
//runs a predefined filter based on the enum value passed in
}

To make this refactor in your own code, you'd look for everywhere that uses Filter, and move all those conditions into their own functions that can be called based on the enum value.

If the conditions require passing in variables (such as a "less than X" filter), then instead of an enum, define a data type with a subtype for each condition. Filter can require the Type to use instead of the Enum.

Ex: Recursion

You can turn a recursive operation into an iterative one with an explicit stack.

For instance, to search a binary tree without recursion:
- keep a list of "pending" nodes, initialized with the root node
- while the list is not empty, process the next "pending" node
- processing means removing the node from "pending", running the search, and appending any child nodes to the "pending" list

Pros

It is difficult to serialize a function to pass across the internet. Defunctionalization helps when building distributed systems.

Cons

It is harder to add a new implementation/option.

Considerations for changing the functionality of legacy code.

Chesterton's Fence

Chesterton's fence is the principle that reforms should not be made until the reasoning behind the existing state of affairs is understood. The quotation is from G. K. Chesterton's 1929 book "The Thing", in the chapter entitled "The Drift from Domesticity":

In the matter of reforming things, as distinct from deforming them, there is one plain and simple principle; a principle which will probably be called a paradox. There exists in such a case a certain institution or law; let us say, for the sake of simplicity, a fence or gate erected across a road. The more modern type of reformer goes gaily up to it and says, "I don't see the use of this; let us clear it away." To which the more intelligent type of reformer will do well to answer: "If you don't see the use of it, I certainly won't let you clear it away. Go away and think. Then, when you can come back and tell me that you do see the use of it, I may allow you to destroy it."

Brainstorming common terminology that can be used when naming a class:

[verb forms] - a lot of business logic is process (verbs) rather than records (nouns) - use verbs for process classes

Node - in a network, or in a tree - a unit of a data structure

Record - a database record, or a very general term

LifeCycle - process management

Policy - a business policy raised to first-class level

Rule - a business rule raised to first-class level, a rule is a sub-unit of a policy

Process

Flow

Agent

Mediator - manages (brings together) work from many other code elements - contains little logic of its own

Client - manages communication with another service or module