What is TDD (and red-green-refactor)?

When you create a program, you should be thinking about how you’ll know it’s right when it’s done. How will you make sure it’s doing what it’s supposed to do, before it goes to production? TDD and red-green-refactoring can help!

What brought you here? An upcoming meetup or class assignment, where you'll be running through a kata using TDD? Maybe you're starting a new job where they use it? Whatever the case, you might be wondering what to expect when you sit down and try out TDD for the first time.

Let’s define some terms first

So what exactly is unit testing, TDD, and red-green-refactor?

(Everyone’s experience-level is different, so feel free to skip anything you’re already familiar with.)

What is a Unit Test?

When you create a program, you should be thinking about how you’ll know it’s right when it’s done. How will you make sure it’s doing what it’s supposed to do, before it goes to production?

You could test it manually. That’s what we usually do with coursework in school… run it with one input and see what happens. Run it with other inputs and see what happens. Walk through the code trying to figure out what’s going on. But as your program grows from a few methods to thousands of methods across hundreds of classes, that quickly becomes impossible.

So we put the computer to work, testing itself. These automated tests come in all kinds of flavors – from testing one piece, to testing the entire system end to end, testing just the UI, load testing, etc.

The most basic kind of automated test is a unit test. You test a single piece of functionality in a single method. You completely control the environment around that method, so that your test doesn’t inadvertently hit any external resources like the database, a network drive, or even the local disk.

If it’s possible for the database to go down, or the network to become inaccessible, or to accidentally try writing to an unauthorized location on disk, your test could fail for a reason outside of your control, and that’s no good. There’s another concept called programming to the interface that helps avoid these pitfalls, but that’s too much for this post.

Unit tests and isolation go hand-in-hand. You test one method with specific, known inputs, and check that the output is exactly what you thought it would be.

Let’s run through an example…

You’ve been tasked with making a calculator, and the first feature is a “Divide” capability. The method takes two numbers, divides them, and returns the quotient. Ground-breaking, right?

You might start with a method like the following (all my examples are in C#, but hopefully it’s obvious what this is doing even if you’re not familiar with C#), then run it and manually test some numbers. Whatever you throw at it seems to work okay, so you call it a day. Well, thanks for reading, have a good one!

public class Calculator
{
    public decimal Divide(decimal dividend, decimal divisor)
    {
        return dividend / divisor;
    }
}

…… oh wait, skip 6 months ahead, you’re on to other projects, and the users have decided they need to multiply too (so demanding). Also, in the meantime, adding and subtracting has been implemented by someone else, and now the code is intermittently spewing unexpected results and blowing up at runtime. Time to manually run numbers through again. Better yet, manually run numbers through all the methods, every time someone builds it! (That’s ridiculous. Don’t do it manually.)

The point is, the “Divide” method is now failing and no one knows exactly why, or even when it might have started. Your users like to add and subtract, but they rarely divide. All your fine work, unappreciated. :p

That’s a silly example, but on a larger scale, we run into it all the time when we’re developing large applications that dozens of programmers have contributed to over a span of years.

It’s time to start automating your tests. You could add a second class full of methods that instantiate the Calculator, run some numbers through it, and return a value indicating whether or not the expected answer and actual answer match up. The following two “test” methods return true if the answers match; false otherwise.

public class CalculatorTests
{
    Calculator c = new Calculator();
    public bool Is6DividedBy3EqualTo2()
    {
        var quotient = c.Divide(6, 3);
        if (quotient == 2)
            return true;
        else
            return false;
    }
    public bool Is9DividedBy2EqualTo4Point5()
    {
        var quotient = c.Divide(9, 2);
        if (quotient == 4.5m)
            return true;
        else
            return false;
    }
}

You could even move that class into its own project, and use the magic of reflection to run all the tests in your test class, check the return values, and display a list of tests that failed.

Here’s a console application that does just that, but it may be hard to follow if you’re unfamiliar with C#, and certainly isn’t the way I’d recommended running the tests, so don’t spend too much time on it. :)

public class Program
{
    public static void Main()
    {
        var cTests = new CalculatorTests();
 
        var failedTests = new List<string>();
 
        // using reflection, run every test method and record the names of those methods that fail
        foreach (var m in typeof (CalculatorTests).GetMethods()
                                                  .Where(m => m.DeclaringType != typeof (object)))
        {
            if (Convert.ToBoolean(m.Invoke(cTests, null)) != true)
                failedTests.Add(m.Name);
        }
 
        // display all the failed tests, or a message that everything passed
        if (failedTests.Any())
            Console.WriteLine("Failed Tests: \r\n\r\n{0}", string.Join("\r\n", failedTests));
        else
            Console.WriteLine("All tests passed!");
 
        Console.ReadLine();
    }
}

In the second screen shot, I’ve changed my inputs to produce invalid results, so the tests fail.

So, this is better than manual, but still ridiculous.

Instead, you’d use mature testing tools like NUnit and xUnit (.NET languages like C#, F#, VB.NET), JUnit (Java), Minitest (Ruby), etc. There are frameworks for different kinds of tests in nearly every language, and they make running the tests easy (individually or all at once), and tell you loads more about what specifically may have gone wrong.

I’ll stop there for now. Hopefully that gives you at least a starting idea of what a unit test is. Just remember… small, isolated, tests one thing at a time, tightly controlled. (Find the code on .NET Fiddle)

What is TDD and Red-Green-Refactor?

In the last section, we wrote the “Divide” method first, and then wrote the tests to validate it much later. This is common in legacy code, that was written without any tests originally.

TDD, or test-driven development, flips this around. We write the tests before writing the code. In essence, the tests should drive the program, stating what the code should do.

Let’s define our method again, but just enough to make the code compile. It doesn’t take the inputs into account at all, and certainly isn’t implemented correctly.

public decimal Divide(decimal dividend, decimal divisor)
{
    return 0;
}

Now we write a test that states exactly what we expect this code to do. I’m going to switch over to NUnit syntax now, which should still be easy enough to follow, but is more likely to match what you might see when you’re doing your own testing. NUnit is available in Visual Studio via NuGet.

Of course, being programmers, we obsess over every detail, including how we name our tests. There are different opinions, and you can view several here and here, but I’ll follow a naming convention that specifies what we’re testing, the expected result, and when that result should occur (aka MethodName_ExpectedBehavior_StateUnderTest in the second article linked above).

[TestFixture]
public class CalculatorTests
{
    Calculator c;
 
    [SetUp]
    public void Setup()
    {
        c = new Calculator();
    }
 
    [Test]
    public void Divide_Returns2_WhenDividing6By3()
    {
        var quotient = c.Divide(6, 3);
 
        Assert.IsTrue(quotient == 2);
    }
 
    [Test]
    public void Divide_Returns4_5_WhenDividing9By2()
    {
        var quotient = c.Divide(9, 2);
 
        Assert.IsTrue(quotient == 4.5m);
    }
}

Feel free to leave a comment below if you want clarification on anything so far.

A couple quick notes regarding the above code…

The method marked with the “SetUp” attribute runs before every single test. By creating a new instance of Calculator before each test, we isolate our tests from one another (remember, isolation is good – we don’t want one test modifying some values in the lone instance of Calculator, and then the next test failing due to those changed values).

Also, the methods aren’t returning a value anymore. The “Assert” class and its methods capture the result of the test and report it to us. Most testing libraries will have similar methods built-in.

The above can actually be further shortened in NUnit, using the “TestCase” attribute to combine similar tests. It’s not pertinent to a discussion of TDD, but I’ll include it here in case you’re interested. The test method has been updated to accept parameters, which we pass in when we run the tests.

[TestCase(6, 3, 2,   Description = "6 / 3 = 2")]
[TestCase(9, 2, 4.5, Description = "9 / 2 = 4.5")]
public void Divide_ReturnsExpectedQuotient(decimal dividend, decimal divisor, decimal expectedQuotient)
{
    var actualQuotient = c.Divide(dividend, divisor);
 
    Assert.AreEqual(expectedQuotient, actualQuotient);
}

The term “Red-Green-Refactor” is closely tied to TDD.

When we first run our unit tests, the tests are going to fail. The “Divide” method is returning 0 in all cases, so the initial state of our tests is red. Notice how NUnit tells what the expected and actual values were, and provides a stack trace and some other useful info.

We can now fix the original “Divide” method, changing it to return dividend / divisor again; now the tests pass (and are green):

Alright, now a new requirement comes in from the users. If the quotient is negative, make it positive before returning it. It’s weird, but luckily you’re the type who goes with the flow.

Let’s write another test to indicate the expected behavior (-10 / 5 should return 2, not -2) and watch the first two test cases fail:

[TestCase(-10,  5, 2, Description = "-10 / 5 = 2")]
[TestCase( 10, -5, 2, Description = "10 / -5 = 2")]
[TestCase(-10, -5, 2, Description = "10 / 5 = 2")]
public void Divide_ReturnsPositiveQuotient_WhenInput(decimal dividend, decimal divisor, decimal expectedQuotient)
{
    var actualQuotient = c.Divide(dividend, divisor);
 
    Assert.GreaterOrEqual(actualQuotient, 0);
}

Now we fix the code to make the tests pass (green) again.

public decimal Divide(decimal dividend, decimal divisor)
{
    // make negative dividends positive
    if (dividend < 0)
        dividend = -dividend;
    
    // make negative divisors positive
    if (divisor < 0)
        divisor = -divisor;
 
    return dividend / divisor;
}

The refactoring part can come in at any point where our tests are passing.

When we’re confident that our code is working as it should (because our tests pass), we’re free to refactor the code as we see fit.

Replace the above code with the .NET-provided Math.Abs function and run the tests again. They pass, so the changes didn’t break anything.

public decimal Divide(decimal dividend, decimal divisor)
{
    return Math.Abs(dividend / divisor);
}

Things continue this way, with you writing tests to state what the program should do, they’ll most likely fail, and then you fix up the code to make the tests pass.

I’ll go through one more.

Now someone comes along and says, “Wow, looks what happens if I divide by 0!” And some kittens get swallowed into a swirling vortex. Cute ones. Really unfortunate.

The users decide they don’t want to throw an exception.. they want to return 0 when the divisor is 0, no matter what the dividend is. We need another test.

[TestCase(5,  Description = "5 / 0 = 0")]
[TestCase(0,  Description = "0 / 0 = 0")]
[TestCase(-5, Description = "-5 / 0 = 0")]
[Test]
public void Divide_ReturnsZero_WhenDivisorIsZero(decimal input)
{
    var actualQuotient = c.Divide(input, 0);
 
    Assert.AreEqual(0, actualQuotient);
}

Check out the “message” above. The new tests failed (I collapsed the passing tests), because the original method threw a DivideByZeroException.

Time to go green again. We could catch that particular exception, but (in the .NET world at least) exceptions are expensive and it’s better to prevent one if possible. After all, if you can anticipate a condition and code around it, it’s not all that exceptional.

public decimal Divide(decimal dividend, decimal divisor)
{
    if (divisor == 0)
        return 0;
 
    return Math.Abs(dividend / divisor);
}

Run the tests again… good to go! Now if you wanted, you could go back and change the code, maybe catch the DivideByZeroException instead, and the tests will still pass, letting you know you haven’t broken anything.

Clear as mud? If you need clarification, leave a comment below! (Find the code in this section on .NET Fiddle.)

What is Pair Programming?

Pair programming is pretty much what it sounds like. Two heads are better than one.

If you’ve ever asked a fellow programmer for help and then you both sat together figuring a problem out, you’ve pair programmed. There are two ways people see this:

Some people see this as a waste of resources – two salaries with half the output
Other people see this as collaboration – instead of two people getting stuck on individual tasks or getting distracted by the latest cat videos, they can bounce ideas off each other and keep driving ahead

Some advocates take it to the extreme, and do nothing but pair programming, all day every day. I’ve never tried it, so I can’t say much about it. Except that I hope no one gets paired with a Wally.

How does this all fit with Code Katas?

When you pair up during a code kata, you learn from each other, and discuss problems as they arise.

You’ll basically “ping-pong” back and forth, following a pattern like the following:

You write a unit test that fails (red), then pass control of the keyboard to your partner.
She modifies the code to make your test pass (green), then writes a unit test of her own (to indicate what the program should do next), which also fails (red again). She passes control back to you.
You write more code to make her test pass (green), then write the next (failing) test (red!). And so on…

This continues, until time is up. That’s right, there’s usually a time limit. It could be a half-hour, or maybe the length of the user group meeting.

You see, you aren’t really aiming to complete the kata, though with some shorter ones you certainly will. You’re more focused on following a disciplined process:

What’s the next thing our program should be able to do?
What kind of test can we write, to reflect the next requirement to test for?
Test fails. How can we modify the code, to make the test pass? (the requirement’s been met)
Test passes. How can we refactor the code to make it more efficient? (make sure the tests still pass)
Repeat

Can we run through a sample kata?

Sure. Glad you asked.

A popular (and short) one is the Fizz Buzz kata: (there are plenty more at cyber-dojo.org)

Write a program that prints the numbers from 1 to 100. But for multiples of three print “Fizz” instead of the number and for the multiples of five print “Buzz”. For numbers which are multiples of both three and five print “FizzBuzz”.

It helps to list out the requirements, if the kata doesn’t do so already:

Print the numbers from 1 to 100.
If the number is a multiple of 3, print “Fizz” instead.
If the number is a multiple of 5, print “Buzz” instead.
If the number is a multiple of 3 and 5, print “FizzBuzz” instead.

Now we have 4 distinct steps, and we can begin writing tests for these.

Step 1: Return the same number

We need a method that accepts a number, and (for now) spits it back out. Let’s start with a basic method that accepts a number and outputs an empty string, so we can compile.

public class FizzBuzz
{
    public string FizzyOutput(int input)
    {
        return "";
    }
}

Let’s just test 1 and 2 to start. Of course it fails, because we always return an empty string:

[TestFixture]
public class FizzBuzzTests
{
    private FizzBuzz fizzBuzz;
 
    [SetUp]
    public void Setup()
    {
        fizzBuzz = new FizzBuzz();
    }
 
    [Test]
    public void FizzyOutput_OutputsOne_WhenInputIsOne()
    {
        var output = fizzBuzz.FizzyOutput(1);
 
        Assert.AreEqual("1", output);
    }
 
    [Test]
    public void FizzyOutput_OutputsTwo_WhenInputIsTwo()
    {
        var output = fizzBuzz.FizzyOutput(2);
 
        Assert.AreEqual("2", output);
    }
}

Now you’d pass the keyboard to your pair to fix the code and make the test pass.

public string FizzyOutput(int input)
{
    return input.ToString();
}

Step 2: Return “Fizz” for multiples of 3

The second requirement is to print “Fizz” for multiples of 3. Your partner could create multiple tests, or with a tool like NUnit, use the TestCase attribute. Run the new test and watch it fail. After all, we’re still returning the same number no matter what.

[TestCase(3)]
[TestCase(6)]
[TestCase(9)]
[Test]
public void FizzyOutput_OutputsFizz_WhenInputIsMultipleOfThree(int input)
{
    var output = fizzBuzz.FizzyOutput(input);
 
    Assert.AreEqual("Fizz", output);
}

To get this to pass, you could do something silly, like return “Fizz” for exactly the specified inputs. Or take a more practical approach that handles any multiple of 3. Always run the tests again when you’re done, to make sure they pass.

public string FizzyOutput(int input)
{
    if (input % 3 == 0)
        return "Fizz";
 
    return input.ToString();
}

Step 3: Return “Buzz” for multiples of 5

Now you write the next test, indicating that multiples of 5 return “Buzz”, and pass the keyboard again.

[TestCase(5)]
[TestCase(10)]
public void FizzyOutput_OutputsBuzz_WhenInputIsMultipleOfFive(int input)
{
    var output = fizzBuzz.FizzyOutput(input);
 
    Assert.AreEqual("Buzz", output);
}

The test fails, and your pair fixes it in a very similar manner to the previous requirement.

Step 4: Return “FizzBuzz” for multiples of 15

One last requirement, and your partner writes a test for it.. multiples of 3 and 5:

[TestCase(15)]
[TestCase(30)]
[TestCase(45)]
public void FizzyOutput_OutputsFizzBuzz_WhenInputIsMultipleOfThreeAndFive(int input)
{
    var output = fizzBuzz.FizzyOutput(input);

    Assert.AreEqual("FizzBuzz", output);
}

Your turn to finish up the kata, and you do it by checking for multiples of 15:

public string FizzyOutput(int input)
{
    if (input % 15 == 0)
        return "FizzBuzz";
 
    if (input % 3 == 0)
        return "Fizz";
 
    if (input % 5 == 0)
        return "Buzz";
 
    return input.ToString();
}

Run the tests one more time and verify they all pass. Here’s the complete set of tests we’re running.

Although I think the above tests are sufficient, you could test for every value just to be sure:

[TestCase(1, "1")]
[TestCase(2, "2")]
[TestCase(3, "Fizz")]
[TestCase(4, "4")]
[TestCase(5, "Buzz")]
// ....
[TestCase(14, "14")]
[TestCase(15, "FizzBuzz")]
[TestCase(16, "16")]
// ...
[TestCase(95, "Buzz")]
[TestCase(96, "Fizz")]
[TestCase(97, "97")]
[TestCase(98, "98")]
[TestCase(99, "Fizz")]
[TestCase(100, "Buzz")]
public void FizzyOutput_OutputsExpectedValues(int input, string expectedOutput)
{
    var actualOutput = fizzBuzz.FizzyOutput(input);
 
    Assert.AreEqual(expectedOutput, actualOutput);
}

You can find the code for this FizzBuzz example on .NET Fiddle.

Final Thoughts

At this point, we could refactor again, since the tests are all passing. I don’t think there’s much left to do though. Do you see anything I missed? Typos?

I used C# because I’m most familiar with it, but you could pair up with someone who knows a different language. I’ve paired up a couple times to do katas in Ruby – it never hurts to learn something new (and meet someone new!), and to see how other programmers/languages approach testing.

There’s no pressure of business requirements or deadlines… just learning from others and sharing knowledge. And maybe having a good laugh when the time is up and you realize you’ve come up with a hideous solution (not that that ever happens).

Thanks for reading this far… I hope you learned something new or interesting.

Thoughts? Do you have questions or something to add? Let me know in the comments below!