Posts by Topic


Software Sin #1: Neglecting Encapsulation

There's a very good reason that Bjarne Stroustrup pushed C++ so hard...don't forget that reason.

In my new series Software Sins, I'll be exploring different software engineering failures–feature abuse and neglect, anti-patterns, design flaws, and the like–and ways to rectify them.

I'll begin with a story of selecting the wrong programming paradigm for a project...


Growing up, I taught myself C. It was the first programming language that I learned thoroughly, so for a long time, I was used to thinking procedurally by default.

One of my first assignments at 128 Technology was to write tests for the router's command line interface to ensure that commands hadn't been broken between builds.

We didn't care about exact output validation, we just wanted to run all of the commands as quickly as possible and make sure that there were no catastrophic errors or strange regressions.

So I sat in a room for a few hours and developed a simple scheme to test each command. I opted to use a YAML file that listed each command along with some customizable test cases. It looked something like this:

"command_name <flag_arg> [keyword_arg] required_arg1 required_arg2...":
    enabled: (bool, default to True)
    test_cases:
        test_standard:
            enabled_flags:
                - flag_arg
            keyword_arguments:
                - keyword_arg: value
            arguments:
                - one
                - two
            expected_result: (success, failure/<reason>)

For a hypothetical show router-info CLI command, the command "schema" with test cases would look like this:

"show router-info <verbose> [filter] router_name":
    test_cases:
        test_router_only:
            arguments:
                - corporate_router
            expected_result: success
        test_bad_filter:
            keyword_arguments:
                - filter: location
            arguments:
                - corporate_router
            expected_result: failure/bad_argument
        ...

In hindsight, there are ton of things I would have changed about it (for example, using a real schema for the commands instead of parsing the command string for argument identification and substitution), but it worked for our use case.

Right after I had gotten the OK, my C instincts kicked in, and I wrote a test runner that looked like this:

import logging
import pathlib

import yaml


SMOKE_TEST_PATH = pathlib.Path('src') / 'tests' / 'cli_smoke.yaml'


def run_smoke_tests():
    command_list = _load_tests(SMOKE_TEST_PATH)

    failures = []

    for command_name, command_definition in command_list.items():
        command_failures = _run_command(command_name, command_definition)
        failures.extend([
            failure for failiure in command_failures
            if failure is not None
        ])

    return failures


def _load_tests(test_path):
    with open(test_path, mode='r') as test_data:
        return yaml.load(test_data)


def _run_command(command_name, command_definition):
    if not command['enabled']:
        logging.warn(f'Command "{command_name}" is disabled')
        return

    return [
        _run_test_case(test_case_name, test_case_definition, command_name)
        for test_case_name, test_case_definition
        in command_definition['test_cases'].items()
    ]


def _run_test_case(test_case_name, test_case_definition, command_name):
    # There's a lot to do here. Not only do we have to run the tests,
    # but we must validate the data, construct the command string,
    # and return the result. The code gets much sloppier from here...
    ...

The code worked, and it did everything it was supposed to, but there are a few issues here and even more in the segment I didn't rewrite:

All of the problems boiled down to the fact that I was interacting with raw YAML data instead of encapsulating and abstracting the tests into objects with their own operations.

Many months later, I refactored it to use objects, but it wasn't a trivial effort:

import pathlib
import weakref

import yaml


SMOKE_TEST_PATH = pathlib.Path('src') / 'tests' / 'cli_smoke.yaml'


def run_smoke_tests():
    tests = _construct_smoke_tests(
        _load_test_markup(SMOKE_TEST_PATH)
    )

    failures = []

    for command in tests:
        test_failures = command.run_test_cases()
        failures.extend([
            failure for failiure in test_failures
            if failure is not None
        ])

    return failures


def _load_test_markup(test_path):
    with open(test_path, mode='r') as test_data:
        return yaml.load(test_data)


def _construct_smoke_tests(test_yaml):
    return [
        Command(command_name, command_def)
        for command_name, command_def
        in test_yaml
    ]


class Command:

    def __init__(self, name, enabled, test_cases):
        self.name = None
        self._enabled = enabled
        self._test_cases = test_cases

    def run_test_cases(self):
        # Run each test case, return errors as a Failure object
        ...

    ...

    @classmethod
    def create_from_yaml(command_name, command_definition):
        test_cases = [
            TestCase.create_from_yaml(test_name, test_definition)
            for test_name, test_definition
            in command_definition['test_cases'].items()
        ]
        ...
        

class TestCase:

    def __init__(self, parent, name, ...):
        self.parent_command = weakref.WeakRef(parent)
        # Test case validation happens here

    ...

TL;DR: The difference between this example and the last is that I constructed command and test objects from the YAML from the get-go.

Encapsulation introduced the following benefits:

Most important, people want to stab me when they looked at the new code...

Moral: The minute you realize that you're passing too many of the same parameters around and/or you can't conveniently operate on data, refactor your code to use objects.

26. From Pennsylvania, USA. Software engineer at Amazon.com, travel enthusiast, scuba divemaster, amateur photographer. A bit restless.

View Comments