Lunch and Learn: Why Refactor?


This is the first (?) blog entry on why developers should refactor. I just noticed the situation today at my assignment, and I felt it was prime time to have a lunch time type fest and illustrate the why(s) of refactoring in a concrete, real world example. As I find other code samples, I will likely blog again.

I wrote the following code based on code I recently documented for some changes in a system. The code has the same basic branching logic found in the original, but I have altered the actual code and obfuscated out the variable names and even the name of the individual responsible. This is not a “look what so-and-so did” post, but a learning exercise.

Here is the code. A quick examination should show we are setting and resetting a value numerous times. The actual code sample had many more cases of branching logic throughout the routine (most of the routines in this Access module contain hundreds of lines of code). But this small snippet of one routine is enough to illustrate how code is bolted on rather than refactored.

First, the code, as it stands:

‘ Added by XXX per specs 10/30/2008
If recordset![Item1] = "1" And Item2 = "X" And Item3 = "Y" Then
        recordset![Item4] = "XX"
End If

‘ Added by XXX per specs 10/30/2008
If recordset![Item1] = "1" And Item2 = "Z" And Item3 = "A" Then
     recordset![Item4] = "XX"
End If

If recordset![Item5] = "4900" Then     ‘ Code added by XXX per specs 10/30/2008

        recordset![Item4] = "UU"

        ‘ Added by XXX per specs 10/30/2008
        If recordset![Item1] = "1" And Item2 = "Z" And Item3 = "A" Then
            recordset![Item4] = "UU"
        End If
Else
    recordset![Item4] = "XX"
End If

The code should clue in that we need some changes, as the same value (Item4) is changed over and over again. All we need to get the same answers is this:

If recordset![Item5] = "4900" Then     ‘ Code added by XXX per specs 10/30/2008 
        recordset![Item4] = "UU"
Else
    recordset![Item4] = "XX"
End If

There is one other refactor we would have to look at here, if the code actually changed the values. Item2 is a subset of Item3, set up in a hierarchy. All Xs are of parent set Y and all Zs are of parent set A. It looks something like this:

Y
   > X
   > U
   > T

A
   > B
   > Z

etc. So, if the code actually was an exclusion, as it is elsewhere in the code, we would do the following:

‘ Added by XXX per specs 10/30/2008
If recordset![Item1] = "1" And Item2 = "Z" Then
     recordset![Item4] = "XX"
End If

This covers when we are dealing with items that are a subset. The more specific item (child) will always have the same parent and you avoid a test that simply wastes cycles).

How Does Code Get Like This?

There is a very simple answer. New rules come in and they are bolted on. This generally comes from one or more of the following conditions:

  1. The person coding does not have the technical knowledge to think through the paths (inexperience)
  2. The person coding does not have the business knowledge to determine the impact of the changes (lack of business knowledge – ignorance)
  3. The person needs to ensure the path is there for possible future changes
  4. The person coding does not FEEL he has the time to refactor

Lets take these in order:

Item 1: The person coding does not have the technical knowledge to think through the paths(inexperience): Inexperience, at least with the code contained in this routine, is a poor excuse. Anyone taking a moment to run down the code path, sees the following.

IF Item2 = Z

  • Change Item4 to 29
  • If 4900 change Item4 to WJ, and then change Item4 to WJ
  • If not 4900 change Item4 to 29

End result: 4900 is the only key to determining the value of Item4.

The solution is to code out the branches that can NEVER be hit:

If recordset![Item5] = "4900" Then     ‘ Code added by XXX per specs 10/30/2008 
        recordset![Item4] = "UU"
Else
    recordset![Item4] = "XX"
End If

As a side note, it should be noted that the entire branching logic in the routine was per a single specification, so the coding here is even more insidious, as a single developer coded all of the paths. It would have been better to code in notes (comments), as in the following:

‘ Per specs 10/30/2007, if Item1 = 1 and Item2 = X, then Item4 = XX
‘ Per specs 10/30/2007, if Item1 = 1 and Item2 = Z, then Item4 = XX

If recordset![Item5] = "4900" Then     ‘ Code added by XXX per specs 10/30/2008 
    recordset![Item4] = "UU"
    ‘ Per specs 10/30/2007, if Item1 = 1 and Item2 = Z, then Item4 = UU

Else
    recordset![Item4] = "XX"
End If

Item 2: The person coding does not have the business knowledge to determine the impact of the changes (lack of business knowledge): This is not an excuse in this case, but it could be legit in some instances. Even so, if the paths are obvious, make a comment on the “changes” so someone knows. The routine change is below the next item

Item 3: The person needs to ensure the path is there for possible future changes: This can also be solved, like item 2, with comments. And, no, I will not get into the “comments are bad” argument, pro or con. Here is the solution code for Items 2 and 3, which is the same for Item1, taking in account that the developer was not adding to the code, he was creating the routine.

‘ Per specs 10/30/2007, if Item1 = 1 and Item2 = X, then Item4 = XX
‘ Per specs 10/30/2007, if Item1 = 1 and Item2 = Z, then Item4 = XX

If recordset![Item5] = "4900" Then     ‘ Code added by XXX per specs 10/30/2008 
    recordset![Item4] = "UU"
    ‘ Per specs 10/30/2007, if Item1 = 1 and Item2 = Z, then Item4 = UU

Else
    recordset![Item4] = "XX"
End If

Item 4: The person coding does not FEEL he has the time to refactor. I can sympathize with this thought pattern, as it is often hard to convince management that you need a bit more time to clean up code. After all, the general consensus is “as long as it is working, why should you take any more time developing?” And, if management forces the issue, it is better to have a job than it is to be right, right?

As a personal note, I actually disagree with the “better to have a job” statement, at least in most cases. With the economic downturn, we are at a higher unemployment rate (around 3%, compared to the current 9.35% average across all sectors) and the rates out there are lower, perhaps lower than your current job rate. So, accepting garbage coding might be a bit more acceptable, but realize there are two artifacts that come with coding garbage:

  1. You are developing bad habits: While you may downplay this, the impact of continuing to do bad work does eventually become a habit, if the studies are correct. I have personally witnessed it (anecdotal evidence) with workers in an environment that allows them to be lazy, so I assume it applies to any development/coding habit. If you disagree with me, just comment, and we can talk about it.
  2. You are leaving behind evidence of garbage coding: We already know that the guy who just left normally gets blamed for the problems in the system(s) he was working on. Unless you can’t avoid it, don’t give evidence that you are to blame for the problem.

This gets to another CYA maneuver. Whenever you are forced to leave behind bolted on code, keep an evidence trail that you advised a refactor. You may never need it, but if you do, it is better to have some form of evidence that you were trying to do the right thing and forced to compromise.

SideNote

For the actual code, I left the original code, but commented out the lines that would not run. This was more out of my desire to stop spinning cycles, but leaving things so people in the future could see the logic. It was a bit faster than a rewrite. The code ends up like this when commented out:

‘ Added by XXX per specs 10/30/2008
‘GAB NOTE: This code accomplishes nothing, so it is commented out
‘If recordset![Item1] = "1" And Item2 = "X" And Item3 = "Y" Then
‘        recordset![Item4] = "XX"
‘End If

‘ Added by XXX per specs 10/30/2008
‘GAB NOTE: This code accomplishes nothing, so it is commented out
‘If recordset![Item1] = "1" And Item2 = "Z" And Item3 = "A" Then
‘     recordset![Item4] = "XX"
‘End If

If recordset![Item5] = "4900" Then     ‘ Code added by XXX per specs 10/30/2008

        recordset![Item4] = "UU"

        ‘ Added by XXX per specs 10/30/2008
        ‘GAB NOTE: This code accomplishes nothing, so it is commented out
        ‘If recordset![Item1] = "1" And Item2 = "Z" And Item3 = "A" Then
        ‘    recordset![Item4] = "UU"
        ‘End If

Else
    recordset![Item4] = "XX"
End If

The added benefit of this change is it can easily be reversed if rules change. It is unlikely in this case, but being able to revert back and change the order of code is probably wise in this case. And “politically” it is a better option. 😉

Peace and Grace,
Greg

Twitter: @gbworld

Advertisements

One Response to Lunch and Learn: Why Refactor?

  1. Alexandru says:

    Hello,Speaking about strongly typed datasets, please see the link below: https://connect.microsoft.com/VisualStudio/feedback/ViewFeedback.aspx?FeedbackID=105927 It is very sad that after 4 years since Visual Studio 2005 was released, Microsoft does not allow to set the field’s (NullValue) property to Null for anything else than strings ?!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: