June 12, 2012 Leave a comment
Before I get started, I need to explain that there are many constraints in this project that I cannot get rid of. For this reason, I focusing solely on refactoring out branching logic that will continue to grow if left as is.
Let’s describe the situation. A system has been put in place for other teams to put “notifications” into “queues”. The terms are in quotes, as the “notification” is an item placed in a list of work to do and the “queue” is a SharePoint list that both “queues” items and is used to update status. This is shown below:
Architecturally, I know of many better patterns, but this is what I have to work with. Technically, I have to facilitate the idea of serializing objects to strings so a windows service can read the string, serialize the correct object and do some work on it. The side of the equation being refactored in this blog entry is the serialization of multiple derived types
Object enter the system via the “Notification” service, which accepts an object of the type NotificationBase:
The actual client creates a type derived from this base, as in the following:
There are potentially infinite types of objects, as more and more people use this service to enter work into “queues”.
The current pattern to facilitate the serialization and storage of the items in the “queues” was to switch on type.
The primary reason it was done this way is familiarity with the switch type of logic. The problem, from a maintenance standpoint, is I have to update code every time a new derived notification type is created. This means software that is constantly being updated. It also goes against the Single Responsibility Principle (a SOLID software practice) and the code smell of having to change logic in multiple places to facilitate a simple change.
There are a couple of ways to refactor out this pattern. A common approach, for example, is polymorphism. But due to various constraints, I cannot head this direction. I also see a simpler way to solve the problem at hand and show how to simplify the software.
It would be nice to have this down to one line, like so:
But the serializer serializes to NotificationBase, which causes an exception. So, a look at the serializer is in order:
The issue here is typeof(T). Why? This uses the generic passed. Since the generic is implicitly stated, the compiler examines and creates instructions as if this line was fed:
What I need the compiler to do is set it up like this:
Now, your thought process, as this time, might be to dynamically supply the generic type, but the following syntax does not work:
It would be nice if Microsoft made a very easy facility to create generic types dynamically like the above, but it is not possible today (maybe in C# vNext? – not .NET 4.5, as it is in RC). Sans the syntactical sugar, let’s focus on the issue for this problem, which is needing to serialize the derived type rather not the base type. To understand this, look at the following code and comments:
Notice that the typeof(T) returns the base object, as that is how the derived object is passed through the system, as in the following method signature:
But, obj.GetType() returns the derived object, as that is its “true identity” (Clark Kent in trouble now?). So, back to our serialize method:
I could simply do the following:
The problem here is the code is already implemented. Perhaps some of the clients can only use the typeof(T) method of serialization. Since I don’t know this, I am going to use a safer refactoring. First, I add an optional parameter called useTypeOfT.
Then, I then separate the type from the intiailization of the XmlSerializer:
Finally, I add a simple branch based on useTypeOfT to either use typeof(T) or obj.GetType(), with the default being the currently used typeof(T).
Here is the entire routine with these changes.
So, now let’s go back to our branching logic and refactor it down. First, let’s look at the original:
The first change would be set the optional argument up (useTypeOfT) and set it to false, as in the following. Simple, small change, so it is easy to test.
Now, since it can determine the correct derived type, we can get rid of the recasting of the base class into child objects.
If we look at the code above, we now notice the lines in the branches are identical, so we can refactor out the entire branching logic. This leaves us with the desired end condition, as follows:
The switch statement is now gone and I can add as many different types of derived notification types as needed to implement the full system, without any code changes to my service or serializer. And, since I have left the base logic alone, I should not have any issues with clients who have implemented the library using typeof(T) to determine type.
Simplicity is the mark of good software. As software features grow, however, solutions often get more complex rather than simpler. This means the developer has to “take the bull by the horns” and force refactoring to simplicity.
In this blog entry, I focused on one type of refactoring to get rid of unnecessary branching logic. The logic was forced by a “flaw” in a serializer, but cascaded up to create a solution that required branching logic changes every time a new type was added to the system, rather than the simple addition of new derived domain objects. By the end, we had code that required no changes to facilitate new “queued” types.
There are likely to be a variety of other ways to “skin this cat”. For example, we could use polymorphism to get rid of the branch. If there were major logic branches (changes in behavior based on type, for example, rather than just new derived types), I would definitely explore this option. Another possible option is getting deeper into the covariance and contravariance features added to C# in .NET 4.0. If someone would like to link this type of example, I would love to read it.
Peace and Grace,