Adventures in Refactoring: Eliminating Switches in Code


Before I get started, I need to explain that there are many constraints in this project that I cannot get rid of. For this reason, I focusing solely on refactoring out branching logic that will continue to grow if left as is.

Let’s describe the situation. A system has been put in place for other teams to put “notifications” into “queues”. The terms are in quotes, as the “notification” is an item placed in a list of work to do and the “queue” is a SharePoint list that both “queues” items and is used to update status. This is shown below:

image

Architecturally, I know of many better patterns, but this is what I have to work with. Technically, I have to facilitate the idea of serializing objects to strings so a windows service can read the string, serialize the correct object and do some work on it. The side of the equation being refactored in this blog entry is the serialization of multiple derived types

Object enter the system via the “Notification” service, which accepts an object of the type NotificationBase:

Code Snippet
  1. public interface INotificationService
  2. {
  3.     Guid NotifyChange(NotificationBase message, string queue);
  4.     ProcessStatus GetProcessStatus(Guid notificationId, string queue);
  5. }
  6.  
  7. public class NotificationBase : IExtensibleDataObject
  8. {
  9.     public Guid NotificationId { get; set; }
  10.     public string Description { get; set; }
  11.     public ExtensionDataObject ExtensionData { get; set; }
  12. }

The actual client creates a type derived from this base, as in the following:

Code Snippet
  1. public class AdvisorNotification : NotificationBase
  2. {
  3.     public Guid AdvisorID { get; set; }
  4.     public OperationType Action { get; set; }
  5.     public string Version { get; set; }
  6.     public List<GuidMap> CloningChanges { get; set; }
  7. }

There are potentially infinite types of objects, as more and more people use this service to enter work into “queues”.

The current pattern to facilitate the serialization and storage of the items in the “queues” was to switch on type.

Code Snippet
  1. string messageString;
  2.  
  3. if (notification is AdvisorNotification)
  4. {
  5.     var advisorNotification = (AdvisorNotification)notification;
  6.     messageString = SerializationHelper.SerializeToXml(advisorNotification);
  7. }
  8. else if (notification is WorkflowNotification)
  9. {
  10.     var workflowNotification = (WorkflowNotification)notification;
  11.     messageString = SerializationHelper.SerializeToXml(workflowNotification);
  12. }

The primary reason it was done this way is familiarity with the switch type of logic. The problem, from a maintenance standpoint, is I have to update code every time a new derived notification type is created. This means software that is constantly being updated. It also goes against the Single Responsibility Principle (a SOLID software practice) and the code smell of having to change logic in multiple places to facilitate a simple change.

There are a couple of ways to refactor out this pattern. A common approach, for example, is polymorphism. But due to various constraints, I cannot head this direction. I also see a simpler way to solve the problem at hand and show how to simplify the software.

It would be nice to have this down to one line, like so:

Code Snippet
  1. string messageString = SerializationHelper.SerializeToXml(notification, false);

But the serializer serializes to NotificationBase, which causes an exception. So, a look at the serializer is in order:

Code Snippet
  1. public static string SerializeToXml<T>(T obj)
  2. {
  3.     var serializer = new XmlSerializer(typeof(T));
  4.  
  5.     using (var writer = new StringWriter())
  6.     {
  7.         serializer.Serialize(writer, obj);
  8.         return writer.ToString();
  9.     }
  10. }

The issue here is typeof(T). Why? This uses the generic passed. Since the generic is implicitly stated, the compiler examines and creates instructions as if this line was fed:

Code Snippet
  1. string messageString = SerializationHelper.SerializeToXml<NotificationBase>(notification, false);

What I need the compiler to do is set it up like this:

Code Snippet
  1. string messageString = SerializationHelper.SerializeToXml<AdvisorNotification>(notification, false);

Now, your thought process, as this time, might be to dynamically supply the generic type, but the following syntax does not work:

Code Snippet
  1. string messageString = SerializationHelper.SerializeToXml<notification.GetType()>(notification, false);

It would be nice if Microsoft made a very easy facility to create generic types dynamically like the above, but it is not possible today (maybe in C# vNext? – not .NET 4.5, as it is in RC). Sans the syntactical sugar, let’s focus on the issue for this problem, which is needing to serialize the derived type rather not the base type. To understand this, look at the following code and comments:

Code Snippet
  1. Type type1 = typeof (T); //type1 == NotificationBase;
  2. Type type2 = obj.GetType(); //type2 == AdvisorNotification

 

Notice that the typeof(T) returns the base object, as that is how the derived object is passed through the system, as in the following method signature:

Code Snippet
  1. private static Guid NotifyDataChange(NotificationBase notification, string queueName)

But, obj.GetType() returns the derived object, as that is its “true identity” (Clark Kent in trouble now?). So, back to our serialize method:

Code Snippet
  1. public static string SerializeToXml<T>(T obj)
  2. {
  3.     var serializer = new XmlSerializer(typeof(T));
  4.  
  5.     using (var writer = new StringWriter())
  6.     {
  7.         serializer.Serialize(writer, obj);
  8.         return writer.ToString();
  9.     }
  10. }

I could simply do the following:

Code Snippet
  1. public static string SerializeToXml<T>(T obj)
  2. {
  3.     var serializer = new XmlSerializer(obj.GetType());
  4.  
  5.     using (var writer = new StringWriter())
  6.     {
  7.         serializer.Serialize(writer, obj);
  8.         return writer.ToString();
  9.     }
  10. }

The problem here is the code is already implemented. Perhaps some of the clients can only use the typeof(T) method of serialization. Since I don’t know this, I am going to use a safer refactoring. First, I add an optional parameter called useTypeOfT.

Code Snippet
  1. public static string SerializeToXml<T>(T obj, bool useTypeOfT = true)

Then, I then separate the type from the intiailization of the XmlSerializer:

Code Snippet
  1. Type type = typeof(T);
  2. var serializer = new XmlSerializer(type);

Finally, I add a simple branch based on useTypeOfT to either use typeof(T) or obj.GetType(), with the default being the currently used typeof(T).

Code Snippet
  1. Type type = useTypeOf ? typeof(T) : obj.GetType();
  2. var serializer = new XmlSerializer(type);

Here is the entire routine with these changes.

Code Snippet
  1. public static string SerializeToXml<T>(T obj, bool useTypeOf = true)
  2. {
  3.     Type type = useTypeOf ? typeof(T) : obj.GetType();
  4.     var serializer = new XmlSerializer(type);
  5.  
  6.     using (var writer = new StringWriter())
  7.     {
  8.         serializer.Serialize(writer, obj);
  9.         return writer.ToString();
  10.     }
  11. }

So, now let’s go back to our branching logic and refactor it down. First, let’s look at the original:

Code Snippet
  1. string messageString;
  2.  
  3. if (notification is AdvisorNotification)
  4. {
  5.     var advisorNotification = (AdvisorNotification)notification;
  6.     messageString = SerializationHelper.SerializeToXml(advisorNotification);
  7. }
  8. else if (notification is WorkflowNotification)
  9. {
  10.     var workflowNotification = (WorkflowNotification)notification;
  11.     messageString = SerializationHelper.SerializeToXml(workflowNotification);
  12. }

The first change would be set the optional argument up (useTypeOfT) and set it to false, as in the following. Simple, small change, so it is easy to test.

Code Snippet
  1. string messageString;
  2.  
  3. if (notification is AdvisorNotification)
  4. {
  5.     var advisorNotification = (AdvisorNotification)notification;
  6.     messageString = SerializationHelper.SerializeToXml(advisorNotification, false);
  7. }
  8. else if (notification is WorkflowNotification)
  9. {
  10.     var workflowNotification = (WorkflowNotification)notification;
  11.     messageString = SerializationHelper.SerializeToXml(workflowNotification, false);
  12. }

Now, since it can determine the correct derived type, we can get rid of the recasting of the base class into child objects.

Code Snippet
  1. string messageString;
  2.  
  3. if (notification is AdvisorNotification)
  4. {
  5.     messageString = SerializationHelper.SerializeToXml(notification, false);
  6. }
  7. else if (notification is WorkflowNotification)
  8. {
  9.     messageString = SerializationHelper.SerializeToXml(notification, false);
  10. }

If we look at the code above, we now notice the lines in the branches are identical, so we can refactor out the entire branching logic. This leaves us with the desired end condition, as follows:

Code Snippet
  1. string messageString = SerializationHelper.SerializeToXml(notification, false);

The switch statement is now gone and I can add as many different types of derived notification types as needed to implement the full system, without any code changes to my service or serializer. And, since I have left the base logic alone, I should not have any issues with clients who have implemented the library using typeof(T) to determine type.

Summary

Simplicity is the mark of good software. As software features grow, however, solutions often get more complex rather than simpler. This means the developer has to “take the bull by the horns” and force refactoring to simplicity.

In this blog entry, I focused on one type of refactoring to get rid of unnecessary branching logic. The logic was forced by a “flaw” in a serializer, but cascaded up to create a solution that required branching logic changes every time a new type was added to the system, rather than the simple addition of new derived domain objects. By the end, we had code that required no changes to facilitate new “queued” types.

There are likely to be a variety of other ways to “skin this cat”. For example, we could use polymorphism to get rid of the branch. If there were major logic branches (changes in behavior based on type, for example, rather than just new derived types), I would definitely explore this option. Another possible option is getting deeper into the covariance and contravariance features added to C# in .NET 4.0. If someone would like to link this type of example, I would love to read it.

Peace and Grace,
Greg

Twitter: @gbworld

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: