The world is moving to containers! Hop on board, or the train will leave you behind! Woah, stop right there—take some time to analyze and think about what you are doing. Do you need to rewrite your code? Refactor your infrastructure? Recreate your environment? All of these will take time, money, and experience (knowledge). Get on the track, of course, but where you get off will depend on many factors. There are several first steps you need to consider. There are several pitfalls waiting for you. Learn from those who have gone before you. As with any strategy, whether business or game, you need a plan to move forward. A plan to iterate upon. A plan to reach your goal. We call that an architecture in some cases and, in others, a design. Where are you along the tracks?
Let us consider an example. A company has a ten-year-old application that has gone through several iterations as the company has grown. It has grown from 100 million transactions per day to 44 billion transactions per day. The code was originally written in a fourth-generation language (an interpreted language such as PHP or Perl). The company went through several iterations of code fixes to improve performance. Then, it had a decision to make. Should it add more systems (refactor its deployment), rewrite the code in a faster noninterpreted language, or recreate its deployment using something completely different? This company chose to rewrite its code using a compiled language to be as fast as possible. It had to weigh several factors:
- maintenance of the code (it had several people on staff who knew the compiled language)
- speed of delivery (it contacted the original author of the interpreted language version and got a cost to convert to a compiled language)
- cost of delivery vs. cost of adding more and more systems to handle increased load
- potential loss of business if it did not grow
- ongoing costs of new infrastructure over many years
For this particular company, cost and growth were paramount. It needed to keep costs down while growing the business. It also knew that unless it did something, the business would not be able to maintain its lead in the industry, as infrastructure and its maintenance would eat through any savings from not doing something. In the end, after weighing all the factors and getting input from all the stakeholders, developers, and operations folks, it chose to rewrite its code.
The rewrite was supposed to take six months. It took nine months, which is amazing given the complexity of the code. However, the contract was for the work to be performed, not hourly, including all documentation and management interfaces. Most of those management interfaces were very simple. The next stage of the rewrite was to test against it and put it into play. Luckily, it was designed to be a drop-in replacement. Voilà! It worked. Quite a bit of effort went into ensuring it would work. The next part was to make the code go even faster. To do that, the company approached its lead developer to, in effect, analyze the code and look for anything that could be removed to speed things up. There was quite a bit. Performance improvements occurred daily, then weekly, then monthly, and now yearly or so. There is always something that can be tweaked.
However, more functionality was added, which required refactoring how data was stored and used. The company eventually added NoSQL databases, better handling of the built-in message queues, and other improvements to meet business requirements. It just finished adding yet another performance improvement that changes how logging happens at scale.
For this company, rewriting the code was the way to go, but refactoring the code as new technologies become available has kept the code from becoming stale. It has yet to recreate anything, and there is no plan to do so. It was going to rewrite the entire codebase to remove some complexity and ongoing maintenance. However, the benefit based on cost just was not there. So, while the code has been worked on, it is not a priority.
Can all companies follow this approach? Of course, to do so, a number of things are necessary to have:
- A firm grasp of the business by IT stakeholders, not just C-level.
- A metric or set of metrics that allow you to see impact to performance and other improvements quickly. For this company, metrics appear every minute on a dashboard everyone can view.
- Time to look at new technologies and at how they can slot in to improve performance, availability, operations, etc. This company has several people whose job it is to look after how new technology and methodologies can improve overall application delivery.
- A well-defined architecture that is updated every six months (if there are many changes, it should be updated more often). Everyone should be able to view this documentation, and there should be formal reviews whenever new features are to be added. This cuts down on misunderstandings quite a bit.
- People who understand the code as well as the architecture.
- Good communication between teams. This company uses scrum boards, Slack, etc. Communication is paramount to success.
- The realization that using new technology often does not require rewriting everything, and that some changes are pretty minor. It is important to understand the impact of changes within the codebase.
- The understanding that testing is paramount. This company spends quite a bit of time testing changes.
As you can see, this is a pretty forward-thinking company. It is using the technology as a tool to further the business while keeping the core of the business well understood by everyone. Problems are easy to solve, because everyone is on the same page.
This company has chosen to rewrite and refactor its codebase several times. How does your organization approach rewriting, refactoring, and recreating? How does it pull in new technology? Does it have teams that work with others to determine the short- vs. long-term costs and gains of adopting new technologies?