Continuous Integration, Deployment, and Testing

I recently gave a Bright Talk session on adding security to the Agile Cloud/DevOps Development cycle. Part of this discussion addressed adding security testing as part of the process before, during, and even after continuous deployment. In other words, if we continually deploy, we must continually test. Our testing needs to be in the multi-minded parallel process we use for modern development, not the single-minded pipeline acceptable to most DevOps or agile processes. In the past, a team of people would test, each working independently to improve our software. We need similar capabilities within our automated processes. How do we achieve this? How do we add automated, continual testing? And where can we add this to our process or pipeline?

The goal of any testing environment is to ensure that code quality measurements are met prior to release of a product. However, given that release cycles are now measured in minutes rather than months, developers, products, and businesses suffer from not running enough tests. This makes our customers our beta testers. It also puts the business at risk of revenue loss due to sites and services being unavailable. That loss of revenue can occur in the form of customers voting with their feet, or in the form of being down for too long. Some businesses are at risk of failing if they are down for even a span of minutes. Because of this, architects build redundancy into entire environments. However, if the code is at fault, how do we recover?

Risk Management

In the end, how to recover, and what the business can take as a hit, depends on how we manage risk. We may manage risk by using A/B deployments: deploy new code to A, while letting old code run in B. Businesses may use rollback capability. They may press on and fix the problem as soon as possible (say the next 15m release). These are all potential solutions to managing code quality risks. If the code is bad, or the service is not available, how does the business survive? We need business continuity solutions to handle these levels of risk. That is not just a technological decision, but also a business decision.
A business is forced to make decisions based on code quality issues. We are fixing issues in infrastructure and tools that are related to problems in code. The ultimate goal should be to find those problems before they present a risk to the business.

Code Quality

Code quality involves not just looking to see if code follows specific standards and solves specific problems, but also observing how it reacts to real-world testing. This testing should be as automated as possible, as generally there are not enough hours in the day for a person to run all the tests that must be run. When done manually, some tests will be missed, which leads to code quality issues. Automated testing should give yay or nay results, not interpretations. Tests are for performance, functionality, and security. These three items make up the basis for code quality, yielding hard numbers that give us a score on how a code change impacts the application. A case in point:

An application serves up four billion queries a day. A change was requested and made. That request was to change an if statement (flow logic) to test for some value not 0. The code was implemented as a greater than 0. In an A/B test, it was noticed that this one change caused a problem. The application handled only a quarter of the requests required while raising the load on the server. If the company had rolled the code out everywhere, they would have needed four times the hardware to keep up with their current workload. A simple change led to a massive performance hit. The solution was found: change to an equals evaluation. Performance went back up.

The testing was not good enough before putting the code into production to judge its performance impact. The code quality decreased as performance degraded.

Continuous Testing

Had this code been deployed, the obvious solution would have been to scale up the environment by a factor of four. The business would have been forced to make a cost decision to either scale up, raising costs, or lose money due to missed transactions. In some cases, the 4x increase would have been automated due to unchecked use of APIs. The cost increase might not have been seen for days. All that from a coding issue, easily fixed!
Does this imply that we should put a human in the way of deployment just in case our deployment uses unchecked automation? No: automation is also code, and it should take into account cost considerations using a cost model for deployment. How do we determine this is the case? By using continuous testing.
We need to test for security issues, we need to test for performance issues, and we need to correlate those results to costs and risks, all in a 100% automated way. However, the information that such correlations require comes only from continuous testing tied to continuous integration and continuous deployment. The real trick is to do enough testing to satisfy the most prominent attacks and performance issues in order to meet deployment goals, while running longer tests and responding to their results them by doing rollbacks, opening up issues, or taking some other action.
Our tests must run in parallel with each other to fit within our deployment timeframes. One test may take a minute, but to run 6,000 tests in sequence would take far too long. We need to run them in parallel and wait for the results or do the first hundred, then the next hundred and so on. The tests should be organized based on the current threat environment and extant performance issues, and other tests.

Final Thoughts

Continuous testing is not a very new concept. We have Q/A teams that should be doing this already. However, we need to use automation to make it easier so that the Q/A teams and others just generate the tests. We need tools to correlate results and workflows in order to help us make decisions on what to do with the results. Those workflows could be 100% automated or, if egregious enough, could be programmed to ask a human to interject a new process.
IT could solve many of its problems by ensuring code is properly tested. This is where tools from Ixia, Spirent, QualiSystems, and Netsparker all fit in.