If automation is the key to unlocking the value of SDN and NFV-based networks, then feedback is the key to automation. Automation works as a closed loop control system, continually generating Key Performance Indicators (KPIs) from the network, comparing these indicators to reference values and adjusting the network to compensate for differences between the reference and actual values.
There are five major building blocks in this closed-loop system:
As you might expect, the frequency at which this control loop operates can have a significant impact on the intended result. Imagine if this loop only operated once a day, which is essentially the way network operations work in traditional networks – gather data, generate reports, analyze the reports daily. At this rate, it may take days to detect and fix a service issue! Speeding up this loop, through the gathering of real-time KPIs fed into an analytics engine to produce recommended changes which can then be acted upon by the orchestrator will clearly improve the time to detect and correct, hopefully to the point where the customer doesn’t even notice the issue.
However, this speed needs to be managed properly or else the system runs the risk of becoming unstable. Consider a hypothetical system where service KPIs are generated every 1 mSec and fed to an analytics engine which generates recommendations every 10 mSec. These recommendations then get fed to the Orchestrator which issues prioritized changes every 10 mSec and those might take 5 mSec to be implemented. This means that the length of time it takes from detecting an issue to implementing the change may be as much as 26mSec. In that time, our hypothetical system has generated 26 new KPI values along with 2 more recommendations for changes and 2 more sets of changes implemented. In a worst-case scenario, the orchestrator may push up to 2 additional requests for resources to the network before seeing the impact of the first request. This would then require additional changes to take back some of the additional resources leading to an inefficient use of the network resources.
But equally important is the task of understanding which information to act upon first. In our multi-dimensional analytics model, with customer service and network KPIs, the analytics engine must be able to prioritize all the KPIs from the different dimensions to understand the true root cause and recommend the appropriate action. A failure in the network of a key piece of equipment may lead to degraded KPIs for a set of specific services, which may in turn lead to degraded KPIs for certain customers. While it might, on the surface, appear that the right thing to do would be to modify each of the impacted services, it would, in fact, be more appropriate to modify the network itself to deal with the outage; for example, spin up a new virtual router or move certain VNFs to another server or data center. If the analytics engine does not have real-time visibility to the customer, service and network KPIs, then it is likely to produce inappropriate recommendations to the orchestrator.
As Service Providers continue to be driven to reduce OPEX, become more service agile and provide their customers with the highest QoE possible, and all at a scale that will drive growth and innovation, the role of automation driven by Real-Time, Multi-Dimensional Analytics becomes the critical piece of this puzzle. Having access to the right information at the right time means better decisions are being made, fewer resources are being used, and customers are getting what they purchased.