Scaling Industrial IoT (IIoT) solutions requires a DevOps organization that can manage increased software and hardware complexity in terms of capability, capacity and footprint. DevOps is derived from Development and Operations and is one of the buzz words for ICT companies.
Often it is the amalgamation of Software Developers from R&D and senior engineers from Operations into a new organization. Startups are faced with the challenge of how to quickly create a functioning DevOps organization that can scale with rapid growth. In this article, we will deal with the keys for success to scale software solutions with using an example of an Industrial IoT solution. We will look at how DevOps should function and discuss the important principles for software development, tools and operations.
First, let’s consider what we are trying to achieve via DevOps. Why use it? The advantages of DevOps are usually a combination of the following:
- Increase the frequency and quality of feature deployments
- Increase the frequency and quality of deployments to customers
- Improve solution quality
- Reduce the severity and frequency of release failures
- Improve troubleshooting and recovery capabilities
DevOps achieves this through close cooperation between the Development and Operation functions. This requires and “DevOps culture” that is supported by methods and tools to enable solutions to be created and operated efficiently as they scale.
Whether Devops is created as a new organization or from existing Development and Operations functions one of the first challenges to overcome is the organizational culture. The culture has to support the ability to produce, faster more reliable solutions that can keep pace with business needs. A key advantage of a DevOps approach is that SW development supports operational activities and vice versa. It is part of the Lean and Agile development methodologies and that implies a certain level of autonomy and speed to “get the job done”. Not all organizations have the experience to manage this autonomy and require support from all stakeholders. It is important to have a culture to guide but not impede agile development. The rotation of engineers between development and operational activities can be a useful tool for creating a DevOps culture but it isn’t restricted in engineers. A true DevOps approach includes development, operations, business owners, customers, and partners communicating and working together to achieve the business objectives.
Industry 4.0 relies on the use of data from devices and other sources to increase productivity, flexibility and efficiency. The ability to scale solutions without incurring prohibitive complexities and costs is achieved through the automation of industrial and manufacturing processes. However, it should not be limited to these. IoT software and services that are developed to automate the industrial process must be deployed and life cycle managed with maximum efficiency. The Development part of DevOps should consider the operational aspects of new features. The Operations part of DevOps must feed requirements to ensure maximum efficiency as they are best positioned to recommend efficiencies and improvements.
Let’s look at some of the tools and techniques used by DevOps for automation.
CI/CD Continuous Integration/Continuous Development
In order to improve the frequency and quality of deployments, DevOps requires a streamlined automated development process. This implies the use of tools that offer CI/CD Continuous Integration/Development. These accelerate deployment through (semi)automated development pipelines of Develop, Test, Integrate, Deploy, etc. There are numerous tools available such as Jenkins, Travis and GitLab that are cloud independent. Most cloud providers such as Google, Microsoft and AWS also offer complementary products to simplify CI/CD. These provide part of the solution but with the increase in agile development, there are further issues to be solved. Agile development models offer the possibility for Scrums teams to reprioritize work at short notice and this can lead to multiple teams working on the same software module. It can create conflicts and incompatibilities that Scrum masters should manage through the release strategy and schedule of activities in the release process. A release strategy with too many releases will make the management of sprints too complex and ultimately slow the development of new features.
The majority of IoT solutions (and non-IoT) that are not using a serverless architecture will implement a container architecture with an orchestration manager on a virtualized layer. The software is designed with microservices involving one or more containers deployed on virtual machines. The advantages of this approach will be realized as services scale in terms of features, functionality or installed base.
Using a microservice architecture can be more complex to design but it can increase the ability to isolate and troubleshoot faults i.e. the rewards are reaped later in Operations. Microservices create smaller manageable software modules with clear interfaces that can simplify troubleshooting and allow more complex software to be allocated across multiple development teams. This is nothing new in software development, but the advantages increase with containerized software. Containers typically have multiple microservices that combine to create a complete software function. The container includes all the necessary libraries and dependencies for software to be run on different platforms. This reduces the need for migration or redesign as the footprint grows i.e. reduced complexity for deployment on different environments.
One difficulty with container architecture is that it can become complex for operations to manage as it scales. This has been solved by container orchestration tools, the most popular being Kubernetes. Orchestration management has been further extended with tools such as OpenShift and Docker Swarm or cloud-based tools such as Anthos, EKS or AKS. Many of these tools sit on top of Kubernetes and reduce further the complexity of managing the virtualization layer.
Kubernetes is built for the desired state architecture. This means that the desired state of the system is defined and this is maintained by the orchestration control function. It automates many of the activities carried out by operations staff. The advantages include automated fault recovery, reduced scaling complexity, improved redundancy and increased security as highlighted in the example below.
Industry 4.0 IoT Use Case
Let’s take the example of a factory with multiple production lines that plans to automate and transform into Industry 4.0 solutions. They wish to extract the data at various points in the production process to be used by multiple departments and users. The objective is to increase throughput, quality and reduce costs. The software required for this has the functionality as illustrated below.
The robots assemble components, that are configured/calibrated to work together, and they are tested before the final output. This isn’t a very complex scenario, but it is key to understand how this is managed when scale is required. Managing for a few manufacturing lines is not complex but scaling across factories and countries requires advanced automation that can be provided through virtualization, containers and orchestration.
If Container 3 fails due to a hanging process or a communication issue, then operations would be required to perform a manual intervention to restore the service. In a virtualized environment with an Orchestrator, the fault would be detected, and the control plane would start the Container 3 process on another VM, for example, VM2. This is an example of automated redundancy or failover for the solution.
Automated Load Balancing
Take the example off Container 2 on VM2 with buffer congestion that is impacting performance. We would expect operations to manually move processes and load to another VM to maintain performance in line with KPIs. If a container orchestrator is available, it can detect the performance issue and automatically move part of the load to another VM. This reduces the manual workload of operations and facilitates scaling.
Automated Fault Management
If a process in Container 1 hangs the traditional approach would have required that an operator logs into the machine to manually restart the process and recover the service. If an orchestration manager is available, it will automatically start the process on another container. This ensures production continues while the fault is investigated and corrected.
Creating containers with clustering enables software processes and hardware to be isolated offering opportunities for increased security. Security measures can be introduced between the clusters via the orchestrator to harden the security from a hardware and software perspective.
Introducing new assembly lines normally requires the deployment of the software stack but that now can be handled by the orchestrator. Updating the required state of the control function to increase the number of assembly lines required will trigger a new deployment of a container. The complexity of this activity has been reduced by offering operations the ability to define how many container instances are required and the orchestrator control function looks after the rest.
Automated Software Releases
To release new software the orchestrator can be updated to specify the newly required system state e.g. SW Release 4.2 instead of 4. The controller detects the system requires an update and schedules the activity. Traditionally Operations would have been required to deploy new versions of software and then redeploy the containers. Now, this process can be automated by the orchestrator.
The architecture above still implies a single point of failure with the Orchestration controller but there are solutions for this. In general, if the controller fails the processes running in the other VMs should not be impacted. However, features such as redundancy would not become unavailable, but the service will continue functioning as defined.
A combination of the tools and processes described above will be fundamental for scaling Industrial IoT Solutions. Implementing DevOps that uses virtualization and orchestration functionality can be part of the solution. However, it introduces organizational and software development complexities and is not for all solutions and organizations. It may not be advisable not to adopt this strategy early in the life cycle of software development with a new DevOps organization. However, it should be possible to design solutions that can have a relatively painless migration when scaling is required.
Well-designed software will be modular, layered and if it has some form of virtualization then the migration shouldn’t be complex.
Successful DevOps organizations will have clear but evolving methods and tools that support a DevOps culture to facilitate scaling.