When cloud computing came onto the scene, one fairly popular view was that it would subsume all computing. A common analogy was the electric grid which (at the time) had largely done away with distributed local power generation.
But there are many reasons why public clouds can’t always substitute for on-prem hardware. In the case of edge computing specifically, it’s often desirable to push compute closer to the collection of data and the use of that data.
This has become important as machine learning is increasingly used to automate local operations in time-critical situations. Furthermore, taking action locally is often not only faster, but it can be more resilient in the face of connectivity outages than always being dependent on a central site.
However, implementing an edge architecture isn’t always straightforward. Here are four principles to consider as you take your business out to the edge of the network.
1. Automation and management aren't optional
It’s not that automation and management aren’t important for almost any IT infrastructure. They are, of course. But the sheer number of edge devices and the fact that there may be no local IT staff (or even no permanent employees on-site) means they need automation and management.
[ Also read Edge computing: 4 pillars for CIOs and IT leaders. ]
Automation uses software to create repeatable instructions and processes that reduce human interaction with IT systems. This, in turn, can improve IT provisioning, configuration management, patching, app orchestration, security, and compliance. Automation is usually accomplished through automation workflows close to the edge endpoints and a centralized control layer. Localized execution guards against high latency and connection disruptions, while centralized control provides integrated control of the entire edge environment.
For example, a retail chain might automate endpoints to simplify infrastructure operations, reinforce security policies, and standardize device configuration across branches. Mass configuration, disaster recovery, branch migration, and actions in response to specific events are all examples of roles that automation can play in an edge environment.
Closely related is management, which includes creating a standard operating environment. Doing so is key to scaling. As you deploy your standardized image across your environments, from development to production, you will also want to register these systems to the edge management console. Maintaining security policies is also an essential management function.
2. Don't build random into your business processes
The edge can become a bit like the Wild West if you let it. Even with automation and management systems in place, it still takes an architectural commitment to maintain a high degree of consistency across the edge (and datacenter) environment.
One rationale for a lack of consistency is that devices at the periphery are often smaller and less powerful than servers in a data center. The reasoning then follows that they need to run different software.
But this isn’t necessarily the case – or at least, it isn’t the whole story. You can build system images from the small core Linux operating system you run elsewhere and customize it to add exactly what you need in terms of drivers, extensions, and workloads. Images can then be versioned, tested, signed, and deployed as a unit, so your ops team can know exactly what is running on the devices.
Staged and applied image updates can also be configured to occur only at the next reboot or power cycle, ensuring minimal downtime. Downtime can also be reduced by intelligent rollbacks, which can reverse an update in response to application-specific health checks.
While some customization is often necessary across a hybrid cloud that includes edge devices, it is still often possible to have a consistent core environment underlying any necessary customizations.
3. The edge needs Kubernetes, too
That consistency can even extend to Kubernetes.
You may be thinking, “Wait… isn’t Kubernetes just for cloud and server clusters?” Not necessarily.
For one thing, edge devices aren’t necessarily that small any longer. For example, in recent research, senior operations management within companies say allowing data analysis to be done locally is a significant edge computing benefit.
While training machine learning (ML) models often still happens in a centralized location, model inference is increasingly pushed out to the edge. This can significantly reduce the need for high bandwidth connections to send all the data back home for analysis. It also means that any needed local actions (such as shutting down a malfunctioning machine or about to) aren’t dependent on a reliable and fast network link.
Even if today’s workload is relatively lightweight, you may want to keep your options open. Perhaps your workload will grow. Perhaps you want to add a high availability option. Perhaps you decide to be less dependent on a reliable network link.
However, it can also make sense to adopt Kubernetes for the same reason discussed in the last section: consistency. If you’re running Kubernetes in your data center, running Kubernetes at the edge would help you standardize software lifecycle management and provide consistency across your hybrid cloud environment. Toward this end, a variety of projects are underway to optimize Kubernetes for a variety of use cases with a different footprint, availability, and networking requirements.
4. There's help out there
There are many sources of information about edge computing. However, I wanted to make you aware of a couple of open source efforts that document complete edge architectures based on patterns that organizations have already implemented.
Portfolio architectures showcase successful deployments of open source software, including edge deployments, and provide architecture best practices, tools, and links to other associated resources. They include high-level abstractions of services and platforms, a schema that describes the main nodes and services along with their interactions and network connections, and detailed looks at specific services.
Portfolio architectures are developed with a common repeatable process, visual language and toolset, presentations, and architecture diagrams. Portfolio architectures focus on combinations of technologies that are effective in multiple deployments and solve a specific, common problem (or cluster of problems).
Validated patterns are a natural progression from reference architectures.
They contain all the code needed to help build an edge software stack to make getting to a proof of concept faster. A typical pattern will include a data center and one or more edge Kubernetes-based clusters. All steps are fully automated through GitOps to automate deployments consistently and at scale. Users can then modify the pattern for their specific application.
Furthermore, the associated user can communicate improvements – providing another example of the open source development model being applied to both initial deployment and ongoing operations of a complex, distributed software stack.
Unlike static reference architectures, the validated patterns are continuously tested against current product releases so that your deployment is kept up to date – reducing risk while using the latest capabilities.
Validated patterns also consider other aspects, such as security, that may not be part of the architecture per se but are important to consider as part of any software deployment. For example, secrets and identity management are essential to most complex deployments. Still, they are often left off of “marketectures” or even reference architectures to focus on the essential elements.
In part, because edge computing deployments need to interact with a messy physical world, they will always have their complexities and unique aspects. However, if you take advantage of what others have learned and keep core principles such as automation, management, and consistency in mind, there’s real business value to be gained.
[ Discover how priorities are changing. Get the Harvard Business Review Analytic Services report: Maintaining momentum on digital transformation. ]