How to Fight the Coming Latency Wars

Increasing demand for real-time computing combined with the power of AI are driving the buildout of the edge. The winners of this transformation will be the companies that can minimize latency.

We certainly live in an age of wonders. We have supercomputers in our pockets, a global Internet, and applications in the cloud. In less than a lifetime, our four-channel television, rotary dial telephone world has transformed, bringing futuristic science fiction to everyday technology reality.

AI continues to advance its penetration into our lives as it seeks ubiquity. The 5G rollout is well underway as consumers snap up the latest generation of 5G devices. Software infrastructure and applications are keeping pace with the rapid maturation of cloud-native computing.

Human nature being what it is, we now take the current technology state of affairs for granted, and we want more. Much more. Faster, better, and cheaper – especially faster.

The battleground of the near future, however, is not on our smartphones. It’s not even in the cloud. All these technology trends point to one nexus of exploding innovation and competition: the edge.

And on the edge, we will fight one battle in particular: the war over latency.

Understanding Latency

Latency essentially means the amount of time it takes for a request to go from its source (say, when you click a button in an app or on a web site) to its destination and for the response to find its way back.

The lower the latency, the better. We’d all love to have immediate responses, but unfortunately, zero latency is an impossibility. There’s always something slowing us down.

In fact, slowdowns come in three basic categories:

Transmission speed. Just how fast can the bits move down the wire (or fiber optic cable, as the case may be).

For any message, there’s always latency due to transmission speed, for one simple fact: the speed of light. No matter what you do, nothing can go faster.

Light, of course, is quite fast enough for most situations (on earth in any case) – but its physical limitation on minimum latency can be a factor.

In one millisecond, for example, light travels 186 miles – the distance, say, from New York City to Baltimore. Indeed, message from one of these cities to the other might take longer than a millisecond – but it will never go any faster.

Network equipment. The original survive-nuclear-war design of Internet precursor ARPANET required it to establish multiple paths across a number of routers and other network equipment.

To this day, any internet request is likely to traverse multiple pieces of network gear, each adding a modicum of latency to the interaction. The more hops, the more latency.

Processing time. Once your request reaches its destination, you expect the application there to do something for you – and that something always takes a bit of time.

Some of this time has to do with the actual bit crunching on the CPU itself – but most if it generally involves interactions with the application’s underlying database.

If your database is in the cloud, then it is likely made up of multiple pieces that must communicate with each other in order to give you the right answer – communication that adds even more to the latency.

Why Low Latency is so Important

Low latency is important whenever real-time behavior is critical. In fact, certain industries have been fighting latency wars for years.

Real-time stock trading, for example, has taken the lead in driving low-latency solutions from tech vendors. Given today’s high-speed algorithmic trading programs, every fraction of a millisecond can mean the difference between earning or losing millions of dollars.

The other industry that has long been extraordinarily latency-sensitive is multiplayer gaming, where the colloquial term for latency is ‘lag.’ If you’re playing, say, Call of Duty, the tiniest bit of lag can make the difference between you fragging your buddy or vice-versa.

While stock trading predictably involves big bucks, and thus the players involved can invest in expensive latency-killing tech, the gaming world is essentially high-end consumer. As a result, there has been quite a bit of innovation at the low end of the market (relatively speaking), as gamers may be willing to shell out a couple grand, say, for an ultra-low latency cable modem or home router.

The Latency Battleground

The coming latency wars, in contrast, are about everything else – extending the low-latency priorities of stock trading and multiplayer gaming to the broader world of IT.

This low-latency focus is especially important anywhere we’d like to have real-time, AI-driven behavior. Today we may be thinking of autonomous vehicles, but we can extend such real-time behavior to automated factory equipment, virtual assistants, or any sort of AR/VR applications that rely upon real-time AI.

Today, most AI apps are not fast enough to support real-time decision making, and part of the problem is latency. Given the immaturity of the market, we don’t really know the full power of AI-driven decisions – but rest assured, they will demand increasingly low latency to rival the fastest stock trading system of today.

How We’ll Fight the Latency Wars

Every generation of our computers and network equipment is always faster than the previous one – a pattern that has been established for decades. We can also take advantage of a new generation of blazingly fast real-time streaming infrastructure software that is reinventing the way data move through and between organizations.

Meanwhile, we can target network equipment latency via special dedicated networks that have fewer hops than the Internet – networks that often have real-time streaming technologies built in.

Faster gear, streaming software, and fewer hops, however, don’t help with that pesky speed of light.

The only way to lower latency due to transmission distance is simply to reduce the distance – in other words, move equipment closer together.

This basic principle underlies the architecture of 5G. 5G, in fact, is more than one protocol – it’s a number of protocols that work at different frequencies, and thus at different ranges as well.

As a result, some 5G equipment goes on existing cell towers, while others will be bolted to buildings or installed in closets – the closer to users, the better.

In fact, this strategy of placing equipment in different locations with varying proximity to the users of the equipment defines the very nature of the edge. Not only does the edge deliver on scalability demands beyond what data center-based clouds can offer, but it can also reduce latency better than any traditional cloud can.

Sometimes, however, messages have to go back to the cloud. After all, the whole point of the Internet is to foster distributed communication across the world. Our sneaky ‘move stuff closer’ strategy won’t help with the global nature of the Internet, will it?

It actually can – at least in part. Once again, we take a page out of the multiplayer gaming playbook.

Most of the processing for such games is local, including all the heavy graphics rendering and animation. The only information that has to go up to the cloud is a tiny message that contains the bare facts about the player’s actions – what buttons they push, perhaps, or where their gun is pointing.

The same pattern holds for real-time AI-driven applications. Your autonomous vehicle certainly doesn’t need to communicate with the cloud to know how to drive. It must retain most of those skills locally.

The latency war, therefore, will be an ongoing, real-time optimization of edge communications combined with round-trip interactions with hyperscale clouds.

The Intellyx Take

Given all these relevant factors – transmission speed, network topology, processing capabilities, as well as the particular requirements of real-time AI – the coming latency wars are really battles over optimization.

Given the limitations of the speed of light, there will never be a perfect solution to any problem in this space. There will always be tradeoffs. These latency war tradeoffs are driving, and will continue to drive, the architecture of the edge.

If a particular application can reside on a 5G minitower on your street corner instead of a cloud data center, then you’ll get lower latency and hence better real-time performance as compared to apps that must reside in today’s cloud.

A final point: why do I say ‘today’s cloud’ instead of just ‘cloud’? The answer: tomorrow’s cloud will subsume the edge (or perhaps vice-versa, depending on how you look at it).

The winners of the latency wars, therefore, will be those cloud/edge providers who can deliver the best real-time behaviors to the most customers around the world, via a complex optimization of technology, architecture, and geographic distribution.

For all of us users of this technology, all I can say is – buckle up!

© Intellyx LLC. Intellyx publishes the Intellyx Cloud-Native Computing Poster and advises business leaders and technology vendors on their digital transformation strategies. Intellyx retains editorial control over the content of this document. Image credit: Battle of Agincourt, St. Alban’s Chronicle by Thomas Walsingham, 1415 (public domain).