The Chamber of Tech Secrets has been opened for the second time. First, I’d like to thank everyone who read #1 and elected to subscribe. I am very thankful for so many $1M donations! 🙏 🤣 Just kidding. Donations are turned off.
This week, I’ll employ my extensive (zero) amount of inside information on the future of AR Glasses in order to explore some thoughts about what the infrastructure might need to look like if mainstream AR glasses do emerge. We’ll wrap with a few “Secrets from the Edge”.
Edge Meets AR?
Farewell to Google Glass Enterprise, again. Is this a technology class that is doomed to fail repeatedly, or is it just waiting for an organization that executes on consumer technology with perfection? 👀🍎
Apple is never first, but they are often best. Ideas are easy; execution is hard.
In This Week in Startups #1697, JCal spoke with Andrew McHugh of Wist Labs about his phone video-to-VR application. He mentioned that Apple’s recent developments in camera, depth sensors and APIs suggests that “Apple preparing for something big” [paraphrase mine]. Could the Apple AR glasses rumors be for real? I have been hoping someone would nail the execution in a way that parallels Daniel Suarez’s the DarkNet glasses in Daemon and FreedomTM for a long time. I assume there are engineering challenges to solve with battery life, heat dissipation, and materials design that make this a challenge to execute, but maybe we’re getting close(r)? Maybe?
As a thought experiment, lets assume Apple creates consumer-friendly AR glasses and they are 🔥. What sort of infrastructure footprint will be necessary to make the application experience work? In the vast majority of modern web and mobile applications, we’ve been able to tolerate hundreds of milliseconds of latency without too much trouble. My twitter refreshes and full amazon page loads take 1-2 seconds on a fiber internet connection, and that’s perfectly fine for the web user experience. For the fun, sci-fi AR use cases to succeed (metadata overlay on everything you see), every millisecond is going to start to matter.
I listened to a talk by Dr. Satya from Carnegie Melon this week, and he said something that really resonated. Einstein was fond of saying about theories, “as simple as possible, but no simpler”. Dr. Satya applied this to modern computing and said:
“…we want to be as close to the cloud as possible, but no closer.”
Enabling AR applications certainly seems like a place for a highly distributed, low-latency infrastructure with at least regional, county or city level edge deployments connected to high-speed giver networks to improve performance and therefore user experience. What will run on the devices and what will need to be off-loaded? What sorts of infrastructure options could emerge?
The Last Mile Owners as “Edge Cloud” Providers
Wireless Last Mile Owners: Cell providers are well positioned as they own the mobile “last mile” connectivity that is near a fiber backbone, and they have mass protocol adoption (LTE/5G). I think they realize computing and developer experience are not their sweet spot, and they are partnering with the hyperscale cloud providers on solutions like AWS Wavelength. These solutions are currently fairly limited in service capabilities (AWS offers EC2 + EBS) and are deployed a city level in major urban areas. Microsoft and Google are following. For most teams, working with these primitives will be a lot of work so there is definitely still opportunity to make things more dev friendly at this edge. The speed of light is fast… but are there use cases that will utilize too much bandwidth and justify something even closer, say at the 5G tower or 5G small cell? One can imagine an even more highly distributed environment being possible, but challenging to manage.
Wired Last Mile Owners: Wired last mile owners have a similar advantage in that they can place compute a single wired hop away from their users, which opens the door to low-latency wired / wifi solutions. The challenge here is the regional nature of internet service providers. To get a consistent UX, an application builder would need to ensure their applications were deployed at the ISP edge across ~3,000 ISPs in the US alone. That seems insurmountable at present. This solution seems more viable for enterprise uses cases where there is consistency of ISP across physical locations (or at least a smaller number to manage). There are still obvious challenges in managing the differences in primitives and services across that fleet, until a standard emerges (K8s perhaps?).
Content Delivery Network Edge
Serverless on CDN: Providers such as Akamai, Cloudflare, AWS CloudFront and Fastly are also well-positioned for this opportunity. Many CDNs are starting to enable running Serverless “containers”. Cloudflare Workers allows writing immediate-scale-up apps in Javascript, Rust, C, and C++ and also provides access to a managed, distributed KV store. AWS CloudFront enables running Lambda functions at their edge. Fastly Compute@Edge has a similar paradigm but has focused on enabling WASM apps.
Private Edge
Businesses w/ Physical Presences: Some solutions—especially those driving business operations—need to work with or without a WAN connection and therefore require on-site edge compute resources to ensure continuity of operations. These cases are usually crafted due to connectivity availability constraints more than latency, but the proliferation of bandwidth-hungry machine learning use cases will also drive an increasing number of on-site edge deployments w/ GPU. Commercial, industrial, and medical applications come to mind as likely consumers of this model.
Consumer Device Edge: End user devices such as cell phones and tablets have been running applications with unreliable connectivity to the cloud for a long time. These devices continue to grow in capability and enable increasing amounts of offline processing. IMO, they qualify as edge devices and are likely to run more workloads in the future, including ML workloads. I fully expect any Apple AR Glass implementation to lean heavily on a paired iPhone to offload a lot of the processing. From there, the network hops start. There are also companies like NimbleEdge are attempting to enable ML training and inference on mobile devices without long, expensive round trips to the cloud and without movement of personal data off the device. And at the consumer edge, lets not forget the car.
Something else
Compute Mesh: Public clouds / private cloud are great and should be maximized, but the future of computing will involve the edge. My hypothesis is that we see a hybrid model—what I am calling the “compute mesh”—including hyperscale public cloud, private cloud deployments, CDN edges and POPs w/ Serverless capability, city/community edges, private edge, and end user device edge. Perhaps the future is a compute mesh that enables running workloads anywhere based on advertised compute capabilities and availability/latency SLAs with tons of dynamic workload placement orchestration that is handled automatically.
For now, the art will be in piecing together all of these available components to give users the best possible experience while maintaining an architecture that is reliable, cost-reasonable and simple enough to not overwhelm humans with cognitive load.
Generally, winners emerge based on the simplicity of the developer experience that can be offered, so it will be interesting to watch and see how this industry evolves in the months and years ahead.
Secrets from the Edge
7,500 Node K8s Cluster… Say what?! What does OpenAI’s Kubernetes infrastructure have to do with the Edge? One parallel that jumped out quickly was their success seems to be due in part to a minimalist usage strategy; just enough Kubernetes to do the job. That resonates with me as it parallels our philosophy when selecting K3s for the edge at Chick-fil-A, where we essentially use it as a job scheduler with a rich ecosystem of supporting tools.
I think Matt hit this on the head with his opinion: “the developer experience and cloud-native integrations of Kubernetes more than makeup for some of the shortcomings. Developers today deploy with containers. Nodes are heterogeneous (and ephemeral). Secrets, blob storage, and volume mounts other than NFS are necessary. You have to build many of these things in HPC, but it's much easier in Kubernetes. Developer experience matters [emphasis mine].”
A familiar architecture: I haven’t tried out AKS Edge Essentials, but it looks shockingly familiar. Kubernetes on the Edge in single or multi-node clusters with a management control plane via Azure Arc. The use case example is about a machine learning model for cooking fries that is adjusted dynamically at the edge based on real-time inputs from point-of-sale. That sounds… just like what I said at QConNY in 2018 and what we shared in a blog post about Edge Computing at Chick-fil-A. It’s interesting to see this type of edge computing architecture go mainstream via the hyperscale cloud providers.
Disposable Units of Compute (DUC instead of NUC): A few weeks back I attended Edge Field Day and did some reflecting on our edge design decisions. One of the biggest challenges of the edge is the unwieldy number of places something can (and does) go wrong. While it won’t work for every use case, we have found a lot of success in our work at Chick-fil-A by treating our devices as “cattle instead of pets”, just like we do in the cloud. Our NUCs are DUCs: Disposable Units of Compute. Rather than constantly log into devices to attempt to resolve cluster or node specific issues, we built a system that enables “wiping” them back to their initial state and re-initializing. This makes ops much easier, but does require edge developers to build with an ephemeral mindset. Your app could be killed at anytime with short (or zero) notice. Therefore, the pattern is to send data to the cloud whenever possible and be capable of rehydrating app state upon failure, similar to what happens when you get a new iPhone. If our troublesome node re-image attempts fail, we’d rather ship a replacement NUC and attempt to re-image out-of-band later (and dispose of repeat offenders). To do this, you can’t rely on humans and need an architecture that supports capabilities like Zero Touch Provisioning (plug in power and ethernet and the rest is automated). In short, automation is critical at the edge and scaling with humans is a hopeless endeavor.
Tweet of the Week
Thanks for reading. I hope you found this useful and/or fun. If you did… awesome. My purpose is to put an idea in your brain or a link in front of you that sends to off down the rabbit hole of learning. I have a ton of topics queued up for the coming weeks that I can’t wait to share. If you are interested in my take on any particular topic, please let me know. With that, it’s time to close the Chambers of Tech Secrets. Until next week.