The Chamber of Tech Secrets has been opened. The AI đ¤ hype is off the charts. Weâll ignore that and focus on an area that hasnât gotten as much attention lately but that is at the foundation of most of these LLM systems: cloud âď¸.
The Cloud Chooses You
I have been involved in Cloud Computing since ~2014 in my role at Chick-fil-A. Since that time, weâve been on a growth journey.
In extremely oversimplified form, it went something like thisâŚ
Use Redshift to get insights into our history of restaurant transactions / line items which was too big of a dataset for our traditional databases.
Use Elastic Beanstalk, DynamoDB and a few other services on EC2 to build a customer-facing digital experience.
Shift to DevOps and âCloud Firstâ. Start building engineering capability. Simple and fast with a few services.
Grow to ~100 product engineering teams (of varying sizes) across tons of cloud accounts and countless cloud services, both native to the cloud provider as well as SaaS and OSS services (Databricks, Kubernetes, ArgoCD, GitHub/Actions, etc).
Grow to utilizing services in every hyperscale public cloud and lots of other SaaS infrastructure services / tools. The cloud chose us. All of them.
Reflecting on that experience, here are 25 things I have learned:
Start simple and keep it simple as long as possible. I mean one cloud provider and as few services as necessary.
Act like your product teams are startup companies.
The cloud primitives are incredibly great building blocks (in AWS: EC2, S3, DynamoDB, etc). A huge portion of the other cloud services are built on top of them so they have to be rock-solid. Use them.
In the beginning, product teams should have their own accounts, one per environment. Consolidate later. Speed over avoiding duplication.
Get security comfortable from the start. Our security team quickly said âwe can be more secure in the public cloud than our own co-lo datacenterâ, which was game on for cloud adoption. Next, build a solid security foundation that helps detect/remediate mistakes / bad practices quickly.
Build great relationships with your cloud providerâs people, both Account Management / TAMs and product people. These are good people who usually want to solve problems and want to help you whenever they can.
Be careful of n+1: Use a small set of cloud services exceptionally well. Build with primitives and strategically incorporate other no-brainer value-added services. Donât use every new service.
Start by augmenting with OSS tools and running them yourself. A small, focused team can do this successfully for quite a while.
Entropy will increase with time. Things are unlikely to ever be more simple than they are today. Act with intentionality.
A cloud provider will almost always have more resources to build a given service than you do, and will likely build a better service than you will (if its not your core business).
A specialized cloud service (ala Databricks or Wiz) will almost always have more resources to build a given service than a cloud provider does, and will likely build a better service than a cloud provider will (primitives exempt).
Be aware of the âMVP factorâ: A lot of cloud services get stuck at MVP+. Sometimes opting for a focused SaaS solution (CloudFlare vs Cloud CDN, DataDog vs CloudWatch, CockroachDB vs Cloud Managed PostgreSQL) may be the better route. No knock on cloud providers: this is a business and it doesnât always make sense for them to go further.
âFully Managedâ means a lot of different things.
Every service has limits. For simple apps this rarely matters. When you scale, youâll hit these limits. Get to know limits early and plan to OSS/buy SaaS/build solutions if you canât influence the cloud provider to change their limits.
The longer you utilize cloud, the more your ecosystem will grow into non-cloud services to do a plethora of things (IDM providers, GitHub for SCM, GitHub actions for workflows, security tools like Wiz, etc.).
At some point, youâll need to build a platform on top of your cloud services. The goal of this is to get back to making things simple like they were at the start. Product teams operating like startups.
Complexity will exist. Hide as much as possible with solid platform engineering.
Most apps donât need to be multi-cloud.
Some organizations might need to be multi-cloud. At some point, most enterprises get concerned about having âall their eggs in one basketâ with a single cloud provider. If you are here, Iâd suggest its past time for a platform.
Avoid the âleast common denominatorâ approach unless you have a really, really, really good business reason to run the same application across multiple clouds.
Every system has gravity. Some have a lot. In our org it was Microsoft O365 leading to Power Platform and some supporting Azure Services and Oracle ERP in Oracle Cloud.
Avoid âeasy buttonâ solutions, which are rarely âeasyâ or âbuttonsâ.
Sometimes the cloud(s) choose you. All of them might choose you.
Avoid sprawl: âBox inâ the usage of the âcloud chooses youâ clouds as much as possible. Sprawl is costly in every way.
Use Hybrid cloud as a bridge: One thing we considered (but never implemented) was having a hybrid cloud account with an AWS DirectConnect link back to our co-lo datacenter. This would have allowed us to build APIs in front of our legacy apps using a familiar software development paradigm until those apps made their way to the public cloud.
These are all based on my experience and research and may not apply to all situations, but I suspect they apply to most. Wherever you are in your journey, I think what is most important is to keep things as simple as possible and grow (add people and tech) as carefully as possible to help keep things as simple as possible⌠which lets you focus on your product instead of your infrastructure and tools.
Secrets from the Edge
Where is Apple đ in the LLM game? The on-chip + call-external-API-when-needed model could be a game changer.
Every major tech company except Apple has announced their own LLM.
Apple has spent years perfecting its on-device neural engine. Capable of some absolutely insane operations. Loads of computing in a small and energy-efficient form factor.
With M1, M2 & soon M3 the neural engine is even more powerful than their A series mobile chipsets.
While we currently need the cloud to run ChatGPT and itâs clunky, I think Apple is going to blow everyone out of the water here. Both on desktop class hardware and mobile.
I think Apple will be launching its own secure and private LLM that runs on devices (edge computing). And when necessary it offloads more heavy workloads to a cloud-based LLM thatâs optimized for heavier tasks. So we will initially have some hybrid.
Databricks Dolly â Small dataset â human-like interactivity could be a breakthrough insight.
A new Intel NUC! Congrats to my friend Brian McCarson and his team at Intel.
I hope you enjoyed this weeks Chamber of Tech Secrets. Steve Jobs does⌠so you should too.