Chaos is What Killed the Dinosaurs, Darling

A Heathers-inspired take on what product managers can learn from the success of Snowflake

9 min readSep 6, 2022

These thoughts are my own and do not represent my current employer or any of my former employers. Many thanks to Jon Natkins for comments on a draft of this post.

Greetings and Salutations — Are you a database?

A few months back my good friend Natty wrote a blog post asking the question, “Is Snowflake a database?” As I read and reflected on it, I found myself pondering the significance of the question itself. (As one does when asking such deep questions, I found my ponderings along the way laced with Heathers quotes, and so this blog post shall be.)

The fact, demonstrated by Natty, that the answer is not at all straightforward illustrates something deeper, both about Snowflake as a product and the cloud as a technology. In this post I aim to explore the “question behind the question,” and what product managers and technologists can learn from studying what Snowflake is, why it’s been so successful, and how the cloud makes it all possible.

Credit — Natty’s post (I didn’t know Natty was a meme lord but now I do)

I certainly would recommend reading Natty’s post in its entirety, but if I were to summarize the conclusion, it’s that yes, Snowflake is a database (or, more precisely, a collection of databases) — but it’s not just a database. Natty argues that Snowflake is a platform that provides a single entry point to multiple types of technologies and workloads, with different interfaces that abstract their varying architectures through the prism of the larger platform. In short, Snowflake makes it such that the user interfaces with the platform, explicitly not the architecture.

I think this illustrates something fundamental about the technological world we live in. In my assessment, the secret sauce behind the success of the great cloud companies like Snowflake, AWS, and Salesforce is that they make it such that the user doesn’t need to worry about the underlying technology or architecture. They abstract away the technological complexity from the user and allow them to focus on the business value they are trying to obtain — their “job to be done” (as defined in this book). When a product does a great job of removing the friction between the technical implementation of a solution and the business value, great things happen.

How Snowflake Bought the Slushy

Shortly after the Snowflake IPO, a product manager friend of mine who works in a different technology space asked me, “So, what’s the deal with this Snowflake company?” The person asking this question was curious as to what made Snowflake so different from the rest of the bloated “big data” market they’d been hearing about for years. Being a nascent product manager at the time and wanting to sound smart, I stuck to the technology explanation known by most data professionals. I recall giving roughly the following explanation:

Snowflake separates storage and compute, so you won’t take down the database with a bad query or fight for resource contention.
Snowflake has a proprietary query engine that is pretty fast.
Snowflake has a rich ecosystem of partners and integrators.
Snowflake has a strong, experienced leadership team.
Snowflake has a go-to-market strategy that capitalizes on the pervasive belief that bloomed in the 2010s that “something isn’t right” with the status quo enterprise data stack.

When I look back on it now, I think that was a very reductive answer. Here’s why:

Grow up, Heather, data warehouses are so ‘87

Firstly, if you talk to any old-timers in the data space, they’ll all say the same thing — how, exactly, is any of this new? Oracle has been doing all of this for what seems like hundreds of years. Federated query engines have been around forever; storage and compute has been separable forever; Natty specifically mentions Vertica and the early theoretical underpinning of the read-optimized columnar database going back to 2005.

Generally, from a product point of view, I’ve found that when it comes to purely discussing a product on its technological merits alone, you don’t get very far. I think there’s two reasons for this:

People more knowledgeable than me or you will almost always be able to explain why thing B is certainly just a natural extension of thing A that’s been around since Ada Lovelace. Framing a product as valuable because it is a “revolutionary technology” will leave you more likely to be wrong than right. On that note, I’d be remiss if I didn’t include two real quotes whose speakers I will leave anonymous:

I could outperform Snowflake with my Pig scripts.
Snowflake is just Teradata in the cloud.

It bears repeating that customers don’t buy technology, they buy products — the total package of what the technology enables (this was a critical mantra behind the early successes at Intel, as chronicled in Marketing High Technology).

Technology is a means to an end, but not an end unto itself. Something isn’t really a product, i.e. something worth paying for, unless the value it unlocks can be framed in a business context (things like time to value, cost savings, and efficiency). It’s no accident that most people in the data world can’t tell you much detail about the technology behind Snowflake — but they can pretty quickly rattle off Snowflake’s business value proposition as a product.

SQL — to think there was a time I actually thought you were cool

Another reason why my answer to my friend was reductive is this: anyone who lived through the Hadoop age knows that no matter how much effort you spend thinking about file partitions, block size, compaction, delivery guarantees, and lambda architectures, your business stakeholders will come back with the same request — can I just query the data in SQL, please? The business user who wants to write SQL doesn’t care about the technological underpinnings of your solution. People want SQL for breakfast, lunch, and dinner. People even want SQL to query streaming workloads, for crying out loud.

Attempts to provide “SQL for big data” led to many different technical approaches, most of which, I think we can broadly agree, failed spectacularly as standalone products (but not without leaving behind the cannot-unsee elephant’s head on a bee’s body):

Try keeping this image out of your mind at 3am on a cold night. (Credit)

The Hive Metastore has certainly had a longer life; it serves as a technological component of Databricks’ platform, and was the main frame of reference from which the Apache Iceberg project was launched, and it can be argued that the technology achieved its goals. But from a product perspective, there’s hardly anything left. Most of what is left of the SQL-for-big-data efforts takes the form of modern federated query engines, most of which can yield incredible query performance but with the same big catch — you actually have to think about how you store your data.

This is a big problem for these solutions as products (not as technologies). The problem is evident to anyone who ever had to explain to management why reports still hadn’t finished loading due to a file compaction issue, or why the “Spark script” written by an expensive consultant featuring a SQLContext with a query joining 20 tables of disparate size was failing due to an out-of-memory error (perhaps you can tell, that included me). The technology was there, but the product was unusable — you had to really, deeply think about your data infrastructure to keep these types of issues from happening.

You don’t have to think about that with Snowflake.

Dear Diary, my Big Data cluster has a body count

If it’s true that (1) people want SQL and (2) there’s always been a tool for that, why was there a need for something different in the data space? Well, I think that’s where the long list of failed big data projects and companies does have something to tell us. There is new technology available to us; there are better ways of doing things. As data practitioners, we deserve something better, even if we don’t quite know what it is.

What Snowflake did that many of those companies did not was to hide the architecture, not bring it to the forefront. This is the exact opposite of what the Hadoop companies were doing around the time Snowflake began to gain traction. Snowflake goes to great lengths to hide the architecture even today; as of this writing their website is splattered with the phrase “data cloud,” a term so technically nebulous it could really mean whatever you want it to mean — and I bet that’s the point. You put the data in the cloud and Snowflake does the rest.

Yet even though Snowflake has a rich platform with expanded APIs for different workloads like streaming, pluggable interfaces for external storage, and built-in solutions for governance, these are just window dressings for the real reason to care about Snowflake: to get value from your data, you don’t have to care about any of that. If you have a highly-specialized need and Snowflake doesn’t provide you a built-in technical solution, you can bet that if it has traction, either they will add it in the product directly or one of their partners will. But the expectation will be that it is a cloud-first solution — meaning you’ll get the value you seek from the upgrade in technology without having to concern yourself with the nuts and bolts of how it works. And, yes, it is good to know and understand the nuts and bolts of technical solutions — but the time to value is much higher when you do it yourself. This is a crystal-clear application of the cloud value proposition in action.

The formula for success with Snowflake is (1) put your data in Snowflake (2) run queries (3) get business value. Compare that to the path to success for everything that came before Snowflake, and you will see something entirely different.

How Very — What can we learn from Snowflake’s success?

*What would you do if you woke up with $5 million, knew the world was ending in 2 days, and could build whatever product you wanted? (Credit*)

For me, studying the Snowflake story yields two very important lessons, both from a technology and product perspective:

Relentlessly focus on the job to be done. What Snowflake did and does so well is to enable the buyers of their product to focus on meeting their business needs. Having great technology and architecture is a necessary but not sufficient condition to build a great product (credit to Raji Narayanan for taking me back to writing abstract algebra proofs in all my product thinking). The sufficient condition for a great technology product is that it gets the job done. From there, the question of architecture becomes a build-buy decision that can be addressed via accurate pricing.
The cloud is the key component to abstracting your product’s architecture and focusing on business value. You can’t abstract away architecture when the user has to deploy something (and if you try, it better have an installer as easy to use as Zoom’s). Back when you had to install huge Oracle appliances or, later on, Hadoop clusters on your premises, architecture was of critical importance (try running a Spark job that shuffles terabyte data frames across the cluster with a tiny network switch and see what happens). Has any Snowflake customer had to think about their Snowflake network architecture, outside of perhaps how to set up a connection that doesn’t go over the public internet? Has any Snowflake customer had to think about the storage format of their data unless they wanted to? I’m convinced you can only get this type of abstraction with the cloud.

In short — make it really easy for the customer to do a really good job doing what they want to do. Easier said than done, but it’s critical in the cloud world that products take on the “making it easy” part themselves. Snowflake did a lot of hard work so you don’t have to. Product managers need to always ask themselves, is my product doing the same?