Member-only story

ELI5: Understanding Cosmos DB

5 min readMar 17, 2025

Databases have long been the backbone of modern applications, but for years, achieving high availability and global scale with relational databases was both expensive and complex. With the advent of Azure Cosmos DB, many of the challenges that once required intricate failover mechanisms, synchronized replication, and expensive infrastructure investments have been abstracted away. Cosmos DB is designed for the cloud era, providing a seamless multi-region, multimodal database experience that simplifies global application deployments.

This article is based on a lecture I gave to my LEAP intern, who had zero background in Azure, relational databases, or NoSQL databases. I used simple metaphors to break down complex technical concepts and make them easier to grasp. We’ll explore how Cosmos DB differs from traditional relational databases, its unique hierarchical structure, and how it enables active-active and active-passive replication scenarios with ease.

The Hierarchy of Cosmos DB

To understand Cosmos DB, it helps to visualize its structure using an analogy. Think of an Azure subscription as your house, a resource group as a room in that house, and inside that room, you can place different objects — one of which is a Cosmos DB account. This account is the entry point for creating and managing data storage, and it supports multiple database models, such as NoSQL, MongoDB, Gremlin, etc.

Within a Cosmos DB account, we define containers, which serve as the primary unit for storing data. If you’re coming from a relational database background, you might think of containers as tables, but there’s a key difference: each entry (or document) in a container has a unique ID, a partition key, and stores data in JSON format. This schema-less approach makes it much more flexible than traditional relational tables, where you must define a strict schema upfront.

Unlike SQL Server, Cosmos DB doesn’t provision dedicated servers or VMs for users. Instead, it allocates “RUs” or what can be thought of as frequent flyer miles — a defined amount of throughput that applications consume. The reason I call them frequent flyer miles is that there is no direct reference to physical infrastructure; instead, Cosmos DB introduces a third currency as a derivative metric of the underlying infrastructure. This completely abstracts the underlying provisioned infrastructure from the Cosmos DB consumer, ensuring scalability without the need for manual…

ELI5: Understanding Cosmos DB

The Hierarchy of Cosmos DB

Written by Mark Tinderholt

No responses yet