← Back to Writing
Article· 4 min read· Last updated

Kubernetes Storage Explained: Volumes, PVs, and PVCs

KubernetesStoragePersistentVolumeStatefulSetAWS
Diagram of a pod claiming storage via a PersistentVolumeClaim bound to a PersistentVolume backed by an EBS volume

Summary

Pods are disposable, so how does data survive? PersistentVolumes, PersistentVolumeClaims, StorageClasses, and StatefulSets explained, with EBS/EFS examples.

Short answer: Pods are disposable, so anything written inside a Pod disappears when it does. For data that must survive, Kubernetes uses a PersistentVolume (the actual storage) that a Pod claims with a PersistentVolumeClaim. On AWS, those volumes are typically backed by EBS or EFS. Stateful apps also use a StatefulSet for stable identity.

Part 8 of the series. Previous: Kubernetes Networking Explained.

Introduction

Kubernetes was built for stateless apps first, which is why storage feels like an add-on. But real systems have databases, ledgers, and caches. This article explains how to keep data safe when the Pods holding it can vanish at any moment.

The problem

Your Trade Ledger writes records to local disk inside its Pod. The Pod gets rescheduled to another Node during a deploy — and the data is gone, because a Pod's filesystem is as disposable as the Pod. For a financial ledger, that is unacceptable. Data needs to live independently of any single Pod.

Simple explanation

  • A Pod's own storage is ephemeral — it dies with the Pod.
  • A PersistentVolume (PV) is durable storage that exists independently of Pods.
  • A PersistentVolumeClaim (PVC) is how a Pod requests storage ("I need 20Gi"). Kubernetes binds it to a PV.
  • When a Pod is rescheduled, the new Pod re-binds the same claim and the data is still there.

Official Kubernetes concept

  • Volume: storage mounted into a Pod (some volume types are ephemeral, some durable).
  • PersistentVolume (PV): a cluster resource representing real storage.
  • PersistentVolumeClaim (PVC): a request for storage that binds to a PV.
  • StorageClass: defines how PVs are dynamically provisioned (for example, an EBS gp3 class).
  • StatefulSet: manages stateful Pods with stable names and stable per-Pod storage.

How it works

You usually do not create PVs by hand. You define a StorageClass (once) and a PVC that references it; Kubernetes dynamically provisions a matching PV (for example, creating an EBS volume) and binds it. The Pod mounts the PVC as a directory. If the Pod is rescheduled, the volume is reattached so data persists. For databases needing stable identity and storage per replica, a StatefulSet gives each Pod its own PVC and a stable network name.

Finance example

Your Trade Ledger runs as a StatefulSet with one PVC per replica, each backed by an EBS volume. When a ledger Pod is rescheduled during a node upgrade, its EBS volume reattaches to the new Pod and every committed trade is still present. A separate read-heavy reporting cache might instead use EFS (shared, multi-reader) so several reporting Pods read the same dataset.

C# example

Your service just reads and writes a mounted path; Kubernetes handles durability:

// The PVC is mounted at /data; treat it as durable storage.
var ledgerPath = "/data/ledger";
await File.AppendAllTextAsync(
    Path.Combine(ledgerPath, "trades.log"),
    $"{trade.Symbol},{trade.Quantity},{trade.Price}\n");

A PVC requesting durable storage:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: ledger-data
spec:
  accessModes: ["ReadWriteOnce"]
  storageClassName: ebs-gp3
  resources:
    requests:
      storage: 20Gi

AWS example

On EKS, the EBS CSI driver dynamically provisions EBS volumes for `ReadWriteOnce` claims (one Node at a time) — ideal for databases. The EFS CSI driver provides `ReadWriteMany` shared file storage for multiple Pods across Nodes. Choose EBS for single-writer durability and EFS for shared read/write.

Architecture diagram

Production reality

Stateful workloads are where teams get burned:

  • The biggest mistake is self-hosting your primary database in-cluster to "keep everything in Kubernetes." For a ledger or order store, a managed service (RDS, DynamoDB) is almost always the better call — backups, failover, and patching are solved for you.
  • EBS is single-AZ. An `ReadWriteOnce` EBS volume is tied to one Availability Zone, so a Pod using it can only be scheduled in that AZ. Plan StatefulSet topology accordingly or you will hit `Pending`.
  • Reclaim policy bites. A `Delete` reclaim policy can destroy the underlying EBS volume when a PVC is removed. For financial data, default to `Retain` and delete deliberately.
  • Cost: orphaned PVs and unattached EBS volumes silently accrue charges. Audit them; deleting a StatefulSet does not always delete its volumes.
  • Security: enable encryption-at-rest on the StorageClass (KMS-backed EBS/EFS) — table stakes for regulated data.

AI Engineering connection

Most AI services are stateless and need no storage — but vector stores, model caches, and fine-tuning checkpoints do. A self-hosted vector database runs as a StatefulSet with PVCs; large model weights are often mounted from EFS so multiple model-serving Pods share one read-only copy instead of each pulling gigabytes.

Interview questions

  • Why is Pod storage not durable by default? A Pod's filesystem is ephemeral and is lost when the Pod is deleted or rescheduled.
  • What is the difference between a PV and a PVC? A PV is the actual storage resource; a PVC is a Pod's request that binds to a PV.
  • What is dynamic provisioning? A StorageClass lets Kubernetes create a PV (for example, an EBS volume) automatically when a PVC is created.
  • When do you use a StatefulSet? For stateful workloads needing stable identity and stable per-replica storage, like databases.
  • EBS vs EFS on EKS? EBS is single-writer block storage (ReadWriteOnce); EFS is shared file storage (ReadWriteMany).

Key takeaways

  • Pods are disposable; durable data lives in PersistentVolumes claimed via PVCs.
  • A StorageClass enables dynamic provisioning so you rarely create PVs by hand.
  • Use StatefulSets for databases that need stable identity and storage.
  • On AWS: EBS for single-writer durability, EFS for shared access.

Next article

Next: What Is Helm and Why Kubernetes Needed a Package Manager? — taming manifest sprawl. Previous: Kubernetes Networking Explained.

Frequently asked questions

Why is Pod storage not durable by default?
A Pod's filesystem is ephemeral and is lost when the Pod is deleted or rescheduled, so durable data must live in a PersistentVolume.
What is the difference between a PV and a PVC?
A PersistentVolume is the actual storage resource. A PersistentVolumeClaim is a Pod's request that binds to a PV.
EBS or EFS on EKS?
EBS is single-writer block storage (ReadWriteOnce), ideal for databases. EFS is shared file storage (ReadWriteMany) for multiple Pods across Nodes.

Related reading