Introduction to Persistent Volumes (PVs)
In Kubernetes, managing storage is a critical aspect of running stateful applications. Unlike stateless applications, stateful applications require persistent storage to retain data beyond the lifecycle of a pod. Kubernetes provides a robust storage management system using Persistent Volumes (PVs) and Persistent Volume Claims (PVCs).
A Persistent Volume (PV) is a cluster-wide storage resource provisioned by an administrator or dynamically created using Storage Classes. PVs abstract the underlying storage infrastructure (such as NFS, AWS EBS, GCE Persistent Disk, or local storage) and provide a consistent API for pods to consume storage. PVs exist independently of pods, meaning data persists even if the pod is deleted.
Key Concepts of Persistent Volumes
1. Persistent Volume (PV)
A PV is a piece of storage in the cluster that has been provisioned by an administrator or dynamically created using a StorageClass. PVs have a lifecycle independent of any individual pod. They can be backed by various storage types, including:
-
Network-attached storage (NFS, iSCSI, Ceph)
-
Cloud provider block storage (AWS EBS, GCE PD, Azure Disk)
-
Local storage (hostPath, local volumes)
PVs are defined using a YAML manifest with specifications such as:
-
Capacity (e.g.,
10Gi
) -
Access Modes (
ReadWriteOnce
,ReadOnlyMany
,ReadWriteMany
) -
Reclaim Policy (
Retain
,Delete
,Recycle
) -
Storage Class Name (for dynamic provisioning)
-
Volume Mode (
Filesystem
orBlock
)
2. Persistent Volume Claim (PVC)
A Persistent Volume Claim (PVC) is a request for storage by a user. It allows pods to consume PV resources without needing to know the underlying storage implementation. PVCs specify:
-
Storage requirements (size)
-
Access modes
-
StorageClass (if dynamic provisioning is needed)
When a PVC is created, Kubernetes binds it to an available PV that matches the requested criteria. If no PV is available, a new one may be dynamically provisioned (if supported by the StorageClass).
3. StorageClass (SC)
A StorageClass enables dynamic provisioning of PVs. Instead of manually creating PVs, administrators define StorageClasses that describe the types of storage available (e.g., SSD, HDD, or cloud-specific storage). When a PVC references a StorageClass, Kubernetes automatically provisions a PV that meets the claim’s requirements.
Key attributes of a StorageClass include:
-
Provisioner (e.g.,
kubernetes.io/aws-ebs
,kubernetes.io/gce-pd
) -
Parameters (e.g.,
type: gp2
for AWS EBS) -
Reclaim Policy (
Delete
orRetain
) -
Volume Binding Mode (
Immediate
orWaitForFirstConsumer
)
4. Access Modes
PVs and PVCs support different access modes, which define how the volume can be mounted:
-
ReadWriteOnce (RWO): Read-write by a single node.
-
ReadOnlyMany (ROX): Read-only by multiple nodes.
-
ReadWriteMany (RWX): Read-write by multiple nodes.
Not all storage backends support all access modes. For example, AWS EBS only supports ReadWriteOnce
, while NFS supports ReadWriteMany
.
5. Reclaim Policies
When a PVC is deleted, the PV’s reclaim policy determines what happens to the underlying storage:
-
Retain: Keeps the PV and its data (manual cleanup required).
-
Delete: Automatically deletes the PV and associated storage (supported in cloud environments).
-
Recycle (deprecated): Deletes data and makes the PV available for reuse (replaced by dynamic provisioning).
6. Volume Binding Modes
StorageClasses can define how PVs are bound to PVCs:
-
Immediate: Binds the PVC to a PV as soon as the PVC is created.
-
WaitForFirstConsumer: Delays binding until a pod uses the PVC (useful for local storage or topology constraints).
7. Volume Modes
PVs can be used in two modes:
-
Filesystem: Default mode, where the volume is mounted as a directory.
-
Block: Treats the volume as a raw block device (useful for databases like MySQL).
Static vs. Dynamic Provisioning
Static Provisioning
In static provisioning, an administrator manually creates PVs in advance. Users then create PVCs to claim these PVs. This approach is useful when:
-
The storage infrastructure is fixed.
-
Fine-grained control over PV properties is required.
-
Dynamic provisioning is not supported by the storage backend.
Dynamic Provisioning
Dynamic provisioning automates PV creation when a PVC is created. It relies on StorageClasses to define the type of storage to provision. This is the preferred method in cloud environments because:
-
It eliminates the need for pre-created PVs.
-
It scales automatically with demand.
-
It simplifies storage management.
Feature | Static Provisioning | Dynamic Provisioning |
---|---|---|
Definition | Administrator manually creates Persistent Volumes (PVs) in advance. | PVs are automatically created when a Persistent Volume Claim (PVC) is made, using a StorageClass. |
Setup Complexity | Requires manual creation of PVs before use. | Fully automated; no need to pre-create PVs. |
Storage Backend | Works with any storage type (NFS, local, cloud). | Requires a provisioner (e.g., AWS EBS, GCE PD, Azure Disk). |
Scalability | Limited by pre-created PVs; may require manual scaling. | Scales automatically as PVCs are created. |
Reclaim Policy | Can be Retain , Delete , or Recycle . |
Typically Delete (cloud storage) or Retain (if configured). |
Use Cases | – Legacy storage systems without dynamic provisioning. – Fine-grained control over PV properties. – On-premises environments with fixed storage. |
– Cloud-native environments (AWS, GCP, Azure). – CI/CD pipelines needing on-demand storage. – Large-scale deployments requiring automation. |
Pros | – Full control over PV configuration. – Works with any storage system. |
– No manual intervention required. – Scales effortlessly with demand. |
Cons | – Manual management overhead. – Risk of under/over-provisioning. |
– Requires compatible storage backend. – Less control over PV properties (depends on StorageClass). |
Lifecycle of a PV and PVC
-
Provisioning: PV is created statically by an admin or dynamically via a StorageClass.
-
Binding: A PVC requests storage, and Kubernetes binds it to an available PV.
-
Using: A pod references the PVC, and the volume is mounted into the container.
-
Releasing: When the pod is deleted, the PVC can be deleted or retained.
-
Reclaiming: Based on the reclaim policy, the PV is either retained, deleted, or recycled.
Common Use Cases
-
Databases (MySQL, PostgreSQL): Require persistent storage to retain data.
-
Stateful Applications (Redis, Elasticsearch): Need durable storage for logs and data.
-
File Storage (NFS, CephFS): Shared storage for multiple pods.
-
Cloud-native Applications (EBS, Azure Disk): Leverage cloud provider storage.
Troubleshooting PVs and PVCs
For the CKA exam, you should be familiar with debugging storage issues:
-
Check PVC status:
kubectl get pvc
(look forBound
orPending
status). -
Check PV status:
kubectl get pv
(ensure PVs are available). -
Describe PVC/PV:
kubectl describe pvc <name>
(look for errors). -
Check StorageClass:
kubectl get storageclass
(ensure default or correct SC is set). -
Check events:
kubectl get events
(look for provisioning errors).
Best Practices for CKA Exam
-
Understand YAML Definitions: Be able to write PV, PVC, and StorageClass manifests.
-
Know Access Modes: Recognize which storage backends support
ReadWriteMany
. -
Dynamic Provisioning: Practice creating PVCs with StorageClasses.
-
Troubleshooting: Know how to diagnose
Pending
PVCs. -
StatefulSets: Understand how PVs work with StatefulSets (stable network identities and persistent storage).
Conclusion
Persistent Volumes are essential for running stateful applications in Kubernetes. The CKA exam tests your ability to configure and troubleshoot PVs, PVCs, and StorageClasses. Key takeaways:
-
PVs are cluster resources, while PVCs are user requests for storage.
-
StorageClasses enable dynamic provisioning.
-
Access Modes and Reclaim Policies dictate how storage is used and managed.
-
Troubleshooting involves checking PVC/PV status, StorageClass, and events.
Mastering these concepts will ensure you can handle storage-related tasks in the CKA exam and real-world Kubernetes deployments.
Frequently Asked Questions (FAQs) on Kubernetes Persistent Volumes (PVs)
1. What is the difference between a Persistent Volume (PV) and a Persistent Volume Claim (PVC)?
-
A Persistent Volume (PV) is a cluster resource representing physical storage (e.g., NFS, EBS, or local disk). It is provisioned by an administrator or dynamically via a StorageClass.
-
A Persistent Volume Claim (PVC) is a user’s request for storage. It specifies size, access modes, and optionally a StorageClass. Kubernetes binds the PVC to an available PV.
Exam Tip: PVs exist independently of pods, while PVCs are used by pods to request storage.
2. When should I use Retain
vs. Delete
reclaim policy?
-
Retain: Keeps the PV and data after PVC deletion (manual cleanup required). Useful for critical data (e.g., databases).
-
Delete: Automatically deletes the PV and underlying storage (common in cloud environments like AWS EBS).
Use Case:
-
Retain → For stateful applications where data must not be accidentally deleted.
-
Delete → For temporary storage or when using dynamic provisioning in cloud environments.
3. Why is my PVC stuck in “Pending” state?
Possible reasons:
-
No matching PV exists (for static provisioning).
-
StorageClass not configured properly (for dynamic provisioning).
-
Insufficient storage capacity (requested size exceeds available PVs).
-
Access mode mismatch (PVC requests
ReadWriteMany
but PV only supportsReadWriteOnce
).
kubectl describe pvc <pvc-name> # Check events for errors kubectl get storageclass # Verify StorageClass exists kubectl get pv # Check available PVs
4. Can multiple pods use the same Persistent Volume?
Yes, but it depends on the access mode:
-
ReadWriteOnce
(RWO): Only one pod can mount it in read-write mode. -
ReadOnlyMany
(ROX): Multiple pods can mount it read-only. -
ReadWriteMany
(RWX): Multiple pods can read and write (e.g., NFS, CephFS).
Exam Scenario: If a PVC uses ReadWriteOnce
, only one pod per node can use it.
5. How does dynamic provisioning work with StatefulSets?
-
StatefulSets require stable storage, so each replica gets a unique PVC.
-
A StorageClass with
volumeBindingMode: WaitForFirstConsumer
ensures PVs are provisioned only when a pod is scheduled (useful for local storage or zone restrictions).