Building a Resilient On-Premises S3 Storage with MinIO: Best Practices and Lessons Learned
Introduction
Modern applications often require scalable, reliable, and cost-effective storage solutions. Amazon S3 has set the standard for object storage, but what if you need similar capabilities within your own infrastructure? Enter MinIO—a high-performance, S3-compatible object storage server designed for on-premises deployments.
In this article, we’ll explore how MinIO works under the hood, share practical tips for configuring redundancy and fault tolerance, and walk through real-world troubleshooting scenarios. Whether you’re a developer, DevOps engineer, or IT enthusiast, you’ll find actionable insights to help you build a robust storage backend.
What is MinIO and Why Use It?
MinIO is an open-source object storage server that implements the S3 API over HTTP. It allows you to store and retrieve large volumes of data using familiar S3 tools and libraries. MinIO is popular for private cloud and on-premises deployments because it is:
- Simple to deploy and manage
- Well-documented and actively maintained
- Highly performant and scalable
- Compatible with the S3 API
At RUTUBE, we chose MinIO for our internal S3 needs due to its reliability, ease of use, and strong community support.
Note: This article focuses on MinIO’s configuration and data durability features, not on storing massive video archives. For large-scale video storage, see our dedicated article.
How MinIO Stores Data: Erasure Coding and Storage Classes
The Basics
MinIO organizes your data across multiple disks and nodes using a technique called erasure coding. This approach splits each file into small chunks and distributes them across a set of disks, providing redundancy and fault tolerance similar to RAID, but with more flexibility.
Key Concepts:
- Erasure Set: A group of 4–16 disks where data and parity (redundancy) chunks are stored.
- Data Drives (N): Disks that store actual data chunks.
- Parity Drives (M): Disks that store parity information, allowing recovery from disk failures.
- Server Pool: A collection of erasure sets managed as a unit.
- Storage Classes: Define redundancy levels.
STANDARD
(default) andREDUCED_REDUNDANCY
(RRS) are available.
Example: 4 Nodes × 8 Disks (4×8)
Suppose you have a MinIO cluster with 4 nodes, each with 8 disks (total 32 disks). MinIO automatically groups disks into erasure sets (up to 16 disks per set). For each erasure set, you can configure the number of parity drives:
MINIO_STORAGE_CLASS_STANDARD=EC:4
MINIO_STORAGE_CLASS_RRS=EC:2
EC:
is the erasure coding prefix.- The number after
EC:
is the count of parity drives for that storage class.
Inspecting Cluster State
Use the mc
(MinIO Client) tool to check your deployment:
mc alias set minio-test-a http://localhost:9000 minio minio123
mc admin info minio-test-a
You’ll see details about nodes, disks, erasure sets, and storage class settings.
To see how disks are mapped to erasure sets:
mc admin info minio-test-a --json | jq '[.info.servers|.[].drives[]|{endpoint,set_index}]'
This helps you understand which disks belong to which erasure set—a crucial detail for troubleshooting.
Fault Tolerance: Handling Disk and Node Failures
How Many Disks Can Fail?
The number of parity drives (M) determines how many disks you can lose in an erasure set without data loss. For example, with 4 parity drives in a 4×8 setup, you can lose up to 4 disks in an erasure set or even an entire node, and the system will continue to operate.
Warning: If you lose more than M disks in an erasure set, all objects in that set become unavailable.
Replacing Failed Disks
When a disk fails, MinIO can heal itself using the remaining data and parity drives. The healing process may start automatically or can be triggered manually:
mc admin heal -r minio-test-a
Monitor the healing process to ensure your cluster returns to a consistent state.
Storage Classes: Standard vs. Reduced Redundancy
- STANDARD: Higher redundancy, more parity drives, better durability.
- REDUCED_REDUNDANCY (RRS): Fewer parity drives, saves space, but higher risk of data loss.
To use RRS for less critical data:
mc cp /tmp/ubuntu.iso minio-test-a/test/ubuntu-rrs.iso --storage-class REDUCED_REDUNDANCY
If you lose more disks than the RRS parity count, those objects are lost. Use RRS with caution and only for non-essential data.
Erasure Sets and Data Availability
Each object is stored within a single erasure set. If you lose more disks than allowed in a set, all objects in that set become inaccessible. With multiple erasure sets, it can be unclear which files will survive a major failure.
Best Practice:
- Prefer configurations with a single erasure set per server pool (e.g., 4×4) to reduce uncertainty during failures.
Scaling MinIO Deployments
MinIO recommends scaling by adding new server pools rather than expanding existing ones. This approach is more cloud-native and avoids issues with uneven disk sizes.
Steps to Add a Server Pool:
- Prepare new servers and disks.
- Update the
MINIO_VOLUMES
environment variable in your config:- Before:
MINIO_VOLUMES="https://minio-test-{1...4}:9000/data/storage{1...8}"
- After:
MINIO_VOLUMES="https://minio-test-{1...4}:9000/data/storage{1...8} https://minio-test-{5...8}:9000/data/storage{1...4}"
- Before:
- Restart MinIO on all nodes.
Note: All disks in a server pool must be the same size. After adding a pool, existing objects remain in their original erasure sets unless you run a rebalance operation:
mc admin rebalance start minio-test-a
Data Replication for Disaster Recovery
MinIO supports bucket replication across deployments for added resilience. There are three main modes:
- Active-Passive: One-way replication.
- Active-Active: Bi-directional replication.
- Multi-site Active-Active: Replication across multiple sites.
Setting Up Active-Active Replication
- Create versioned buckets on both deployments.
- Define replication policies (JSON policy files) and attach them to replication users.
- Create users for replication and assign policies.
- Enable replication between buckets using
mc replicate add
.
Example command:
mc replicate add minio-test-a/test
--remote-bucket 'https://ReplicationRemoteUser:@/test'
--replicate "delete,delete-marker,existing-objects"
Replication is asynchronous by default but can be made synchronous with --sync enable
.
Check replication status:
mc replicate status minio-test-a/test
Proxying Requests for Unavailable Objects
If an object is unavailable in one deployment (e.g., due to disk failure), MinIO can proxy the request to the replica, ensuring continued access. This is a powerful feature for high-availability setups.
Key Takeaways and Best Practices
- Choose parity drive count carefully: Balance between fault tolerance and usable storage.
- Avoid multiple erasure sets per server pool: This reduces uncertainty during failures. A 4×4 configuration is often optimal.
- Plan for replication: If possible, deploy two MinIO clusters with Active-Active replication for maximum resilience.
- Monitor and test healing: Regularly check your cluster’s health and practice recovery procedures.
- Use RRS only for non-critical data: Understand the risks before sacrificing redundancy for space.
Conclusion
MinIO is a versatile and robust solution for on-premises S3-compatible storage. By understanding its architecture and following best practices for redundancy, scaling, and replication, you can build a storage backend that meets your reliability and performance needs.
Tip: Always test your disaster recovery plan before you need it. Simulate failures and ensure your data can be restored as expected.
For more insights on infrastructure and development at RUTUBE, join our Telegram channel!