mirror of
https://github.com/K-Dense-AI/claude-scientific-skills.git
synced 2026-03-27 07:09:27 +08:00
5.7 KiB
5.7 KiB
Modal Volumes
Table of Contents
- Overview
- Creating Volumes
- Mounting Volumes
- Reading and Writing Files
- CLI Access
- Commits and Reloads
- Concurrent Access
- Volumes v2
- Common Patterns
Overview
Volumes are Modal's distributed file system, optimized for write-once, read-many workloads like storing model weights and distributing them across containers.
Key characteristics:
- Persistent across function invocations and deployments
- Mountable by multiple functions simultaneously
- Background auto-commits every few seconds
- Final commit on container shutdown
Creating Volumes
In Code (Lazy Creation)
vol = modal.Volume.from_name("my-volume", create_if_missing=True)
Via CLI
modal volume create my-volume
# v2 volume (beta)
modal volume create my-volume --version=2
Programmatic v2
vol = modal.Volume.from_name("my-volume", create_if_missing=True, version=2)
Mounting Volumes
Mount volumes to functions via the volumes parameter:
vol = modal.Volume.from_name("model-store", create_if_missing=True)
@app.function(volumes={"/models": vol})
def use_model():
# Access files at /models/
with open("/models/config.json") as f:
config = json.load(f)
Mount multiple volumes:
weights_vol = modal.Volume.from_name("weights")
data_vol = modal.Volume.from_name("datasets")
@app.function(volumes={"/weights": weights_vol, "/data": data_vol})
def train():
...
Reading and Writing Files
Writing
@app.function(volumes={"/data": vol})
def save_results(results):
import json
import os
os.makedirs("/data/outputs", exist_ok=True)
with open("/data/outputs/results.json", "w") as f:
json.dump(results, f)
Reading
@app.function(volumes={"/data": vol})
def load_results():
with open("/data/outputs/results.json") as f:
return json.load(f)
Large Files (Model Weights)
@app.function(volumes={"/models": vol}, gpu="L40S")
def save_model():
import torch
model = train_model()
torch.save(model.state_dict(), "/models/checkpoint.pt")
@app.function(volumes={"/models": vol}, gpu="L40S")
def load_model():
import torch
model = MyModel()
model.load_state_dict(torch.load("/models/checkpoint.pt"))
return model
CLI Access
# List files
modal volume ls my-volume
modal volume ls my-volume /subdir/
# Upload files
modal volume put my-volume local_file.txt
modal volume put my-volume local_file.txt /remote/path/file.txt
# Download files
modal volume get my-volume /remote/file.txt local_file.txt
# Delete a volume
modal volume delete my-volume
Commits and Reloads
Modal auto-commits volume changes in the background every few seconds and on container shutdown.
Explicit Commit
Force an immediate commit:
@app.function(volumes={"/data": vol})
def writer():
with open("/data/file.txt", "w") as f:
f.write("hello")
vol.commit() # Make immediately visible to other containers
Reload
See changes from other containers:
@app.function(volumes={"/data": vol})
def reader():
vol.reload() # Refresh to see latest writes
with open("/data/file.txt") as f:
return f.read()
Concurrent Access
v1 Volumes
- Recommended max 5 concurrent commits
- Last write wins for concurrent modifications of the same file
- Avoid concurrent modification of identical files
- Max 500,000 files (inodes)
v2 Volumes
- Hundreds of concurrent writers (distinct files)
- No file count limit
- Improved random access performance
- Up to 1 TiB per file, 262,144 files per directory
Volumes v2
v2 Volumes (beta) offer significant improvements:
| Feature | v1 | v2 |
|---|---|---|
| Max files | 500,000 | Unlimited |
| Concurrent writes | ~5 | Hundreds |
| Max file size | No limit | 1 TiB |
| Random access | Limited | Full support |
| HIPAA compliance | No | Yes |
| Hard links | No | Yes |
Enable v2:
vol = modal.Volume.from_name("my-vol-v2", create_if_missing=True, version=2)
Common Patterns
Model Weight Storage
vol = modal.Volume.from_name("model-weights", create_if_missing=True)
# Download once during image build
def download_weights():
from huggingface_hub import snapshot_download
snapshot_download("meta-llama/Llama-3-8B", local_dir="/models/llama3")
image = (
modal.Image.debian_slim()
.uv_pip_install("huggingface_hub")
.run_function(download_weights, volumes={"/models": vol})
)
Training Checkpoints
@app.function(volumes={"/checkpoints": vol}, gpu="H100", timeout=86400)
def train():
for epoch in range(100):
train_one_epoch()
torch.save(model.state_dict(), f"/checkpoints/epoch_{epoch}.pt")
vol.commit() # Save checkpoint immediately
Shared Data Between Functions
data_vol = modal.Volume.from_name("shared-data", create_if_missing=True)
@app.function(volumes={"/data": data_vol})
def preprocess():
# Write processed data
df.to_parquet("/data/processed.parquet")
@app.function(volumes={"/data": data_vol})
def analyze():
data_vol.reload() # Ensure we see latest data
df = pd.read_parquet("/data/processed.parquet")
return df.describe()
Performance Tips
- Volumes are optimized for large files, not many small files
- Keep under 50,000 files and directories for best v1 performance
- Use Parquet or other columnar formats instead of many small CSVs
- For truly temporary data, use
ephemeral_diskinstead of Volumes