Files
claude-scientific-skills/scientific-skills/modal/references/volumes.md
2026-03-23 16:21:31 -07:00

5.7 KiB

Modal Volumes

Table of Contents

Overview

Volumes are Modal's distributed file system, optimized for write-once, read-many workloads like storing model weights and distributing them across containers.

Key characteristics:

  • Persistent across function invocations and deployments
  • Mountable by multiple functions simultaneously
  • Background auto-commits every few seconds
  • Final commit on container shutdown

Creating Volumes

In Code (Lazy Creation)

vol = modal.Volume.from_name("my-volume", create_if_missing=True)

Via CLI

modal volume create my-volume

# v2 volume (beta)
modal volume create my-volume --version=2

Programmatic v2

vol = modal.Volume.from_name("my-volume", create_if_missing=True, version=2)

Mounting Volumes

Mount volumes to functions via the volumes parameter:

vol = modal.Volume.from_name("model-store", create_if_missing=True)

@app.function(volumes={"/models": vol})
def use_model():
    # Access files at /models/
    with open("/models/config.json") as f:
        config = json.load(f)

Mount multiple volumes:

weights_vol = modal.Volume.from_name("weights")
data_vol = modal.Volume.from_name("datasets")

@app.function(volumes={"/weights": weights_vol, "/data": data_vol})
def train():
    ...

Reading and Writing Files

Writing

@app.function(volumes={"/data": vol})
def save_results(results):
    import json
    import os

    os.makedirs("/data/outputs", exist_ok=True)
    with open("/data/outputs/results.json", "w") as f:
        json.dump(results, f)

Reading

@app.function(volumes={"/data": vol})
def load_results():
    with open("/data/outputs/results.json") as f:
        return json.load(f)

Large Files (Model Weights)

@app.function(volumes={"/models": vol}, gpu="L40S")
def save_model():
    import torch
    model = train_model()
    torch.save(model.state_dict(), "/models/checkpoint.pt")

@app.function(volumes={"/models": vol}, gpu="L40S")
def load_model():
    import torch
    model = MyModel()
    model.load_state_dict(torch.load("/models/checkpoint.pt"))
    return model

CLI Access

# List files
modal volume ls my-volume
modal volume ls my-volume /subdir/

# Upload files
modal volume put my-volume local_file.txt
modal volume put my-volume local_file.txt /remote/path/file.txt

# Download files
modal volume get my-volume /remote/file.txt local_file.txt

# Delete a volume
modal volume delete my-volume

Commits and Reloads

Modal auto-commits volume changes in the background every few seconds and on container shutdown.

Explicit Commit

Force an immediate commit:

@app.function(volumes={"/data": vol})
def writer():
    with open("/data/file.txt", "w") as f:
        f.write("hello")
    vol.commit()  # Make immediately visible to other containers

Reload

See changes from other containers:

@app.function(volumes={"/data": vol})
def reader():
    vol.reload()  # Refresh to see latest writes
    with open("/data/file.txt") as f:
        return f.read()

Concurrent Access

v1 Volumes

  • Recommended max 5 concurrent commits
  • Last write wins for concurrent modifications of the same file
  • Avoid concurrent modification of identical files
  • Max 500,000 files (inodes)

v2 Volumes

  • Hundreds of concurrent writers (distinct files)
  • No file count limit
  • Improved random access performance
  • Up to 1 TiB per file, 262,144 files per directory

Volumes v2

v2 Volumes (beta) offer significant improvements:

Feature v1 v2
Max files 500,000 Unlimited
Concurrent writes ~5 Hundreds
Max file size No limit 1 TiB
Random access Limited Full support
HIPAA compliance No Yes
Hard links No Yes

Enable v2:

vol = modal.Volume.from_name("my-vol-v2", create_if_missing=True, version=2)

Common Patterns

Model Weight Storage

vol = modal.Volume.from_name("model-weights", create_if_missing=True)

# Download once during image build
def download_weights():
    from huggingface_hub import snapshot_download
    snapshot_download("meta-llama/Llama-3-8B", local_dir="/models/llama3")

image = (
    modal.Image.debian_slim()
    .uv_pip_install("huggingface_hub")
    .run_function(download_weights, volumes={"/models": vol})
)

Training Checkpoints

@app.function(volumes={"/checkpoints": vol}, gpu="H100", timeout=86400)
def train():
    for epoch in range(100):
        train_one_epoch()
        torch.save(model.state_dict(), f"/checkpoints/epoch_{epoch}.pt")
        vol.commit()  # Save checkpoint immediately

Shared Data Between Functions

data_vol = modal.Volume.from_name("shared-data", create_if_missing=True)

@app.function(volumes={"/data": data_vol})
def preprocess():
    # Write processed data
    df.to_parquet("/data/processed.parquet")

@app.function(volumes={"/data": data_vol})
def analyze():
    data_vol.reload()  # Ensure we see latest data
    df = pd.read_parquet("/data/processed.parquet")
    return df.describe()

Performance Tips

  • Volumes are optimized for large files, not many small files
  • Keep under 50,000 files and directories for best v1 performance
  • Use Parquet or other columnar formats instead of many small CSVs
  • For truly temporary data, use ephemeral_disk instead of Volumes