Files
claude-scientific-skills/scientific-skills/modal/references/resources.md
2026-03-23 16:21:31 -07:00

2.5 KiB

Modal Resource Configuration

CPU

Requesting CPU

@app.function(cpu=4.0)
def compute():
    ...
  • Values are physical cores, not vCPUs
  • Default: 0.125 cores
  • Modal auto-sets OPENBLAS_NUM_THREADS, OMP_NUM_THREADS, MKL_NUM_THREADS based on your CPU request

CPU Limits

  • Default soft limit: 16 physical cores above the CPU request
  • Set explicit limits to prevent noisy-neighbor effects:
@app.function(cpu=4.0)  # Request 4 cores
def bounded_compute():
    ...

Memory

Requesting Memory

@app.function(memory=16384)  # 16 GiB in MiB
def large_data():
    ...
  • Value in MiB (megabytes)
  • Default: 128 MiB

Memory Limits

Set hard memory limits to OOM-kill containers that exceed them:

@app.function(memory=8192)  # 8 GiB request and limit
def bounded_memory():
    ...

This prevents paying for runaway memory leaks.

Ephemeral Disk

For temporary storage within a container's lifetime:

@app.function(ephemeral_disk=102400)  # 100 GiB in MiB
def process_dataset():
    # Temporary files at /tmp or anywhere in the container filesystem
    ...
  • Value in MiB
  • Default: 512 GiB quota per container
  • Maximum: 3,145,728 MiB (3 TiB)
  • Data is lost when the container shuts down
  • Use Volumes for persistent storage

Larger disk requests increase the memory request at a 20:1 ratio for billing purposes.

Timeout

@app.function(timeout=3600)  # 1 hour in seconds
def long_running():
    ...
  • Default: 300 seconds (5 minutes)
  • Maximum: 86,400 seconds (24 hours)
  • Function is killed when timeout expires

Billing

You are charged based on whichever is higher: your resource request or actual usage.

Resource Billing Basis
CPU max(requested, used)
Memory max(requested, used)
GPU Time GPU is allocated
Disk Increases memory billing at 20:1 ratio

Cost Optimization Tips

  • Request only what you need
  • Use appropriate GPU tiers (L40S over H100 for inference)
  • Set scaledown_window to minimize idle time
  • Use min_containers=0 when cold starts are acceptable
  • Batch inputs with .map() instead of individual .remote() calls

Complete Example

@app.function(
    cpu=8.0,              # 8 physical cores
    memory=32768,         # 32 GiB
    gpu="L40S",           # L40S GPU
    ephemeral_disk=204800, # 200 GiB temp disk
    timeout=7200,         # 2 hours
    max_containers=50,
    min_containers=1,
)
def full_pipeline(data_path: str):
    ...