claude-scientific-skills/scientific-skills/modal/references/resources.md at b75f4e8d08b10720d637d423f8cf3d7357c29f4b

skills/claude-scientific-skills

Fork 0

mirror of https://github.com/K-Dense-AI/claude-scientific-skills.git synced 2026-03-27 07:09:27 +08:00

Files

Timothy Kassis b75f4e8d08 Update Modal skill

2026-03-23 16:21:31 -07:00

2.5 KiB

Raw Blame History

CPU

Requesting CPU

@app.function(cpu=4.0)
def compute():
    ...

Values are physical cores, not vCPUs
Default: 0.125 cores
Modal auto-sets OPENBLAS_NUM_THREADS, OMP_NUM_THREADS, MKL_NUM_THREADS based on your CPU request

CPU Limits

Default soft limit: 16 physical cores above the CPU request
Set explicit limits to prevent noisy-neighbor effects:

@app.function(cpu=4.0)  # Request 4 cores
def bounded_compute():
    ...

Memory

Requesting Memory

@app.function(memory=16384)  # 16 GiB in MiB
def large_data():
    ...

Value in MiB (megabytes)
Default: 128 MiB

Memory Limits

Set hard memory limits to OOM-kill containers that exceed them:

@app.function(memory=8192)  # 8 GiB request and limit
def bounded_memory():
    ...

This prevents paying for runaway memory leaks.

Ephemeral Disk

For temporary storage within a container's lifetime:

@app.function(ephemeral_disk=102400)  # 100 GiB in MiB
def process_dataset():
    # Temporary files at /tmp or anywhere in the container filesystem
    ...

Value in MiB
Default: 512 GiB quota per container
Maximum: 3,145,728 MiB (3 TiB)
Data is lost when the container shuts down
Use Volumes for persistent storage

Larger disk requests increase the memory request at a 20:1 ratio for billing purposes.

Timeout

@app.function(timeout=3600)  # 1 hour in seconds
def long_running():
    ...

Default: 300 seconds (5 minutes)
Maximum: 86,400 seconds (24 hours)
Function is killed when timeout expires

Billing

You are charged based on whichever is higher: your resource request or actual usage.

Resource	Billing Basis
CPU	max(requested, used)
Memory	max(requested, used)
GPU	Time GPU is allocated
Disk	Increases memory billing at 20:1 ratio

Cost Optimization Tips

Request only what you need
Use appropriate GPU tiers (L40S over H100 for inference)
Set scaledown_window to minimize idle time
Use min_containers=0 when cold starts are acceptable
Batch inputs with .map() instead of individual .remote() calls

Complete Example

@app.function(
    cpu=8.0,              # 8 physical cores
    memory=32768,         # 32 GiB
    gpu="L40S",           # L40S GPU
    ephemeral_disk=204800, # 200 GiB temp disk
    timeout=7200,         # 2 hours
    max_containers=50,
    min_containers=1,
)
def full_pipeline(data_path: str):
    ...

2.5 KiB Raw Blame History

Modal Resource Configuration

CPU