mirror of
https://github.com/K-Dense-AI/claude-scientific-skills.git
synced 2026-03-27 07:09:27 +08:00
2.5 KiB
2.5 KiB
Modal Resource Configuration
CPU
Requesting CPU
@app.function(cpu=4.0)
def compute():
...
- Values are physical cores, not vCPUs
- Default: 0.125 cores
- Modal auto-sets
OPENBLAS_NUM_THREADS,OMP_NUM_THREADS,MKL_NUM_THREADSbased on your CPU request
CPU Limits
- Default soft limit: 16 physical cores above the CPU request
- Set explicit limits to prevent noisy-neighbor effects:
@app.function(cpu=4.0) # Request 4 cores
def bounded_compute():
...
Memory
Requesting Memory
@app.function(memory=16384) # 16 GiB in MiB
def large_data():
...
- Value in MiB (megabytes)
- Default: 128 MiB
Memory Limits
Set hard memory limits to OOM-kill containers that exceed them:
@app.function(memory=8192) # 8 GiB request and limit
def bounded_memory():
...
This prevents paying for runaway memory leaks.
Ephemeral Disk
For temporary storage within a container's lifetime:
@app.function(ephemeral_disk=102400) # 100 GiB in MiB
def process_dataset():
# Temporary files at /tmp or anywhere in the container filesystem
...
- Value in MiB
- Default: 512 GiB quota per container
- Maximum: 3,145,728 MiB (3 TiB)
- Data is lost when the container shuts down
- Use Volumes for persistent storage
Larger disk requests increase the memory request at a 20:1 ratio for billing purposes.
Timeout
@app.function(timeout=3600) # 1 hour in seconds
def long_running():
...
- Default: 300 seconds (5 minutes)
- Maximum: 86,400 seconds (24 hours)
- Function is killed when timeout expires
Billing
You are charged based on whichever is higher: your resource request or actual usage.
| Resource | Billing Basis |
|---|---|
| CPU | max(requested, used) |
| Memory | max(requested, used) |
| GPU | Time GPU is allocated |
| Disk | Increases memory billing at 20:1 ratio |
Cost Optimization Tips
- Request only what you need
- Use appropriate GPU tiers (L40S over H100 for inference)
- Set
scaledown_windowto minimize idle time - Use
min_containers=0when cold starts are acceptable - Batch inputs with
.map()instead of individual.remote()calls
Complete Example
@app.function(
cpu=8.0, # 8 physical cores
memory=32768, # 32 GiB
gpu="L40S", # L40S GPU
ephemeral_disk=204800, # 200 GiB temp disk
timeout=7200, # 2 hours
max_containers=50,
min_containers=1,
)
def full_pipeline(data_path: str):
...