mirror of
https://github.com/K-Dense-AI/claude-scientific-skills.git
synced 2026-03-27 07:09:27 +08:00
118 lines
2.5 KiB
Markdown
118 lines
2.5 KiB
Markdown
# Modal Resource Configuration
|
|
|
|
## CPU
|
|
|
|
### Requesting CPU
|
|
|
|
```python
|
|
@app.function(cpu=4.0)
|
|
def compute():
|
|
...
|
|
```
|
|
|
|
- Values are **physical cores**, not vCPUs
|
|
- Default: 0.125 cores
|
|
- Modal auto-sets `OPENBLAS_NUM_THREADS`, `OMP_NUM_THREADS`, `MKL_NUM_THREADS` based on your CPU request
|
|
|
|
### CPU Limits
|
|
|
|
- Default soft limit: 16 physical cores above the CPU request
|
|
- Set explicit limits to prevent noisy-neighbor effects:
|
|
|
|
```python
|
|
@app.function(cpu=4.0) # Request 4 cores
|
|
def bounded_compute():
|
|
...
|
|
```
|
|
|
|
## Memory
|
|
|
|
### Requesting Memory
|
|
|
|
```python
|
|
@app.function(memory=16384) # 16 GiB in MiB
|
|
def large_data():
|
|
...
|
|
```
|
|
|
|
- Value in **MiB** (megabytes)
|
|
- Default: 128 MiB
|
|
|
|
### Memory Limits
|
|
|
|
Set hard memory limits to OOM-kill containers that exceed them:
|
|
|
|
```python
|
|
@app.function(memory=8192) # 8 GiB request and limit
|
|
def bounded_memory():
|
|
...
|
|
```
|
|
|
|
This prevents paying for runaway memory leaks.
|
|
|
|
## Ephemeral Disk
|
|
|
|
For temporary storage within a container's lifetime:
|
|
|
|
```python
|
|
@app.function(ephemeral_disk=102400) # 100 GiB in MiB
|
|
def process_dataset():
|
|
# Temporary files at /tmp or anywhere in the container filesystem
|
|
...
|
|
```
|
|
|
|
- Value in **MiB**
|
|
- Default: 512 GiB quota per container
|
|
- Maximum: 3,145,728 MiB (3 TiB)
|
|
- Data is lost when the container shuts down
|
|
- Use Volumes for persistent storage
|
|
|
|
Larger disk requests increase the memory request at a 20:1 ratio for billing purposes.
|
|
|
|
## Timeout
|
|
|
|
```python
|
|
@app.function(timeout=3600) # 1 hour in seconds
|
|
def long_running():
|
|
...
|
|
```
|
|
|
|
- Default: 300 seconds (5 minutes)
|
|
- Maximum: 86,400 seconds (24 hours)
|
|
- Function is killed when timeout expires
|
|
|
|
## Billing
|
|
|
|
You are charged based on **whichever is higher**: your resource request or actual usage.
|
|
|
|
| Resource | Billing Basis |
|
|
|----------|--------------|
|
|
| CPU | max(requested, used) |
|
|
| Memory | max(requested, used) |
|
|
| GPU | Time GPU is allocated |
|
|
| Disk | Increases memory billing at 20:1 ratio |
|
|
|
|
### Cost Optimization Tips
|
|
|
|
- Request only what you need
|
|
- Use appropriate GPU tiers (L40S over H100 for inference)
|
|
- Set `scaledown_window` to minimize idle time
|
|
- Use `min_containers=0` when cold starts are acceptable
|
|
- Batch inputs with `.map()` instead of individual `.remote()` calls
|
|
|
|
## Complete Example
|
|
|
|
```python
|
|
@app.function(
|
|
cpu=8.0, # 8 physical cores
|
|
memory=32768, # 32 GiB
|
|
gpu="L40S", # L40S GPU
|
|
ephemeral_disk=204800, # 200 GiB temp disk
|
|
timeout=7200, # 2 hours
|
|
max_containers=50,
|
|
min_containers=1,
|
|
)
|
|
def full_pipeline(data_path: str):
|
|
...
|
|
```
|