5.6 KiB
Modal Functions and Classes
Table of Contents
- Functions
- Remote Execution
- Classes with Lifecycle Hooks
- Parallel Execution
- Async Functions
- Local Entrypoints
- Generators
Functions
Basic Function
import modal
app = modal.App("my-app")
@app.function()
def compute(x: int, y: int) -> int:
return x + y
Function Parameters
The @app.function() decorator accepts:
| Parameter | Type | Description |
|---|---|---|
image |
Image |
Container image |
gpu |
str |
GPU type (e.g., "H100", "A100:2") |
cpu |
float |
CPU cores |
memory |
int |
Memory in MiB |
timeout |
int |
Max execution time in seconds |
secrets |
list[Secret] |
Secrets to inject |
volumes |
dict[str, Volume] |
Volumes to mount |
schedule |
Schedule |
Cron or periodic schedule |
max_containers |
int |
Max container count |
min_containers |
int |
Minimum warm containers |
retries |
int |
Retry count on failure |
concurrency_limit |
int |
Max concurrent inputs |
ephemeral_disk |
int |
Disk in MiB |
Remote Execution
.remote() — Synchronous Call
result = compute.remote(3, 4) # Runs in the cloud, blocks until done
.local() — Local Execution
result = compute.local(3, 4) # Runs locally (for testing)
.spawn() — Async Fire-and-Forget
call = compute.spawn(3, 4) # Returns immediately
# ... do other work ...
result = call.get() # Retrieve result later
.spawn() supports up to 1 million pending inputs.
Classes with Lifecycle Hooks
Use @app.cls() for stateful workloads where you want to load resources once:
@app.cls(gpu="L40S", image=image)
class Model:
@modal.enter()
def setup(self):
"""Runs once when the container starts."""
import torch
self.model = torch.load("/weights/model.pt")
self.model.eval()
@modal.method()
def predict(self, text: str) -> dict:
"""Callable remotely."""
return self.model(text)
@modal.exit()
def teardown(self):
"""Runs when the container shuts down."""
cleanup_resources()
Lifecycle Decorators
| Decorator | When It Runs |
|---|---|
@modal.enter() |
Once on container startup, before any inputs |
@modal.method() |
For each remote call |
@modal.exit() |
On container shutdown |
Calling Class Methods
# Create instance and call method
model = Model()
result = model.predict.remote("Hello world")
# Parallel calls
results = list(model.predict.map(["text1", "text2", "text3"]))
Parameterized Classes
@app.cls()
class Worker:
model_name: str = modal.parameter()
@modal.enter()
def load(self):
self.model = load_model(self.model_name)
@modal.method()
def run(self, data):
return self.model(data)
# Different model instances autoscale independently
gpt = Worker(model_name="gpt-4")
llama = Worker(model_name="llama-3")
Parallel Execution
.map() — Parallel Processing
Process multiple inputs across containers:
@app.function()
def process(item):
return heavy_computation(item)
@app.local_entrypoint()
def main():
items = list(range(1000))
results = list(process.map(items))
print(f"Processed {len(results)} items")
- Results are returned in the same order as inputs
- Modal autoscales containers to handle the workload
- Use
return_exceptions=Trueto collect errors instead of raising
.starmap() — Multi-Argument Parallel
@app.function()
def add(x, y):
return x + y
results = list(add.starmap([(1, 2), (3, 4), (5, 6)]))
# [3, 7, 11]
.map() with order_outputs=False
For faster throughput when order doesn't matter:
for result in process.map(items, order_outputs=False):
handle(result) # Results arrive as they complete
Async Functions
Modal supports async/await natively:
@app.function()
async def fetch_data(url: str) -> str:
import httpx
async with httpx.AsyncClient() as client:
response = await client.get(url)
return response.text
Async functions are especially useful with @modal.concurrent() for handling multiple requests per container.
Local Entrypoints
The @app.local_entrypoint() runs on your machine and orchestrates remote calls:
@app.local_entrypoint()
def main():
# This code runs locally
data = load_local_data()
# These calls run in the cloud
results = list(process.map(data))
# Back to local
save_results(results)
You can also define multiple entrypoints and select by function name:
modal run script.py::train
modal run script.py::evaluate
Generators
Functions can yield results as they're produced:
@app.function()
def generate_data():
for i in range(100):
yield process(i)
@app.local_entrypoint()
def main():
for result in generate_data.remote_gen():
print(result)
Retries
Configure automatic retries on failure:
@app.function(retries=3)
def flaky_operation():
...
For more control, use modal.Retries:
@app.function(retries=modal.Retries(max_retries=3, backoff_coefficient=2.0))
def api_call():
...
Timeouts
Set maximum execution time:
@app.function(timeout=3600) # 1 hour
def long_training():
...
Default timeout is 300 seconds (5 minutes). Maximum is 86400 seconds (24 hours).