Deploying Optimization as an API
Optimization models are powerful, but they are only useful if people can access them. This chapter teaches you to deploy ILP models as web services.
Why Deploy Optimization as an API?
Running optimization from a Jupyter notebook is fine for development. In production, applications need programmatic access:
| Scenario | Without API | With API | |----------|:-----------:|:--------:| | Supply chain team needs daily re-optimization | Open notebook, change parameters, run | POST request with parameters, get JSON response | | Website needs real-time pricing optimization | Impossible | API call in milliseconds | | Mobile app needs route optimization | Impossible | API call from any device |
What Is an Optimization API?
An optimization API wraps the ILP model behind a RESTful endpoint. The client sends problem parameters as JSON, the server solves the model, and returns the solution.
How to Build an Optimization API
Step 1: Define the Model as a Function
# scheduling_model.py
import pulp
def solve_scheduling(demand: dict, employees: list, preferences: dict):
"""Solve staff scheduling problem and return assignment"""
days = list(range(1, 8))
shifts = ["Morning", "Afternoon", "Evening"]
prob = pulp.LpProblem("Staff_Scheduling", pulp.LpMinimize)
x = {}
for e in employees:
for d in days:
for s in shifts:
x[(e, d, s)] = pulp.LpVariable(f"x_{e}_{d}_{s}", cat="Binary")
# Objective
prob += pulp.lpSum(
-preferences.get((e, s), 0) * x[(e, d, s)]
for e in employees for d in days for s in shifts
if (e, s) in preferences
)
# Constraints
for e in employees:
for d in days:
prob += pulp.lpSum(x[(e, d, s)] for s in shifts) <= 1
for e in employees:
prob += pulp.lpSum(x[(e, d, s)] for d in days for s in shifts) <= 5
for d in days:
for s in shifts:
prob += pulp.lpSum(x[(e, d, s)] for e in employees) >= demand[(d, s)]
prob.solve()
if pulp.LpStatus[prob.status] != "Optimal":
return {"status": "infeasible"}
# Build result
schedule = {}
for e in employees:
schedule[e] = {}
for d in days:
for s in shifts:
if pulp.value(x[(e, d, s)]) == 1:
schedule[e][f"Day_{d}"] = s
return {
"status": "optimal",
"objective": pulp.value(prob.objective),
"schedule": schedule,
}
Step 2: Wrap with FastAPI
# api.py
from fastapi import FastAPI
from pydantic import BaseModel
from typing import Dict, List
app = FastAPI(title="Optimization API", version="1.0.0")
class ScheduleRequest(BaseModel):
demand: Dict[str, int] # "1_Morning": 2
employees: List[str]
min_days: int = 4
max_days: int = 5
class ScheduleResponse(BaseModel):
status: str
objective: float = 0
schedule: Dict = {}
@app.post("/schedule", response_model=ScheduleResponse)
async def create_schedule(req: ScheduleRequest):
"""
Solve staff scheduling problem.
Send employee shift preferences and demand requirements,
receive optimal schedule back.
"""
# Convert demand format
demand = {}
for key, val in req.demand.items():
day_str, shift = key.split("_")
demand[(int(day_str), shift)] = val
# Default preferences (neutral)
preferences = {}
result = solve_scheduling(demand, req.employees, preferences)
return result
@app.get("/health")
async def health():
return {"status": "healthy", "solver": "PuLP", "version": "1.0.0"}
Step 3: Run the Server
uvicorn api:app --host 0.0.0.0 --port 8000
Step 4: Call from Any Client
import requests
response = requests.post(
"http://localhost:8000/schedule",
json={
"demand": {
"1_Morning": 2, "1_Afternoon": 3, "1_Evening": 1,
"2_Morning": 2, "2_Afternoon": 2, "2_Evening": 1,
},
"employees": ["Alice", "Bob", "Charlie"],
}
)
print(response.json())
Advanced: Async Optimization
For long-running models (supply chain, production planning), use async endpoints:
import asyncio
from uuid import uuid4
# In-memory task store
jobs = {}
@app.post("/optimize/async")
async def start_optimization(req: ScheduleRequest):
"""Start optimization in background, return job ID"""
job_id = str(uuid4())
jobs[job_id] = {"status": "running"}
# Start background task
asyncio.create_task(run_optimization(job_id, req))
return {"job_id": job_id, "status": "running"}
@app.get("/optimize/status/{job_id}")
async def get_status(job_id: str):
"""Check optimization status"""
job = jobs.get(job_id)
if not job:
return {"status": "not_found"}
return job
Deployment Considerations
| Aspect | Recommendation | |--------|---------------| | Server | At least 2 CPU cores, 4 GB RAM | | Timeout | Set reasonable API timeout (30-60s) | | Caching | Cache common problem instances | | Validation | Validate inputs before solving | | Error handling | Return clear error messages | | Monitoring | Track solve times and failure rates | | Scaling | Multiple workers for concurrent requests |
Performance Benchmarks
| Problem Type | Variables | Solve Time | API Response | |:-------------|:---------:|:----------:|:------------:| | Small scheduling | 100 | < 0.5s | < 100ms after model warmup | | Medium supply chain | 1000 | 2-5s | ~3s | | Large production | 10000 | 30-60s | ~45s | | Enterprise | 50000+ | minutes | Use async endpoint |
The Vibe Coding Approach
"Create a FastAPI endpoint that solves staff scheduling. Accept demand requirements and employee lists as JSON, return optimal schedule. Wrap the PuLP model from Chapter 2."
The AI will generate the complete API code with request/response models.
Summary
Deploying optimization models as APIs makes them accessible to applications, users, and automated systems.
Key takeaways:
- Package the ILP model as a Python function
- Wrap with FastAPI for automatic OpenAPI documentation
- Use Pydantic models for request/response validation
- Async endpoints for long-running models
- Add health check endpoint for monitoring
- Set appropriate timeouts based on problem size
- Cache common problem instances
- Monitor solve times and failure rates
What's Next: Course Summary
This constraint programming course covered: ILP basics with PuLP, staff scheduling, multi-project resource allocation, supply chain optimization, and deployment as APIs. You can now model, solve, and deploy a wide range of optimization problems.