Deploying Optimization as an API

Optimization models are powerful, but they are only useful if people can access them. This chapter teaches you to deploy ILP models as web services.

Why Deploy Optimization as an API?

Running optimization from a Jupyter notebook is fine for development. In production, applications need programmatic access:

| Scenario | Without API | With API | |----------|:-----------:|:--------:| | Supply chain team needs daily re-optimization | Open notebook, change parameters, run | POST request with parameters, get JSON response | | Website needs real-time pricing optimization | Impossible | API call in milliseconds | | Mobile app needs route optimization | Impossible | API call from any device |

What Is an Optimization API?

An optimization API wraps the ILP model behind a RESTful endpoint. The client sends problem parameters as JSON, the server solves the model, and returns the solution.

How to Build an Optimization API

Step 1: Define the Model as a Function

# scheduling_model.py
import pulp

def solve_scheduling(demand: dict, employees: list, preferences: dict):
    """Solve staff scheduling problem and return assignment"""
    days = list(range(1, 8))
    shifts = ["Morning", "Afternoon", "Evening"]

    prob = pulp.LpProblem("Staff_Scheduling", pulp.LpMinimize)

    x = {}
    for e in employees:
        for d in days:
            for s in shifts:
                x[(e, d, s)] = pulp.LpVariable(f"x_{e}_{d}_{s}", cat="Binary")

    # Objective
    prob += pulp.lpSum(
        -preferences.get((e, s), 0) * x[(e, d, s)]
        for e in employees for d in days for s in shifts
        if (e, s) in preferences
    )

    # Constraints
    for e in employees:
        for d in days:
            prob += pulp.lpSum(x[(e, d, s)] for s in shifts) <= 1

    for e in employees:
        prob += pulp.lpSum(x[(e, d, s)] for d in days for s in shifts) <= 5

    for d in days:
        for s in shifts:
            prob += pulp.lpSum(x[(e, d, s)] for e in employees) >= demand[(d, s)]

    prob.solve()

    if pulp.LpStatus[prob.status] != "Optimal":
        return {"status": "infeasible"}

    # Build result
    schedule = {}
    for e in employees:
        schedule[e] = {}
        for d in days:
            for s in shifts:
                if pulp.value(x[(e, d, s)]) == 1:
                    schedule[e][f"Day_{d}"] = s

    return {
        "status": "optimal",
        "objective": pulp.value(prob.objective),
        "schedule": schedule,
    }

Step 2: Wrap with FastAPI

# api.py
from fastapi import FastAPI
from pydantic import BaseModel
from typing import Dict, List

app = FastAPI(title="Optimization API", version="1.0.0")


class ScheduleRequest(BaseModel):
    demand: Dict[str, int]  # "1_Morning": 2
    employees: List[str]
    min_days: int = 4
    max_days: int = 5


class ScheduleResponse(BaseModel):
    status: str
    objective: float = 0
    schedule: Dict = {}


@app.post("/schedule", response_model=ScheduleResponse)
async def create_schedule(req: ScheduleRequest):
    """
    Solve staff scheduling problem.
    
    Send employee shift preferences and demand requirements,
    receive optimal schedule back.
    """
    # Convert demand format
    demand = {}
    for key, val in req.demand.items():
        day_str, shift = key.split("_")
        demand[(int(day_str), shift)] = val

    # Default preferences (neutral)
    preferences = {}

    result = solve_scheduling(demand, req.employees, preferences)
    return result


@app.get("/health")
async def health():
    return {"status": "healthy", "solver": "PuLP", "version": "1.0.0"}

Step 3: Run the Server

uvicorn api:app --host 0.0.0.0 --port 8000

Step 4: Call from Any Client

import requests

response = requests.post(
    "http://localhost:8000/schedule",
    json={
        "demand": {
            "1_Morning": 2, "1_Afternoon": 3, "1_Evening": 1,
            "2_Morning": 2, "2_Afternoon": 2, "2_Evening": 1,
        },
        "employees": ["Alice", "Bob", "Charlie"],
    }
)

print(response.json())

Advanced: Async Optimization

For long-running models (supply chain, production planning), use async endpoints:

import asyncio
from uuid import uuid4

# In-memory task store
jobs = {}

@app.post("/optimize/async")
async def start_optimization(req: ScheduleRequest):
    """Start optimization in background, return job ID"""
    job_id = str(uuid4())
    jobs[job_id] = {"status": "running"}
    
    # Start background task
    asyncio.create_task(run_optimization(job_id, req))
    
    return {"job_id": job_id, "status": "running"}


@app.get("/optimize/status/{job_id}")
async def get_status(job_id: str):
    """Check optimization status"""
    job = jobs.get(job_id)
    if not job:
        return {"status": "not_found"}
    return job

Deployment Considerations

| Aspect | Recommendation | |--------|---------------| | Server | At least 2 CPU cores, 4 GB RAM | | Timeout | Set reasonable API timeout (30-60s) | | Caching | Cache common problem instances | | Validation | Validate inputs before solving | | Error handling | Return clear error messages | | Monitoring | Track solve times and failure rates | | Scaling | Multiple workers for concurrent requests |

Performance Benchmarks

| Problem Type | Variables | Solve Time | API Response | |:-------------|:---------:|:----------:|:------------:| | Small scheduling | 100 | < 0.5s | < 100ms after model warmup | | Medium supply chain | 1000 | 2-5s | ~3s | | Large production | 10000 | 30-60s | ~45s | | Enterprise | 50000+ | minutes | Use async endpoint |

The Vibe Coding Approach

"Create a FastAPI endpoint that solves staff scheduling. Accept demand requirements and employee lists as JSON, return optimal schedule. Wrap the PuLP model from Chapter 2."

The AI will generate the complete API code with request/response models.

Summary

Deploying optimization models as APIs makes them accessible to applications, users, and automated systems.

Key takeaways:

Package the ILP model as a Python function
Wrap with FastAPI for automatic OpenAPI documentation
Use Pydantic models for request/response validation
Async endpoints for long-running models
Add health check endpoint for monitoring
Set appropriate timeouts based on problem size
Cache common problem instances
Monitor solve times and failure rates

What's Next: Course Summary

This constraint programming course covered: ILP basics with PuLP, staff scheduling, multi-project resource allocation, supply chain optimization, and deployment as APIs. You can now model, solve, and deploy a wide range of optimization problems.