Model Deployment: Integrating ML into Applications

A high-accuracy model in a Jupyter notebook has zero business value. Real value comes from integrating the model into a real application where it can make decisions and generate revenue.

Saving and Loading Models

Training a model can take minutes or hours. You cannot retrain it on every website request. The correct approach: save the trained model to disk and load it when needed.

Saving Models with Joblib

import joblib

# After the model is trained...
rf_model.fit(X_train, y_train)

# Save model to file
joblib.dump(rf_model, 'churn_model.pkl')

# Also save the scaler (needed for prediction too!)
joblib.dump(scaler, 'churn_scaler.pkl')

print("Model saved as churn_model.pkl")

Loading the Model

# Load saved model and scaler
loaded_model = joblib.load('churn_model.pkl')
loaded_scaler = joblib.load('churn_scaler.pkl')

# Predict using the loaded model
new_customer = [[12, 89.5, 1074, 2, 0, 45.2, 1]]  # Feature values
new_customer_scaled = loaded_scaler.transform(new_customer)
prediction = loaded_model.predict(new_customer_scaled)
probability = loaded_model.predict_proba(new_customer_scaled)[:, 1]

print(f"Prediction: {'Will Churn' if prediction[0] == 1 else 'Will Not Churn'}")
print(f"Churn probability: {probability[0]:.2%}")

Building a Prediction API with FastAPI

Next, we will build a prediction API using FastAPI so the frontend can call it to predict churn in real time.

install FastAPI

pip install fastapi uvicorn

Creating the API Server

Create a file named predict_api.py:

from fastapi import FastAPI
from pydantic import BaseModel
import joblib
import numpy as np

# Load saved model and scaler
model = joblib.load('churn_model.pkl')
scaler = joblib.load('churn_scaler.pkl')

# Create FastAPI app
app = FastAPI(title="Customer Churn Prediction API")

# Define request data format
class CustomerData(BaseModel):
    tenure_months: int           # Months of tenure
    monthly_charges: float       # Monthly charge
    total_charges: float         # Total charges
    num_support_tickets: int     # Support tickets
    has_contract: int            # Has contract (0/1)
    avg_order_value: float       # Avg order value
    num_complaints: int          # Complaints

# Define response format
class PredictionResult(BaseModel):
    will_churn: bool
    churn_probability: float
    churn_risk_level: str

@app.get("/health")
def health_check():
    return {"status": "ok", "model": "churn_prediction"}

@app.post("/predict", response_model=PredictionResult)
def predict_churn(customer: CustomerData):
    # Convert input data to model format
    features = np.array([[
        customer.tenure_months,
        customer.monthly_charges,
        customer.total_charges,
        customer.num_support_tickets,
        customer.has_contract,
        customer.avg_order_value,
        customer.num_complaints
    ]])
    
    # Apply scaling
    features_scaled = scaler.transform(features)
    
    # Make prediction
    prediction = model.predict(features_scaled)[0]
    probability = model.predict_proba(features_scaled)[0, 1]
    
    # Determine risk level
    if probability >= 0.7:
        risk_level = "High Risk"
    elif probability >= 0.4:
        risk_level = "Medium Risk"
    else:
        risk_level = "Low Risk"
    
    return PredictionResult(
        will_churn=bool(prediction),
        churn_probability=round(float(probability), 4),
        churn_risk_level=risk_level
    )

if __name__ == "__main__":
    import uvicorn
    uvicorn.run(app, host="0.0.0.0", port=8000)

Starting the API Server

python predict_api.py
# Or
uvicorn predict_api:app --host 0.0.0.0 --port 8000 --reload

Testing the API

Open your browser to http://localhost:8000/docs to see the interactive Swagger API documentation!

You can also test with curl:

curl -X POST "http://localhost:8000/predict" \
  -H "Content-Type: application/json" \
  -d '{
    "tenure_months": 3,
    "monthly_charges": 89.5,
    "total_charges": 268.5,
    "num_support_tickets": 5,
    "has_contract": 0,
    "avg_order_value": 35.0,
    "num_complaints": 3
  }'

Example response:

{
  "will_churn": true,
  "churn_probability": 0.8732,
  "churn_risk_level": "High Risk"
}

Integrating with the Frontend

In your frontend (Next.js / React), call the API using fetch:

// React / Next.js calling the ML API
async function predictChurn(customerData) {
  const response = await fetch('http://localhost:8000/predict', {
    method: 'POST',
    headers: { 'Content-Type': 'application/json' },
    body: JSON.stringify(customerData)
  });
  const result = await response.json();
  
  if (result.will_churn) {
    showAlert(`⚠️ High risk customer! Churn probability: ${(result.churn_probability * 100).toFixed(0)}%`);
    sendCoupon(customerData.userId); // Auto-send discount coupon
  }
  
  return result;
}

Dockerizing the ML API

Finally, let us containerize the ML API with Docker for cloud deployment:

FROM python:3.11-slim

WORKDIR /app

# Copy necessary files
COPY requirements.txt .
COPY predict_api.py .
COPY churn_model.pkl .
COPY churn_scaler.pkl .

# Install packages
RUN pip install --no-cache-dir -r requirements.txt

# Expose port
EXPOSE 8000

# Start the API
CMD ["uvicorn", "predict_api:app", "--host", "0.0.0.0", "--port", "8000"]

requirements.txt：

fastapi
uvicorn
joblib
numpy
pydantic
scikit-learn

Summary

In this chapter, you learned:

✅ Model Persistence: Save and load trained models with Joblib
✅ FastAPI: Build a machine learning prediction API
✅ Frontend Integration: Call the ML API from React/Next.js
✅ Docker Deployment: Containerize the ML API for cloud deployment
✅ Complete Pipeline: From training -> saving -> API -> frontend integration

What Is Next: Time Series Forecasting with Prophet

The final chapter covers time series forecasting — predicting future values based on historical data. You will learn how Facebook's Prophet library can forecast revenue, traffic, and stock trends with just a few lines of code.