Model Deployment: Integrating ML into Applications
A high-accuracy model in a Jupyter notebook has zero business value. Real value comes from integrating the model into a real application where it can make decisions and generate revenue.
Saving and Loading Models
Training a model can take minutes or hours. You cannot retrain it on every website request. The correct approach: save the trained model to disk and load it when needed.
Saving Models with Joblib
import joblib
# After the model is trained...
rf_model.fit(X_train, y_train)
# Save model to file
joblib.dump(rf_model, 'churn_model.pkl')
# Also save the scaler (needed for prediction too!)
joblib.dump(scaler, 'churn_scaler.pkl')
print("Model saved as churn_model.pkl")
Loading the Model
# Load saved model and scaler
loaded_model = joblib.load('churn_model.pkl')
loaded_scaler = joblib.load('churn_scaler.pkl')
# Predict using the loaded model
new_customer = [[12, 89.5, 1074, 2, 0, 45.2, 1]] # Feature values
new_customer_scaled = loaded_scaler.transform(new_customer)
prediction = loaded_model.predict(new_customer_scaled)
probability = loaded_model.predict_proba(new_customer_scaled)[:, 1]
print(f"Prediction: {'Will Churn' if prediction[0] == 1 else 'Will Not Churn'}")
print(f"Churn probability: {probability[0]:.2%}")
Building a Prediction API with FastAPI
Next, we will build a prediction API using FastAPI so the frontend can call it to predict churn in real time.
install FastAPI
pip install fastapi uvicorn
Creating the API Server
Create a file named predict_api.py:
from fastapi import FastAPI
from pydantic import BaseModel
import joblib
import numpy as np
# Load saved model and scaler
model = joblib.load('churn_model.pkl')
scaler = joblib.load('churn_scaler.pkl')
# Create FastAPI app
app = FastAPI(title="Customer Churn Prediction API")
# Define request data format
class CustomerData(BaseModel):
tenure_months: int # Months of tenure
monthly_charges: float # Monthly charge
total_charges: float # Total charges
num_support_tickets: int # Support tickets
has_contract: int # Has contract (0/1)
avg_order_value: float # Avg order value
num_complaints: int # Complaints
# Define response format
class PredictionResult(BaseModel):
will_churn: bool
churn_probability: float
churn_risk_level: str
@app.get("/health")
def health_check():
return {"status": "ok", "model": "churn_prediction"}
@app.post("/predict", response_model=PredictionResult)
def predict_churn(customer: CustomerData):
# Convert input data to model format
features = np.array([[
customer.tenure_months,
customer.monthly_charges,
customer.total_charges,
customer.num_support_tickets,
customer.has_contract,
customer.avg_order_value,
customer.num_complaints
]])
# Apply scaling
features_scaled = scaler.transform(features)
# Make prediction
prediction = model.predict(features_scaled)[0]
probability = model.predict_proba(features_scaled)[0, 1]
# Determine risk level
if probability >= 0.7:
risk_level = "High Risk"
elif probability >= 0.4:
risk_level = "Medium Risk"
else:
risk_level = "Low Risk"
return PredictionResult(
will_churn=bool(prediction),
churn_probability=round(float(probability), 4),
churn_risk_level=risk_level
)
if __name__ == "__main__":
import uvicorn
uvicorn.run(app, host="0.0.0.0", port=8000)
Starting the API Server
python predict_api.py
# Or
uvicorn predict_api:app --host 0.0.0.0 --port 8000 --reload
Testing the API
Open your browser to http://localhost:8000/docs to see the interactive Swagger API documentation!
You can also test with curl:
curl -X POST "http://localhost:8000/predict" \
-H "Content-Type: application/json" \
-d '{
"tenure_months": 3,
"monthly_charges": 89.5,
"total_charges": 268.5,
"num_support_tickets": 5,
"has_contract": 0,
"avg_order_value": 35.0,
"num_complaints": 3
}'
Example response:
{
"will_churn": true,
"churn_probability": 0.8732,
"churn_risk_level": "High Risk"
}
Integrating with the Frontend
In your frontend (Next.js / React), call the API using fetch:
// React / Next.js calling the ML API
async function predictChurn(customerData) {
const response = await fetch('http://localhost:8000/predict', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify(customerData)
});
const result = await response.json();
if (result.will_churn) {
showAlert(`⚠️ High risk customer! Churn probability: ${(result.churn_probability * 100).toFixed(0)}%`);
sendCoupon(customerData.userId); // Auto-send discount coupon
}
return result;
}
Dockerizing the ML API
Finally, let us containerize the ML API with Docker for cloud deployment:
FROM python:3.11-slim
WORKDIR /app
# Copy necessary files
COPY requirements.txt .
COPY predict_api.py .
COPY churn_model.pkl .
COPY churn_scaler.pkl .
# Install packages
RUN pip install --no-cache-dir -r requirements.txt
# Expose port
EXPOSE 8000
# Start the API
CMD ["uvicorn", "predict_api:app", "--host", "0.0.0.0", "--port", "8000"]
requirements.txt:
fastapi
uvicorn
joblib
numpy
pydantic
scikit-learn
Summary
In this chapter, you learned:
- ✅ Model Persistence: Save and load trained models with Joblib
- ✅ FastAPI: Build a machine learning prediction API
- ✅ Frontend Integration: Call the ML API from React/Next.js
- ✅ Docker Deployment: Containerize the ML API for cloud deployment
- ✅ Complete Pipeline: From training -> saving -> API -> frontend integration
What Is Next: Time Series Forecasting with Prophet
The final chapter covers time series forecasting — predicting future values based on historical data. You will learn how Facebook's Prophet library can forecast revenue, traffic, and stock trends with just a few lines of code.