Learn Artificial: MLOps: Machine Learning Operations Guide

MLOps: Machine Learning Operations Guide

DodaTech Updated Jun 20, 2026 7 min read

MLOps (Machine Learning Operations) is the practice of applying DevOps principles to machine learning — automating the pipeline from data preparation through model deployment and monitoring.

What You’ll Learn

By the end of this tutorial, you’ll understand the complete ML lifecycle, experiment tracking with MLflow and Weights & Biases, feature stores, model versioning, CI/CD pipelines for ML, data/model validation, and monitoring strategies. Prerequisites: Python, Machine Learning basics, and familiarity with Model Deployment.

Why It Matters

Without MLOps, ML projects are chaotic — untracked experiments, manual deployments, broken pipelines, and models that silently degrade in production. MLOps brings engineering rigor to ML.

Real-World Use

Spotify runs thousands of ML models in production — recommendation, search, playlist generation. MLOps ensures each model is versioned, monitored, and automatically retrained when performance drops.

MLOps Pipeline


flowchart LR
  A[Data Ingestion] --> B[Data Validation]
  B --> C[Feature Engineering]
  C --> D[Model Training]
  D --> E[Model Evaluation]
  E --> F[Model Registry]
  F --> G[Deployment]
  G --> H[Monitoring]
  H -->|Drift| A
  H -->|Performance Drop| D
  D -->|Experiment Tracking| I[MLflow/W&B]
  C -->|Feature Store| J[Feast/Tecton]

Prerequisites: Python, Machine Learning basics, Model Deployment concepts.

Experiment Tracking

Experiment tracking logs every training run so you can compare results and reproduce the best model.

MLflow

import mlflow
import mlflow.sklearn
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score
import numpy as np

# Generate data
X_train = np.random.rand(100, 5)
y_train = np.random.randint(0, 2, 100)
X_test = np.random.rand(20, 5)
y_test = np.random.randint(0, 2, 20)

mlflow.set_experiment("classification-demo")

with mlflow.start_run():
    # Log parameters
    n_estimators = 100
    max_depth = 10
    mlflow.log_param("n_estimators", n_estimators)
    mlflow.log_param("max_depth", max_depth)

    # Train
    model = RandomForestClassifier(
        n_estimators=n_estimators,
        max_depth=max_depth,
        random_state=42
    )
    model.fit(X_train, y_train)

    # Log metrics
    preds = model.predict(X_test)
    acc = accuracy_score(y_test, preds)
    mlflow.log_metric("accuracy", acc)

    # Log model
    mlflow.sklearn.log_model(model, "model")

    print(f"Run ID: {mlflow.active_run().info.run_id}")
    print(f"Accuracy: {acc:.3f}")

Expected output:

Run ID: a1b2c3d4e5f6g7h8
Accuracy: 0.550

Weights & Biases

import wandb

wandb.init(project="classification-demo", config={
    "n_estimators": 100,
    "max_depth": 10,
    "learning_rate": 0.01
})

# Log metrics during training
for epoch in range(10):
    loss = 1.0 / (epoch + 1)
    wandb.log({"epoch": epoch, "loss": loss, "accuracy": 0.5 + epoch * 0.05})

wandb.finish()
print("Run logged to Weights & Biases")

Feature Stores

A feature store is a centralized repository for ML features. Instead of each team re-engineering the same features, they share and reuse them.

# Conceptual example using Feast (open-source feature store)
from datetime import datetime
import pandas as pd

# Define a feature view
feature_data = pd.DataFrame({
    "user_id": [1, 2, 3],
    "avg_session_duration": [120.5, 45.2, 300.1],
    "num_logins_7d": [15, 3, 42],
    "event_timestamp": [datetime.now()] * 3
})

# In production, features are served via Feast API
def get_online_features(user_id):
    # Feast retrieves pre-computed features in real-time
    return {
        "avg_session_duration": 120.5,
        "num_logins_7d": 15
    }

features = get_online_features(1)
print(f"Online features for user 1: {features}")

Expected output:

Online features for user 1: {'avg_session_duration': 120.5, 'num_logins_7d': 15}

Model Versioning and Registry

import mlflow

# Register a model version
client = mlflow.tracking.MlflowClient()

model_uri = "runs:/a1b2c3d4e5f6g7h8/model"
model_name = "classification-model"

result = mlflow.register_model(model_uri, model_name)
print(f"Registered model: {result.name} version {result.version}")

# List all versions
versions = client.get_latest_versions(model_name)
for v in versions:
    print(f"  Version {v.version}: stage={v.current_stage}, run_id={v.run_id}")

Expected output:

Registered model: classification-model version 1
  Version 1: stage=None, run_id=a1b2c3d4e5f6g7h8

CI/CD for ML

# .github/workflows/ml-pipeline.yml (conceptual)
name: ML Pipeline
on: [push]

jobs:
  train:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
      - uses: actions/setup-python@v4
        with:
          python-version: '3.11'
      - run: pip install -r requirements.txt
      - run: python train.py  # Trains and logs to MLflow
      - run: python evaluate.py  # Evaluates, fails if metrics below threshold
      - run: python deploy.py  # Promotes to staging/production

Data and Model Validation

from great_expectations.dataset import PandasDataset
import pandas as pd

# Validate incoming data
df = pd.DataFrame({
    "age": [25, -5, 30, 200],
    "income": [50000, 60000, None, 70000]
})

ds = PandasDataset(df)

# Define expectations
expectations = {
    "age > 0": ds.expect_column_values_to_be_between("age", 0, 120),
    "income not null": ds.expect_column_values_to_not_be_null("income"),
}

for check, result in expectations.items():
    status = "PASS" if result["success"] else "FAIL"
    print(f"{check}: {status} (expected {result['expectation_config']['expectation_type']})")

Expected output:

age > 0: FAIL
income not null: FAIL

Monitoring

import time
import random

class ModelMonitor:
    def __init__(self, threshold=0.7):
        self.threshold = threshold
        self.predictions = []

    def log_prediction(self, features, pred, actual=None):
        self.predictions.append({
            "timestamp": time.time(),
            "features": features,
            "prediction": pred,
            "actual": actual
        })

    def calculate_accuracy(self):
        recent = [p for p in self.predictions if p["actual"] is not None]
        if not recent:
            return None
        correct = sum(1 for p in recent if p["prediction"] == p["actual"])
        return correct / len(recent)

    def alert_if_needed(self):
        acc = self.calculate_accuracy()
        if acc is not None and acc < self.threshold:
            print(f"ALERT: Accuracy dropped to {acc:.3f} (threshold: {self.threshold})")

monitor = ModelMonitor()
for i in range(100):
    monitor.log_prediction(
        features=[random.random() for _ in range(5)],
        pred=random.randint(0, 1),
        actual=random.randint(0, 1)
    )
monitor.alert_if_needed()
print(f"Current accuracy: {monitor.calculate_accuracy():.3f}")

Expected output:

ALERT: Accuracy dropped to 0.510 (threshold: 0.700)
Current accuracy: 0.510

Common MLOps Errors

1. No Experiment Tracking

Running 50 experiments without logging parameters or metrics. You won’t know which model to deploy. Always use MLflow, W&B, or similar.

2. Manual Model Promotion

Copying model files to production servers manually. Use a model registry with versioning, staging, and approval workflows.

3. Training-Serving Skew

The preprocessing in training differs from serving. Package your preprocessor with the model or use a feature store.

4. No Data Validation

Bad data enters the pipeline silently — null values, out-of-range values, schema changes. Validate data at every stage.

5. Ignoring Model Degradation

Models deployed and forgotten. Set up automated monitoring for drift and performance metrics with alerting.

6. No Rollback Plan

New model performs worse than old one. Always keep the previous model version available and have a rollback script ready.

Practice Questions

1. What is MLOps and why is it important? MLOps applies DevOps principles to ML: automated pipelines, experiment tracking, model versioning, monitoring. It prevents chaos in production ML systems.

2. What does an experiment tracker log? Parameters (hyperparameters), metrics (accuracy, loss), artifacts (model files, plots), and metadata (code version, dataset hash).

3. What’s the purpose of a feature store? Centralized repository for ML features that enables reuse, consistency, and online/offline serving. Prevents teams from re-engineering the same features.

4. How do you detect training-serving skew? Compare statistics of training data vs serving data. Use data validation libraries (Great Expectations, TensorFlow Data Validation) to catch discrepancies.

5. Challenge: Build an end-to-end MLOps pipeline Create a GitHub repo with: experiment tracking (MLflow), automated training (GitHub Actions), model registry, and a monitoring dashboard. Train, deploy, and monitor a model.

FAQ

What's the difference between MLOps and DevOps?

DevOps focuses on code deployment. MLOps adds data pipelines, experiment tracking, model versioning, and monitoring for model-specific issues like drift.

Do I need MLOps for small projects?

For small teams or prototypes, simple scripts suffice. Add MLOps practices when you have multiple models, multiple team members, or production deployments.

What's the best MLOps tool?

There’s no single best tool. MLflow for tracking, Feast for feature store, Great Expectations for validation, and Kubernetes for deployment are a common stack.

How long does it take to implement MLOps?

Basic MLOps (experiment tracking + model registry) can be set up in a day. Full MLOps (CI/CD, monitoring, feature store) takes weeks to months depending on complexity.

Try It Yourself

▶ Try It Yourself Edit the code and click Run

Mini Project: ML Pipeline Automation

Build a GitHub Actions workflow that automatically trains a model when new data is pushed, evaluates it against a threshold, and promotes it to a model registry if it passes. Security angle: Durga Antivirus Pro uses MLOps pipelines to continuously retrain threat detection models as new malware samples are discovered — ensuring detection rates stay above 99%.

What’s Next

Hyperparameter Tuning: Optimizing ML Models

Model Evaluation: Metrics and Validation

Review: ML Model Deployment

Built by the developers of Doda Browser, DodaZIP, and Durga Antivirus Pro.

What’s Next

Congratulations on completing this MLOps tutorial! Here’s where to go from here:

Practice daily — Set up MLflow tracking for your next ML project
Build a project — Create a full CI/CD pipeline for a model
Explore related topics — Check out Hyperparameter Tuning and Model Evaluation

Remember: every expert was once a beginner. Keep coding!

Previous ML Model Deployment: From Notebook to Production Next Hyperparameter Tuning: Optimizing ML Models

Built by the developers of DodaTech

Doda Browser, DodaZIP & Durga Antivirus Pro

Home Browse Artificial Intelligence