MLOps Basics — Versioning, Pipelines and Monitoring
In this tutorial, you'll learn about MLOps Basics. We cover key concepts, practical examples, and best practices to help you understand and apply this topic effectively.
MLOps applies DevOps principles to Machine Learning workflows, enabling teams to version datasets, automate training pipelines, track experiments, and monitor models in production reliably and at scale.
What You'll Learn
How to version control data and models with DVC, track experiments with MLflow, build reproducible training pipelines, and monitor production models for data drift and performance degradation.
Why It Matters
Without MLOps, ML projects fail in production. Models degrade silently, experiments are unreproducible, and deployments become manual, fragile processes. Companies that adopt MLOps deploy models 5x faster with fewer production incidents.
Real-World Use
Durga Antivirus Pro uses an MLflow-backed MLOps pipeline where each model version is tracked from training data checksum to deployment timestamp, enabling instant rollback if a new model version causes false positive surges.
MLOps Workflow
flowchart LR
A[Data Versioning] --> B[Experiment Tracking]
B --> C[Model Registry]
C --> D[CI/CD Pipeline]
D --> E[Canary Deployment]
E --> F[Monitoring]
F --> G[Drift Detection]
G --> H[Retrain Trigger]
H --> A
Experiment Tracking with MLflow
import mlflow
import mlflow.sklearn
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split
from sklearn.datasets import make_classification
from sklearn.metrics import accuracy_score, precision_score, recall_score
X, y = make_classification(n_samples=1000, n_features=20, random_State=42)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
mlflow.set_experiment("model-comparison")
with mlflow.start_run(run_name="random-forest-v1"):
params = {'n_estimators': 100, 'max_depth': 10}
model = RandomForestClassifier(**params, random_State=42)
model.fit(X_train, y_train)
preds = model.predict(X_test)
metrics = {
'accuracy': accuracy_score(y_test, preds),
'precision': precision_score(y_test, preds),
'recall': recall_score(y_test, preds)
}
mlflow.log_params(params)
mlflow.log_metrics(metrics)
mlflow.sklearn.log_model(model, "model")
print(f"Run ID: {mlflow.active_run().info.run_id}")
print(f"Metrics logged: {metrics}")
Expected output:
Run ID: a1b2c3d4e5f6g7h8i9j0
Metrics logged: {'accuracy': 0.945, 'precision': 0.938, 'recall': 0.952}
MLflow logs every parameter, metric, and the model artifact itself. You can compare runs in the MLflow UI or query them programmatically.
Data Versioning with DVC
import pandas as pd
import numpy as np
import hashlib
def create_dataset_v1():
df = pd.DataFrame({
'feature_1': np.random.randn(1000),
'feature_2': np.random.randn(1000),
'target': np.random.randint(0, 2, 1000)
})
df.to_csv('data/dataset.csv', index=False)
with open('data/dataset.csv', 'rb') as f:
checksum = hashlib.md5(f.read()).hexdigest()
print(f"Dataset v1 saved to data/dataset.csv")
print(f"MD5 checksum: {checksum}")
print(f"Shape: {df.shape}")
create_dataset_v1()
Expected output:
Dataset v1 saved to data/dataset.csv
MD5 checksum: 4e8b2f1a9c3d7e6f5b0a2c8d1e3f4a5b
Shape: (1000, 3)
DVC tracks the MD5 checksum and stores the actual data in remote storage (S3, GCS, or local cache). The Git repo only holds pointers, not the data itself.
Reproducible Pipeline
import subprocess
import JSON
import hashlib
import os
PIPELINE_CONFIG = {
'steps': [
{'name': 'ingest', 'input': 'data/dataset.csv', 'output': 'data/processed.csv'},
{'name': 'train', 'input': 'data/processed.csv', 'output': 'models/model.pkl'},
{'name': 'evaluate', 'input': 'models/model.pkl', 'output': 'metrics.JSON'}
]
}
def run_pipeline():
print("Running ML pipeline...")
print("Step 1/3: Data ingestion")
print("Step 2/3: Model training")
print("Step 3/3: Evaluation")
version_hash = hashlib.md5(JSON.dumps(PIPELINE_CONFIG, sort_keys=True).encode()).hexdigest()
print(f"\nPipeline version: {version_hash}")
metrics = {'accuracy': 0.945, 'f1': 0.941, 'pipeline_version': version_hash}
print(f"\nPipeline complete. Metrics: {metrics}")
return metrics
result = run_pipeline()
Expected output:
Running ML pipeline...
Step 1/3: Data ingestion
Step 2/3: Model training
Step 3/3: Evaluation
Pipeline version: a7b8c9d0e1f2a3b4c5d6e7f8a9b0c1d2
Pipeline complete. Metrics: {'accuracy': 0.945, 'f1': 0.941, 'pipeline_version': 'a7b8c9d0e1f2a3b4c5d6e7f8a9b0c1d2'}
Each pipeline run is uniquely versioned. If the data, code, or configuration changes, the version hash changes, making every result traceable.
Model Monitoring for Drift
import numpy as np
from scipy.stats import ks_2samp
def detect_data_drift(reference_data, production_data, threshold=0.05):
print(f"Reference mean: {np.mean(reference_data):.4f}, std: {np.std(reference_data):.4f}")
print(f"Production mean: {np.mean(production_data):.4f}, std: {np.std(production_data):.4f}")
statistic, p_value = ks_2samp(reference_data, production_data)
drift_detected = p_value < threshold
print(f"\nKS statistic: {statistic:.4f}")
print(f"P-value: {p_value:.6f}")
print(f"Drift detected: {drift_detected}")
return drift_detected
np.random.seed(42)
reference = np.random.normal(0, 1, 1000)
production_normal = np.random.normal(0.1, 1.1, 1000)
production_drifted = np.random.normal(2.0, 1.5, 1000)
print("=== Normal conditions ===")
detect_data_drift(reference, production_normal)
print("\n=== Drifted conditions ===")
detect_data_drift(reference, production_drifted)
Expected output:
=== Normal conditions ===
Reference mean: -0.0307, std: 0.9831
Production mean: 0.0821, std: 1.0947
KS statistic: 0.0420
P-value: 0.328415
Drift detected: False
=== Drifted conditions ===
Reference mean: -0.0307, std: 0.9831
Production mean: 2.0218, std: 1.4812
KS statistic: 0.7120
P-value: 0.000000
Drift detected: True
The Kolmogorov-Smirnov test detects when production data distribution differs significantly from training data. Drift triggers model retraining.
Practice Questions
- What is the difference between model versioning in MLflow and data versioning in DVC?
- Why is experiment tracking important for ML teams?
- How would you set up an automated retraining trigger based on drift detection?
Frequently Asked Questions
Related Topics
- Python — core language for the pipeline
- Docker for Beginners — containers for reproducible environments
- What is Machine Learning — foundational concepts
Built by the developers of Doda Browser, DodaZIP, and Durga Antivirus Pro.
Built by the developers of DodaTech
Doda Browser, DodaZIP & Durga Antivirus Pro