November 23, 2025
15 min read

Junior Machine Learning Engineer Interview Questions: Complete Guide

interview
career-advice
job-search
entry-level
Junior Machine Learning Engineer Interview Questions: Complete Guide
MB

Milad Bonakdar

Author

Master ML engineering fundamentals with essential interview questions covering Python, ML algorithms, model training, deployment basics, and MLOps for junior machine learning engineers.


Introduction

Machine Learning Engineers build, deploy, and maintain ML systems in production. Junior ML engineers are expected to have strong programming skills, understanding of ML algorithms, experience with ML frameworks, and knowledge of deployment practices.

This guide covers essential interview questions for Junior Machine Learning Engineers. We explore Python programming, ML algorithms, model training and evaluation, deployment basics, and MLOps fundamentals to help you prepare for your first ML engineering role.


Python & Programming (5 Questions)

1. How do you handle large datasets that don't fit in memory?

Answer: Several techniques handle data larger than available RAM:

  • Batch Processing: Process data in chunks
  • Generators: Yield data on-demand
  • Dask/Ray: Distributed computing frameworks
  • Database Queries: Load only needed data
  • Memory-Mapped Files: Access disk as if in memory
  • Data Streaming: Process data as it arrives
import pandas as pd
import numpy as np

# Bad: Load entire dataset into memory
# df = pd.read_csv('large_file.csv')  # May crash

# Good: Read in chunks
chunk_size = 10000
for chunk in pd.read_csv('large_file.csv', chunksize=chunk_size):
    # Process each chunk
    processed = chunk[chunk['value'] > 0]
    # Save or aggregate results
    processed.to_csv('output.csv', mode='a', header=False)

# Using generators
def data_generator(filename, batch_size=32):
    while True:
        batch = []
        with open(filename, 'r') as f:
            for line in f:
                batch.append(process_line(line))
                if len(batch) == batch_size:
                    yield np.array(batch)
                    batch = []

# Dask for distributed computing
import dask.dataframe as dd
ddf = dd.read_csv('large_file.csv')
result = ddf.groupby('category').mean().compute()

Rarity: Very Common Difficulty: Medium


2. Explain decorators in Python and give an ML use case.

Answer: Decorators modify or enhance functions without changing their code.

  • Use Cases in ML:
    • Timing function execution
    • Logging predictions
    • Caching results
    • Input validation
import time
import functools

# Timing decorator
def timer(func):
    @functools.wraps(func)
    def wrapper(*args, **kwargs):
        start = time.time()
        result = func(*args, **kwargs)
        end = time.time()
        print(f"{func.__name__} took {end - start:.2f} seconds")
        return result
    return wrapper

# Logging decorator
def log_predictions(func):
    @functools.wraps(func)
    def wrapper(*args, **kwargs):
        predictions = func(*args, **kwargs)
        print(f"Made {len(predictions)} predictions")
        print(f"Prediction distribution: {np.bincount(predictions)}")
        return predictions
    return wrapper

# Usage
@timer
@log_predictions
def predict_batch(model, X):
    return model.predict(X)

# Caching decorator (memoization)
def cache_results(func):
    cache = {}
    @functools.wraps(func)
    def wrapper(*args):
        if args not in cache:
            cache[args] = func(*args)
        return cache[args]
    return wrapper

@cache_results
def expensive_feature_engineering(data_id):
    # Expensive computation
    return processed_features

Rarity: Common Difficulty: Medium


3. What is the difference between @staticmethod and @classmethod?

Answer: Both define methods that don't require an instance.

  • @staticmethod: No access to class or instance
  • @classmethod: Receives class as first argument
class MLModel:
    model_type = "classifier"
    
    def __init__(self, name):
        self.name = name
    
    # Regular method - needs instance
    def predict(self, X):
        return self.model.predict(X)
    
    # Static method - utility function
    @staticmethod
    def preprocess_data(X):
        # No access to self or cls
        return (X - X.mean()) / X.std()
    
    # Class method - factory pattern
    @classmethod
    def create_default(cls):
        # Has access to cls
        return cls(name=f"default_{cls.model_type}")
    
    @classmethod
    def from_config(cls, config):
        return cls(name=config['name'])

# Usage
# Static method - no instance needed
processed = MLModel.preprocess_data(X_train)

# Class method - creates instance
model = MLModel.create_default()
model2 = MLModel.from_config({'name': 'my_model'})

Rarity: Medium Difficulty: Medium


4. How do you handle exceptions in ML pipelines?

Answer: Proper error handling prevents pipeline failures and aids debugging.

import logging
from typing import Optional

# Setup logging
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)

class ModelTrainingError(Exception):
    """Custom exception for model training failures"""
    pass

def train_model(X, y, model_type='random_forest'):
    try:
        logger.info(f"Starting training with {model_type}")
        
        # Validate inputs
        if X.shape[0] != y.shape[0]:
            raise ValueError("X and y must have same number of samples")
        
        if X.shape[0] < 100:
            raise ModelTrainingError("Insufficient training data")
        
        # Train model
        if model_type == 'random_forest':
            from sklearn.ensemble import RandomForestClassifier
            model = RandomForestClassifier()
        else:
            raise ValueError(f"Unknown model type: {model_type}")
        
        model.fit(X, y)
        logger.info("Training completed successfully")
        return model
        
    except ValueError as e:
        logger.error(f"Validation error: {e}")
        raise
    except ModelTrainingError as e:
        logger.error(f"Training error: {e}")
        # Could fallback to simpler model
        return train_fallback_model(X, y)
    except Exception as e:
        logger.error(f"Unexpected error: {e}")
        raise
    finally:
        logger.info("Training attempt finished")

# Context manager for resource management
class ModelLoader:
    def __init__(self, model_path):
        self.model_path = model_path
        self.model = None
    
    def __enter__(self):
        logger.info(f"Loading model from {self.model_path}")
        self.model = load_model(self.model_path)
        return self.model
    
    def __exit__(self, exc_type, exc_val, exc_tb):
        logger.info("Cleaning up resources")
        if self.model:
            del self.model
        return False  # Don't suppress exceptions

# Usage
with ModelLoader('model.pkl') as model:
    predictions = model.predict(X_test)

Rarity: Common Difficulty: Medium


5. What are Python generators and why are they useful in ML?

Answer: Generators yield values one at a time, saving memory.

  • Benefits:
    • Memory efficient
    • Lazy evaluation
    • Infinite sequences
  • ML Use Cases:
    • Data loading
    • Batch processing
    • Data augmentation
import numpy as np

# Generator for batch processing
def batch_generator(X, y, batch_size=32):
    n_samples = len(X)
    indices = np.arange(n_samples)
    np.random.shuffle(indices)
    
    for start_idx in range(0, n_samples, batch_size):
        end_idx = min(start_idx + batch_size, n_samples)
        batch_indices = indices[start_idx:end_idx]
        yield X[batch_indices], y[batch_indices]

# Usage in training
for epoch in range(10):
    for X_batch, y_batch in batch_generator(X_train, y_train):
        model.train_on_batch(X_batch, y_batch)

# Data augmentation generator
def augment_images(images, labels):
    for img, label in zip(images, labels):
        # Original
        yield img, label
        # Flipped
        yield np.fliplr(img), label
        # Rotated
        yield np.rot90(img), label

# Infinite generator for training
def infinite_batch_generator(X, y, batch_size=32):
    while True:
        indices = np.random.choice(len(X), batch_size)
        yield X[indices], y[indices]

# Use with steps_per_epoch
gen = infinite_batch_generator(X_train, y_train)
# model.fit(gen, steps_per_epoch=100, epochs=10)

Rarity: Common Difficulty: Medium


ML Algorithms & Theory (5 Questions)

6. Explain the difference between bagging and boosting.

Answer: Both are ensemble methods but work differently:

  • Bagging (Bootstrap Aggregating):
    • Parallel training on random subsets
    • Reduces variance
    • Example: Random Forest
  • Boosting:
    • Sequential training, each model corrects previous errors
    • Reduces bias
    • Examples: AdaBoost, Gradient Boosting, XGBoost
from sklearn.ensemble import RandomForestClassifier, GradientBoostingClassifier
from sklearn.datasets import make_classification
from sklearn.model_selection import cross_val_score

# Generate data
X, y = make_classification(n_samples=1000, n_features=20, random_state=42)

# Bagging - Random Forest
rf = RandomForestClassifier(n_estimators=100, random_state=42)
rf_scores = cross_val_score(rf, X, y, cv=5)
print(f"Random Forest CV: {rf_scores.mean():.3f} (+/- {rf_scores.std():.3f})")

# Boosting - Gradient Boosting
gb = GradientBoostingClassifier(n_estimators=100, random_state=42)
gb_scores = cross_val_score(gb, X, y, cv=5)
print(f"Gradient Boosting CV: {gb_scores.mean():.3f} (+/- {gb_scores.std():.3f})")

# XGBoost (advanced boosting)
import xgboost as xgb
xgb_model = xgb.XGBClassifier(n_estimators=100, random_state=42)
xgb_scores = cross_val_score(xgb_model, X, y, cv=5)
print(f"XGBoost CV: {xgb_scores.mean():.3f} (+/- {xgb_scores.std():.3f})")

Rarity: Very Common Difficulty: Medium


7. How do you handle imbalanced datasets?

Answer: Imbalanced data can bias models toward majority class.

  • Techniques:
    • Resampling: SMOTE, undersampling
    • Class weights: Penalize misclassification
    • Ensemble methods: Balanced Random Forest
    • Evaluation: Use F1, precision, recall (not accuracy)
    • Threshold adjustment: Optimize decision threshold
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import classification_report, confusion_matrix
from imblearn.over_sampling import SMOTE
from imblearn.under_sampling import RandomUnderSampler
from collections import Counter

# Create imbalanced dataset
X, y = make_classification(
    n_samples=1000, n_features=20,
    weights=[0.95, 0.05],  # 95% class 0, 5% class 1
    random_state=42
)

print(f"Original distribution: {Counter(y)}")

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3)

# 1. Class weights
model_weighted = RandomForestClassifier(class_weight='balanced', random_state=42)
model_weighted.fit(X_train, y_train)
print("\nWith class weights:")
print(classification_report(y_test, model_weighted.predict(X_test)))

# 2. SMOTE (oversampling minority class)
smote = SMOTE(random_state=42)
X_train_smote, y_train_smote = smote.fit_resample(X_train, y_train)
print(f"After SMOTE: {Counter(y_train_smote)}")

model_smote = RandomForestClassifier(random_state=42)
model_smote.fit(X_train_smote, y_train_smote)
print("\nWith SMOTE:")
print(classification_report(y_test, model_smote.predict(X_test)))

# 3. Threshold adjustment
y_proba = model_weighted.predict_proba(X_test)[:, 1]
threshold = 0.3  # Lower threshold favors minority class
y_pred_adjusted = (y_proba >= threshold).astype(int)
print("\nWith adjusted threshold:")
print(classification_report(y_test, y_pred_adjusted))

Rarity: Very Common Difficulty: Medium


8. What is cross-validation and why is it important?

Answer: Cross-validation evaluates model performance more reliably than single train-test split.

  • Types:
    • K-Fold: Split into k folds
    • Stratified K-Fold: Preserves class distribution
    • Time Series Split: Respects temporal order
  • Benefits:
    • More robust performance estimate
    • Uses all data for training and validation
    • Detects overfitting
from sklearn.model_selection import (
    cross_val_score, KFold, StratifiedKFold,
    TimeSeriesSplit, cross_validate
)
from sklearn.ensemble import RandomForestClassifier
from sklearn.datasets import load_iris

# Load data
data = load_iris()
X, y = data.data, data.target

model = RandomForestClassifier(random_state=42)

# Standard K-Fold
kfold = KFold(n_splits=5, shuffle=True, random_state=42)
scores = cross_val_score(model, X, y, cv=kfold)
print(f"K-Fold scores: {scores}")
print(f"Mean: {scores.mean():.3f} (+/- {scores.std():.3f})")

# Stratified K-Fold (preserves class distribution)
stratified_kfold = StratifiedKFold(n_splits=5, shuffle=True, random_state=42)
stratified_scores = cross_val_score(model, X, y, cv=stratified_kfold)
print(f"\nStratified K-Fold: {stratified_scores.mean():.3f}")

# Time Series Split (for temporal data)
tscv = TimeSeriesSplit(n_splits=5)
ts_scores = cross_val_score(model, X, y, cv=tscv)
print(f"Time Series CV: {ts_scores.mean():.3f}")

# Multiple metrics
cv_results = cross_validate(
    model, X, y, cv=5,
    scoring=['accuracy', 'precision_macro', 'recall_macro', 'f1_macro'],
    return_train_score=True
)

print(f"\nAccuracy: {cv_results['test_accuracy'].mean():.3f}")
print(f"Precision: {cv_results['test_precision_macro'].mean():.3f}")
print(f"Recall: {cv_results['test_recall_macro'].mean():.3f}")
print(f"F1: {cv_results['test_f1_macro'].mean():.3f}")

Rarity: Very Common Difficulty: Easy


9. Explain precision, recall, and F1-score.

Answer: Classification metrics for evaluating model performance:

  • Precision: Of predicted positives, how many are correct
    • Formula: TP / (TP + FP)
    • Use when: False positives are costly
  • Recall: Of actual positives, how many were found
    • Formula: TP / (TP + FN)
    • Use when: False negatives are costly
  • F1-Score: Harmonic mean of precision and recall
    • Formula: 2 × (Precision × Recall) / (Precision + Recall)
    • Use when: Need balance between precision and recall
from sklearn.metrics import (
    precision_score, recall_score, f1_score,
    classification_report, confusion_matrix
)
import numpy as np

# Example predictions
y_true = np.array([0, 1, 1, 0, 1, 1, 0, 1, 0, 0])
y_pred = np.array([0, 1, 0, 0, 1, 1, 0, 1, 1, 0])

# Calculate metrics
precision = precision_score(y_true, y_pred)
recall = recall_score(y_true, y_pred)
f1 = f1_score(y_true, y_pred)

print(f"Precision: {precision:.3f}")  # 0.800
print(f"Recall: {recall:.3f}")        # 0.800
print(f"F1-Score: {f1:.3f}")          # 0.800

# Confusion matrix
cm = confusion_matrix(y_true, y_pred)
print(f"\nConfusion Matrix:\n{cm}")
# [[4 1]
#  [1 4]]

# Classification report (all metrics)
print("\nClassification Report:")
print(classification_report(y_true, y_pred))

# Trade-off example
from sklearn.metrics import precision_recall_curve
import matplotlib.pyplot as plt

# Get probability predictions
y_proba = model.predict_proba(X_test)[:, 1]

# Calculate precision-recall at different thresholds
precisions, recalls, thresholds = precision_recall_curve(y_test, y_proba)

# Find optimal threshold (maximize F1)
f1_scores = 2 * (precisions * recalls) / (precisions + recalls + 1e-10)
optimal_idx = np.argmax(f1_scores)
optimal_threshold = thresholds[optimal_idx]
print(f"Optimal threshold: {optimal_threshold:.3f}")

Rarity: Very Common Difficulty: Easy


10. What is regularization and when would you use it?

Answer: Regularization prevents overfitting by penalizing model complexity.

  • Types:
    • L1 (Lasso): Adds absolute value of coefficients
    • L2 (Ridge): Adds squared coefficients
    • Elastic Net: Combines L1 and L2
  • When to use:
    • High variance (overfitting)
    • Many features
    • Multicollinearity
from sklearn.linear_model import Ridge, Lasso, ElasticNet, LinearRegression
from sklearn.datasets import make_regression
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error, r2_score
import numpy as np

# Generate data with many features
X, y = make_regression(
    n_samples=100, n_features=50,
    n_informative=10, noise=10, random_state=42
)

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3)

# No regularization
lr = LinearRegression()
lr.fit(X_train, y_train)
lr_score = r2_score(y_test, lr.predict(X_test))
print(f"Linear Regression R²: {lr_score:.3f}")

# L2 Regularization (Ridge)
ridge = Ridge(alpha=1.0)
ridge.fit(X_train, y_train)
ridge_score = r2_score(y_test, ridge.predict(X_test))
print(f"Ridge R²: {ridge_score:.3f}")

# L1 Regularization (Lasso)
lasso = Lasso(alpha=0.1)
lasso.fit(X_train, y_train)
lasso_score = r2_score(y_test, lasso.predict(X_test))
print(f"Lasso R²: {lasso_score:.3f}")
print(f"Lasso non-zero coefficients: {np.sum(lasso.coef_ != 0)}")

# Elastic Net (L1 + L2)
elastic = ElasticNet(alpha=0.1, l1_ratio=0.5)
elastic.fit(X_train, y_train)
elastic_score = r2_score(y_test, elastic.predict(X_test))
print(f"Elastic Net R²: {elastic_score:.3f}")

# Hyperparameter tuning for alpha
from sklearn.model_selection import GridSearchCV

param_grid = {'alpha': [0.001, 0.01, 0.1, 1.0, 10.0]}
grid = GridSearchCV(Ridge(), param_grid, cv=5)
grid.fit(X_train, y_train)
print(f"\nBest alpha: {grid.best_params_['alpha']}")
print(f"Best CV score: {grid.best_score_:.3f}")

Rarity: Very Common Difficulty: Medium


Model Training & Deployment (5 Questions)

11. How do you save and load models in production?

Answer: Model serialization enables deployment and reuse.

import joblib
import pickle
from sklearn.ensemble import RandomForestClassifier
from sklearn.datasets import load_iris

# Train model
X, y = load_iris(return_X_y=True)
model = RandomForestClassifier()
model.fit(X, y)

# Method 1: Joblib (recommended for scikit-learn)
joblib.dump(model, 'model.joblib')
loaded_model = joblib.load('model.joblib')

# Method 2: Pickle
with open('model.pkl', 'wb') as f:
    pickle.dump(model, f)

with open('model.pkl', 'rb') as f:
    loaded_model = pickle.load(f)

# For TensorFlow/Keras
import tensorflow as tf

# Save entire model
# keras_model.save('model.h5')
# loaded_model = tf.keras.models.load_model('model.h5')

# Save weights only
# keras_model.save_weights('model_weights.h5')
# new_model.load_weights('model_weights.h5')

# For PyTorch
import torch

# Save model state dict
# torch.save(model.state_dict(), 'model.pth')
# model.load_state_dict(torch.load('model.pth'))

# Save entire model
# torch.save(model, 'model_complete.pth')
# model = torch.load('model_complete.pth')

# Version control for models
import datetime

model_version = datetime.datetime.now().strftime("%Y%m%d_%H%M%S")
model_path = f'models/model_{model_version}.joblib'
joblib.dump(model, model_path)
print(f"Model saved to {model_path}")

Rarity: Very Common Difficulty: Easy


12. How do you create a REST API for model serving?

Answer: REST APIs make models accessible to applications.

from flask import Flask, request, jsonify
import joblib
import numpy as np

app = Flask(__name__)

# Load model at startup
model = joblib.load('model.joblib')

@app.route('/predict', methods=['POST'])
def predict():
    try:
        # Get data from request
        data = request.get_json()
        features = np.array(data['features']).reshape(1, -1)
        
        # Make prediction
        prediction = model.predict(features)
        probability = model.predict_proba(features)
        
        # Return response
        return jsonify({
            'prediction': int(prediction[0]),
            'probability': probability[0].tolist(),
            'status': 'success'
        })
    
    except Exception as e:
        return jsonify({
            'error': str(e),
            'status': 'error'
        }), 400

@app.route('/health', methods=['GET'])
def health():
    return jsonify({'status': 'healthy'})

if __name__ == '__main__':
    app.run(host='0.0.0.0', port=5000)

# FastAPI alternative (modern, faster)
from fastapi import FastAPI, HTTPException
from pydantic import BaseModel

app = FastAPI()

class PredictionRequest(BaseModel):
    features: list

class PredictionResponse(BaseModel):
    prediction: int
    probability: list
    status: str

@app.post("/predict", response_model=PredictionResponse)
async def predict(request: PredictionRequest):
    try:
        features = np.array(request.features).reshape(1, -1)
        prediction = model.predict(features)
        probability = model.predict_proba(features)
        
        return PredictionResponse(
            prediction=int(prediction[0]),
            probability=probability[0].tolist(),
            status="success"
        )
    except Exception as e:
        raise HTTPException(status_code=400, detail=str(e))

# Usage:
# curl -X POST "http://localhost:5000/predict" \
#      -H "Content-Type: application/json" \
#      -d '{"features": [5.1, 3.5, 1.4, 0.2]}'

Rarity: Very Common Difficulty: Medium


13. What is Docker and why is it useful for ML deployment?

Answer: Docker containers package applications with all dependencies.

  • Benefits:
    • Reproducibility
    • Consistency across environments
    • Easy deployment
    • Isolation
# Dockerfile for ML model
FROM python:3.9-slim

WORKDIR /app

# Copy requirements
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

# Copy model and code
COPY model.joblib .
COPY app.py .

# Expose port
EXPOSE 5000

# Run application
CMD ["python", "app.py"]
# Build Docker image
docker build -t ml-model:v1 .

# Run container
docker run -p 5000:5000 ml-model:v1

# Docker Compose for multi-container setup
# docker-compose.yml
version: '3.8'
services:
  model:
    build: .
    ports:
      - "5000:5000"
    environment:
      - MODEL_PATH=/app/model.joblib
    volumes:
      - ./models:/app/models

Rarity: Common Difficulty: Medium


14. How do you monitor model performance in production?

Answer: Monitoring detects model degradation and ensures reliability.

  • What to Monitor:
    • Prediction metrics: Accuracy, latency
    • Data drift: Input distribution changes
    • Model drift: Performance degradation
    • System metrics: CPU, memory, errors
import logging
from datetime import datetime
import numpy as np

class ModelMonitor:
    def __init__(self, model):
        self.model = model
        self.predictions = []
        self.actuals = []
        self.latencies = []
        self.input_stats = []
        
        # Setup logging
        logging.basicConfig(
            filename='model_monitor.log',
            level=logging.INFO,
            format='%(asctime)s - %(message)s'
        )
    
    def predict(self, X):
        import time
        
        # Track input statistics
        self.input_stats.append({
            'mean': X.mean(),
            'std': X.std(),
            'min': X.min(),
            'max': X.max()
        })
        
        # Measure latency
        start = time.time()
        prediction = self.model.predict(X)
        latency = time.time() - start
        
        self.predictions.append(prediction)
        self.latencies.append(latency)
        
        # Log prediction
        logging.info(f"Prediction: {prediction}, Latency: {latency:.3f}s")
        
        # Alert if latency too high
        if latency > 1.0:
            logging.warning(f"High latency detected: {latency:.3f}s")
        
        return prediction
    
    def log_actual(self, y_true):
        self.actuals.append(y_true)
        
        # Calculate accuracy if we have enough data
        if len(self.actuals) >= 100:
            accuracy = np.mean(
                np.array(self.predictions[-100:]) == np.array(self.actuals[-100:])
            )
            logging.info(f"Rolling accuracy (last 100): {accuracy:.3f}")
            
            if accuracy < 0.8:
                logging.error(f"Model performance degraded: {accuracy:.3f}")
    
    def check_data_drift(self, reference_stats):
        if not self.input_stats:
            return
        
        current_stats = self.input_stats[-1]
        
        # Simple drift detection (compare means)
        mean_diff = abs(current_stats['mean'] - reference_stats['mean'])
        if mean_diff > 2 * reference_stats['std']:
            logging.warning(f"Data drift detected: mean diff = {mean_diff:.3f}")

# Usage
monitor = ModelMonitor(model)
prediction = monitor.predict(X_new)
# Later, when actual label is available
monitor.log_actual(y_true)

Rarity: Common Difficulty: Medium


15. What is CI/CD for machine learning?

Answer: CI/CD automates testing and deployment of ML models.

  • Continuous Integration:
    • Automated testing
    • Code quality checks
    • Model validation
  • Continuous Deployment:
    • Automated deployment
    • Rollback capabilities
    • A/B testing
# .github/workflows/ml-pipeline.yml
name: ML Pipeline

on:
  push:
    branches: [ main ]
  pull_request:
    branches: [ main ]

jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v2
      
      - name: Set up Python
        uses: actions/setup-python@v2
        with:
          python-version: 3.9
      
      - name: Install dependencies
        run: |
          pip install -r requirements.txt
          pip install pytest
      
      - name: Run tests
        run: pytest tests/
      
      - name: Train model
        run: python train.py
      
      - name: Validate model
        run: python validate_model.py
  
  deploy:
    needs: test
    runs-on: ubuntu-latest
    if: github.ref == 'refs/heads/main'
    steps:
      - name: Deploy to production
        run: |
          docker build -t ml-model:latest .
          docker push ml-model:latest
# tests/test_model.py
import pytest
import numpy as np
from sklearn.datasets import load_iris

def test_model_accuracy():
    from train import train_model
    X, y = load_iris(return_X_y=True)
    model, accuracy = train_model(X, y)
    assert accuracy > 0.9, f"Model accuracy {accuracy} below threshold"

def test_model_prediction_shape():
    model = load_model('model.joblib')
    X_test = np.random.rand(10, 4)
    predictions = model.predict(X_test)
    assert predictions.shape == (10,), "Unexpected prediction shape"

def test_model_prediction_range():
    model = load_model('model.joblib')
    X_test = np.random.rand(10, 4)
    predictions = model.predict(X_test)
    assert all(p in [0, 1, 2] for p in predictions), "Invalid predictions"

Rarity: Medium Difficulty: Hard


Related Posts

Recent Posts

Weekly career tips that actually work

Get the latest insights delivered straight to your inbox