MK
Mani khandan
MLOps/LLMOps Engineer at Wipro Technologies - Lloyds Banking Group
ChennaiWork Experience
Wipro Technologies - Lloyds Banking Group
Aug 2021 - Present -4 yrs
- Job Details:Designed and implemented scalable ML pipelines using Git, GitHub Actions, DVC, and Apache Airflow, enabling automated, reproducible, and maintainable workflows using GitOps practices. Deployed and managed production ML models using Docker, FastAPI, and Nginx on AWS EC2, supporting secure, real-time inference and scalable deployment. Orchestrated ML workflows across cloud-native environments with Kubernetes and Airflow, improving pipeline reliability and reducing operational overhead. Automated experiment tracking, model versioning, and model registry integration using MLflow and DagsHub to ensure traceability, reproducibility, and audit readiness. Built MLOps infrastructure on AWS and Azure, leveraging Fargate, ECS, ECR, and Boto3 SDK for efficient, serverless ML pipeline orchestration. Developed and deployed models on AWS SageMaker with continuous training, enabling rapid training, evaluation, and deployment with integrated model monitoring. Monitored ML systems and pipelines using Prometheus, Grafana, and WhyLabs to deliver proactive anomaly detection and monitor model/data drift in real-time. Configured secure deployment pipelines using Nginx as a reverse proxy, improving performance and safeguarding APIs in production environments. Implemented full GitOps automation using the Argo suite (events, workflows, CD, rollouts), enabling reliable CI/CD, canary releases, and auto rollbacks in ML deployments.
Wipro Technologies - Lloyds Banking Group
Aug 2019 - Aug 2021 -2 yrs
- Job Details:Developed and deployed machine learning models to production, focusing on predictive analytics, classification, and real-time inference use cases. Preprocessed large-scale datasets with efficient feature engineering, integrating feature store concepts to ensure consistency across training and serving. Designed and automated scalable data pipelines for training and evaluation using Python, Scikit-learn, and TensorFlow, improving workflow reproducibility. Built end-to-end ML workflows with data preprocessing, training, and model registry integration for production-ready deployment and model tracking. Conducted A/B testing and model evaluation using metrics like accuracy, precision, recall, F1-score, and ROC-AUC to ensure high model performance in production. Tuned hyperparameters and applied continuous training strategies to improve accuracy, reduce model drift, and maintain robust inference over time. Leveraged MLflow for experiment tracking, model versioning, and registry, ensuring reproducibility and traceability across the ML lifecycle. Implemented monitoring and logging using Prometheus and Grafana, enabling live performance tracking, anomaly detection, and metric-based model promotion. Integrated CI/CD pipelines using GitHub Actions and Docker for automated model testing, containerization, and deployment across cloud environments. Collaborated cross-functionally with data engineers and product teams to align ML solutions with business objectives, reducing model deployment time by 30%.
Skills
- version control & collaboration: git, github, github actions, dvc, dagshub
- model experimentation & tracking: mlflow, dvc, dagshub, tensorboard
- programming languages & frameworks: python, scikit-learn, pytorch, tensorflow, bash, linux
- containerization & orchestration: docker, kubernetes, aws fargate, nginx
- cloud platforms & sdks: aws (ec2, s3, ecr, ecs, sagemaker, fargate), azure, boto3 sdk
- monitoring & observability: whylabs, prometheus, grafana
- deployment & automation: terraform, fastapi, nginx (as reverse proxy), github actions (ci/cd pipelines)
- data handling & visualization: pandas, numpy, matplotlib, seaborn
- machine learning tools & libraries: scikit-learn, pytorch, tensorflow, xgboost
- large language models (llms): hands-on with openai gpt, llama, gemini, claude; fine-tuning and prompt engineering for domain-specific tasks.
View More