Manager, ML Ops Infrastructure

WTS Paradigm LLC
Dallas, TX

Paradigm is a software company transforming the way that the residential, construction & building product industries operate across the globe. We are looking for a Manager, ML Ops Infrastructure to be part of revolutionizing these industries.

We're looking for a hands-on technical leader to build and scale the ML Ops infrastructure that powers our AI capabilities in production. You'll oversee the end-to-end platform for deploying, serving, and operating ML models and AI agents, what we call our "agent factory": a repeatable framework that makes shipping AI-powered features as reliable and routine as deploying any other service.

This role sits at the intersection of ML operations, platform engineering, and cloud infrastructure. You'll build the compute, orchestration, and deployment pipelines that take ML experiments from notebooks to production, creating self-service tooling so data scientists and ML engineers can deploy with confidence and speed.

What You Will Do:

  • Build and lead a team of ML Ops engineers focused on production deployment frameworks for AI/ML systems including hiring, mentoring, and technical guidance.
  • Design and operate Kubernetes-based infrastructure for ML workloads including model training, real-time inference, LLM serving, and agent orchestration.
  • Create the core ML Ops platform: model versioning, deployment automation, registries, serving infrastructure, and CI/CD pipelines purpose-built for ML and AI agent workflows.
  • Architect and manage GPU-accelerated compute for training and inference, optimizing for both performance and cost through spot instances, auto-scaling, and efficient resource allocation.
  • Build self-service deployment tooling that enables data scientists and ML engineers to push models and agents to production without manual infrastructure work.
  • Build the infrastructure for agentic AI: tool-calling, multi-step workflows, orchestration frameworks, multi-agent systems, and agent lifecycle management.
  • Implement production-grade deployment strategies (canary, blue/green) with rollback capabilities, observability, drift detection, and performance monitoring.
  • Partner with data science, ML engineering, and SRE teams to align infrastructure with deployment requirements and reliability SLOs.
  • Drive continuous improvement in deployment velocity, cost efficiency, and operational maturity across the ML platform including evaluating and integrating tools like MLflow, Kubeflow, and emerging agent frameworks.

What You Need to Succeed:

  • Bachelor’s degree in Computer Science, Engineering, or a related field or equivalent experience.
  • 7+ years in infrastructure engineering, DevOps, or platform engineering, with at least 3 years focused on ML/AI infrastructure.
  • 1+ years of experience building and leading teams that operate production ML systems or demonstrated tech lead experience with direct influence over team processes and career growth.
  • Track record deploying and managing ML models in production. You understand the full lifecycle from training to serving to monitoring.
  • Hands-on experience with GPU computing, model optimization, and ML-specific infrastructure patterns.
  • Hands-on experience with Kubernetes and container orchestration for ML workloads (Kubeflow, KServe, Ray, or similar).
  • Experience working with Azure cloud services such as Azure ML, Azure OpenAI, Azure Databricks, GPU-accelerated compute (GPU VMs, AKS with GPU node pools).
  • Experience using infrastructure as code tools (Terraform or equivalent) with ML infrastructure patterns.
  • Python programming experience with fluency in ML frameworks (PyTorch, TensorFlow) and LLM APIs (OpenAI, Anthropic, Azure OpenAI).
  • Experience with the modern AI/ML toolchain including model serving (vLLM, Triton, TorchServe), ML Ops platforms (MLflow, Kubeflow, W&B), vector databases (pgvector, Azure AI Search), and agent orchestration frameworks. Familiarity with RAG architectures, fine-tuning workflows, and embedding pipelines at scale.
  • You are a bridge-builder who translates fluently between ML practitioners and infrastructure teams.
  • You are a systems thinker who balances performance, cost, and reliability while building for scale.
  • You are collaborative, curious, and driven to enable teams to ship AI capabilities faster than they thought possible.

Ready to Join? Apply now at myparadigm.com/careers/
#Paradigm

Posted 2026-02-27

Recommended Jobs

Marketing Manager (Construction Services)

CMC
Dallas, TX

it's what's inside that counts _______________________________ There’s more to CMC than our products and the buildings, structures, and roads they go into. At CMC, it’s the people inside our re…

View Details
Posted 2026-01-15

Veterinarian, General Practice (2 DVMs needed)

Coastal Wave Recruiting
Houston, TX

Offers primary care services for cats and dogs in the Houston area. Services that include dental care, elective surgeries, and common major procedures, dental and full-body X-ray. Two full-time DVMs a…

View Details
Posted 2026-01-15

Service Coordinator - Completion Tools

Halliburton
Odessa, TX

We are looking for the right people - people who want to innovate, achieve, grow and lead. We attract and retain the best talent by investing in our employees and empowering them to develop themselves…

View Details
Posted 2026-01-21

Sales Development Representative

CMA CGM
Houston, TX

Led by Rodolphe Saadé, the CMA CGM Group, a global leader in shipping and logistics, serves more than 420 ports around the world on five continents. With its subsidiary CEVA Logistics, a world leader…

View Details
Posted 2026-01-29

Enterprise Architect

IDR
Austin, TX

IDR is seeking a highly skilled and experienced Enterprise Architect  to join one of our top clients in the public sector industry. If you are looking for an opportunity to join a large organization a…

View Details
Posted 2026-02-27

Remote Educational Consultant - Leadership & Professional Development

Life of Prosperity
Texas

Transform Your Teaching & Leadership Skills into a Flexible, High‑Impact Remote Career Ready to move beyond the classroom?  Discover a rewarding role where your experience drives personal and prof…

View Details
Posted 2026-02-21

Network Engineer

NOV
Houston, TX

JOB DESCRIPTION About NOV NOV delivers technology-driven solutions to empower the global energy industry. For more than 150 years, NOV has pioneered innovations that enable its customers to …

View Details
Posted 2026-01-15

TECHNICAL TRAINING DEVELOPMENT MANAGER

Weatherford
Houston, TX

Weatherford is a leading global energy services company, providing innovative solutions, technology, and expertise to the oil, gas, and energy industries. With operations in more than 70 countries, W…

View Details
Posted 2026-01-03

Digital Software Engineer Senior Analyst

Citi
Irving, TX

Citibank, N.A. seeks a Digital Software Engineer Senior Analyst for its Irving, TX location. Duties: Ensure software is compliant with new versions of platform tools, devices, browsers, and operat…

View Details
Posted 2026-02-10

Senior Accountant

Kings III of America
Coppell, TX

Full-time Description The Role: Senior Accountant  Who You Are: You are an experienced accounting professional with a history of 3- 5 years of progressive GAAP accounting experienc…

View Details
Posted 2026-02-28