# Lead AI Platform

> Integrant · Cairo, Egypt (Hybrid) · Full-time · Posted 2026-04-08

**Workplace:** hybrid

**Department:** Software Development

## Description

Integrant is looking for game changers to join our team as " Lead AI Platform".

The Lead AI Platform Engineer is responsible for bridging AI workloads with production-grade infrastructure, with a strong focus on NVIDIA AI stack, enabling high-performance, scalable, and optimized AI systems.

This role focuses on model optimization, runtime efficiency, and GPU utilization, ensuring that AI workloads are production-ready, cost-efficient, and performant across enterprise environments.

### Roles and Responsibilities:

-   Translate AI/ML workloads into optimized infrastructure and deployment strategies

-   Optimize model performance across GPU environments (latency, throughput, memory utilization)
-   Design and implement inference and training pipelines using NVIDIA stack tools (TensorRT, Triton, NIM)
-   Convert and optimize models across frameworks (PyTorch → ONNX → TensorRT)
-   Analyze and resolve performance bottlenecks using profiling tools (GPU, memory, network)
-   Improve GPU utilization and scheduling efficiency across clusters
-   Design scalable distributed training and inference architectures
-   Work closely with customers to define AI infrastructure strategies and deployment models
-   Support production deployments including monitoring, rollback, and performance validation
-   Conduct applied research to improve model efficiency and infrastructure utilization
-   Mentor team members on AI infrastructure, optimization, and GPU systems
-   Experiment tracking tools (MLflow, W&B, Neptune) log parameters, metrics, and artifacts for comparison
-   Find the Model degradation happens post-deployment: concept drift, data pipeline changes, traffic pattern shifts
-   Root cause analysis (RCA) applies to ML systems: isolating variables, reproducing issues

## Requirements

-   8+ years of experience in AI systems
-   8+ years of experience in ML systems, HPC and AI infrastructure
-   Strong proficiency in Python
-   Strong experience with GPU-based AI workloads and performance optimization
-   Deep understanding of model optimization techniques (quantization, pruning, batching)
-   Hands-on experience with:

1.  PyTorch
2.  ONNX / ONNX Runtime
3.  TensorRT / TensorRT-LLM
4.  Triton Inference Server

-   Knowledge of CUDA, cuDNN, and GPU architecture fundamentals
-   Experience with distributed systems (multi-GPU / multi-node)
-   Familiarity with:

1.  NCCL communication
2.  NVLink / InfiniBand
3.  Kubernetes or Slurm for orchestration

-   Experience deploying AI models into production environments
-   Ability to analyze system bottlenecks (compute, memory, network)
-   Experience with profiling tools (Nsight, TensorRT profiler, etc.)
-   Knowledge of cost optimization strategies for GPU workloads
-   Experiment tracking tools (MLflow, W&B, Neptune) log parameters, metrics, and artifacts for comparison
-   Find the Model degradation happens post-deployment: concept drift, data pipeline changes, traffic pattern shifts
-   Root cause analysis (RCA) applies to ML systems: isolating variables, reproducing issues

### Nice to Have

-   Experience with NVIDIA NIM and NGC ecosystem
-   Exposure to Megatron-LM, NeMo, or large-scale LLM training/inference
-   Experience with LLM optimization techniques (KV cache, batching strategies)
-   Familiarity with MLOps practices and CI/CD for AI systems
-   Experience in customer-facing architecture or consulting roles
-   Familiarity with hybrid cloud / on-prem HPC environments

## Benefits

-   Salary paid in USD
-   Six-month career advancing opportunities
-   Supportive and friendly work environment
-   Premium medical insurance \[employee +family\]
-   English language development courses
-   Interest-free loans paid over 2.5 years
-   Technical development courses
-   Planned overtime program (POP)
-   Employment referral program
-   Premium location in Maadi
-   Social insurance

## Apply

[Apply at Integrant](https://apply.workable.com/integrant/j/C096ADF78E/apply)

---
Powered by [Workable](https://www.workable.com)
