# CUDA Developer (AI/LLM & GPU Optimization)

> Gramian Consulting Group · Egypt (Remote) · Contract · Posted 2026-05-22

**Workplace:** remote

**Department:** Partnerships

## Description

Gramian Consultancy is a boutique consultancy specializing in IT professional services and engineering talent solutions. With a strong background in software engineering and leadership, we help companies build high-performing teams by matching them with professionals who truly fit their needs.

**Role Overview**

We are looking for experienced **CUDA Developers** to work on advanced AI and machine learning initiatives focused on improving the capabilities of large language models (LLMs). In this role, you will solve complex GPU programming challenges, optimize high-performance CUDA workloads, review AI-generated code, and contribute to the development of more capable AI systems.

**Duration:** 3 months

**Commitment:** 40h/week, 4h/day overlap with PST

**Model:** Contract, time and material

**Location: 100% Remote: Bangladesh, Brazil, Colombia, Egypt, Ghana, India, Pakistan, Indonesia, Kenya, Nigeria, Turkey, Vietnam**

**Interview: 1 technical interview**

**Key Responsibilities**

-   Solve advanced CUDA and GPU programming problems involving parallel computing and performance optimization
-   Review, evaluate, and improve AI-generated CUDA, C++, and Python code
-   Optimize GPU kernels for throughput, latency, memory efficiency, and resource utilization
-   Work with CUDA libraries and frameworks such as Thrust, cuBLAS, and cuDNN
-   Debug and resolve issues related to CUDA kernels, synchronization, and memory management
-   Develop high-quality technical prompts, solutions, explanations, and evaluations for AI model training
-   Collaborate with AI researchers, engineers, and evaluation teams
-   Stay up to date with the latest developments in CUDA, GPU architectures, and performance optimization techniques

## Requirements

-   5+ years of professional software development experience with strong focus on CUDA development
-   Strong proficiency in C/C++
-   Strong hands-on experience with Python and scientific computing ecosystems
-   Experience working with PyTorch and NumPy
-   Experience with CUDA 12.3 or newer
-   Strong understanding of GPU programming, parallel computing, and performance optimization
-   Experience optimizing workloads for high-performance execution and efficient resource utilization
-   Experience with CUDA libraries such as Thrust, cuBLAS, and cuDNN

## Apply

[Apply at Gramian Consulting Group](https://apply.workable.com/gramian/j/06C87EBC4F/apply)

---
Powered by [Workable](https://www.workable.com)