# AI Engineer (ARIP - Agent Development) - CP Axtra

> Makro PRO · Nawamin Road, Thailand · Full-time · Posted 2026-05-19

**Workplace:** on_site

**Department:** Technology

## Description

Senior ICs who build ARIP's 15 named agents (A-11..A-25) end-to-end on LangGraph / CrewAI: prompt design, tool definitions, multi-step workflows, eval harnesses (golden sets, regression gates, LLM-as-judge, multi-step replay), HITL gate integration, Trust Gate progression, and per-agent cost optimisation. Distinct from DEAA's Senior AI Engineer who owns the LLM Gateway — ARIP AI Engineers are platform consumers and agent builders.

**Remote candidates outside of Thailand are welcome to apply.**

### **Key Responsibilities:**

-   Build agents on Layer 4 runtime end-to-end — each ships with eval harness, HITL gate config, observability instrumentation, per-agent cost meter, and runbook.
-   Design and own golden-set test cases per agent; build regression gates in CI (no agent ships without eval-pass); implement multi-step conversation replay and LLM-as-judge patterns.
-   Configure per-agent HITL gates and collect gate-progression evidence (Shadow 60d → Recommender 90d → Executor); co-own Trust Gate Framework for Suite 3 financial-threshold ladder (G0–G4).
-   Tune model routing per agent (LLM provider / model tier): balance cost, latency, quality; implement semantic caching where appropriate.
-   Consume DEAA's LLM Gateway via standard SDK; provide per-agent cost data to DEAA's GenAI Cost Dashboard; partner with DEAA Senior AI Engineer on embedding model selection and retrieval relevance.
-   Author agent-engineering playbook alongside DEAA's AI Best Practices Playbook; mentor PACE-seeded engineers on agent engineering discipline.

## Requirements

-   5+ years software engineering; 2+ years shipping LLM-based / agentic systems to production (not just RAG demos or notebooks).
-   Expert in production multi-agent orchestration: LangGraph / CrewAI / AutoGen / DSPy or equivalent with HITL gates by default, not autonomous-by-default.
-   Eval-driven LLM development in production: golden sets, LLM-as-judge, multi-step replay, regression gates in CI.
-   HITL gate and agent guardrail design: prompt injection / PII / output filtering defences — designs and tests them in production.
-   Strong Python (async, observability, testing); major LLM provider (Azure OpenAI / Anthropic / Bedrock / Vertex) production experience; Langfuse or equivalent for LLM tracing,
-   Calibre: Senior AI Engineer from agentic-AI startups (Anthropic-adjacent ecosystem), Agoda, LINE MAN Wongnai, Grab, SCBX with multi-agent production experience.

## Apply

[Apply at Makro PRO](https://apply.workable.com/joinmakropro/j/2CFFE27C25/apply)

---
Powered by [Workable](https://www.workable.com)
