# AI Engineer

> Cube · Bangkok, Thailand (Hybrid) · Full-time · Posted 2026-06-24

**Workplace:** hybrid

**Department:** Data, Product & Technology

## Description

As one of our early AI Engineering hires, you'll help define what AI at Cube looks like. You'll build the AI features people actually use from our self-hosted chat interface and MCP server to retrieval pipelines, prompts, evaluations, and integrations with internal systems. You'll work closely with our Infrastructure and Data Engineering teams to design architecture, connect systems, and transform emerging AI capabilities into practical products and tools that solve real problems every day.

-   Maintain and tunning our self-hosted chat interface including model connections, MCP integration, RAG/knowledge base setup
-   Build the RAG pipeline: ingestion, chunking, embeddings, vector store, retrieval, reranking, and evaluation
-   Integrate LiteLLM or OpenRouter as the gateway; handle routing, fallbacks, rate limits, and cost tracking
-   Maintain and configure MCP server and the tools it exposes to the model
-   Write prompts and evaluations, and iterate on them based on real usage and failure cases
-   Monitoring the logging, tracing, and guardrails of our AI platforms and model does.
-   Good to have exposure on MLOps/Platform team to deploy self-hosted models (vLLM, TGI, Ollama) and keep them healthy
-   Ship features end-to-end: API, retrieval, prompt, evaluation, and rollout

## Requirements

-   4+ years of software engineering experience
-   Familiarity with containerized technologies and orchestration platforms such as Kubernetes
-   Strong interest in AI, LLMs, and the rapidly evolving model ecosystem
-   1+ years of experience building, deploying, or supporting production LLM systems (RAG, agents, or fine-tuned models)
-   Experience deploying and configuring self-hosted LLM chat interfaces (Open WebUI preferred; similar platforms are acceptable)
-   Hands-on experience with retrieval and RAG systems, including embeddings, vector databases, chunking strategies, hybrid search, and evaluation methodologies
-   Experience working with LLM gateways or routing layers such as LiteLLM, OpenRouter, Portkey, or similar solutions
-   Experience serving open-weight models using tools such as vLLM, TGI, or SGLang
-   Experience designing and implementing secure integrations between LLMs and internal business systems
-   Nice to have: Experience with or understanding of MCP servers, agent frameworks, or tool-calling architectures
-   Nice to have: Experience with or understanding of LLM observability and monitoring platforms such as LangSmith, Langfuse, or similar tools

## Apply

[Apply at Cube](https://apply.workable.com/cube-asia/j/B70DBB4314/apply)

---
Powered by [Workable](https://www.workable.com)