# RL Environment Software Engineer

> talentpluto · San Francisco, United States (Hybrid) · Full-time · Posted 2026-07-01

**Salary:** USD 180,000–400,000

**Workplace:** hybrid

## Description

**Location:** San Francisco, CA

**Work Model:** Hybrid

**Industry:** Applied AI / AI research data

**Compensation:** $180K-$220K base, ~$400K+ OTE (uncapped profit share)

### About the Company

Our partner is a fast-growing applied AI research lab that builds high-quality reinforcement-learning environments and agents sold to the world's leading AI labs. In under two years they have scaled to a nine-figure revenue run rate and grown their team severalfold in a matter of months, backed by leading venture investors. Quality is their core differentiator, and they are rapidly expanding into new domains.

### The Opportunity

As an RL Environment Software Engineer, you will sit at the intersection of research engineering and traditional software engineering, building the environments that simulate real-world workflows and the agents that automate them. This is forward-looking work, you will help research and predict what high-quality environments the frontier will need next, then build them from the ground up.

You will join a brand-new RL team being assembled with exceptional talent, with a clear path to grow alongside it as the function scales into industry pods.

### Responsibilities

-   Design and build high-quality RL environments that simulate real working environments end to end.
-   Develop agents for the tasks within those environments and iterate until they are efficient and production-ready.
-   Partner with the research team to scope which environments to build and why, staying ahead of future demand rather than only meeting present needs.
-   Own the backend and infrastructure layers that make environments reliable and scalable.
-   Help set engineering standards for a zero-to-one team as the RL function grows.

### Requirements

-   Strong machine-learning engineers who code heavily and build systems from scratch, with strong intuition for reinforcement learning.
-   Proficiency across a modern stack, Node.js and Python on the backend and React/TypeScript on the frontend, with strong Kubernetes and Docker skills.
-   Comfort operating in a fast-paced startup environment with high ownership and long hours.
-   A track record of meaningful tenure and impact at previous companies.
-   Reinforcement-learning experience or an RL research background is a strong plus, though not required.
-   Bachelor's degree in computer science or a related technical field, or equivalent practical experience.

## Apply

[Apply at talentpluto](https://apply.workable.com/talentpluto/j/3189BDCE25/apply)

---
Powered by [Workable](https://www.workable.com)
