aRTi AI Roleplay — practice hard conversations with an LLM partner

+8% AI feature usage in 24h · +29% adoption since launch

Live coaches charge $200–400 an hour and don't know your team. So we built one that does.

The problem

Managers regularly need to rehearse hard conversations — performance feedback, terminations, conflict resolution, difficult peer dynamics. Live coaching is expensive (often $200–400/hr), often unavailable in the moment, and still feels artificial because the coach doesn't know the team.

Rising Team's customers — leaders at Adobe, Meta, Google Cloud, Microsoft, Bank of Hawaii, and others — wanted a way to practice on demand, from anywhere, with realistic resistance and actionable feedback.

The approach

I led the research, design, and end-to-end implementation of aRTi AI Roleplay against the platform's core services.

Team-aware partner. Pinecone retrieval over each leader's team context, declared development goals, and prior session insights. The LLM partner stays in character with realistic objections specific to that manager's team.
Pause / reflect / retry interaction model. A manager can stop mid-conversation, ask for phrasing suggestions, replay any line, and try again. Practice is non-linear by design.
Voice and text modes. ElevenLabs powers natural voice for high-realism rehearsal. A text mode keeps latency low when managers want to drill quickly.
Multi-model orchestration. OpenAI for generation in some flows, Anthropic for reflection and feedback, Gemini for cost-sensitive paths — all coordinated through LangChain with cache-aware prompt design.
Real-time scoring. A second-pass evaluator critiques tone and clarity, surfacing one phrasing improvement per turn instead of generic encouragement.

Stack

Python · Django · React · TypeScript · LangChain · OpenAI · Anthropic · Gemini · Pinecone · ElevenLabs · PostgreSQL · AWS

Outcomes

+8% platform-wide AI feature usage in the first 24 hours of launch.
+29% adoption growth since launch.
Helped power the platform's customer-cohort gains: up to 33% retention lift, 60–200% eNPS improvement, and 22–75% engagement-score lifts.
At Bank of Hawaii alone, the broader Rising Team platform delivered 400+ sessions across 1,950 employees in 40 branches, contributing to an 88% retention lift over an 18-month period.

What I learned

Dual-LLM evaluation beats encouragement. A separate model scoring the partner's response — not the user's — catches "supportive but wrong" feedback that a single prompt loop hides.
Voice tripled session length only when latency stayed under ~800 ms. Above that, managers dropped to text mode and never came back to voice.
Retrieval narrows the prompt — it doesn't replace it. Team context grounds the partner; the system prompt still has to encode the kind of pushback that's productive vs. cruel.

Public

risingteam.com/ai-roleplay-partner →

Hiring for something like this?

Tell me what you're building