aRTi AI Roleplay — practice hard conversations with an LLM partner
+8% AI feature usage in 24h · +29% adoption since launch
Live coaches charge $200–400 an hour and don't know your team. So we built one that does.
The problem
Managers regularly need to rehearse hard conversations — performance feedback, terminations, conflict resolution, difficult peer dynamics. Live coaching is expensive (often $200–400/hr), often unavailable in the moment, and still feels artificial because the coach doesn't know the team.
Rising Team's customers — leaders at Adobe, Meta, Google Cloud, Microsoft, Bank of Hawaii, and others — wanted a way to practice on demand, from anywhere, with realistic resistance and actionable feedback.
The approach
I led the research, design, and end-to-end implementation of aRTi AI Roleplay against the platform's core services.
- Team-aware partner. Pinecone retrieval over each leader's team context, declared development goals, and prior session insights. The LLM partner stays in character with realistic objections specific to that manager's team.
- Pause / reflect / retry interaction model. A manager can stop mid-conversation, ask for phrasing suggestions, replay any line, and try again. Practice is non-linear by design.
- Voice and text modes. ElevenLabs powers natural voice for high-realism rehearsal. A text mode keeps latency low when managers want to drill quickly.
- Multi-model orchestration. OpenAI for generation in some flows, Anthropic for reflection and feedback, Gemini for cost-sensitive paths — all coordinated through LangChain with cache-aware prompt design.
- Real-time scoring. A second-pass evaluator critiques tone and clarity, surfacing one phrasing improvement per turn instead of generic encouragement.
Stack
Python · Django · React · TypeScript · LangChain · OpenAI · Anthropic · Gemini · Pinecone · ElevenLabs · PostgreSQL · AWS
Outcomes
- +8% platform-wide AI feature usage in the first 24 hours of launch.
- +29% adoption growth since launch.
- Helped power the platform's customer-cohort gains: up to 33% retention lift, 60–200% eNPS improvement, and 22–75% engagement-score lifts.
- At Bank of Hawaii alone, the broader Rising Team platform delivered 400+ sessions across 1,950 employees in 40 branches, contributing to an 88% retention lift over an 18-month period.
What I learned
- Dual-LLM evaluation beats encouragement. A separate model scoring the partner's response — not the user's — catches "supportive but wrong" feedback that a single prompt loop hides.
- Voice tripled session length only when latency stayed under ~800 ms. Above that, managers dropped to text mode and never came back to voice.
- Retrieval narrows the prompt — it doesn't replace it. Team context grounds the partner; the system prompt still has to encode the kind of pushback that's productive vs. cruel.
Public
Hiring for something like this?
Tell me what you're building