All work

aRTi AI Roleplay — practice hard conversations with an LLM partner

+8% AI feature usage in 24h · +29% adoption since launch

Live coaches charge $200–400 an hour and don't know your team. So we built one that does.

The problem

Managers regularly need to rehearse hard conversations — performance feedback, terminations, conflict resolution, difficult peer dynamics. Live coaching is expensive (often $200–400/hr), often unavailable in the moment, and still feels artificial because the coach doesn't know the team.

Rising Team's customers — leaders at Adobe, Meta, Google Cloud, Microsoft, Bank of Hawaii, and others — wanted a way to practice on demand, from anywhere, with realistic resistance and actionable feedback.

The approach

I led the research, design, and end-to-end implementation of aRTi AI Roleplay against the platform's core services.

  • Team-aware partner. Pinecone retrieval over each leader's team context, declared development goals, and prior session insights. The LLM partner stays in character with realistic objections specific to that manager's team.
  • Pause / reflect / retry interaction model. A manager can stop mid-conversation, ask for phrasing suggestions, replay any line, and try again. Practice is non-linear by design.
  • Voice and text modes. ElevenLabs powers natural voice for high-realism rehearsal. A text mode keeps latency low when managers want to drill quickly.
  • Multi-model orchestration. OpenAI for generation in some flows, Anthropic for reflection and feedback, Gemini for cost-sensitive paths — all coordinated through LangChain with cache-aware prompt design.
  • Real-time scoring. A second-pass evaluator critiques tone and clarity, surfacing one phrasing improvement per turn instead of generic encouragement.

Stack

Python · Django · React · TypeScript · LangChain · OpenAI · Anthropic · Gemini · Pinecone · ElevenLabs · PostgreSQL · AWS

Outcomes

  • +8% platform-wide AI feature usage in the first 24 hours of launch.
  • +29% adoption growth since launch.
  • Helped power the platform's customer-cohort gains: up to 33% retention lift, 60–200% eNPS improvement, and 22–75% engagement-score lifts.
  • At Bank of Hawaii alone, the broader Rising Team platform delivered 400+ sessions across 1,950 employees in 40 branches, contributing to an 88% retention lift over an 18-month period.

What I learned

  1. Dual-LLM evaluation beats encouragement. A separate model scoring the partner's response — not the user's — catches "supportive but wrong" feedback that a single prompt loop hides.
  2. Voice tripled session length only when latency stayed under ~800 ms. Above that, managers dropped to text mode and never came back to voice.
  3. Retrieval narrows the prompt — it doesn't replace it. Team context grounds the partner; the system prompt still has to encode the kind of pushback that's productive vs. cruel.

Public

risingteam.com/ai-roleplay-partner →