← Back to Blog
Technology··4 min read

PersonaPlex-7B: Full-Duplex Speech-to-Speech Model Guide

Complete guide to PersonaPlex-7B, the open source speech-to-speech model for building conversational AI. Learn deployment, optimization, and best practices.

personaplex-7bspeech to speech modelfull duplex voice aiconversational ai model

PersonaPlex-7B: Full-Duplex Speech-to-Speech Model Guide

PersonaPlex-7B is an open source speech-to-speech model designed for real-time conversational AI. Unlike traditional TTS pipelines, it processes speech end-to-end, enabling natural full-duplex conversations with interruption handling.

Overview

PersonaPlex-7B represents a new approach to voice AI:

  • Speech-to-speech: Direct audio processing without intermediate text
  • Full-duplex: Listen and respond simultaneously
  • Interruption handling: Natural turn-taking and barge-in support
  • Context awareness: Maintains conversation history and emotional state
  • 7B parameters: Optimized for real-time inference

Architecture

┌─────────────┐     ┌──────────────────┐     ┌─────────────┐
│ User Speech │────►│  PersonaPlex-7B  │────►│ AI Response │
│   (Audio)   │◄────│ Speech-to-Speech │◄────│   (Audio)   │
└─────────────┘     └──────────────────┘     └─────────────┘
        ▲                    │                      │
        │                    ▼                      │
        │           ┌────────────────┐              │
        └───────────│ Full-Duplex    │◄─────────────┘
                    │ Stream Manager │
                    └────────────────┘

Hardware Requirements

ConfigurationVRAMLatencyConcurrent Users
Minimum16GB~500ms1
Recommended24GB~300ms2-4
Production40GB+~200ms8+

Quick Start

Installation

pip install personaplex-7b

Basic Usage

from personaplex import PersonaPlex7B
 
# Initialize the model
model = PersonaPlex7B.from_pretrained("personaplex/personaplex-7b")
 
# Create a conversation session
session = model.create_session()
 
# Process audio input and get response
response_audio = session.process(input_audio)

Streaming Conversation

import asyncio
from personaplex import PersonaPlex7B, AudioStream
 
async def conversation():
    model = PersonaPlex7B.from_pretrained("personaplex/personaplex-7b")
    session = model.create_session()
 
    # Create bidirectional audio streams
    input_stream = AudioStream.from_microphone()
    output_stream = AudioStream.to_speaker()
 
    # Run full-duplex conversation
    async for response_chunk in session.stream(input_stream):
        await output_stream.write(response_chunk)
 
asyncio.run(conversation())

Full-Duplex Features

Interruption Handling

PersonaPlex-7B naturally handles interruptions:

session = model.create_session(
    interruption_threshold=0.3,  # Sensitivity (0-1)
    fade_on_interrupt=True,      # Gracefully fade out
    interrupt_response="adaptive" # or "immediate", "delayed"
)

Turn-Taking

Configure natural conversation flow:

session = model.create_session(
    end_of_turn_detection="auto",  # Automatic pause detection
    min_response_delay=100,        # ms before responding
    backchanneling=True            # Enable "uh-huh", "I see" etc.
)

Voice Configuration

Built-in Voices

# List available voices
voices = model.list_voices()
# ['aria', 'marcus', 'elena', 'kai', ...]
 
# Use a specific voice
session = model.create_session(voice="aria")

Voice Cloning

Clone a voice for personalized agents:

custom_voice = model.clone_voice(
    reference_audio="agent_voice.wav",
    name="my_agent"
)
 
session = model.create_session(voice=custom_voice)

System Prompts

Guide the AI's behavior with system prompts:

session = model.create_session(
    system_prompt="""You are a helpful customer support agent for TechCorp.
    Be concise, friendly, and solution-oriented.
    If you don't know something, offer to connect the user with a human agent."""
)

WebSocket Server

Deploy as a real-time WebSocket server:

from personaplex import PersonaPlex7B, WebSocketServer
 
model = PersonaPlex7B.from_pretrained("personaplex/personaplex-7b")
server = WebSocketServer(model, port=8765)
 
# Start serving
server.run()

Client connection:

const ws = new WebSocket('ws://localhost:8765');
const mediaRecorder = new MediaRecorder(audioStream);
 
mediaRecorder.ondataavailable = (e) => ws.send(e.data);
ws.onmessage = (e) => playAudio(e.data);

Production Deployment

Docker Compose

version: '3.8'
services:
  personaplex:
    image: personaplex/personaplex-7b:latest
    runtime: nvidia
    environment:
      - NVIDIA_VISIBLE_DEVICES=all
      - MODEL_CACHE=/models
    volumes:
      - ./models:/models
    ports:
      - "8765:8765"
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: 1
              capabilities: [gpu]

Load Balancing

For high-traffic deployments:

services:
  personaplex:
    deploy:
      replicas: 4
    # ... rest of config
 
  nginx:
    image: nginx
    ports:
      - "80:80"
    volumes:
      - ./nginx.conf:/etc/nginx/nginx.conf

Performance Optimization

Quantization

Reduce memory usage with quantization:

model = PersonaPlex7B.from_pretrained(
    "personaplex/personaplex-7b",
    quantization="int8"  # ~50% memory reduction
)

Speculative Decoding

Enable faster inference:

model = PersonaPlex7B.from_pretrained(
    "personaplex/personaplex-7b",
    speculative_decoding=True,
    draft_model="personaplex/personaplex-draft"
)

Comparison with Alternatives

FeaturePersonaPlex-7BMoshiGPT-4o Realtime
Open SourceYesYesNo
Full-DuplexYesYesYes
Self-hostableYesYesNo
Voice CloningYesLimitedNo
LanguagesEnglishEN/FR50+

For more options, see our Open Source Voice AI Models comparison.

When to Use PersonaPlex-7B

Choose PersonaPlex-7B when you need:

  • Full-duplex conversational AI
  • Self-hosted deployment with data privacy
  • Custom voice cloning
  • Natural interruption handling

Consider alternatives when you need:

Conclusion

PersonaPlex-7B brings state-of-the-art conversational AI capabilities to the open source community. Its full-duplex architecture and natural conversation handling make it ideal for voice agents, AI companions, and interactive applications.


This article is part of our Open Source Voice AI Models series.

Related Articles