Named Entity Recognition (NER)

Extract entities like names, locations, organizations, and dates from conversations.

Overview

Named Entity Recognition helps you:

Extract customer information
Identify organizations and locations
Recognize dates and times
Track entities mentioned in calls
Enable information enrichment

Supported Entity Types

Entity Type	Examples	Use Case
PERSON	"John Smith", "Dr. Johnson"	Customer/agent identification
ORGANIZATION	"Acme Corp", "Microsoft"	Company identification
LOCATION	"New York", "California"	Geographic references
DATE	"January 15", "next Monday"	Event/appointment dates
TIME	"3:00 PM", "14:30"	Meeting times, deadlines
PHONE_NUMBER	"+1-555-0123"	Contact information
EMAIL	"john@example.com"	Email addresses
MONEY	"$50", "€200"	Monetary amounts
PERCENTAGE	"25%", "0.5%"	Percentages and rates
URL	"www.example.com"	Website addresses

Getting Entity Data

From Transcript Status

curl -X GET "https://api.example.com/api/v1/transcripts/123456789/status" \
  -H "Authorization: Bearer YOUR_TOKEN"

Response includes keywords with entity information:

{
  "keywords": [
    {
      "name": "John Smith",
      "type": "PERSON",
      "score": 95,
      "channelMatch": "left"
    },
    {
      "name": "Acme Corporation",
      "type": "ORGANIZATION",
      "score": 92,
      "channelMatch": "right"
    }
  ]
}

Entity Extraction Example

Python Implementation

import requests

class EntityExtractor:
    def __init__(self, api_base, token):
        self.api_base = api_base
        self.headers = {"Authorization": f"Bearer {token}"}
    
    def extract_entities(self, transcript_id):
        """Extract all entities from transcript"""
        response = requests.get(
            f"{self.api_base}/api/v1/transcripts/{transcript_id}/status",
            headers=self.headers
        )
        
        transcript = response.json()
        keywords = transcript.get('keywords', [])
        
        # Organize entities by type
        entities = {
            'PERSON': [],
            'ORGANIZATION': [],
            'LOCATION': [],
            'DATE': [],
            'TIME': [],
            'PHONE_NUMBER': [],
            'EMAIL': [],
            'MONEY': [],
            'PERCENTAGE': [],
            'URL': []
        }
        
        for keyword in keywords:
            # Extract entity type from keyword metadata
            entity_type = keyword.get('type', 'OTHER')
            if entity_type in entities:
                entities[entity_type].append({
                    'name': keyword['name'],
                    'score': keyword['score'],
                    'channel': keyword.get('channelMatch')
                })
        
        return entities
    
    def get_persons(self, transcript_id):
        """Get all person entities"""
        entities = self.extract_entities(transcript_id)
        return entities['PERSON']
    
    def get_organizations(self, transcript_id):
        """Get all organization entities"""
        entities = self.extract_entities(transcript_id)
        return entities['ORGANIZATION']
    
    def get_contact_info(self, transcript_id):
        """Extract contact information"""
        entities = self.extract_entities(transcript_id)
        return {
            'phone_numbers': entities['PHONE_NUMBER'],
            'emails': entities['EMAIL'],
            'urls': entities['URL']
        }
    
    def get_financial_info(self, transcript_id):
        """Extract financial information"""
        entities = self.extract_entities(transcript_id)
        return {
            'amounts': entities['MONEY'],
            'percentages': entities['PERCENTAGE']
        }

# Usage
extractor = EntityExtractor("https://api.example.com", "your_token")

# Extract entities from transcript
entities = extractor.extract_entities(123456789)
print("Extracted entities:", entities)

# Get specific entity types
persons = extractor.get_persons(123456789)
print("Persons mentioned:", persons)

contact_info = extractor.get_contact_info(123456789)
print("Contact information:", contact_info)

Entity Confidence Scoring

Each entity has a confidence score:

{
  "name": "John Smith",
  "score": 95,
  "confidence_level": "Very High"
}

Confidence Levels:

90-100: Very High
75-89: High
50-74: Medium
<50: Low (needs verification)

Entity Linking

Associate entities across conversations:

def create_customer_profile(transcript_id, token):
    """Create customer profile from extracted entities"""
    extractor = EntityExtractor("https://api.example.com", token)
    
    entities = extractor.extract_entities(transcript_id)
    contact_info = extractor.get_contact_info(transcript_id)
    
    profile = {
        'names': entities['PERSON'],
        'organizations': entities['ORGANIZATION'],
        'locations': entities['LOCATION'],
        'phone_numbers': contact_info['phone_numbers'],
        'emails': contact_info['emails'],
        'related_dates': entities['DATE'],
        'financial_info': {
            'amounts': entities['MONEY'],
            'percentages': entities['PERCENTAGE']
        }
    }
    
    return profile

Use Cases

1. Customer Data Enrichment

Extract customer names and contact info
Identify customers mentioned in calls
Update CRM with call-extracted data
Track customer interactions

2. Compliance & Audit

Extract sensitive information (PII)
Track PCI compliance
Audit data mentions
Generate compliance reports

3. Knowledge Management

Extract product names and versions
Identify mentioned competitors
Track topics and dates
Build knowledge base

4. Analysis & Reporting

Track mentioned organizations
Analyze geographic distribution
Extract financial information
Generate business intelligence

5. Automated Actions

Create calendar events from dates
Add contacts from phone numbers
Track follow-ups needed
Trigger workflows based on entities

Entity Extraction Patterns

Contact Information Pattern

def extract_call_contact_info(transcript_id, token):
    """Extract contact information to update CRM"""
    extractor = EntityExtractor("https://api.example.com", token)
    entities = extractor.extract_entities(transcript_id)
    
    return {
        'customer_name': entities['PERSON'][0] if entities['PERSON'] else None,
        'phone': entities['PHONE_NUMBER'][0] if entities['PHONE_NUMBER'] else None,
        'email': entities['EMAIL'][0] if entities['EMAIL'] else None,
        'company': entities['ORGANIZATION'][0] if entities['ORGANIZATION'] else None,
        'location': entities['LOCATION'][0] if entities['LOCATION'] else None
    }

Financial Transaction Pattern

def extract_financial_data(transcript_id, token):
    """Extract financial transaction details"""
    extractor = EntityExtractor("https://api.example.com", token)
    entities = extractor.extract_entities(transcript_id)
    
    return {
        'amounts': entities['MONEY'],
        'percentages': entities['PERCENTAGE'],
        'dates': entities['DATE'],
        'persons': entities['PERSON']
    }

Event Scheduling Pattern

def extract_appointment_info(transcript_id, token):
    """Extract appointment scheduling information"""
    extractor = EntityExtractor("https://api.example.com", token)
    entities = extractor.extract_entities(transcript_id)
    
    return {
        'date': entities['DATE'][0] if entities['DATE'] else None,
        'time': entities['TIME'][0] if entities['TIME'] else None,
        'person': entities['PERSON'][0] if entities['PERSON'] else None,
        'location': entities['LOCATION'][0] if entities['LOCATION'] else None
    }

PII (Personal Identifiable Information) Handling

Sensitive Entity Types

{
  "sensitive_entities": [
    {
      "type": "PERSON",
      "is_pii": true,
      "encryption_required": true
    },
    {
      "type": "PHONE_NUMBER",
      "is_pii": true,
      "encryption_required": true
    },
    {
      "type": "EMAIL",
      "is_pii": true,
      "encryption_required": true
    }
  ]
}

Best Practices for PII

Data Protection
- Encrypt sensitive entities at rest
- Use HTTPS for transmission
- Implement access controls
- Audit access logs
Retention
- Define retention policies
- Automatically delete after period
- Comply with privacy regulations
- Document data handling
Compliance
- GDPR: Right to be forgotten
- CCPA: Consumer privacy rights
- HIPAA: Healthcare data protection
- Industry-specific regulations

Entity Statistics

Generate Entity Report

def generate_entity_report(date_from, date_to, token):
    """Generate report on extracted entities"""
    api_base = "https://api.example.com"
    headers = {"Authorization": f"Bearer {token}"}
    
    # Get all transcripts in date range
    response = requests.get(
        f"{api_base}/api/v1/transcripts",
        headers=headers,
        params={
            'DateFrom': date_from,
            'DateTo': date_to,
            'Rows': 1000
        }
    )
    
    transcripts = response.json().get('data', [])
    
    # Aggregate entity statistics
    entity_stats = {
        'PERSON': {},
        'ORGANIZATION': {},
        'LOCATION': {},
        'PHONE_NUMBER': 0,
        'EMAIL': 0,
        'MONEY': [],
        'total_entities': 0
    }
    
    for transcript in transcripts:
        keywords = transcript.get('keywords', [])
        for keyword in keywords:
            entity_type = keyword.get('type', 'OTHER')
            entity_name = keyword['name']
            
            if entity_type == 'PERSON':
                entity_stats['PERSON'][entity_name] = \
                    entity_stats['PERSON'].get(entity_name, 0) + 1
            elif entity_type == 'ORGANIZATION':
                entity_stats['ORGANIZATION'][entity_name] = \
                    entity_stats['ORGANIZATION'].get(entity_name, 0) + 1
            elif entity_type == 'PHONE_NUMBER':
                entity_stats['PHONE_NUMBER'] += 1
            
            entity_stats['total_entities'] += 1
    
    return entity_stats

Visualization

Entity Frequency

Apple Inc:           ████████░ 45 mentions
John Smith:          ██████░░░ 32 mentions
San Francisco:       █████░░░░ 28 mentions
Microsoft:           ████░░░░░ 22 mentions
Jane Doe:            ██░░░░░░░ 12 mentions

Entity Type Distribution

PERSON:         ███████░░ 35%
ORGANIZATION:   ██████░░░ 30%
LOCATION:       ████░░░░░ 20%
MONEY:          ████░░░░░ 10%
PHONE_NUMBER:   ░░░░░░░░░ 2%
OTHER:          ░░░░░░░░░ 3%

Troubleshooting

Entities Not Extracted

Insufficient speech content
Poor audio quality affecting transcription
Entity type not supported
Low confidence scores

Incorrect Extraction

Similar sounding names
Multiple entity references
Acronyms and abbreviations
Specialized terminology

PII Not Masked

Entity type may not be classified as PII
Requires explicit configuration
Check redaction settings

Next Steps

Sentiment Analysis - Emotional analysis
Topic Detection - Topic identification
Summarization - Call summaries
Translation - Language translation

Overview​

Supported Entity Types​

Getting Entity Data​

From Transcript Status​

Entity Extraction Example​

Python Implementation​

Entity Confidence Scoring​

Entity Linking​

Use Cases​

1. Customer Data Enrichment​

2. Compliance & Audit​

3. Knowledge Management​

4. Analysis & Reporting​

5. Automated Actions​

Entity Extraction Patterns​

Contact Information Pattern​

Financial Transaction Pattern​

Event Scheduling Pattern​

PII (Personal Identifiable Information) Handling​

Sensitive Entity Types​

Best Practices for PII​

Entity Statistics​

Generate Entity Report​

Visualization​

Entity Frequency​

Entity Type Distribution​

Troubleshooting​

Entities Not Extracted​

Incorrect Extraction​

PII Not Masked​

Next Steps​