live perception

Watch your camera and audio become structured events

No audio/video is sent to external LLM providers

Ready

Anger

Contempt

Disgust

Fear

Happy

Neutral

Sadness

Surprise

Emotion Analysis

HAPPY

0%

NEUTRAL

0%

ANGER

0%

CONTEMPT

0%

DISGUST

0%

FEAR

0%

SADNESS

0%

SURPRISE

0%

Content Scanning

Do something interesting! Video captions will appear here...

Backed by Y Combinator

What is Pinch?

Pinch is a perception stack that extracts meaning from raw audio and video.

From chaos to structure

Raw sensor data — pixels, waveforms, noise — becomes structured events: emotion, speech, sound understanding, person engaged, environment analysis, and more.

Built for real-time or post-analysis

Use Pinch to give your agents awareness. Let them see when someone smiles, hear when they're confused, know when to respond. Or use Pinch to analyze your media library after the fact.

How it works

Power real-time agents or analyze your entire media library

Power real-time agents

Analyze your entire media library

Real-time

Give your agents awareness

Perfect for live calls, customer interactions, and agent-driven experiences

1

Stream live audio/video

Connect your call, meeting, or camera feed through our SDK

2

Get instant perception events

Receive emotion, engagement, speech as structured events over WebSocket

3

React in the moment

Let your agent detect when a customer hesitates, when a student is confused, when tone shifts

Async

Index and search your library

Perfect for post-production, compliance review, and training analysis

1

Send your media files

Upload recordings, archives, or raw footage via API

2

We extract everything

Every visual, every sound, every emotional beat becomes searchable

3

Search by meaning, not metadata

"Show clips where the rep sounded confident" — get exact timestamps back

Search your media by meaning

Ask anything about your audio and video — and get the exact moments back.

pinch> _

Learning & Development

"Show me moments when participants looked disengaged"

"Find when the trainer mentioned 'data security'"

"Locate parts where people laughed or asked questions"

Sales & Call Analysis

"Find when the customer sounded uncertain"

"Locate when pricing was discussed and silence followed"

"Show clips where rep sounded confident"

Media & Post-Production

"Find clips where host smiles and music starts"

"Search for outdoor background with natural light"

"Locate emotional peaks, nostalgic moments"

Online Education

"Find when students appeared disengaged"

"Locate every time instructor paused for questions"

"Search for confused to understanding shifts"

Legal & Compliance

"Find instances of profanity or raised voice"

"Locate where restricted keywords were mentioned"

"Show clips with background TV or music"

Contact Center QA

"Locate long customer silence after explanation"

"Find high frustration with elevated volume"

"Search for 'cancel' said angrily"

Ready to build?

Get started with Pinch today

Talk to an engineer