Speak to every visitor in their first language.
Multilingual is the single dimension where AI changes the math most. Legacy studio production multiplies cost and time per language. Convo adds the next nine from one approved English source — same script, regenerated and re-voiced in about a minute — so the museum can finally serve non-English-speaking visitors at scale without re-recording.
Book a demoForty-plus languages,
from one approved source.
Convo supports more than forty languages; each institution activates up to ten at a time and can swap the set whenever it likes. The cap isn’t a capability limit — it exists because no institution can realistically vet forty languages, and we’d rather you publish ten you’ve checked than forty you haven’t. You write and approve the tour once, in English. The rest come from that source — same stops, same edits, re-voiced in roughly a minute whenever the English changes. Multilingual stops being a budget line item and starts being the default.
Translation that respects
the cultural register.
Current models handle literal accuracy, register, and the parallel structure of an audio script well. What still needs human judgment is sacred and religious terminology, named-entity transliteration, and region-specific cultural framing — decisions, not errors. A reviewer who reads the target language edits the output directly. The platform makes review cheap; it doesn’t pretend the review is unnecessary.
Re-voice in seconds,
not weeks.
A curator finishes a round of edits on the English script at 10:14. They press regenerate. By 10:15 the same stop exists in ten languages, voiced in the institution’s chosen voice for each. No studio is booked. No talent is contracted. No translator is emailed a copy of the new file. The change propagates because the source did — and the same loop handles seasonal updates, traveling exhibitions, and pieces rotating off the wall.
The visitor’s phone,
in their language.
The language picker lives in the visitor web app and follows them across stops. A family with a Korean-speaking grandparent and an English-speaking grandchild can hand the phone back and forth and the tour state persists. Questions asked of the visitor guide come back in whatever language the question was asked in. Multilingual narration is the first interpretation layer that meets a non-English-reading visitor where they are.
No per-language fees.
Ever.
Studio production charges per language per minute of finished audio — adding Spanish doubles the bill, adding ten multiplies it. Convo doesn’t charge per language. The ten languages come bundled, and adding or updating them after launch doesn’t trigger a new production cycle. The cost lives in the platform, not the language count. You no longer choose which exhibits get translated. They all do.