AUTHORING

How a Convo tour
actually gets made.

The curator’s seat. What you bring, what the draft looks like, what the editor is for, and how a tour gets from idea to live — multilingual, on the wall — in two to four weeks. Without an IT ticket, a studio booking, or a producer in the loop.

THE WORKFLOW

From your materials to a published tour,
in days not quarters.

The shape of authoring on Convo is the same shape every institution settles into within a week of starting. You gather the reference materials you already have — wall text, catalog notes, exhibition essays — and upload them. The platform drafts each stop in the voice you have asked for, grounded in those sources. A curator reads through, edits where the script is wrong or where it just doesn’t sound like the institution. You hit publish. The platform voices every stop in ten languages and pushes the tour to the QR card on the wall.

Three things matter about that loop. First, no part of it requires external production. There is no studio booking, no audio engineer, no translation vendor in the chain. Second, the curator never leaves the seat — the AI drafts, but the editorial decision lives with a person who knows the collection. Third, the loop runs in days, and it keeps running after launch. Edits are not a project; they are a workflow.

Most institutions get their first gallery live within two weeks of kickoff. Larger rollouts — twelve galleries, three voices, simultaneous launch — push to about four. The boring observation underneath that timeline is that the meeting cadence is the bottleneck, not the technology. If your curator can read a twelve-stop draft and give notes inside of three days, you are ready. The full operational picture is on the implementation page.

WHAT YOU BRING

What you bring.

The honest answer is: less than you would think, and probably less than you have already written. A Convo tour needs source material the institution can stand behind — the same evidence base a curator would use to write a wall text or a catalog entry. Concretely, that usually looks like one or more of: the wall texts and gallery panels you have already produced, the catalog entries from your collection database (a CSV export is fine), the exhibition essay or the curator’s working notes, any scholarly articles or press materials you regularly cite, and the voice sample we use to match cadence — often a thirty-minute clip of an existing tour or a curator reading aloud.

Some material makes the draft come out cleaner. Anything written for a public reader — wall text, exhibition essay, audio guide scripts for previous tours — translates directly into the cadence of an audio stop. Anything written for an academic reader — full footnotes, dense paragraphs, internal taxonomy codes — works as grounding but produces a draft that needs a heavier curator pass. Image-only material doesn’t translate at all yet for authoring; the tour script is written from text. (Visitors can point a phone camera at an object and ask the guide about it — that is a separate visitor-facing capability, covered on the visitor Q&A page — but it does not feed authoring.)

What you do not need: a finished script, a clean folder structure, a translation vendor, or a sound studio. The most common preparation mistake is for a team to delay kickoff while they tidy up their reference library. Don’t. Hand us the messy folder. The first day is partly us walking through what you have and deciding what to use. That conversation is faster than the tidying would have been.

THE DRAFT AND THE EDITOR

How the draft happens,
and why it’s a draft.

The mechanics are not mysterious. The model writes each stop using a retrieval pass over the materials you uploaded — it pulls the relevant passages, drafts a stop in the voice you chose, and shows its working. It is instructed to write only from those passages, and to decline when it cannot ground a claim rather than fill in. That grounding rule is the most important sentence in this paragraph. The draft is not the model’s opinion of your collection; it is the model’s rendering of your own materials, in audio cadence.

Then the editor opens. The script lives in a per-stop editor inside the admin — you can rewrite any line by hand, ask the AI for a specific change (shorter, softer, lead with the back of the sculpture, fold in this catalog quote), or sit between the two. When the AI proposes a rewrite, you see a diff against your current script before accepting it. Nothing edits itself. The stop has a three-status workflow — draft, in review, published — so a curator and an editor can move stops down a pipeline rather than emailing Word docs back and forth. A stop drawer on the side surfaces the cover image, any pre-staged image overrides, the script, and the metadata that drives the player.

The discipline this enforces is the one that matters editorially: every line a visitor hears was approved by a curator who knows the material. The AI saves the time you would have spent at the blank page. It does not replace the judgment about what to say. When a stop reads off, the fix is upstream — you correct the source material the draft was grounded in, regenerate, and the same correction propagates the next time anyone asks about that object. On the visitor side, the same grounding rule applies to Q&A, which is its own page — what a visitor can ask the guide — and built on the same source library.

AFTER LAUNCH

How updates work after launch.

This is the section that surprises curators who have lived under the studio production model. After launch, an edit to a stop — a corrected date, a softened sentence, a new line about a piece that just came in — takes about a minute to propagate. You open the stop, edit the line, hit publish. The platform re-voices that stop in all ten languages and pushes the change to the QR card on the wall. A visitor who scans the code five minutes later hears the corrected line.

Compare that to what same-day correction looks like in a studio workflow. A factual error in a permanent collection means a script change, a translator pass for each language, a re-record in each voice booth, a re-master, a redeployment to the handsets. The published cost analyses put a single corrected line on a legacy audio guide in the four-figure range and the calendar in weeks. On Convo, the same correction costs a curator’s attention for a minute. That single asymmetry — one minute against four weeks — is the entire operational argument for moving authoring in-house.

The three-status workflow holds after launch too. Most institutions keep one curator in publish authority and route proposed changes through review before they go to visitors. If a change is sensitive — a contested attribution, a politically loaded label — that pause is exactly where you want it. The same workflow handles seasonal updates, traveling exhibitions, and pieces rotating off the wall; the operational pattern is on the operations pillar. And because the translation pass is one of the things the platform does for you, the language layer keeps pace — covered in detail on the multilingual page.

WHERE THIS DOESN’T FIT

Where this workflow doesn’t fit.

I should be honest about the cases where this is not the right answer. The first is the permanent collection that genuinely is not going to change for a decade. If you are a small institution with one tour you wrote in 2018 that you fully expect to leave alone until the next director, the case for an editable, multilingual authoring platform is weaker — you are paying for an update cycle you are not going to use. A printed transcript and a hardware handset rental will probably serve you. (We will say so on the call.)

The second is the production where a named voice is itself the curatorial point. If you have commissioned an artist or a poet to record a tour, or if a specific actor’s voice is the reason a visitor would put the headphones on, the authoring model here is the wrong instrument. The AI voices we use are good — good enough that most visitors do not notice — but they are not a named person. Some institutions run both: a marquee artist-voiced tour in the headline exhibition, and Convo across the rest of the building. That is fine, and we will tell you when we think it’s the right call.

The third is the operational case: you are mid-contract on a hardware handset fleet with two years still on the lease. Switching now means writing off the depreciation. The realistic move is to pilot Convo on a single temporary exhibition that is not on the handset fleet — a temporary gallery, a traveling show, a kids’ program — and let the contract age out. The pilot timeline on the implementation page is built for exactly that kind of sideways start. If any of these cases describe you and you’d still like a second opinion, the call is free and I will be straight about it. Pricing is on the pricing page and the category context is on the AI audio guides resource hub.

What curators see after the tour is on the wall — the visitor questions, the themes, the gaps in the source material that show up only once real people start asking — is its own page: insights from a published tour.

COMMON QUESTIONS

What curators ask about authoring.

Two to four weeks for a single gallery, depending on how organised your reference materials are and how many reviewers you want in the loop. The first week is drafting and your read-through; the second is voicing, translation, and a soft launch on the floor. If your materials are scattered and you want three curators to weigh in, push it to four. That is still measured against the studio-production baseline of four to nine months.
No, but it adds a few days. The first thing we do on kickoff is sit with you and pull together what you already have — wall text PDFs, exhibition essays, the curator’s working notes, the catalog entries you wrote two years ago. You don’t need to clean any of this up first. Messy is fine. We are looking for source material, not a finished script.
Two safeguards. First, the draft is grounded only in materials you uploaded — the model is not pulling from the open internet, and it is instructed to decline when it cannot ground an answer. Second, every line a visitor hears was approved by a curator in the editor. The script is not a prediction; it is an artifact your curator signed off on. If a visitor question later surfaces something off in the source, you correct the source, regenerate, and republish.
Yes. The script is a living document. Edit a stop, re-voice it across all ten languages, and republish — the change is on the visitor’s phone within a minute. There is no studio queue. This is the part of the workflow that most surprises curators who have lived under the legacy production model.
Yes. Multiple staff can be invited to the admin and work on the same tour at the same time. Convo does not yet route specific stops to specific reviewers inside the admin — most institutions handle that by assigning galleries to curators on their side and using the three-status workflow (draft, review, published) as the handoff. If you need formal per-language reviewer routing, treat that as a process step for now and let us know — it is on the roadmap.
You do. Your reference materials, your scripts, your audio, your visitor analytics. We don’t train models on your content, and we don’t share it with other institutions. You can export scripts and per-stop audio in plain formats at any time.
Book a callSee the pilot timeline
START WITH ONE GALLERY

Bring the messy folder.
We’ll take it from there.