Building and running audio tours without a studio.

KEY TAKEAWAYS

The work an AI platform removes is studio production: drafting drudgery, voice talent scheduling, translation re-recording, and device deployment. The work it does not remove — and in some ways intensifies — is curator review attention and source-material discipline.
The lifecycle of an AI-narrated tour is short and repeatable: source materials in, draft, review, voice, publish, update, measure. The same cycle covers a permanent gallery, a temporary exhibition, and a multi-site rollout, at different scales.
The single biggest predictor of tour quality on an AI platform is the quality and organization of the reference materials the curator hands the platform. Messy inputs produce drafts that need more editing than they save.
A curator's day on a working program looks less like "writing audio scripts" and more like "editing, approving, and answering visitor-question signal." The interpretation team's labor doesn't disappear; it shifts toward judgment and away from production.
Multi-tour and multi-site management gets meaningfully easier when production cost per tour collapses — but governance gets harder. Who owns voice, who approves edits, and how cross-site standards stay coherent are the questions that replace "can we afford this."
This pillar exists because we kept hearing the same operational questions on calls: how do tours actually get made, how often do they get updated, what does the curator's week look like, how do you run a portfolio of tours across sites. This hub answers them; the spokes go deeper on each.

The category-level argument for AI-narrated audio guides — collapsed production cost, multilingual by default, same-day updates — is, by now, well-rehearsed in vendor decks. The harder question, the one I get from curators and visitor-experience directors after the demo is over, is operational. What does it actually look like to run a program like this? What does the curator do on a Tuesday? How often do tours change? What breaks at the seams between exhibitions, between sites, between languages?

This is the hub for the operations pillar. The pieces in this pillar are about the work that doesn't go away when you move off studio production — and the work that does. I've tried to be specific about both, because the institutions that have run aground with AI audio platforms have done so for operational reasons, not technical ones.

What does the lifecycle of an AI-narrated tour actually look like?

Source materials in, draft, review, voice, publish, update, measure — and then the same cycle, again, for the next tour or the next correction. The shape is short and repeatable, and once an interpretation team has run it twice they tend to internalize it as one motion rather than seven steps.

The seven beats, with the realistic time each takes on a working platform:

Source materials in. A curator uploads reference files — catalog notes, wall text, exhibition essays, label copy, a CSV from the collections management system. Minutes.
Draft. The platform generates a first-pass script for each stop, grounded in those materials. Seconds to a minute or two per stop.
Review. The curator reads, edits, and approves. This is where the time goes. Realistic budgets are five to fifteen minutes per stop for a substantive review pass, longer for new exhibitions.
Voice. Approved text is voiced by neural TTS. Seconds per stop.
Publish. Status flips from draft to published; visitors hear the new version on the next QR scan.
Update. A correction, a reframing, an added stop — re-enters at step 3 and re-renders.
Measure. Listen to what visitors actually engaged with, where they dropped off, and what they asked about. Feed it back into the next round of edits.

The first five steps used to take months. Now they take a curator's afternoon for a small tour, or a couple of weeks of part-time review work for a full permanent collection. The bottleneck has moved from studio scheduling to curator review — which is exactly where it should be.

For the deeper read on what changes when steps two and four go from months to seconds, the Pillar 1 spoke on AI audio guide vs traditional audio guide covers the dimension-by-dimension comparison.

What does a curator's week actually look like on this kind of program?

Less writing from scratch, more editing and approving. More attention to source organization at the start of a project, and to visitor-question signal at the end. The labor doesn't disappear; it shifts upstream and downstream of the part the platform now does.

On a working program, a curator's recurring work on the audio tour typically falls into four pots:

Editing the draft. The largest single time sink. The first AI draft is a starting point; turning it into the institution's voice is real work. Curators who run this well treat the draft the way an editor treats a first manuscript — line by line, keeping what works, rewriting what doesn't.
Approving and publishing. Short, but ritualized. A clear approval step keeps the institution honest about what's reaching visitors. On Convo this is a status change from review to published; on every serious platform there's an equivalent.
Responding to visitor signal. Once tours are live, a steady trickle of feedback: visitor questions the guide didn't answer well, drop-off points, suggested prompts that did or didn't resonate. The curator decides what to feed back into the script.
Same-day corrections. Attribution fixes, deaccessions, updated dates, reframed introductions. On the legacy model these queued indefinitely; on AI platforms they belong to a Tuesday. See the spoke on same-day museum tour updates for the operational discipline this requires.

The shift, summarized: less time on "producing the audio," more time on the curatorial work that was always supposed to be the point. Most curators I've talked with describe this as a relief; a few describe it as exposing.

How should reference materials be prepared?

Organized, current, and labeled at the object level — not a folder of PDFs labeled "tour stuff." The single biggest predictor of how good the first AI draft will be is how well-structured the source materials are when they go in. This is the operational discipline most teams underestimate going in, and the one most often cited as a regret after the first tour.

A reference set that produces good drafts looks like this:

One object, one record. Each stop in the tour corresponds to a clear object (or grouping) with a defined identifier. When the platform asks "what do you have on this object?", the answer should be a coherent set of materials, not a search through a shared drive.
A few short documents beats one long one. Catalog entry, wall text, exhibition essay excerpt, a paragraph of curator notes — each in its own file or field. You'll feel the difference in draft quality.
Current, not archival. Out-of-date attributions, old deaccessions, superseded labels — these get drafted into tours if they're in the source set. An audit before the first draft saves a lot of editing on the back end.
Voice samples. A paragraph or two of writing in the institution's intended voice — a recent wall card, a past audio script, a director's framing — lets the draft target the right register.
A note on what's off-limits. Living artists' wishes, contested provenance language, donor sensitivities. Spell out the constraints up front.

Teams that take a week to clean up reference materials before generating the first draft routinely save themselves a month of editing later. This is the cheapest leverage in the whole workflow.

What does the review workflow actually need to look like?

Three statuses — draft, review, published — with a defined owner at each stage, and an audit trail of who changed what when. Anything less collapses into "whoever opened it last." Anything more becomes ceremonial and people start routing around it.

The minimum viable review workflow for a working program:

Draft. A curator (or the platform on their behalf) produces a first pass. Visible only inside the institution. Edits don't need approval at this stage; iteration is cheap.
Review. A second pair of eyes — an interpretation lead, an editor, a head of curatorial — reads the draft against the source materials and against the institutional voice. Comments, requested changes, sign-off.
Published. Live to visitors via QR. Every change after this point is logged with who, when, and (ideally) why.

The platform's job is to make these statuses real — not folders, but states with permissions and an audit log. The institution's job is to decide which roles trigger which transitions, and to honor that decision under deadline pressure. The temptation, especially on small teams, is to skip the review step for "small" edits. That's how typos and accidental factual changes ship.

For more on the operational discipline that comes with same-day editing, the spoke on same-day museum tour updates goes deeper on version history, audit logs, and rollback.

How often should tours get updated?

Often enough that nothing wrong stays live for long, rarely enough that the canonical tour doesn't drift week to week. The cadence most working institutions converge on is "weekly drift, daily fixes" — rolling improvements to long-running tours plus same-day correction of anything factually wrong.

A useful way to think about update cadence is to separate three classes of change:

Factual corrections. Wrong date, wrong attribution, deaccessioned object, updated label. These should ship the day they're identified. There's no editorial argument for letting a known-wrong line stay live.
Editorial improvements. A better introduction, a smoother transition, a clearer phrasing. These are valuable but not urgent. Batching them weekly or biweekly avoids visitor whiplash on tours people are coming back to.
Structural changes. New stops, reordered tours, reframed narratives. These warrant the full review cycle — draft, review, publish — and should be treated as small projects, not edits.

The thing the platform makes possible is that any of these can happen without re-booking a studio. The thing the institution still owns is whether it should. On a healthy program, the curator's editorial judgment about what's worth changing is the binding constraint — not the production schedule, not the budget, not the calendar.

How do you manage multiple tours at once?

The mental model that works is "tour as object, not project." A tour is a thing the institution maintains, with a defined owner, an editorial state, an analytics profile, and a place in the visitor's wayfinding. Once you have three or four tours running — a permanent collection, a current special exhibition, a children's track, a Spanish-language adaptation — the management discipline starts to matter as much as the production discipline.

The pieces that tend to scale poorly without intention:

Owners. Every live tour should have one named curator-of-record. Without that, "everyone's job" becomes "nobody's job" within a quarter.
Status visibility. Which tours are draft, which are in review, which are live. A dashboard, not a spreadsheet.
Voice consistency. As more tours go live under more curators, the institutional voice drifts unless someone is paying attention. A short style guide and a designated voice owner — usually the interpretation lead — keeps it coherent.
Naming and wayfinding. What visitors see on the wall and in the player should be legible. "Permanent Collection — North Wing," not "Tour A v3."
Decommissioning. Tours for past exhibitions, deprecated narratives, old language tracks. A clear process for taking things down keeps the visitor-facing program honest.

Most platforms in the category, ours included, expose this as a tour list with statuses and per-tour analytics. The discipline isn't the tooling; it's the editorial governance the tooling enables.

What changes when you're running tours across multiple sites?

The cost math collapses across sites the same way it collapses across languages — but the governance question gets harder. A historic-house network, a regional museum system, a university museum with branch sites: in the legacy model, per-site audio production cost was the binding constraint. In the AI model, the binding constraint is whether the program holds together across sites that operate semi-independently.

The questions that surface in every multi-site rollout I've seen:

Shared scripts or per-site? Some content travels (the institutional history, the common interpretive frame). Some doesn't (this gallery's specific objects). Deciding which is which is the work.
Central editorial control or distributed? Either model can work; the failure mode is having neither.
Per-site branding inside one institutional shell? Visitors should know which site they're at, while feeling the program is part of one institution.
Cross-site analytics. What's working at one location that should travel to another. What's failing at all sites and points to a frame that needs to be rethought centrally.

The dedicated spoke — multi-site museum tour management — walks through governance models, what should be shared and what shouldn't, and where the multi-site case breaks down. It's the next read if your institution has more than one address.

What about temporary and traveling exhibitions?

This is where the AI platform's economics most obviously dominate the legacy model. Traditional audio production timelines run on the order of months end-to-end for a custom mobile guide, which is simply not compatible with exhibitions that run twelve to sixteen weeks. The result, historically, was that temporary exhibitions either shipped without audio or borrowed a half-hearted track from a previous show.

On AI platforms, the temporary-exhibition workflow is the same lifecycle compressed:

Source materials from the exhibition team's existing labels and essays
Draft, review, voice, publish — typically inside two weeks of opening, sometimes less
Updates ship the same day if a curator catches something during install
When the show closes, the curator takes the tour down; nothing to recover from the floor

Traveling exhibitions add the wrinkle of multiple venue stops with different floor plans, slightly different selections, and venue-specific addenda (director welcomes, sponsor walls, gallery wayfinding). The shared-scripts/site-addenda pattern that works for multi-site institutions applies cleanly to traveling shows too. The deeper treatment is the spoke on audio guides for traveling exhibitions.

The one operational caution worth naming: the speed of production tempts institutions to skip the curator-review step on temporary shows because "it's only twelve weeks anyway." That's the wrong inference. The standard for what reaches a visitor should be the same regardless of how long the show is up.

How do you handle outdoor, walking, and campus tours?

The lifecycle is the same; the wayfinding and trigger model change. Sources in, draft, review, voice, publish, update — that applies cleanly to a campus tour, a walking heritage trail, or a city tour. What changes is how visitors reach each stop: GPS triggers, posted QR codes, or numbered self-paced wayfinding. GPS works for outdoor stops with clear sky and broad spacing; it fails for closely-spaced urban stops. QR-per-stop is more work to install but more reliable. The curator-side workflow is identical regardless of trigger. The spoke on audio tours for walking, campus, and city contexts walks through weather, group sync, and how the specialist outdoor-tour apps compare.

What does this make harder, not easier?

Honest section. Most of the category's marketing focuses on what AI makes easier; the operational reality is that it shifts work, doesn't eliminate it. Three things consistently get harder, or at least more intense, on AI-narrated programs:

Curator review attention. When production is cheap, more drafts land in the curator's inbox. The temptation is to push more tours through more quickly — and the editorial standard slips. Maintaining quality requires consciously not producing every tour the platform makes possible.
Source-material organization. Bad inputs were forgiven on the legacy model because a human scriptwriter would shape them anyway. On AI platforms, messy inputs produce messy drafts. The organizational work that used to be optional is now load-bearing.
Visitor-question signal triage. On conversational platforms (Convo is one), visitors ask questions, and that signal is genuinely useful — but only if someone is reading and acting on it. Aggregated questions across a museum are insight; ignored, they're just data.

There's also a quieter cost worth naming: the fluency of an AI draft can mask thin source material. A curator reading a smooth draft about an under-researched object can mistake the draft's confidence for the institution's. The fix is the same fix the field has always relied on — careful editing against primary sources — but the failure mode is new.

Where this operational model doesn't fit

A few cases where the operational shape of the AI-narrated model is the wrong fit, even if the production economics look good:

Institutions with no curatorial capacity to review. The model assumes a curator-as-editor in the loop. If there's no one to play that role, the platform's speed becomes a liability, not an asset.
Programs where the audio guide is the artifact. A named-voice tour, an artist-commissioned audio piece, an oral-history project. The lifecycle above is the wrong lifecycle for these.
Institutions with policy prohibitions on generative tooling. Funder agreements, state contracts, and some grant terms explicitly bar generative AI in published material. Read your agreements.
Very small, never-changing collections. Twelve objects, fixed wall text, no rotation. The operational benefits of "easy to update" don't apply when there's nothing to update.

For everything else — most working museums, most institutions running multiple tours, most multi-site programs, most temporary-exhibition calendars — the operational model holds.

Frequently asked questions

How long does it take to launch a first tour?

A small single-language tour — a dozen stops, well-organized source materials, one curator on it — can ship in days. A full permanent-collection rollout in one language usually takes one to three months of part-time curator review. Multilingual versions add review time per language but no additional production time on the platform side. The bottleneck is the editorial review pass, not the production tooling.

How many people does the interpretation team need to run a program like this?

The minimum viable team is one curator with edit-and-approve authority. Working programs usually have at least two — a curator-of-record and a reviewer — plus access to a language-checker per non-English track. Larger institutions assign a coordination role across tours once they have more than three or four live. None of this is roles you have to hire; it's roles the existing interpretation team takes on.

How is this different from a vendor agency producing tours for us?

A traditional vendor relationship typically prices per tour, owns the production pipeline, and ships an artifact you license. The AI platform model prices as SaaS, hands the production controls to your team, and keeps the institution as the editor of record on every change. The legacy model is "we make tours for you." The AI model is "you make tours, faster, with our tooling."

Who owns the visitor data?

In a reasonable contract, the institution does. Tour-completion data, popular-stop data, visitor-question themes, and language-distribution analytics belong to the museum. The platform processes them on the institution's behalf. Get this in writing before signing — vendors who hedge here are the wrong partners.

What happens to the program if we change platforms?

Ask in the demo. Reasonable vendors export your scripts, your audio files, and your analytics on request. Vendors who make this hard are betting on lock-in. Keep the source materials — catalog notes, wall text, exhibition essays — under your control and treat the platform as the production layer on top, not the system of record.

Continue reading

The four spokes in this pillar go deeper than the hub does:

Same-day museum tour updates — what changes about content cadence, and the operational discipline same-day editing requires.
Audio guides for traveling exhibitions — how short-run shows fit the AI model, and how shared content across venues works.
Multi-site museum tour management — governance, per-site branding, and cross-site analytics for institutions with more than one address.
Audio tours for walking, campus, and city contexts — what changes operationally when the tour leaves the building.

For the category-level argument these operational pieces sit on top of, start with the AI audio guides for museums hub. For the curator-facing product story, the curators page is the shortest version. For the founder's case for why "the audio guide is not the product" anymore, see the note on that subject. For recent product shifts, the changelog is the running record.

About the author

Eric Duffy is the founder of Convo, a platform that lets museums and cultural institutions publish multilingual audio tours their visitors can have a conversation with. He writes about how museums could afford to be more ambitious with interpretation, drawing on discovery conversations with curators, directors, and education leads at small and mid-size US museums. Reach him at eric@convo.app or on LinkedIn.

Building and running audio tours without a studio.

What does the lifecycle of an AI-narrated tour actually look like?

What does a curator's week actually look like on this kind of program?

How should reference materials be prepared?

What does the review workflow actually need to look like?

How often should tours get updated?

How do you manage multiple tours at once?

What changes when you're running tours across multiple sites?

What about temporary and traveling exhibitions?

How do you handle outdoor, walking, and campus tours?

What does this make harder, not easier?

Where this operational model doesn't fit

Frequently asked questions

How long does it take to launch a first tour?

How many people does the interpretation team need to run a program like this?

How is this different from a vendor agency producing tours for us?

Who owns the visitor data?

What happens to the program if we change platforms?

Continue reading

About the author

More in operations.

Audio guides for temporary and traveling exhibitions.

Audio tours for walking tours, campus tours, and city tours.

Multi-site and multi-venue audio tour management.

Same-day tour updates: what AI changes about content cadence.

Pick one gallery.
Give us two weeks.

Building and running audio tours without a studio.

What does the lifecycle of an AI-narrated tour actually look like?

What does a curator's week actually look like on this kind of program?

How should reference materials be prepared?

What does the review workflow actually need to look like?

How often should tours get updated?

How do you manage multiple tours at once?

What changes when you're running tours across multiple sites?

What about temporary and traveling exhibitions?

How do you handle outdoor, walking, and campus tours?

What does this make harder, not easier?

Where this operational model doesn't fit

Frequently asked questions

How long does it take to launch a first tour?

How many people does the interpretation team need to run a program like this?

How is this different from a vendor agency producing tours for us?

Who owns the visitor data?

What happens to the program if we change platforms?

Continue reading

About the author

More in operations.

Audio guides for temporary and traveling exhibitions.

Audio tours for walking tours, campus tours, and city tours.

Multi-site and multi-venue audio tour management.

Same-day tour updates: what AI changes about content cadence.

Pick one gallery.Give us two weeks.

Pick one gallery.
Give us two weeks.