BACK TO OPERATIONS
PILLAR 06 · OPERATIONS

Audio guides for temporary and traveling exhibitions.

Why the studio-production math never penciled out for a ten-week show, how AI-narrated guides change the economics, and how to handle shared content, venue addenda, rights, and language regions on a touring exhibition.

ERIC DUFFY·FOUNDER·11 MIN READ·UPDATED 2026-05-29

For most of the last fifty years, audio guides have been built for permanent collections. The studio model — write, cast, record, master, press to handsets — assumed a tour that would justify the six-figure production line by running for a decade. Temporary and traveling exhibitions sit at the opposite end of that math: a twelve-week run, three to eight host venues, two or three languages required at each stop, and a curatorial team that closes the show before the studio bill is paid down.

The result, until very recently, was that most temporary and traveling shows shipped without audio at all. Wall text and a docent program, if you were lucky. A printed catalog and a pamphlet, if you weren't. The visitor who wanted to listen — the visitor who could have heard the curator's voice on the work in front of them — got silence.

That's the gap the AI-narrated model fills almost by accident. It isn't that the technology is impressive (it is, but that's not the point). It's that the production math now works for shows the studio model could never underwrite. This piece is for exhibition designers, curators, and registrars at organizing institutions and host venues sizing up how to handle audio on a traveling show.

I'm Eric Duffy, founder of Convo. I have a horse in the race. The patterns below come from the touring-exhibition conversations I've had with curators on both sides of the loan — organizing institution and host venue — and from the production math we publish in full on our pricing page.

Why the traditional audio-guide model fails for short runs

The simplest framing: the traditional model needs a tour to run for a decade to pay for itself, and most temporary exhibitions don't run for a quarter. A studio-produced tour for a special exhibition typically lands in the same $30,000–$150,000 production window as a permanent-collection tour, because the line items are the same — scriptwriting, voice casting, studio time, edit, master, multilingual versioning. The cost doesn't drop just because the show is short.

What does drop is the denominator. A permanent-collection tour amortizes across ten years of visitors. A temporary exhibition amortizes across ten to sixteen weeks. The per-visitor cost of a $100,000 studio production on a ten-week show that draws 60,000 visitors is north of $1.50 per ear — before handset rental, before translation, before any updates. That's the operating math, and it's why most temporary exhibitions historically didn't get audio at all.

The traveling case is worse. A loaned show with five host venues splits the production cost across the participating institutions, but the production cycle still has to complete before the show opens at venue one. Six to eighteen months of studio work, then a twelve-week run at each stop, then the show closes and the audio retires with it. The math reads more like a film production than a museum operating line — which is roughly how the legacy touring-exhibition producers have always priced it.

The AI-narrated model collapses both halves of this. The production cycle is measured in weeks, not quarters. The cost is a software subscription — Convo's Institution tier at $3,500/month — that covers unlimited tours, including the temporary ones. The denominator question stops mattering: if a show only runs for ten weeks, the audio layer cost the same as a permanent tour would have. For a fuller treatment of the production-math comparison, see our spoke on AI audio guides vs traditional audio guides.

What's a realistic tour-stop length and venue count for a traveling show?

Most traveling exhibitions run twelve weeks per stop and visit three to eight host venues over eighteen to thirty-six months. The Smithsonian American Art Museum's traveling program — which has been touring shows for more than seventy years — publishes twelve weeks as the standard scheduling block per venue, "allowing a few weeks between venues for packing and shipping." Smaller programs like Mid-America Arts Alliance and Museum on Main Street run shorter rotations of six to eight weeks per stop.

The total tour arc is usually two to three years from first opening to final closing. That's the window your audio guide has to be live, updatable, and supported across multiple host venues, each with their own gallery layout, sponsor list, and local rights environment. A short-run, single-venue special exhibition is the easy case. A traveling exhibition is the case that genuinely tests an audio-guide platform's operational model.

The pattern we see most often on the call: an organizing institution wants the same core narration to ship to every venue, but each host museum wants to customize the opening, the sponsor acknowledgment, and a small number of stops where their own collection contextualizes the loan. That hybrid — shared core plus per-venue addendum — is the architecture worth designing for from day one.

How do you share content across multiple venue stops?

Think of the tour as a single canonical script with venue-specific overlays, not as eight separate productions. The organizing institution authors the core tour: the curator's introduction to the show, the stop-by-stop narration on each loaned work, the closing reflection. That core is the artifact that travels.

Each host venue then layers in three kinds of overlay:

  • A welcome. Usually thirty to ninety seconds from the host director or curator, framing why the museum is hosting the show and what local connection the visitor should bring to it.
  • A sponsor acknowledgment. Required by most exhibition agreements when the host institution has its own underwriters. Often a single audio stop at the entrance to the show.
  • A small number of replacement or supplementary stops — usually one to four — where the host's own collection sits adjacent to the touring work and benefits from being framed inside the same tour.

On a software platform, the overlay model is straightforward: the organizing institution authors the master tour once, and each host venue's edits to the welcome, sponsor stops, and supplementary content layer on top without disturbing the canonical script.

On a traditional handset system, the same operation requires re-pressing the audio files, re-uploading to each host venue's hardware, and managing version control across a fleet of devices that may or may not be the same model at each stop. The reason most traveling shows shipped without audio is that this operational overhead was never worth it for a twelve-week run.

How do you handle venue-specific addenda — director welcomes, sponsor walls, gallery wayfinding?

Treat the addenda as first-class stops on the tour, not as afterthoughts. The host venue's welcome is often the first audio a visitor hears, and it's the place where the host's institutional voice gets to set the frame for the loaned content. It deserves the same drafting and review process as the core tour stops, not a last-week scramble.

A useful pattern: the organizing institution provides a one-page brief for the host venue that defines what's required, what's optional, and what's prohibited.

  • Required. Sponsor acknowledgment (where contractually obligated), accessibility-compliant transcript, languages matched to the core tour.
  • Optional. Director or curator welcome, supplementary stops on adjacent permanent-collection works, additional language layers beyond the contractually required minimum, host-specific accessibility additions.
  • Prohibited. Edits to the core tour scripts themselves, voice changes on the core narration, removal of curator attributions, changes to image rights credits.

The point of writing this down is that traveling exhibitions tend to develop their own informal conventions per venue, and without a brief the third stop on the tour ends up with a fundamentally different visitor experience than the first. The brief is also where the licensing and rights questions get answered before each opening, which matters for the next section.

What licensing and rights questions surface when content travels?

The short answer: every right that applies to the show on its home turf has to travel with it, and the audio guide is usually the layer where the gaps show. Touring exhibitions are layered rights instruments: the organizing institution holds licenses for the loaned works, the audio scripts, the translated versions, the recorded voices, and (where applicable) any music or third-party media used in the experience. Each of those rights has to be cleared for each host venue's geography and use.

The categories that most often trip up touring audio:

  • Image and content rights for stop subjects. The audio guide identifies a work, frames it, and may reproduce supporting images in a digital companion. Rights granted at the originating museum don't automatically extend to every host venue, especially in different jurisdictions.
  • Music synchronization and performance rights. If the audio guide uses recorded music — even thirty seconds under a curator's intro — the synchronization and public performance rights have to be cleared per venue. PRS, ASCAP, BMI, and equivalent foreign PROs all run their own per-territory licensing.
  • Translation rights. A translated script is itself a copyrightable derivative work. If the organizing institution commissioned a French translation for the home opening, that translation's licensing usually allows the show to ship the same French version with the tour — but only if the translator's contract said so. We've seen tours stuck mid-rotation because the translation rights ended at the originating venue.
  • Recorded voice rights. A studio recording from a named voice actor typically licenses for a specific tour run, often by venue or by region. A generated voice on a platform you control sidesteps that line entirely — the voice rights are part of the platform license, not a separate clearance per stop.
  • Sponsor and credit obligations. Each host venue's contract with its sponsors may require specific audio acknowledgments; the organizing institution's contract with its lenders may require specific curator and donor credits. Both have to ship.

The practical move on a traveling show is a rights checklist that runs alongside the exhibition checklist. Every audio stop gets a row: subject rights, music rights, translation rights, voice rights, sponsor or credit obligations. If any cell is unresolved, the stop doesn't ship.

What changes when the exhibition crosses language regions?

The audio guide is almost always the cheapest layer of a traveling exhibition to localize — and on an AI platform, the marginal cost of a new language is closer to zero than to anything. A traveling show that opened in Berlin and lands at the Met in October has to add at least English; the same show landing at the Pinacoteca in São Paulo a year later has to add Portuguese; the leg at Mori Art Museum in Tokyo adds Japanese.

On the studio model, each new language is a line item: translation at roughly twelve to thirty cents per word, voice casting at native rates per language, studio booking, edit and master. Conservatively, $5,000 to $15,000 per language per tour, and several weeks of lead time before the venue opens. For a three-language home tour expanding to seven languages across the tour arc, that's a six-figure localization budget on top of the original production.

On an AI platform, the same operation is a software task. Convo regenerates and revoices across all ten of our supported languages — English, Spanish, French, German, Italian, Portuguese, Chinese, Japanese, Korean, Arabic — from a single approved English script in roughly sixty seconds. The host venue at any language region adds the languages it needs, edits the welcome and sponsor stops in those languages, and opens. There is no per-language translation budget on the host's side; there is no per-language studio lead time.

The strategic implication: a traveling show on an AI platform can credibly market itself as a multilingual program at every host venue, in the languages each host's audience actually speaks. The American Alliance of Museums has written about why this matters — roughly 20% of the US population speaks a language other than English at home, with the share substantially higher in major museum cities. For a discussion of how multi-site institutions handle the same coordination problem across permanent sites rather than touring shows, see our spoke on multi-site museum tour management.

Where this pattern doesn't fit

The honest section. Cases where I'd tell an exhibition team to do something else:

Shows where the named voice is part of the production. If the organizing curator has commissioned a specific actor, scholar, or artist to narrate, that voice is the curatorial choice and travels with the show as a fixed audio asset. The AI platform isn't the production tool; it might be the delivery layer, but the narration itself is what it is.

Very small shows where the docent program already covers the content. If a host venue is bringing five docents through a single-gallery loan exhibition twice a day and the visitor flow is already covered, the audio guide may be operational overhead the show doesn't need. Use the time on a stronger printed brochure instead.

Shows landing at venues with existing handset contracts. If a host venue is mid-contract with a legacy hardware-and-content vendor and that contract requires their hardware for any exhibition audio, the practical move is to ship through that channel for the duration of that stop. The traveling exhibition shouldn't be the renegotiation event.

Shows where the rights chain genuinely won't clear. If the organizing institution holds image, music, or translation rights that are explicitly non-transferable to host venues in other jurisdictions, the audio either doesn't travel or has to be re-cleared per stop. Sometimes the rights answer is no, and the show ships visual-only.

For everything else — which is most temporary and traveling shows — the audio layer is now feasible at a budget level the studio model never reached. The constraint that determined which shows got audio at all has loosened; the curatorial question is now what to put on it.

What's the right next step?

If you're an organizing institution planning a touring show, the most useful early move is to decide whether the audio guide ships as part of the loan package or as an optional add-on host venues can subscribe to themselves. Either model works on a software platform; the right answer depends on your control posture and your contract with the host venues. If you're a host venue picking up a touring show, the question is whether to layer the show's audio onto your existing platform — most platforms support a per-tour subscription — or to publish through the organizing institution's instance.

Either way, the time horizon for getting audio live on a traveling show is now weeks rather than quarters. That changes the planning conversation. Convo's pricing covers unlimited tours including temporary ones, and the pilot tier is free for the first show. If you want the broader operations view, the operations hub is the parent guide.

FAQ

Yes, and that's the usual pattern. The organizing institution authors a canonical tour, each host venue clones it, customizes the welcome and sponsor stops, and prints its own QR codes. The visitor at venue three scans the host's code; the content underneath is the same core tour plus the host's overlays. The QR code is a venue-specific entry point; the content underneath is a shared instrument.

Both, in defined slices. The organizing institution owns the core tour content and the voice and register of the show. The host venue owns the welcome, the sponsor acknowledgment, and any supplementary stops that contextualize the loan with the host's permanent collection. A one-page brief written by the organizing institution defines what's editable, what isn't, and what's required at each stop.

On a studio-produced tour, the answer is usually two or three because that's all the budget will reach. On an AI platform, the realistic answer is the languages each host venue's audience actually speaks — typically English plus the host country's primary language, plus one or two tourist languages. The marginal cost of an additional language is close to zero, so the constraint moves from "what can we afford" to "what's the curatorial register we want to localize into."

Each music cue or third-party media segment needs synchronization and public performance rights cleared per host venue's territory. PRS in the UK, ASCAP and BMI in the US, JASRAC in Japan, GEMA in Germany. The practical implication is that the audio guide team should build a music cue list during production and have the organizing institution's rights officer clear each cue per host territory before each opening.

A studio-produced tour usually retires with the show — the recordings are archived. A software-produced tour can be archived the same way. The post-tour life is a conversation with the organizing institution about archival rights, and platforms differ in how they support archived versus active tours.

Usually yes, provided the organizing institution's brief allows it and the host's additions are clearly marked as host-curated. The pattern is a small number of supplementary stops — one to four — placed at galleries adjacent to the touring show where the host's own works contextualize the loaned content. Most organizing curators welcome this when it serves the show; some specifically prohibit it. Get the answer in writing during contracting.

It varies. Increasingly, organizing institutions include the audio guide in the loan package as a fixed asset that ships with the show, in the same way the wall-text design and graphic identity do. Some institutions still treat it as an optional add-on each host venue contracts for separately. The first model is becoming standard for shows where audio is part of the curatorial intent; the second is common for shows where the host venues have heterogeneous audio platforms.

The shorter version

Temporary and traveling exhibitions are the use case AI-narrated audio guides serve almost without trying. The studio-production economics never worked for a twelve-week run, and the operational overhead of pressing audio to handsets across multiple host venues was rarely worth the math. A software-produced guide on the visitor's own phone changes both halves: the production cost amortizes across the platform subscription rather than per show, and a shared core tour with venue-specific overlays takes minutes to publish at each new stop.

The remaining work is the work that doesn't compress: curatorial judgment, editorial review, rights clearance per host territory, sponsor and credit obligations per venue. None of that goes away. What goes away is the reason most traveling shows shipped without audio at all.

If you're planning a touring show, pricing is published in full and the pilot tier is free. If you want the broader operations context, the operations hub is the parent guide; if you want the production-math case for AI versus studio production, the AI vs. traditional spoke is the deeper dive.


About the author

Eric Duffy is the founder of Convo, a platform that lets museums and cultural institutions publish multilingual audio tours their visitors can have a conversation with. He writes about the economics of museum interpretation from inside the category — drawing on RFP data, discovery calls with curators and directors, and the production economics of both the studio-and-handset model and the AI-narrated model. Reach him at eric@convo.app or on LinkedIn.

WHAT WE’RE ASKING

Pick one gallery.
Give us two weeks.