How much does a museum audio guide actually cost in 2026?

KEY TAKEAWAYS

For a phone-based AI audio guide platform, the realistic 2026 range is $6,000–$50,000 a year all-in depending on institution size — subscription plus light staff time, no hardware, no per-language production line. Convo's published Studio tier sits at $7,200/year ($6,000 billed annually); Institution is quoted to the institution's size by annual attendance.
For a traditional studio-and-handset program, expect $30,000–$150,000 per tour before translation — the range we see consistently in RFPs and studio quotes from the legacy studio-and-handset vendors — with another meaningful five-figure line per additional language at the rates voice-over and dubbing studios charge in 2025.
Hardware is where the legacy model bleeds quietly. Rented handset fleets carry low five figures a year in cleaning, batteries, and breakage — operating expense that doesn't show up on the original quote.
The most common framing error in procurement: comparing one studio-produced English tour against one AI platform subscription. The honest comparison is one English tour against ten languages of broad permanent coverage. Different products at the same price point.
Three line items hide most of the surprise cost in legacy procurement: per-language production, content refresh cycles, and hardware logistics. AI platforms collapse all three into a flat subscription. Studio vendors keep them itemized, often in different statements of work.
A defensible 2026 audio guide budget for a small or mid-size US museum is closer to $20,000–$60,000 a year than the six-figure number that's been quoted for decades. If a vendor is quoting more than that and isn't doing custom native-app work, ask what you're paying for.

If you're putting together a 2026 audio guide budget, the number you should walk in with is lower than the number you walked in with in 2020, by roughly an order of magnitude. The studio-produced, handset-distributed model that defined museum audio interpretation for fifty years is no longer the only credible option, and the pricing of the newer option has settled enough that you can plan against it.

This piece is the hub for our Buying & cost pillar. I've tried to give you the procurement-grade version — actual numbers, where they come from, and where vendors (mine included) bury costs that don't make it into the headline quote. If you're sizing this up before an RFP, this is the piece I'd want to read first.

What does a museum audio guide actually cost in 2026?

The honest 2026 range, all-in, is roughly $15,000 to $200,000 a year depending on which model you choose. The bottom of the range is a phone-based AI platform subscription with no hardware; the top is a studio-produced tour with a rented handset fleet and three languages. The midpoint depends almost entirely on three decisions: studio or AI, hardware or BYOD, and how many languages you ship.

A defensible budget for a small or mid-size US museum looking to launch in 2026 is closer to $20,000–$60,000 a year than to the six-figure quotes that dominated the category for the last two decades. That budget assumes a phone-based AI platform, multilingual default, ten to a hundred tour stops, no rented hardware, and roughly 20–40 hours of curator review time over the launch period. Push the budget up if you need a named voice actor, custom hardware, or a white-labeled native app; push it down if you're piloting a single tour in one language before committing.

The rest of this piece breaks down where the money actually goes — and where it used to go — line by line.

What goes into a traditional studio-produced audio guide quote?

A 2026 quote from a legacy studio-and-handset vendor usually breaks down into five line items: scripting, voice talent, studio time, post-production, and hardware. Across the RFPs and studio quotes we have reviewed, the all-in production cost of a traditional museum audio tour lands at $30,000–$150,000 per tour before hardware — and it matches what curators consistently tell us they were quoted in their own procurement processes. Custom native-app development on top of that adds another order of magnitude.

A rough decomposition of where that money goes:

Scriptwriting. $150–$300 per finished minute, often outsourced to a specialist writer or in-house curatorial team time.
Voice talent. Audiobook union and non-union floor rates run $200–$275 per finished hour per the Voice Over Resource Guide, with experienced SAG-AFTRA narrators at session rates of $300–$700 per hour. Museum tours sit in this category. (SAG-AFTRA audiobook agreements; GVAA Rate Guide.)
Studio time. $150–$250 per hour with at least a 2:1 studio-to-finished ratio. A 60-minute tour takes 120+ studio hours.
Post-production. Editing, sound design, mastering, and quality control add 30–50% on top of recording costs.
Hardware (if applicable). A rented handset fleet for a mid-size institution typically runs tens of thousands of dollars in Year 1 capital expenditure plus a low-five-figure operating line in Year 1, dominated by cleaning labor.

What's missing from the quote, and almost always added later: per-language reproduction, content refresh cycles every 18–36 months, and the handset upkeep nobody likes to model. Which is why the legacy number you remember (a $30,000 tour, say) almost always became a $60,000 tour by Year 2.

What does a phone-based AI audio guide platform cost?

The phone-based AI model collapses scripting, voicing, translation, and updates into a software subscription. The shape is now standardized across newer platforms: a monthly or annual fee, scaled by institutional size or visitor volume, with unlimited tours, languages, and edits included on most paid tiers.

Some published 2026 reference points:

Convo — Studio at $600/month (self-serve); Institution quoted to the institution's size. Every plan starts with a free 30-day pilot running the full feature set. All plans include up to ten active languages (from 40+) and unlimited edits.
Other AI-narrated platforms — most publish subscription pricing in a similar shape, with floor tiers from tens of dollars per month for small institutions and institutional tiers in the low-thousands per month. Custom-content services, app store submission, and concierge work are typically billed separately.
Philanthropy-funded platforms (Bloomberg Connects) — free to qualifying museums and cultural organizations. The model is philanthropic; the trade-offs are editorial review by the platform and a shared-app distribution model rather than your own branded experience. We cover the trade-offs in detail in Convo vs Bloomberg Connects.

The all-in 2026 number for a phone-based AI program at a small or mid-size institution typically lands between $15,000 and $50,000 a year, including subscription, curator review time (the real labor line), QR signage, and a small loaner-phone fleet if you choose to offer one. There's no studio bill, no per-language recording line, and no handset CapEx.

For the dimension-by-dimension comparison against the legacy model — voice quality, control, accessibility, hardware — see the spoke on AI audio guide vs traditional audio guide. The cost story is one chapter of a larger comparison.

What do additional languages actually add to the cost?

On the legacy model, every additional language is roughly 50–80% of the cost of the original English production. A new translation has to be commissioned, a native voice has to be cast, a studio has to be booked, and the audio has to be edited and mastered. The studio side of the math is consistent across vendor disclosures; the translation side is the part with public benchmarks.

Translation alone — before voicing — is its own line item. The Slator 2024 market report on translation pricing bands the relevant rates clearly: commodity content runs $0.03–$0.08 per source word, general business content $0.06–$0.12, and specialized content requiring subject-matter expertise — which is where museum and cultural interpretation sits — lands at $0.12–$0.30 per source word. The American Translators Association Compensation Survey (6th ed.) is the canonical industry source for per-language-pair rate tables within that band. A 60-minute audio tour at typical narration pace is roughly 8,000–10,000 words; translating it into Spanish, French, German, Mandarin, Japanese, and Korean adds roughly $5,000–$18,000 per language in translation cost alone, before any voice or studio time.

On AI platforms, the marginal cost of additional languages is effectively zero. Convo, as one example, includes all ten languages — English, Spanish, French, German, Italian, Portuguese, Chinese, Japanese, Korean, Arabic — on every paid tier from the same approved English source, re-voiced across all ten in roughly 60 seconds. The constraint shifts from production budget to editorial review: you still need a reviewer who can read the Mandarin track before it ships. But the studio-and-talent line is gone.

This is the dimension that has changed the most in the last three years, and it's the one that most often closes the gap between a $40,000 AI subscription and a $200,000 legacy program. For a deeper treatment, see Pillar 3 — Multilingual interpretation — when that hub publishes.

What about hardware — handsets, headphones, racks?

The 2025–2026 consensus across vendors is that rented handsets are now the accommodation, not the default. The visitor's own phone has won the BYOD argument, and the operational costs of a handset fleet are difficult to defend in a procurement review when most visitors are now actively choosing to use their own device.

The legacy handset Year-1 number for a mid-size museum stacks up to a tens-of-thousands-of-dollars capital expenditure plus a low-five-figure operating line — dominated by cleaning and turnaround labor (the equivalent of roughly a quarter of an FTE on a busy fleet), with batteries and consumables, breakage and loss at low-double-digit-percent of the fleet, and charger amortization stacked on top.

A BYOD-plus-cloud program produces a Year-1 cost roughly 80–90% lower than the legacy handset comparable. The shape is consistent with what museums report on procurement calls: the hardware line item is where legacy programs get expensive quietly, year after year.

A reasonable 2026 hardware budget for a phone-based program is $0 to $5,000 a year: zero if you commit fully to BYOD, $2,000–$5,000 if you keep a small loaner fleet of accessible phones at the front desk for visitors without smartphones or accessibility needs that benefit from a dedicated device. The remaining cases where a full handset fleet still makes sense are narrow: high-security environments where personal devices aren't allowed, accessibility-first programs where a tuned device is part of the offer, and existing hardware contracts that haven't expired.

For the broader BYOD-versus-handset trade-off, see Pillar 5 — Visitor experience — when that hub publishes.

Where do hidden costs live in audio guide procurement?

Most of the surprise in audio guide budgets isn't the headline quote — it's the line items that aren't on the headline quote. The four places to look hardest:

1. Content refresh. A studio-produced tour is effectively frozen on the day it ships. Updating a single attribution, fixing a factual error, or adding a stop for a new acquisition means re-booking the same talent (often impossible 18 months later), re-engineering the audio to match the original mix, and re-deploying. Most museums underbudget this line because they assume the tour will be edited; in practice, most legacy tours are not.

2. Per-language reproduction. Quoted as an add-on, almost always after the English production is approved. Vendors who quote a $40,000 English tour and a $25,000 Spanish version are quoting the same labor cost twice — translation aside, the studio booking and voice talent are both per-language.

3. Handset upkeep. The line that almost never appears on Year 1 quotes and appears on every Year 2 invoice. Batteries, cleaning labor, breakage, replacement units, charging racks. For a meaningful fleet this lands in the low five figures a year, dominated by cleaning labor.

4. Platform development. If you're going down the custom-native-app path, expect well into the six figures to launch and a meaningful five-figure annual line to maintain — separate from any audio production cost. Most museums don't need this; the SaaS platforms have eaten the case for custom development unless you have very specific brand or integration requirements.

The corollary: on a flat-fee SaaS platform, most of these lines disappear or fold into the subscription. The exposed risk shifts from quote-creep to vendor lock-in, which is a different problem and a more manageable one — covered in the spoke on audio guide pricing models when that publishes.

What's the five-year total cost of ownership?

The interesting number isn't Year 1 — it's Year 5, where content refreshes, hardware replacement, and per-language adds compound. A rough five-year model for a mid-size museum producing one 30-stop tour in English plus three additional languages:

Year	Traditional (studio + handset)	AI-narrated (Convo Studio, $600/mo)
Year 1	$80,000 production + $20,000 handsets	$7,200
Year 3	+ $25,000 (re-record one language, content update)	$7,200
Year 5	+ $40,000 (refresh + handset replacement)	$7,200
5-year total	~$165,000	~$36,000

The honest caveat: that AI column is the published Studio tier, which fits this exact scenario (a few tours, ten active languages). A larger institution running a bigger program with visitor voice Q&A belongs on a quoted Institution plan — sized to its attendance — which lands higher, though still well under the traditional column.

The deeper point is what the five-year number is buying. The legacy column buys four languages, one frozen tour, and a depreciating handset fleet. The AI column buys ten active languages, edits whenever you need them, and a platform that ships updates the same day. Different products at very different price points.

For the procurement-grade TCO model — with the assumptions, sensitivity ranges, and worked examples — see the spoke on museum audio guide total cost of ownership when that publishes.

How do you build a defensible audio guide budget?

The version of this answer I'd give a director sizing up a procurement: don't budget against what the category used to cost; budget against what your visitors now expect.

A defensible 2026 budget for a small or mid-size US museum has roughly this shape:

Platform. $15,000–$45,000/year for a phone-based AI platform with multilingual default and unlimited tours.
Curator review time. 20–60 hours of internal time over a launch period, depending on collection size. If billed at a fully-loaded $75–$150/hour, that's $1,500–$9,000.
Signage and QR design. $1,000–$5,000 for label-card redesign and printing.
Optional loaner-phone fleet. $2,000–$5,000 for a small front-desk fleet for accessibility and visitors without smartphones.
Optional translation review. $500–$3,000 per language if you want a native speaker reviewing each track before it ships. (The translation itself is in the platform; the review is what you're buying.)

That puts a realistic launch budget for a serious multilingual program at $20,000–$60,000 in Year 1, falling to $15,000–$45,000 in Year 2 and beyond once signage and review cycles are amortized. If a vendor is quoting meaningfully more than that and isn't doing custom native-app or hardware work, the question to ask is what you're paying for.

The opposite holds: if a vendor is quoting meaningfully less than that with a serious feature set, the question to ask is what they're not telling you. Free or near-free platforms typically trade away editorial control over the visitor experience or the long-term ability to migrate off the platform. Both are recoverable but worth being explicit about up front.

Where this doesn't apply

The procurement framing in this piece assumes you're sizing a real program — multiple tours, multilingual ambitions, ongoing updates. There are cases where it doesn't apply:

A single one-off tour with a named voice as the headline. If you've cast a celebrity ambassador or are using a specific donor or curator as the narrator, the production is the product. AI subscription pricing is the wrong frame; commission a studio production and pay for the voice.
A program that genuinely will not change for a decade. A permanent installation at a national monument, a fixed historical site whose interpretation is settled. Amortizing a one-time studio cost across ten years can still pencil out, and the update agility of a SaaS platform is a feature you'll never use.
An existing hardware contract that hasn't expired. Switching mid-contract usually doesn't pencil out. Ride out the contract, capture analytics on what's working, plan the migration for renewal.
A program where the production process is part of the institution's brand. Some institutions — the Met's audio guide, for example — have a production identity that is part of why the tour matters. For those institutions, the AI model is a complement to specific exhibits, not a replacement for the program.

For most institutions outside those four cases, the 2026 budget conversation is no longer "studio versus AI." It's "which AI platform, on what terms." That's a different procurement, and the rest of this pillar is built to help you run it.

Frequently asked questions

For a serious phone-based program at a small institution, $15,000–$20,000 a year covers a platform subscription, light staff time, and basic signage. Convo's pilot is free for thirty days with the full feature set, which is enough to ship one gallery end-to-end — and put real visitor data in front of you — before committing budget. Philanthropy-funded options like Bloomberg Connects are free for qualifying cultural organizations and a reasonable starting point if you're willing to trade branded control for zero cost — see our comparison page for the full trade-off.

On a legacy studio-produced tour, expect $5,000–$15,000 per additional language for translation, voice talent, studio time, editing, and mastering. On a phone-based AI platform, additional languages are typically included in the subscription — Convo ships ten from one source at no marginal cost. The editorial review of each language is the labor line you still need to budget, usually $500–$3,000 per language for a native-speaker review pass.

Yes, it's genuinely free to qualifying museums and cultural organizations, as a Bloomberg Philanthropies initiative. The trade-offs are that you publish inside the Bloomberg Connects app rather than under your own brand, your guide lives alongside 1,250+ others in a shared marketplace, and the editorial standards are set by Bloomberg Connects. For institutions that want a free, supported channel with no procurement, it's a legitimate option. For institutions that want their own branded visitor experience and ownership of visitor data, it isn't.

Most don't, in 2026. Convo's pricing tiers use visitor count as guidance for which tier fits, not as a hard meter — there's no per-visit charge once you're on a tier. Most other modern AI-narrated platforms use similar subscription models. The shift away from per-visit pricing is one of the structural changes from the per-tour studio era.

The real one is curator review time. Drafting is fast; reviewing and approving every line a visitor will hear still takes 20–60 hours of curator time for a meaningful tour, and that labor is the actual launch cost. The rest of the surprise costs — translation, voice, studio, hardware — are the ones AI platforms have actually solved.

On a studio model, expect $1,000–$5,000 per update cycle for a single stop revision, and $10,000–$30,000 for a meaningful wing refresh, because the studio and talent have to be rebooked. On a SaaS AI platform, updates are typically included — a curator edits the script in the admin and the new audio ships in seconds. The labor line shifts from production scheduling to curatorial decision-making, which is where it should have been all along.

Roughly $20,000–$40,000 all-in for a phone-based AI platform with multilingual default, one or two tours of the permanent collection, label-card redesign, QR signage, and a small loaner-phone fleet for accessibility. The same scope on a legacy studio model would run $80,000–$150,000 in Year 1 and continue to cost meaningful money every year after. The order-of-magnitude difference is the structural shift.

Different unit cost, different unit of value. A docent program costs $25–$75/hour in staff time and serves the share of visits that book a tour (typically 1–5%). Wall text is effectively free per visit but capped at what can fit on a label. An AI audio guide is a per-year platform cost that scales across every visitor who scans, in their own language, with optional follow-up questions. Most institutions run all three; the budgets don't substitute for each other.

Continue reading

For the dimension-by-dimension comparison of AI versus traditional audio guides — including the five-year cost model, voice quality discussion, and the cases where studio production still wins — see AI audio guide vs traditional audio guide.

For the category-level primer on what an AI audio guide is, how it's produced, and where it fits, see the AI audio guides pillar guide.

For a deeper take on why the audio guide isn't actually the product museums are buying — and what that means for procurement — see the note on the audio guide is not the product.

When the spokes in this pillar publish, the next reads will be audio guide pricing models (subscription vs per-tour vs hybrid) and museum audio guide total cost of ownership (the full five-year procurement model). Both are in progress.

If you want the numbers for our own platform, Convo's pricing is on a single page, including the free thirty-day pilot. The pilot is genuinely free — full feature set, one published tour, real visitor data — and it's the fastest way to put real numbers in front of your board against your own collection.

Sources and references

The numbers in this piece are anchored to public, non-competitor sources wherever possible. The main references:

Translation rates. Slator 2024 market report on translation pricing for the specialized-content band ($0.12–$0.30/word). American Translators Association Compensation Survey, 6th edition as the canonical industry source for per-language-pair rate tables.
Voice talent rates. Voice Over Resource Guide for the audiobook PFH band ($200–$275). SAG-AFTRA audiobook agreements for the union narration rate ranges. GVAA Rate Guide as the non-union industry-standard rate sheet.
US museum landscape. IMLS — 35,144 active US museums as the canonical national count. IMLS Museum Data Files for analytical disaggregation by discipline.
Visitor engagement and wall-text data. Beverly Serrell (1997), "Paying Attention: The Duration and Allocation of Visitors' Time in Museum Exhibitions," Curator: The Museum Journal 40(2): 108–125, and Serrell (2015), Exhibit Labels: An Interpretive Approach (2nd ed.). Stephen Bitgood (2013), Attention and Value: Keys to Understanding Museum Visitors. Falk & Dierking (2013/2016), The Museum Experience Revisited.
Audience demolinguistics. US Census Bureau ACS Language Use Tables (2017–2021 5-year). National Travel and Tourism Office 2024 international travel volumes for inbound tourism markets.

Convo-specific numbers (pricing tiers, ten-language coverage, the ~$30k–$150k per-tour traditional cost range) come from convo.app/pricing and convo.app/about.

About the author

Eric Duffy is the founder of Convo, a platform that lets museums and cultural institutions publish multilingual audio tours their visitors can have a conversation with. He writes about the economics of museum interpretation from inside the category — drawing on RFP data, discovery calls with curators and directors at small and mid-size US museums, and the production economics of both the studio-and-handset model and the AI-narrated model. Reach him at eric@convo.app or on LinkedIn.

How much does a museum audio guide actually cost in 2026?

What does a museum audio guide actually cost in 2026?

What goes into a traditional studio-produced audio guide quote?

What does a phone-based AI audio guide platform cost?

What do additional languages actually add to the cost?

What about hardware — handsets, headphones, racks?

Where do hidden costs live in audio guide procurement?

What's the five-year total cost of ownership?

How do you build a defensible audio guide budget?

Where this doesn't apply

Frequently asked questions

Continue reading

Sources and references

About the author

More in buying & cost.

Audio guide pricing models compared: per-visitor, subscription, hardware, and free.

What to put in a museum audio guide RFP.

The hidden costs of traditional audio guide production.

How to choose museum audio guide software: an evaluation checklist.

Total cost of ownership: hardware vs phone-based museum audio guides.

Pick one gallery.
Give us two weeks.

How much does a museum audio guide actually cost in 2026?

What does a museum audio guide actually cost in 2026?

What goes into a traditional studio-produced audio guide quote?

What does a phone-based AI audio guide platform cost?

What do additional languages actually add to the cost?

What about hardware — handsets, headphones, racks?

Where do hidden costs live in audio guide procurement?

What's the five-year total cost of ownership?

How do you build a defensible audio guide budget?

Where this doesn't apply

Frequently asked questions

Continue reading

Sources and references

About the author

More in buying & cost.

Audio guide pricing models compared: per-visitor, subscription, hardware, and free.

What to put in a museum audio guide RFP.

The hidden costs of traditional audio guide production.

How to choose museum audio guide software: an evaluation checklist.

Total cost of ownership: hardware vs phone-based museum audio guides.

Pick one gallery.Give us two weeks.

Pick one gallery.
Give us two weeks.