BACK TO BUYING & COST
PILLAR 02 · BUYING & COST

Audio guide pricing models compared: per-visitor, subscription, hardware, and free.

The four pricing shapes museums actually see in the audio-guide market — per-visitor, flat subscription, hardware rental plus content, and philanthropy-funded free — with honest tradeoffs and which one fits which institution.

ERIC DUFFY·FOUNDER·10 MIN READ·UPDATED 2026-05-29

If you've sat through more than two audio-guide vendor pitches, you've probably noticed that almost no two vendors price the same way. One charges per visitor. One charges a flat monthly subscription. One quotes hardware plus content as a single bundle. One is free.

This isn't accidental, and it isn't noise. Each pricing shape encodes a different theory of how an audio guide creates value, and each one fits a particular kind of institution badly when forced onto another. This piece walks through all four, with real numbers where vendors publish them, and an honest take on which institution each model actually serves.

I'm Eric Duffy, founder of Convo. We price as a flat subscription. I'll say where I think that's the right call and where it isn't.

At a glance: the four pricing shapes

| Model | Shape | What you pay | Best for | |---|---|---|---| | Per-visitor | Revenue-share or pay-per-use offered by some AI-narrated platforms | $2–$4 per visitor who uses the guide, or a revenue share on a visitor-paid price | Low-volume venues, paid-tour operators, museums uncertain about adoption | | Flat subscription | The dominant shape across modern AI-narrated platforms, including Convo | Tens to low thousands of dollars per month for unlimited use, often with tour and language caps at lower tiers | Mid- and high-volume museums; multi-site institutions; anyone who wants the marginal visitor to be free | | Hardware rental + content | The legacy studio-and-handset vendors | Per-device daily rate plus a one-time content production fee in the $30,000–$150,000 range (Convo) | Institutions where a specific named voice is the curatorial point; mid-contract handset deployments | | Free (philanthropy-funded) | Bloomberg Connects | $0 to the museum, $0 to the visitor | Institutions willing to publish inside a third-party app, give up branded distribution, and accept whatever cadence the platform sets |

The rest of this piece unpacks each row.

How does per-visitor (usage-based) pricing work?

Per-visitor pricing charges the museum, the visitor, or both each time a visitor actually uses the guide — usually in the range of $2–$4 per use, or as a revenue share on a visitor-paid ticket. A handful of AI-narrated platforms publish this model explicitly: museums can choose a revenue-share arrangement where the museum sets the visitor price and keeps the majority, a prepaid bulk-credit purchase for groups, or a pay-per-session arrangement scaled to AI inference cost. The shape shows up most often where a vendor's primary motion is paid-tour operators or low-volume venues rather than mid-size institutions.

The math is simple and self-balancing on the low end. A museum with 25,000 annual visitors and 35% adoption is looking at roughly $4,500–$9,000 per year in usage fees — well below most subscription platforms' floors. For a venue that genuinely doesn't know whether visitors will adopt the guide, that's the right risk profile: you don't pay for a tour nobody uses.

What it rewards: small venues, paid tours where the visitor underwrites the fee, and pilots where adoption is the unknown. What it punishes: success. A free museum at 500,000 annual visits with 40% guide adoption at $3 per use is looking at $600,000 a year — many times what a flat-subscription platform would charge. The very visitors a guide is supposed to serve become a cost line.

The honest read: per-visitor is a great way to start and a difficult way to scale. Most institutions that grow past 100,000 annual guide sessions eventually try to renegotiate to flat.

How does flat subscription pricing work?

A flat subscription charges a single monthly or annual fee — usually scaled to institution size, not visitor count — in exchange for unlimited tours, languages, and visitor sessions. This is now the dominant shape in the AI-narrated audio-guide category. Convo's pricing sits here: Pilot is free, Studio is $1,200/month, Institution is $3,500/month, Enterprise is custom, and at every paid tier all ten languages, unlimited tours, and unlimited visitor Q&A are included. Other modern platforms in the category publish similar shapes, ranging from low-tens-of-dollars-per-month at the small-institution floor to several thousand at the institutional tier.

The principle behind flat subscription is that the marginal visitor should be free. Once a museum has paid for the platform, every additional scan, every additional language, every additional tour stop costs the museum nothing incremental. That's the right incentive: the museum's interest (more visitors using the guide more deeply) and the platform's interest (a long-renewing subscription) line up.

What it rewards: high adoption, deep tours, multilingual reach, frequent updates. The institution that "uses" the subscription most pays the same as the institution that uses it least, which means the engaged customer wins the trade. What it punishes: low-volume venues where the subscription floor is larger than usage would justify. For a museum doing 5,000 annual visits, $14,400 a year at the Studio tier is real money against a usage pattern that a per-visitor model could service for a quarter of that.

The honest read: this is the right model for the median museum in our category — broadly accessible, not extractive at scale — but a small venue with low traffic should run the per-visitor math before signing.

How does hardware rental plus content pricing work?

The legacy handset model bundles two costs into one vendor relationship: a per-device daily or monthly rental fee for the physical guide hardware, plus a one-time content production budget in the $30,000–$150,000 range per tour before translation (Convo). The legacy studio-and-handset vendors all sit here. The published cost shape across the category is roughly low-hundreds-of-dollars per device at fleet purchase, $30,000–$150,000 for studio production of the content, then a per-language production multiplier on top.

The real cost most procurement teams underestimate is operational. Once you add cleaning labor (the equivalent of roughly a quarter of an FTE on a busy fleet), batteries and consumables, breakage and loss, and charger amortization, a mid-size handset fleet absorbs the low five figures a year in operating expense before any content updates. That's the line item that doesn't appear in the original RFP and shows up in the operating budget for the next decade.

What it rewards: the case where the production is the product. If your tour features a named voice — a curator, an artist, a celebrity ambassador — and that voice is the curatorial offer, studio production is the right answer. The legacy model is also defensible for institutions mid-contract on a handset fleet: switching before renewal usually doesn't pencil.

What it punishes: everything that has changed about museum visitors in the last decade. Multilingual reach is unaffordable because every language is another production line. Updates are essentially frozen because re-recording requires the original talent. Hardware logistics — charging, sanitization, theft, repair — are an ongoing operational burden that BYOD eliminates. For most institutions, the model now solves a problem they don't have at a cost they no longer need to absorb.

How does "free" (philanthropy-funded) pricing work?

Bloomberg Connects is the canonical example: a free-to-the-museum, free-to-the-visitor audio guide platform underwritten by Bloomberg Philanthropies as part of its arts-funding work. The platform currently hosts guides for over 1,400 cultural organizations in 47 countries, available in 57 languages. There is no subscription fee, no per-visitor charge, and no production budget extracted from the institution. Bloomberg pays for the platform; participating museums supply the content.

This is a serious offer and worth taking seriously. For a small museum with no audio-guide budget and no realistic path to one, Bloomberg Connects is often the best move available. The content production support is real, the distribution is real, and the price is what's printed on the tin.

The trade-offs are also real, and most institutions weighing the decision underestimate them:

  • You publish inside Bloomberg's app, not your own. Visitors download "Connects: Arts+Culture" from the App Store, not your branded experience. The museum's logo lives inside someone else's frame.
  • You publish on Bloomberg's editorial cadence. Updates, new tours, language priorities, feature roadmap — these are set by the platform's strategy, not yours. For an institution with frequent exhibition turnover, that latency matters.
  • You're tied to a single philanthropic funder. Bloomberg's commitment to the program is real and well-resourced, but it's not contractual to your institution. A future change in philanthropy strategy is a category risk that doesn't apply to a subscription vendor.

The honest read: for institutions where brand-on-glass and editorial control aren't priorities, Bloomberg Connects can be the best line in the budget. For institutions where the visitor experience is part of the curatorial identity, the trade-offs usually push toward a paid platform.

We've written a dedicated comparison on this trade-off in the Convo vs Bloomberg Connects piece if you want the side-by-side.

How should an institution choose between these four?

The decision falls out of three questions, in order: how many visitors do you expect to use the guide, what kind of voice and editorial control do you want, and what does your operations team have capacity to run?

  • If guide adoption is genuinely unknown and you're under 50,000 annual visits, per-visitor pricing matches the risk profile. Pay for sessions you actually serve.
  • If you're a mid-size or high-traffic museum with 50,000+ annual visits and you want the marginal visitor to be free, flat subscription is the right shape. The math improves the more visitors engage.
  • If a specific named voice is the curatorial offer — an artist-narrated tour, a celebrity ambassador, a board-prominent curator — studio production with hardware delivery is still the defensible answer for that specific tour, even if your other tours move to a phone-based platform.
  • If your audio-guide budget is genuinely zero and you're comfortable publishing inside a third-party app, Bloomberg Connects is a serious option. Read the editorial-control trade-offs carefully.

Most institutions we talk to land in the second bucket — flat subscription — and the question becomes which subscription platform, not which pricing model.

Where does Convo sit, and why?

We chose flat subscription for two reasons. First, the unit economics of AI-narrated tours don't behave like the unit economics of studio production. Once a museum has uploaded its source materials and approved its scripts, the marginal cost of serving an additional visitor — or generating an additional language — is small enough that charging per visitor would distort the relationship. A platform that bills per scan has a structural incentive to not help the institution drive adoption.

Second, the museums we talk to plan in operating-budget terms. A flat line that doesn't move when the museum has a busy quarter is easier to defend to a board than a variable bill that spikes the month a popular exhibition opens. "Visitor counts are guidance, not gates — we don't meter on visits" is on our pricing page for a reason; it's the structural choice we want our customers to feel.

Where we're honest about the trade-off: a museum doing 3,000 annual guide sessions could get the same outcome on a per-visitor platform for less. Our Pilot tier is free in part because we want low-volume institutions to use the platform without paying for capacity they don't need; the subscription kicks in when the institution outgrows that.

For a category-level comparison of how the AI-narrated platforms compare to the studio-and-handset model, see AI audio guide vs traditional audio guide. For the full set of buying-side resources, the pillar guide on buying and cost is the index.

FAQ

Usually, yes, up to a crossover point that's typically somewhere between 30,000 and 80,000 annual guide sessions depending on the per-use rate. Below that, per-visitor pricing wins on absolute dollars. Above it, flat subscription wins. The right move is to run the math against your actual traffic and adoption assumptions, not against a vendor's pitch.

Because the marginal cost of serving an extra visitor on a phone-based platform is low, and because flat subscription aligns the platform with the museum. A platform that wins when more visitors engage and a museum that wins when more visitors engage are easy partners. A platform that bills per scan has a structural incentive that points in a different direction.

It is genuinely free in cash terms — no subscription, no per-visitor fee, no production budget extracted from the institution. The costs are non-cash: you publish inside Bloomberg's app instead of your own, you publish on the platform's editorial cadence, and you're tied to a single philanthropic funder for the program's continuity. Whether those costs matter depends on the institution.

Most legacy vendors quote bundled — hardware plus content as a single multi-year contract — rather than publishing a per-device daily rate. The all-in operational cost of running a mid-size handset fleet typically lands in the low five figures a year before content updates (dominated by cleaning labor, with batteries, breakage, and charging amortization stacked on top), and that's the number procurement teams should pressure-test against any quoted bundle.

Yes, and a number of museums do exactly this. A common pattern is a subscription platform as the primary visitor experience and Bloomberg Connects as a distribution channel for select tours that benefit from the in-app reach. The platforms aren't mutually exclusive at the institution level; they're choices per tour.

These exist and they're a fifth shape, though usually as a feature inside a broader visitor-experience platform rather than as an audio-guide vendor in their own right. The trade-off is platform consolidation versus best-of-breed: a bundled solution is easier to manage; a specialized audio platform is usually deeper in the audio-specific features that matter (multilingual coverage, Q&A grounding, update latency).

The verdict

The pricing model a vendor picks isn't a billing detail — it's a signal about who they're built for. Per-visitor platforms are built for low-volume venues and paid-tour operators. Flat-subscription platforms are built for mid- and high-volume museums that want the marginal visitor to be free. Hardware-and-content vendors are built for the dwindling set of cases where a named voice and a permanent installation justify the production investment. And Bloomberg Connects is built for institutions willing to trade brand for genuine zero-dollar distribution.

There isn't a single right answer across the category. There is usually a single right answer for any given institution, and it falls out of the three questions above: expected adoption, editorial control, and operational capacity. Run those before the demo, and the vendor shortlist is shorter than it looked at the start of the procurement cycle.

If you want our take on the buying decision more broadly — RFP shape, total cost of ownership, evaluation criteria — the pillar guide on buying and cost is the place to start. If you want to see our pricing in full, it's published.


About the author

Eric Duffy is the founder of Convo, a platform that lets museums and cultural institutions publish multilingual audio tours their visitors can have a conversation with. He writes about the economics of museum interpretation from inside the category — drawing on RFP data, discovery calls with curators and directors, and the production economics of both the studio-and-handset model and the AI-narrated model. Reach him at eric@convo.app or on LinkedIn.

WHAT WE’RE ASKING

Pick one gallery.
Give us two weeks.