ESSAY

Authenticity and AI: addressing the elephant in the room.

When you say AI-generated audio to a museum professional, "authenticity" comes up inside five minutes. The concern is real and the answer is more interesting than the dismissal.

ERIC DUFFY·FOUNDER·APR 28, 2026·9 MIN READ

A curator's gloved hands carefully holding a painted ceramic vessel over a felt-lined worktable, conservation tools beside it — the lead image for the essay on authenticity and AI in museum interpretation.

Let's address this directly. When you mention AI-generated audio tours to museum professionals, "authenticity" comes up within the first five minutes.

Is the narration authentic? Can a machine capture the institutional voice? What about the human connection that makes museum visits meaningful? Doesn't this undermine everything museums stand for?

These are legitimate questions. They deserve serious answers, not dismissal. So let's work through what authenticity actually means in museum interpretation — and where assisted production fits.

What makes interpretation "authentic"?

Before we can assess whether assisted production threatens authenticity, we need to define what we're protecting.

When museum professionals talk about authentic interpretation, they typically mean some combination of:

Scholarly accuracy. The information is correct, based on research, vetted by experts. It reflects current understanding and acknowledges uncertainty where it exists.
Institutional voice. The interpretation sounds like it comes from this museum — its values, its perspective, its way of engaging with visitors. Not generic, not interchangeable with any other institution.
Curatorial perspective. The choices about what to highlight, what connections to draw, what stories to tell — these reflect human judgment and expertise. Someone decided this matters, and here's why.
Mission alignment. The interpretation serves the museum's educational mission. It's not entertainment for its own sake, not marketing, not propaganda. It helps visitors understand and connect with the collection.
Genuine engagement. The interpretation feels like it comes from people who care about this material, not from a content mill producing filler.

Notice what's not on this list: the specific tools used to deliver the interpretation.

A wall card is authentic if it meets these criteria. A docent tour is authentic if it meets these criteria. An audio tour — whether the voicing is recorded in a booth or generated from a script — is authentic if it meets these criteria.

Authenticity is about the substance, not the delivery mechanism.

The false dichotomy

Much of the anxiety about machine-assisted production in museums rests on a false dichotomy: human versus machine, authentic versus artificial, real versus fake.

But that's not actually the choice.

The choice isn't between machine-produced interpretation and human-produced interpretation. It's between:

Option A: Interpretation created by your curators, reflecting your scholarship, in your institutional voice, with production tooling that lets it ship in days rather than months — serving your visitors.
Option B: No interpretation at all for the visitors your current resources can't reach.

For the Spanish-speaking family visiting on Sunday, the choice isn't "machine-voiced audio tour or human docent." The choice is "a tour in their language or nothing."

For the small museum that can't afford a full traditional production, the choice isn't "machine-voiced audio or studio-voiced audio." The choice is "an audio layer or no audio layer at all."

For the temporary exhibition that opens in six weeks, the choice isn't one production method versus another. The choice is "a tour ready for opening day, or no tour until after the show closes."

When we frame assisted production as threatening authenticity, we're often comparing it to an ideal that doesn't exist for most museums. The relevant comparison is to the actual alternative — which is frequently nothing.

Where human expertise is irreplaceable

Let's be clear about what assisted production cannot and should not do.

Curatorial judgment. Deciding what matters, what stories to tell, what connections to draw — these are human decisions. Production tooling can help execute those decisions, but it can't make them. The curator who decides to highlight the provenance story rather than the technical analysis is exercising judgment that no tool replicates.
Scholarly research. The underlying knowledge that informs interpretation comes from human scholarship. Tooling can help communicate that research, but it can't conduct it. The years of study, the archival work, the careful analysis — these remain human endeavors.
Institutional voice. What makes your museum sound like your museum comes from the humans who work there. The tooling can be tuned to reflect that voice, but the voice itself emerges from human culture and values.
Ethical judgment. How to discuss sensitive topics, when to acknowledge contested histories, how to represent marginalized perspectives — these require human wisdom and institutional values. Tooling can follow guidelines, but humans must set them.
Quality control. Every piece of machine-assisted output should be reviewed by humans who can catch errors, assess tone, and ensure accuracy. The tooling is an instrument, not an autonomous agent.

The museum professionals who worry about authenticity are right to insist that human expertise remains central. The question is whether that expertise is best deployed in recording studios and editing suites, or in the curatorial and educational decisions that shape what visitors experience.

Where assisted production genuinely helps

If human expertise is irreplaceable for the substance of interpretation, what does the tooling actually contribute?

Production access. The barrier that kept most museums from professional audio tours? It's collapsing — see the math in how long an audio tour actually takes to produce. The expertise to create great interpretation exists in museums of all sizes. What was missing was affordable production.
Language access. Generating audio in multiple languages from the same source means the Spanish-speaking family, the Mandarin-speaking tourist, and the French-speaking scholar can all reach your interpretation. Human expertise created the content; the tooling makes it available to more humans.
Update agility. When research changes, assisted production lets you update immediately. The authenticity problem isn't the tooling. It's the frozen, outdated content that traditional production creates because updates are too expensive.
Depth on demand. Different visitors have different questions. A conversational tour can respond to those different questions drawing on the same curatorial knowledge base. The expertise is human; the adaptation is mechanical.
Availability. A conversational layer is available when docents aren't — at 4pm on a Tuesday, after hours, during peak times when staff are overwhelmed. It extends human expertise rather than replacing it.

In each case, the tooling handles production and delivery while humans handle substance and judgment. The division of labor makes sense: let the tooling do the work that's mechanical (generating audio from a finished script, regenerating ten languages from one source, responding to a specific question) and let humans do the work that's curatorial (scholarship, judgment, voice).

The authenticity of access

Here's a question worth sitting with. What's more authentic to a museum's mission — interpretation that's "purely" human-produced but reaches only some visitors, or interpretation that uses assisted production but reaches everyone?

If your mission statement promises to serve diverse communities, is it authentic to serve only English speakers because multilingual production was too expensive?

If your mission is education, is it authentic to leave most visitors without interpretation because professional audio tours were out of reach?

If your mission is access, is it authentic to offer rich content only to those who can afford docent tours or visit during limited hours?

There's a version of "authenticity" that prioritizes purity of production method over actual service to visitors. And there's a version that prioritizes fulfilling the museum's mission — reaching more people, serving diverse needs, making knowledge accessible.

The second version seems more authentic to what museums are actually for.

A framework for thinking about this

When evaluating whether assisted interpretation is "authentic," consider:

Who created the content? If curators and educators developed the interpretation — the stories, the connections, the perspective — the content is authentic regardless of how it's delivered. The tooling didn't write your interpretation. Your experts did. It helped produce and deliver it.
Who controls the output? If museum staff review, approve, and can modify everything visitors hear, authenticity is maintained. The tooling is an instrument under human control, not an autonomous agent making decisions.
Does it reflect your institutional voice? If the interpretation sounds like your museum — your values, your perspective, your way of engaging — it's authentic. The tooling can be tuned on your materials to reflect your voice, not a generic one.
Is it accurate? If the information is correct, based on your scholarship, and updated when knowledge changes, it's authentic. Assisted production actually helps here by making same-day shipping of corrections feasible.
Does it serve visitors? If visitors learn, connect, and have their curiosity met, the interpretation is doing its job. The delivery mechanism matters far less than the outcome.

The real threat to authenticity

Here's what actually threatens authentic museum interpretation.

Outdated content. The audio tour from 2019 that still references the wrong attribution because updating is too expensive. That's inauthentic — it doesn't reflect current knowledge.
Inaccessible interpretation. The brilliant curatorial insights that never reach visitors because production costs are prohibitive. Knowledge locked away isn't serving anyone.
One-size-fits-none. The compromise content that tries to serve everyone and fully serves no one. Generic interpretation isn't authentic to any visitor's actual needs.
Silence. The galleries with no interpretation at all because the museum couldn't afford it. The absence of your voice means visitors fill the gap with whatever a phone search hands them — which is definitely not your authentic perspective.

The tooling doesn't threaten authenticity. Inaccessibility threatens authenticity. Staleness threatens authenticity. Silence threatens authenticity.

Assisted production is one way to address those threats — if we use it thoughtfully, with human expertise at the center.

The question to ask

Instead of "is this authentic?" try asking: does this interpretation reflect our scholarship, our voice, and our mission? Does it serve our visitors well? Is it accurate and current?

If the answer is yes, the production method is a detail.

If the answer is no, no production method will fix it.

Authenticity isn't about how interpretation is produced. It's about what it contains, who shaped it, and whether it serves the people who experience it.

That's always been true. The new tooling doesn't change it.

This connects to the broader shifts in visitor expectations we've been exploring — and to the synthesis of how all these changes come together in what today's museum visitors expect.

Sometimes, depending on the voice. Most museums comparing the two find that visitors notice script quality and curatorial accuracy more than voice timbre. A great script in a competent generated voice outperforms a mediocre script in a great human voice in the engagement numbers we've seen reported.

The work being shifted is almost always work that wasn't budgeted in the first place. Museums that produced a single English track in a studio still can; the new tooling shows up where there was no budget for the second, third, or fourth language. The pie is growing, not redistributing.

Same way they always have, with a partner — a translator-of-record, a community advisor, or a curator at a peer institution. The platform doesn't change the review obligation; it makes the iteration cycle short enough that review actually happens before launch instead of after.

A short line on the tour page or in the credits, naming what the tooling did and what the curators did. The field is still settling on a standard; most US institutions we've talked to are comfortable with a one-sentence note crediting the curatorial team as authors and naming the platform as the production tool.

About the author

Eric Duffy is the founder of Convo, a platform that lets museums and cultural institutions publish multilingual audio tours their visitors can have a conversation with. He writes about how museums could afford to be more ambitious with interpretation, drawing on discovery conversations with curators, directors, and education leads at small and mid-size US museums. Reach him at eric@convo.app or on LinkedIn.