Semantic Data New York 2025

How taxonomies, ontologies, and knowledge graphs both unlock and ground generative AI

I had the honor of presenting at Semantic Data New York 2025: Taxonomy, Ontology, and Knowledge Graphs, on October 14. This event, co-located with DAM New York 2025 and now in its second year, showcased semantic data as a powerful strategic complement (or “gateway drug,” as Madi Weland Solomon put it) to generative AI. Speakers explored how it can unlock the potential hidden in Pandora’s generative AI black box while managing the risks it carries and upholding truth, trust, and transparency in information. We also demystified the pathway to semantic maturity and reassured those just getting started with taxonomies that they can start small and leave plenty of room to grow by leaning on core semantic standards. Here are my top four takeaways from the conference.

1. Semantic data opens up the potential of generative AI

  • Ashleigh Faith’s How Asset Tags help make Better AI offered practical guidance on using semantic data to enhance generative AI outputs.
  • Tracy Forzaglia shared a case study of a publishing company in which standards (which vary from U.S. state to U.S. state), AI, and human editors converge to optimize a tagging workflow.

2. Semantic data mitigates risk by giving generative AI guardrails

  • Faith also reminded us that LLMs are designed to hallucinate and fill gaps with plausible content, prioritizing fluidity over facts, and called attention to accurate but vague outputs as an underestimated problem.
  • Ahren Lehnert’s Language, Semantics, and the Shaping of “Truth” cautioned that, where use cases carry high risk (for example, legal liability or reputation damage), we should tread lightly with experimentation and heavily with stakeholder involvement.
  • Laura Rodriguez also stressed the importance of clear, consistent, and persistent communication with stakeholders as a critical tool to maintain semantic integrity in Creating Taxonomy Governance in a Platform Environment
  • In Information Quality is Information Ethics, Gary Carlson introduced a practical tool to mitigate risk: an information quality and ethics framework for identifying threats to information quality (i.e., risks of entropy or noise) at every stage in the information life cycle.

3. Semantic data is a spectrum, not a big bang

  • Jessica Talisman’s The Ontology Pipeline Explained included a breakdown of her Ontology Pipeline model, which framed semantic maturity as a step-by-step process, as well as her back-to-basics messages that “metadata is the love language of the enterprise” and it’s perfectly fine to start with a spreadsheet.
  • In Just Enough Semantics, I gave practical tips for organizations to start (or continue) moving along the semantic data spectrum, root their semantic efforts in standards (namely, ANSI/NISO Z39.19 and SKOS), and conscientiously work around common system constraints without compromising semantic integrity.

4.  Semantic data’s future: predictions and guesses

  • Lehnert’s talk also anticipated a movement towards gaining “semantic literacy” and proving generative AI’s ROI with metrics that weigh the positives, such as time saved, against the negatives, such as time wasted on “data slop.”
  • In Yale’s Cultural Heritage Knowledge Graph – Lessons Learned, Robert Sanderson predicted that “AI will never create good ontologies because it doesn’t have a worldview.”

Semantic Data New York 2025 was an excellent precursor to Taxonomy Boot Camp, which will be celebrating its 20th birthday this November. I look forward to continuing the conversation around taxonomies, ontologies, and knowledge graphs, including generative AI’s unfolding impacts on how information is organized and the perennial importance of building trust, fostering transparency, upholding ethics, mitigating risk, and proving value.

 

Taxonomy, Tech, and Transformation: Highlights from Taxonomy Boot Camp 2024

Dovecot was thrilled to have five representatives attend this year’s Taxonomy Boot Camp conference to share and learn about the latest developments in taxonomy, ontology, and semantics. 

  • Stephanie Lemieux, President, Conference chair, MC, and facilitator for a number of remarks and panels.
  • Michele Ann Jenkins, Senior Consultant, facilitated the Taxonomy 101 Workshop, presented “Aligning AI Approaches for Taxonomy & Tagging” and, in the Enterprise Search conference (co-located with Taxonomy Boot Camp), presented “The Role Taxonomies Can Play in Enterprise Search”
  • Katherine Black, Senior Consultant, facilitated the Taxonomy 101 Workshop
  • Melissa Knudtson Monsalve, Taxonomy Consultant
  • Stephanie Duncan, Taxonomy Consultant, presented “Future-Proofing Your Organization’s Taxonomy With a Governance Plan”

The event provided an incredible opportunity to dive deep into the latest developments in organizing and structuring information for a wide range of use cases. Now, after the dust has settled, we got together to share some key takeaways and insights.

Artificial intelligence (of course)

AI was, as expected, a big topic of discussion!

Katherine: Shannon Moore and Max Gaibort’s presentation “LLMs & ‘Human-in-the-Loop’ Taxonomy Development at EA Games” was a masterclass in how to integrate AI into taxonomy work. They demonstrated the painstaking process of manually taxonomizing user language and retagging content over two years, followed by using AI methods. It was an excellent real-world example of how AI requires significant manual effort for cleanup and validation to work effectively.

Michele: I agree. That and Rebekah Randle and Andy Fitzgerald’s “Using AI and ML to Do Taxonomy Heavy Lifting” were great because they showed actual in-production use of AI tools for classification. Like you, Katherine, I especially appreciated that they did not hedge about the huge amount of manual effort needed to get these projects off the ground and keep a close eye on everything. 

Katherine: Michele, you and Erik Lee gave a great presentation on this topic, “Aligning AI Approaches for Taxonomy and Tagging.” It was a pragmatic reality check on the effective use of AI. You and Erik really showed where and how it fits into our broader toolkit alongside less glamorous but reliable methods and tools.

Semantic debt

The idea of  “semantic debt” came up more than once, expanding on the idea of “technical debt” —first introduced by computer scientist Ward Cunningham in the early 1990s to explain the long-term costs and trade-offs associated with quick, suboptimal software development decisions—to the semantic structure. 

Katherine: Ahren Lehnert’s “Semantic Layers and the Ghost in the Machine” was great. Ahren highlighted the role of taxonomists as interpreters, and addressed issues like bias in taxonomies and semantic debt.

Michele: We’re always making hard decisions and sometimes you need to move forward with a less-than-ideal solution. That’s when you start to accrue “semantic debt.” Long-term planning needs to include time and resources for resolving semantic debt—going back and fixing things.

Melissa: Agreed, Michele. Ahren’s other presentation, “Stand Still Like the Hummingbird: Enterprise Taxonomy Strategy When Nothing Stands Still,” touched on exactly why that’s important: taxonomies are records of business continuity. As he put it, “Businesses change, but taxonomies are forever.”

Creative inspiration

A great part of every Taxonomy Boot Camp are the real-world solutions showcased in the diverse range of case studies. 

Melissa: This was a great year to gather inspiration for creative solutions and innovative ideas. For example, in “Extensible Taxonomies for Sustainability,” Marjorie Hlava shared a case study featuring a voting system for users to weigh in on tagging quality in open-access scientific journals. What an interesting way to get user input at the point of use.

Katherine: Richard Huffine’s presentation “Mastering Metadata with a Data Catalog” offered a fascinating look into the FDIC’s vast data resources and human resources and the innovative metadata models and data catalog developed by the library team in collaboration with Enterprise Knowledge to wrangle them.

Melissa: I also thoroughly enjoyed Laura Rodriguez and Melissa Casey’s “Untangling Credentialing: A Healthcare Use Case for Data and Metadata” with its behind-the-scenes look at how HealthStream engineered contextual display labels as a starting point for delivering more personalized experiences to healthcare professionals.

Michele: More broadly, I found the panels bringing together some of our brightest minds for informal discussions to be a real highlight. The discussions were both practical and inspiring. 

Taxonomists at work

At the heart of every taxonomy are the people who make it happen!

Melissa: I like to think about Dovecot as a small company powered by relationships, so I appreciated Thomas Stilling’s keynote “Be the Change: Your Taxonomy Expertise Can Help Drive Organizational Transformation” and its “four types of communicators:” promoters/persuaders, analysts, supporters, and controllers. People sometimes wear different hats and play different roles to communicate to keep a taxonomy effort moving forward.

Stephanie Duncan: Speaking of roles, Joyce van Aalten’s “Journey from a Minimal Viable Taxonomy to a Full Taxonomy” made a good case for limiting the role of SMEs in the early stages of development of a minimum viable taxonomy in favor of input from content creators/owners, then bringing SMEs on board to shift from an “MVT” to a full taxonomy.

Melissa: That’s smart, Stephanie. More isn’t always more.

Stephanie Duncan: I also appreciated Bonnie Griffin’s “Consulting from Within: Best Practices for the Solo Taxonomist.” As she pointed out, a substantial amount of work is introducing and re-introducing people to the taxonomy in various mediums.

Katherine: Yes. A standout insight from Duane Degler’s “Enabling Exploratory Discovery Through Taxonomy” was his advice for “selling taxonomy up”—finding emotionally resonant phrases and repeating them until they resonate across the organization.

Stephanie Duncan: Absolutely. According to Bonnie, it’s best to set aside talk of industry standards and best practices and focus on how taxonomy can ease specific pain points across the organization. We can amplify our efforts to “sell taxonomy up” by “cloning ourselves”: identifying allies who can accurately introduce taxonomy as well as we could.

Beyond taxonomies: knowledge graphs and enterprise search

More and more we see taxonomists needing to branch out into more advanced techniques and technologies.

Katherine: In “Enabling Exploratory Discovery Through Taxonomy”, Duane also explored how taxonomies and knowledge graphs shape user experiences, drawing on inspirational and immersive projects from the Georgia O’Keeffe Museum and the Texas Coastal Bend Collection.

Stephanie Duncan: I enjoyed hearing Duane’s thoughts on how knowledge graphs can be used to broaden a user’s view of content by linking related content and creating relationships (“horizontal navigation”), as well as creating exploratory and immersive experiences and motivating people to learn.

Michele: Over in the Enterprise Search & Discovery conference, I heard that we don’t need metadata because we can just understand all the content, we don’t need content because we can just have metadata and data in a graph, and we definitely don’t need users because they make everything difficult. (laughing) I’m glad we have that all sorted out! But, more seriously, there’s a clear need for taxonomists to continue to educate about and communicate the value of taxonomies even in very advanced technical ecosystems. There will always be a need for out expertise to bridge the gap between humans and machines.