Taxonomy, Tech, and Transformation: Highlights from Taxonomy Boot Camp 2024

Dovecot was thrilled to have five representatives attend this year’s Taxonomy Boot Camp conference to share and learn about the latest developments in taxonomy, ontology, and semantics. 

  • Stephanie Lemieux, President, Conference chair, MC, and facilitator for a number of remarks and panels.
  • Michele Ann Jenkins, Senior Consultant, facilitated the Taxonomy 101 Workshop, presented “Aligning AI Approaches for Taxonomy & Tagging” and, in the Enterprise Search conference (co-located with Taxonomy Boot Camp), presented “The Role Taxonomies Can Play in Enterprise Search”
  • Katherine Black, Senior Consultant, facilitated the Taxonomy 101 Workshop
  • Melissa Knudtson Monsalve, Taxonomy Consultant
  • Stephanie Duncan, Taxonomy Consultant, presented “Future-Proofing Your Organization’s Taxonomy With a Governance Plan”

The event provided an incredible opportunity to dive deep into the latest developments in organizing and structuring information for a wide range of use cases. Now, after the dust has settled, we got together to share some key takeaways and insights.

Artificial intelligence (of course)

AI was, as expected, a big topic of discussion!

Katherine: Shannon Moore and Max Gaibort’s presentation “LLMs & ‘Human-in-the-Loop’ Taxonomy Development at EA Games” was a masterclass in how to integrate AI into taxonomy work. They demonstrated the painstaking process of manually taxonomizing user language and retagging content over two years, followed by using AI methods. It was an excellent real-world example of how AI requires significant manual effort for cleanup and validation to work effectively.

Michele: I agree. That and Rebekah Randle and Andy Fitzgerald’s “Using AI and ML to Do Taxonomy Heavy Lifting” were great because they showed actual in-production use of AI tools for classification. Like you, Katherine, I especially appreciated that they did not hedge about the huge amount of manual effort needed to get these projects off the ground and keep a close eye on everything. 

Katherine: Michele, you and Erik Lee gave a great presentation on this topic, “Aligning AI Approaches for Taxonomy and Tagging.” It was a pragmatic reality check on the effective use of AI. You and Erik really showed where and how it fits into our broader toolkit alongside less glamorous but reliable methods and tools.

Semantic debt

The idea of  “semantic debt” came up more than once, expanding on the idea of “technical debt” —first introduced by computer scientist Ward Cunningham in the early 1990s to explain the long-term costs and trade-offs associated with quick, suboptimal software development decisions—to the semantic structure. 

Katherine: Ahren Lehnert’s “Semantic Layers and the Ghost in the Machine” was great. Ahren highlighted the role of taxonomists as interpreters, and addressed issues like bias in taxonomies and semantic debt.

Michele: We’re always making hard decisions and sometimes you need to move forward with a less-than-ideal solution. That’s when you start to accrue “semantic debt.” Long-term planning needs to include time and resources for resolving semantic debt—going back and fixing things.

Melissa: Agreed, Michele. Ahren’s other presentation, “Stand Still Like the Hummingbird: Enterprise Taxonomy Strategy When Nothing Stands Still,” touched on exactly why that’s important: taxonomies are records of business continuity. As he put it, “Businesses change, but taxonomies are forever.”

Creative inspiration

A great part of every Taxonomy Boot Camp are the real-world solutions showcased in the diverse range of case studies. 

Melissa: This was a great year to gather inspiration for creative solutions and innovative ideas. For example, in “Extensible Taxonomies for Sustainability,” Marjorie Hlava shared a case study featuring a voting system for users to weigh in on tagging quality in open-access scientific journals. What an interesting way to get user input at the point of use.

Katherine: Richard Huffine’s presentation “Mastering Metadata with a Data Catalog” offered a fascinating look into the FDIC’s vast data resources and human resources and the innovative metadata models and data catalog developed by the library team in collaboration with Enterprise Knowledge to wrangle them.

Melissa: I also thoroughly enjoyed Laura Rodriguez and Melissa Casey’s “Untangling Credentialing: A Healthcare Use Case for Data and Metadata” with its behind-the-scenes look at how HealthStream engineered contextual display labels as a starting point for delivering more personalized experiences to healthcare professionals.

Michele: More broadly, I found the panels bringing together some of our brightest minds for informal discussions to be a real highlight. The discussions were both practical and inspiring. 

Taxonomists at work

At the heart of every taxonomy are the people who make it happen!

Melissa: I like to think about Dovecot as a small company powered by relationships, so I appreciated Thomas Stilling’s keynote “Be the Change: Your Taxonomy Expertise Can Help Drive Organizational Transformation” and its “four types of communicators:” promoters/persuaders, analysts, supporters, and controllers. People sometimes wear different hats and play different roles to communicate to keep a taxonomy effort moving forward.

Stephanie Duncan: Speaking of roles, Joyce van Aalten’s “Journey from a Minimal Viable Taxonomy to a Full Taxonomy” made a good case for limiting the role of SMEs in the early stages of development of a minimum viable taxonomy in favor of input from content creators/owners, then bringing SMEs on board to shift from an “MVT” to a full taxonomy.

Melissa: That’s smart, Stephanie. More isn’t always more.

Stephanie Duncan: I also appreciated Bonnie Griffin’s “Consulting from Within: Best Practices for the Solo Taxonomist.” As she pointed out, a substantial amount of work is introducing and re-introducing people to the taxonomy in various mediums.

Katherine: Yes. A standout insight from Duane Degler’s “Enabling Exploratory Discovery Through Taxonomy” was his advice for “selling taxonomy up”—finding emotionally resonant phrases and repeating them until they resonate across the organization.

Stephanie Duncan: Absolutely. According to Bonnie, it’s best to set aside talk of industry standards and best practices and focus on how taxonomy can ease specific pain points across the organization. We can amplify our efforts to “sell taxonomy up” by “cloning ourselves”: identifying allies who can accurately introduce taxonomy as well as we could.

Beyond taxonomies: knowledge graphs and enterprise search

More and more we see taxonomists needing to branch out into more advanced techniques and technologies.

Katherine: In “Enabling Exploratory Discovery Through Taxonomy”, Duane also explored how taxonomies and knowledge graphs shape user experiences, drawing on inspirational and immersive projects from the Georgia O’Keeffe Museum and the Texas Coastal Bend Collection.

Stephanie Duncan: I enjoyed hearing Duane’s thoughts on how knowledge graphs can be used to broaden a user’s view of content by linking related content and creating relationships (“horizontal navigation”), as well as creating exploratory and immersive experiences and motivating people to learn.

Michele: Over in the Enterprise Search & Discovery conference, I heard that we don’t need metadata because we can just understand all the content, we don’t need content because we can just have metadata and data in a graph, and we definitely don’t need users because they make everything difficult. (laughing) I’m glad we have that all sorted out! But, more seriously, there’s a clear need for taxonomists to continue to educate about and communicate the value of taxonomies even in very advanced technical ecosystems. There will always be a need for out expertise to bridge the gap between humans and machines.

Come Join Us: Dovecot at Taxonomy Boot Camp and Enterprise Search & Discovery Conferences

KMWorld

KMWorld, the leading knowledge management conference, takes place in Washington, DC this November 18 – 21 and Dovecot Studio will be there with four different presentations, including the halfday Taxonomy Boot Camp workshop. KMWorld features four co-located events—Taxonomy Boot Camp, Enterprise Search & Discovery, Text Analytics Forum, and Enterprise AI World.

Taxonomy Boot Camp

Taxonomies are all about creating structures that bring data and information to life. Taxonomy Boot Camp is the only conference dedicated to taxonomy building and management. Join us as we explore the successes, challenges, and methods of putting taxonomies to work for your organization.

Taxonomy Boot Camp is designed for Taxonomists and Ontologists, Information Architects, Content Managers, Knowledge Engineers, Intranet Professionals, Content Classification Specialists, Information Professionals, Information Scientists, and anyone responsible for classifying, organizing, or managing content. 

Taxonomy 101 Workshop

Monday, November 18 2024

2:00 p.m. – 5:00 p.m.

Michele Ann Jenkins, Senior Consultant – Dovecot Studio, Canada

Katherine Black, Senior Consultant – Dovecot Studio

Lauren Clark Hill, Client Solutions Specialist – Synaptica

Whether you are brand new to the world of taxonomy or are looking to solidify your foundational knowledge, this workshop equips you with the key concepts to help you hit the ground running on your own taxonomy work. Starting with an accessible, practical examination of what taxonomies are, learn how they fit into the information and content management landscape and the most common use cases, including dynamic content, search and discovery, and reporting. Explore the three pillars of what makes a good taxonomy good: strategy and style (term selection and form, relationships, synonyms, and other properties), governance (roles and responsibilities, processes, and documentation), and technology (technical standards, taxonomy tools, metrics, and analytics needed to implement, integrate, and monitor a taxonomy across platforms). Hear about the more advanced approaches such as knowledge graphs, ontologies, and AI tools. Clark also gives a special deep dive on practical taxonomy change management, including policies, approval workflows, and various methods of versioning and tracking.

Future-Proofing Your Organization’s Taxonomy With a Governance Plan

Monday, November 18 2024

3:15 p.m. – 4:15 p.m.

Paula Little, Lead Senior Information Architect & Taxonomist – Factor Firm

Connor Cantrell, Information Architect – Factor

Kristen C. Ratanatharathorn, Assistant Director – Grant Systems and Data – The Andrew W. Mellon Foundation (AWMF)

Stephanie Duncan, Taxonomy Consultant – Dovecot Studio

Change is inevitable, but designing (and following) a governance framework is much easier said than done. Little & Cantrell explore some of the key pillars of a good governance plan, including business drivers for taxonomy changes, guidelines for balancing proactive and reactive workflows, and communication and training plans using case studies of recently implemented plans. Learn the critical role documentation plays in change management and the types of documentation needed for success. Ratanatharathorn and Duncan describe the governance and maintenance strategies for the Grant Classification Taxonomy, which has been in use since September 2021 at the AWMF. Hear their processes and best practices for understanding and documenting use cases, vetting them, and balancing the perspectives of different user groups in order to cultivate a taxonomy that suits the needs of many. 

Aligning AI Approaches for Taxonomy & Tagging

Tuesday, November 19 2024

1:45 p.m. – 2:30 p.m.

Michele Ann Jenkins, Senior Consultant – Dovecot Studio, Canada

Erik Lee, Taxonomist – Factor

As the AI rush began, companies created directives to integrate AI into their products to avoid getting left behind. The result of this “AI for AI’s sake” mindset has been a slew of poor implementations and worse outcomes. However, it is possible to know if, when, and how to  integrate AI intentionally into a project by aligning integration with your methodology. Lee explores the spectrum of available tools, ranging from manual effort to advanced techniques leveraging multiple AI techniques. Spoiler: It’s not just LLMs! Jenkins dives deeper into the key use case around using different approaches to validate and enhance metadata tagging workflows to reduce the burden on content creators and improve quality. Hear caveats, considerations, and risks involved in adding AI automations to tagging workflows. Learn the practical applications of AI in taxonomy and tagging illustrated with real-world examples that can be implemented today, as well as insights into what’s on the horizon for tomorrow.

 

Enterprise Search & Discovery

The Enterprise Search & Discovery conference is the only conference dedicated to exploring this critical business and technical challenge and opportunity. Discover how to design, build, and manage better search and discovery to help extract critical knowledge and business value from your organizational data. Join us as we explore how to provide transformative enterprise search and information discovery across your organization.

Enterprise Search & Discovery is designed for Search Managers, Line-of-Business Departmental Managers, IT Managers, Information & Knowledge Architects, Compliance and Legal Officers, and anyone responsible for organizing, managing, and retrieving internal and/or external information. 

 

The Role Taxonomies Can Play in Enterprise Search

Thursday, November 21 2024

1:00 p.m. – 1:45 p.m.

Michele Ann Jenkins, Senior Consultant – Dovecot Studio, Canada

Marjorie Hlava, Chief Scientist – Access Innovations Data Harmony

Large organizations often turn to enterprise search to solve the challenges of siloed content management systems and fragmented search experiences, but the outcome depends on the quality and consistency of the associated metadata and taxonomy. Believing that a rising tide lifts all repositories, Dovecot’s Jenkins discusses how to align and enhance metadata and taxonomy ahead of enterprise search. Developing a semantic layer, including GenAI, auto-classification, mapping, and other business logic, can support a processing layer to harmonize and enhance metadata beyond the capabilities of the individual source repositories. Access Innovation’s Hlava provides a case study on search recommendations using taxonomy tags. The McGraw-Hill Access Engineering implementation of search depends on, instead of relevance and co-occurrence, the weighted taxonomy tags applied to the individual pieces of content, the information objects. She outlines the process of taxonomy tagging and the search parameters to achieve amazingly high accuracy and consistency.

 

We hope to see you there!

Bite Size Taxonomy Bootcamp

Join Dovecot Studio’s Michele Ann Jenkins for Oct Bite-Sized Taxonomy Bootcamp.

More taxonomies in action

Wednesday 9 October: 16.00 – 16.45 GMT

Tips for taxonomy hierarchies

Making the most of SKOS relationships for mapping

This very practical session covers two dimensions of an expressive and interoperable taxonomy.

Heather covers all aspects of hierarchies, including faceting, polyhierarchy, and some common pitfalls to watch out for.

Michele takes a deep dive into the various SKOS relationships that are useful for mapping and matching taxonomy terms from other sources.

 

Moderator:

Helen Lippell, Taxonomy, Metadata & Search Consultant, Chair, Bite-Sized Taxonomy Boot Camp London

Speakers:

Heather Hedden, Taxonomy Consultant, Hedden Information Management, USA and Author, The Accidental Taxonomist

Michele Ann Jenkins, Senior Consultant, Dovecot Studio, Canada

Register for the online presentation

Michele Ann Jenkins: Taxonomy as the Foundation of Semantic Architecture – Episode 200

Dovecot’s Michele Ann Jenkins got to sit down with Larry Swanson from Content Strategy Insights for an in-depth discussion about taxonomy, ontology, maturity models, and strategies to level up semantic structures in organizations. We also discussed the role of governance, processes, and making sure to keep humans in-the-loop while off-loading the heavy lifting to knowledge graphs and other tools in the semantic layer toolbox.

Available in audio, video, and text transcript.

https://ellessmedia.com/csi/michele-ann-jenkins/

michele-book-chapter

New Taxonomy Book, with a Chapter on Search by Dovecot’s Michele Ann Jenkins

michele-book-chapter

Right in the middle of all the craziness that was 2020, Helen Lippell reached out to me about writing a chapter for the taxonomy book she would be editing. I was excited to hear that she was looking for practical, accessible guidance and real-world examples. I was also eager for a distracting project that I could sink my teeth into. I had had the opportunity to work with Helen on a long term project with a couple of onsite/in-person (those were the days!) work sessions and I knew she would bring her deep expertise and wonderful approachability to the book.

In my chapter, I delve into taxonomy considerations for leveraging taxonomies in search and provide a detailed case study touching on the most common use cases. Taxonomists frequently say that “taxonomy can help search”, but just how and why is often glossed over.

Taxonomies: Practical Approaches to Developing and Managing Vocabularies for Digital Information, contains chapters from leading minds in the taxonomy domain covering everything in the taxonomy development lifecycle from business buy-in and scoping, to implementation and governance.

Taxonomies is available for purchase now in physical and digital formats through the UK Publisher and will be available for North American sales later this summer.
https://www.alastore.ala.org/taxonomies
https://www.facetpublishing.co.uk/page/detail/taxonomies/?K=9781783304813

OHCHR Topic Page

International Non-Profit Launches a New Taxonomy-powered Website

Dovecot is pleased to announce that, after an enormous technical and editorial effort, a large-scale, international non-profit has launched their new, taxonomy-powered website. This Drupal site features over 20,000 pages of HTML content and tens of thousands of digital assets supporting the crucial and sensitive work of the organization across the globe.

In this site, taxonomy drives complex content aggregation and dynamic placement as well as search and filtering.

Dovecot will be presenting “Optimizing the haystack: Improving findability in content-heavy websites” with partners Bluestate and Axelerant at DrupalCon 2022 in Portland, Oregon on April 26. Be sure to say hi if you are able to attend!

Read our case study for more on how we helped this large international non-profit with taxonomy harmonization and development.

Word frequency code example

DIY Text Analysis for Taxonomists

Taxonomy, at its heart, is about making connections between concepts and labels. On the conceptual side taxonomy design requires analyzing and understanding users’ needs and mental models. On the label side there is the body of content (or “corpus” in info science speak), which may be quite large, running to millions of words (or more!). Getting a handle on that much text can be challenging for a human mind, but luckily we live in a time with technology that doesn’t break a sweat running millions of processes.

Text analysis and processing can be useful for a number of common taxonomy development tasks including:

  • Text mining for candidate terms & synonyms
  • Search log analysis
  • Statistical analysis of current metadata use (e.g. from a CMS database export)
  • Term extraction (e.g. from product names or article titles)
  • Data clean up or transformations
  • Aggregation or separation of values based on different criteria
  • Mapping free text to new controlled taxonomy terms
  • Summarizing labels used in a folder structure
  • Replacing a subset of terms
  • Frequency analysis (seeing how many times any term from a list appears in a corpus)

There are a number of high end, enterprise grade applications available for purchase or as a service that advertise advanced analysis, complex machine learning algorithms, and dazzling visualizations. But not everyone has the resources, or need for that level of support. Luckily, there are many approaches that can do a lot of the heavy lifting and provide very useful results using readily available tools that you probably already have on hand.

Excel / Open Office / gSheet are all different spreadsheets with the same core functionality including the ability to use formulas, pivot tables, and extend them with more complex programming or plug-ins (sometimes requiring additional purchase).

Command line tools are available natively in all Linux and Mac OS computers and can be added to Windows (free!). Many of these commands take only a few minutes to learn and have the added advantage of being able to apply them to multiple files (or an entire directory). Command can be combined or chained together to form more complex processes. For example “uniq -dc | sort” will return all the lines in the file that occur more than once, along with a count, and then pass that to the sort function which will sort them alphabetically.

Scripting (simple programming) may seem daunting but, with a very basic introduction to the overall approach (i.e. how to create and run code), there are so many examples available with a quick Google search, there is almost never a need to actually write code. The most common programming language for simple text manipulation is Python. Just search “normalize text python” and then cut and paste the results:

  • # convert to lower caselower_
  • string = string.lower()
  • print(lower_string)

To see examples of each of these approaches and learn more about DIY Text Analysis, check out my presentation from Taxonomy Boot Camp. You can use this as a cheat sheet for all the most useful operations to use in your taxonomy work.