• Shawna Applequist

Dark Data and Semantic Search

What is dark data and why should you care? Dark data is data that is acquired through a variety of computer operations that is not currently being used to derive insights or aid in decision making. Basically, it is generated information within an organization that has no purpose. IBM estimated that approximately 90% of data generated by sensors and analog-to-digital conversions never gets used. But this dark data does not have to remain unused.


Organizations today store vast amounts of knowledge within this dark data, and could unlock this knowledge with semantic search to become more effective in their fields and become more efficient in their work. In an article published by OpenText earlier this month, Marc St-Pierre discusses the competitive advantage of employing Artificial Intelligence to remove barriers to information discovery, content reuse, and data retrieval.


"Semantic search engines like Magellan Search+ can drill down and surface dark data, locked away in various silos, through understanding what users are intending to find."*

*Marc St-Pierre, December 1, 2021, https://blogs.opentext.com/unlock-insight-from-dark-content-using-semantic-search/


Knowledge workers across numerous industries can benefit from tools like Magellan Search+ which help sort through data that would otherwise be useless. Filters within advanced search engines help knowledge workers express their intent through a variety of concepts, terminology, and similarity to other documents or data.


But search engines cannot operate in a vacuum: they need something or someone to train them and inform the AI about analysis. This is where taxonomies come into play. A taxonomy organizes human knowledge in an accessible way for Artificial Intelligence engines. Leveraging prebuilt taxonomies will accelerate search implementation projects and help organizations begin to utilize that dark data faster.


More and more companies globally are beginning to implement Artificial Intelligence and semantic search engines to create and maintain a unified index of the entire organization's knowledge, which not only speeds up processes for knowledge workers looking for specific information, but also allows for knowledge to be shared easily across the company.