
How to drive trusted decisions without changing your current data infrastructure.
Learn more about DataOS® in our white paper.
Data governance can be a powerful agent in scaling the use and distribution of trusted data throughout the company. However, more often than not, it conjures up the idea of a central authority strictly guarding against such access. In this 3-part series, we’ll cover the critical and often misunderstood components of data governance and offer perspective on how to implement data governance strategies that deliver trusted data at the speed of business. If you missed it, make sure to catch up on Part 1 – Data Timeliness.
A taxonomy, very broadly, is a system of organized information that allows the user to classify and show relationships between things. A common example of a taxonomy is the Dewey Decimal System of library classification, in which numbers form a code that correlate to topics, subtopics, and sub-subtopics. Wikipedia illustrates the way this hierarchy is set up:
500 Natural sciences and mathematics
510 Mathematics
516 Geometry
516.3 Analytic geometries
516.37 Metric differential geometries
516.375 Finsler geometry
In the Dewey classification system, each number is associated unambiguously with a single entry in the hierarchy. A number such as 516.375 above identifies a book or other resource specifically as dealing with Finsler Geometry. That number also shows how that book relates to others above and below it in the hierarchy.
A data taxonomy uses a system of unambiguous metadata terms (such as a filename or tags attached to a file) that allow an enterprise to classify a file or dataset into important business categories. Categories can be configured in any way that meets the needs of the organization, but some common ones include the date of creation, date last modified, account name of the creator/modifier, required access privileges, personal identifying information (PII), the department that owns the dataset, and the primary business use of the dataset.
Properly designed and developed, a data taxonomy improves discoverability, observability, and security for your data. Data that is properly classified, catalogued, and tagged is usually well-governed data.
A proper data taxonomy addresses many problems in your data and metadata, including:
The first and most important step to data discoverability is a data catalog. The first essential step in building a catalog is tagging data with business vocabulary so users can easily find the data they need. A data taxonomy makes cataloging much more powerful, improving data quality and discoverability. DataOS® can automate tagging and indexing to add incoming data to your catalog immediately.
The two keys to building a usable data taxonomy from scratch are focused changes and using the language of your users as much as possible.
Focus your taxonomy on one business area at a time. Balance your choice of area by beginning with high-priority targets, while keeping your scope manageable. For example, don’t begin with something like compliance with HIPAA or GDPR. Those are too large and too sweeping to start with. Save those to address after you build the taxonomies for a few smaller areas, such as marketing, sales, or security. Not only will this give you more practice with the methods of taxonomy, but much of what you build there will be needed for something like GDPR, so you’re whittling the scope of that project down as you go.
Use your narrow focus to plan and keep milestones as your taxonomy progresses from one target to the next.
More than many other data projects, a data taxonomy is a team effort. Your IT team or data steward can’t do it on their own. A data taxonomy needs to use the language of your business users, which means a polling process and meetings with users to learn how they think of their data.
You may add a hierarchy to your taxonomy to address the variety of terms that users may have for the same thing. If users have terms like “POS revenues,” “sales,” and “revenues,” then you can set up the taxonomy so all of those searches point back to “sales,” which is the tag that appears in your metadata. This is one of the primary ways in which a taxonomy enforces consistency and aids discoverability.
The focus of your taxonomy efforts can also help users see the value of the taxonomy to their particular projects, increasing enthusiasm and interest in developing the vocabulary for their area.
Most modern businesses spend a lot of money on collecting their data. The ROI on that effort depends on deriving business insights from the data. A data taxonomy makes data easier to find and easier to use while improving data governance and data quality. It makes your data more valuable to your business.
Be the first to know about the latest insights from Modern.
Modern Announces Partnership with Data Mesh Pioneers, ThoughtWorks In July, we collaborated with ThoughtWorks at the annual CDOIQ Conference in Cambridge, MA to discuss real-world Data Products implementation and best practices for Data Mesh. The data community,...
In the modern data-driven landscape, organizations are constantly seeking ways to extract valuable insights from their data assets. While individual data products provide significant value, the true potential lies in harnessing the power of interconnected data...
Data Products Data products encompass several key aspects that contribute to their effectiveness and value in addressing data challenges and delivering actionable insights. These aspects ensure that data products are well-designed, user-centric, and aligned with...
There's nothing more important than customer loyalty when it comes to a business's chance of succeeding. When customers are loyal, they make repeat purchases and advocate for the brand, helping to drive new customer acquisition through word-of-mouth marketing. It's...
In our previous post, The Pros and Cons of Leading Data Management and Storage Solutions, we untangled the differences among data lakes, data warehouses, data lakehouses, data hubs, and data operating systems. Remember to read part one if you need a quick refresher. ...
Data Mesh + Patient360: A Modern Revolution for Healthcare DataHealthcare organizations are sitting on a treasure trove of customer data. Operationalizing that data makes it actionable and usable, helping improve services, costs, and patient outcomes. However,...
The Modern Data Company BriefThe Modern Data Company is radically simplifying data architecture with its paradigm-shifting data operating system, DataOS. We're replacing overwhelm with composability, reinventing governance, and connecting legacy systems to your newest...
DataOS® – The Data Product PlatformDataOS is the The Data Product Platform pioneered to enable data teams to create, deploy, and manage self-sufficient enterprise-grade data products. These data products are reusable, composable, and compatible across any data stack,...
Not Getting Value from Your Data Transformation? Fix itImplementing customer lifetime value as a mission-critical KPI has many challenges. Companies need consistent, high-quality data and a straightforward way to measure CLV. In the past, organizations have struggled...
DataOS® Solution:AI/ML 70% of AI initiatives fail and teams spend the vast majority of their time simply prepping data for platforms, leaving very little left over for gaining insights and driving business value. But an AI/ML platform powered by DataOS can achieve...