Here comes the reading notes for the rest part of "Building and Managing Metadata Respository."
To find out metadata in an organization, there are two ways, top-down and bottom-up. The top-down approach is more like to be used if a project team has the opportunity to organize and summary all kinds of metadata of an organization, regardless of any existing tools and systems. The bottom-up approach is used if a project team is mainly going to fulfilling the needs of metadata by certain existing software and repositories.
Normally a metadata tool should have certain administrative facilities just like what a database system should have. For example, security, concurrent accesses, change management, validate integrity and consistency, and error recovery.
One should also think of how the metadata tool can accommodate existing standards.
There are two categories of metadata tools, the data integration and repository tool and the data access tool. It’s just like the back-end and front-end of any systems.
The “-ilities” for an architecture of metadata system is quite similar to what it is for a data warehouse. The additional “-ilities” are “customizable” and “open.”
Customizable means the metadata tool must be customized to meet specific business needs. This is quite important for those who use prepackaged metadata solutions. The tool must be able to provide abilities for customization.
Open means the metadata tool must allow sharing of metadata.
It is very difficult to define database naming standards and use it throughout the whole enterprise, even for the Global 2000 ones.
There are many sources of metadata, for example, ETL tool/process, data modeling tools, documents, employees, reporting and OLAP tools, vendor applications, and data quality tools.
One should bare in mind that, metadata repository is just like a data warehouse. So there can be multiple versions of metadata, or slowing changing dimensions.
If a metadata repository is complete, there should be logic, rules, tests and even requirements included and these metadata can be of great help to data quality applications. They can be organized into a “data quality dictionary.”
Sometimes it also makes sense to put the most naive technical metadata, for example, what is the name of the production servers, how many CPUs it has, etc.
What is a meta model? It is the physical data model for the meta data. There can be two types of meta model. The first one is a model that is based on a generic object model. It is like what you can see in the system tables in SQL Server. Having a very generic model means that you do not need to have a very big effort to extend it when new elements should be added (why? Because it is generic!). The second one is just like the normal entity-relationship model. The metadata team can just find out a list of all kinds of metadata and then treat them as different entities or relationships. What happens normally is that most teams start with the ER way of modeling and find out later that the object model makes more sense in the end.
One should also think of what kinds of metadata delivery should be included when developing a metadata system. It is not just the metadata when you get batch data from the source system. There are many sources. The architecture of the metadata delivery is also important.
One interesting and also very useful direction is to think about if we can make metadata repository bi-directional. That means we can have any available entries to input updates to the metadata and the updates will be reflected to other relevant parties in very short time. In addition, if every business user or IT user in the company starts the work by thinking and using metadata, the company’s data management situation will be very excellent.
To give a final hit, metadata is now considered a highly valuable asset for an enterprise. But not everybody knows how to make it real. In fact, just think of it as yet-another data warehouse of your enterprise. The suggestion that I will give, if I have only one sentence left, is to let the people who always possess with a bigger overview and solid education background of data management theory to design it.
No comments:
Post a Comment