Wednesday, April 15, 2009

Enumerating reporting services metadata with C# and SQL Server 2005

Who should read this article Report writers, database developers, data modelers, anyone familiar with SQL Server 2005 & 2008 Reporting Services (familiarity with Report Models, Report Designer and Report Builder) and lastly with some knowledge of programming with C# and XML.

Introduction

Why should report model metadata be stored (in a metadata repository)? This could be a question you are asking yourself or you already know the answer and are looking for additional silver bullets to help explain it to your manager or project sponsor.
One of the more compelling reasons for capturing metadata about entities and attributes (and to an extent – roles & relationships) in reporting services report models in any organization is ‘documentation’. May that be client documentation or in-house documentation to serve as a reference guide for business users, or as a technical reference for ETL developers, or for that new developer who just joined you organization and wanted to know the definition or description of a business term, or more importantly as a quick reference for report writers (who may not be familiar with the field names used in the report model). Whichever the use; the bottom line lies in the term ‘Metadata Management’ (not to be confused with Master Data Management – an entirely different topic). The objective of this article is to extract from the smdl file using C#, meaningful and necessary metadata and store it in a SQL Server 2005 database.

Report Authors
I separated the report writers based on method of developing reports:
1. Using Report Designer (from Visual Studio or BIDS) for canned\ known reports - Technical Report writers
Technical Report writers are usually report developers who are familiar with the data model and may\may not know much about business processes.
2. Using Report Builder (spawning from the browser) for ad-hoc reports - Ad-hoc report writers.
Ad-hoc report writers are usually business users (end users working on the application or management folks).

This blog is more inclined towards the ad-hoc report writer community.

If you are interested in knowing more about metadata and metadata driven architectures you will have to wait until my next article or just google ‘Metadata driven architecture’ and ‘Microsoft Project Real’. Microsoft’s Project Real has a good implementation of metadata (open source) for enumerating and storing metadata in the database for SSIS, SSAS and for database structures, but lacks a module for enumerating Reporting Services models.

Report Models
A Report model is a slice of information of the database structure (views and\or tables) that relates to a specific business scope. For example, in an automobile dealership data warehouse one of the report models would be a Sales model which links back to an inventory table and a date time dimension. Report models are created with a ‘data source view’ reference. Think of a ‘Data Source View’ as a visual representation of a database diagram with tables or views having keys and relationships. Report models are created with the end user perspective as they use report models as their source to build ad-hoc reports.

Note: Report models have a file extension of smdl. The content of these ‘smdl’ files is a set of XML nodes describing the model (They contain entities, attributes, rules, roles, relationships, etc).


Creating the metadata repository for Report Models
The report model repository can be created as a separate schema with a set of tables or be a part of your own custom metadata solution; or feel free to use the scripts and schema I provide in this article. Once your schema is created; two stored procedures need to be created for inserting into the Entity table and the Attribute table. You could create simple CRUD type procedures that do a one row insert at a time, but a more efficient solution would be to create OPENXML stored procedures to insert all the data in one call. In my example I used the latter. By using the OPENXML stored procedure that I reduced the number of database inserts calls from hundreds to one. I have provided the code for the OPENXML stored procedures at the end of the article.








Figure 1: Report Model schema.

Now that your schema and stored procedures (“AddAttribute” and “AddEntity”) are created (and tested), you can begin with the application code.
At this point - Make a copy of the Report Model file from which metadata needs to be extracted from.

Extracting data from the Report Model
(.smdl): There are various ways of retrieving this report model data from the smdl file; like using XSLT and XPATH in C# to get to the desired nodes; or store XML chunks of the model file as XML data type in SQL Server and retrieve it based on XQUERY. Any way can be adopted; i will demonstrate one 'simple' way of doing it. You could apply a similar extraction and storage technique for report files (rdl files).


For my solution i used a C# windows application solution in Visual Studio. You could even create a console app instead of windows. I created a windows app so that i could have the feature of browsing for .smdl files without having to type it in.
Basically, the extract from smdl (XML) files was done by using XMLTextReader class provided in the System.XML namespace. I parsed the entire XML file for tags (XMLNodeType), in a specific format hierarchically based on the XML node structure (Entity --> Attribute --> Attribute Properties).

Each line in the XML supports these tags to identify the item associated with it. Once a type (XMLNodeType) name "Attribute" is found, i look for additional properties of the nodes with tags "Name", "DataType", "Nullable", "Format", "SortDirection", "ColumnName", "Description" etc. and start building my XML string structure.
A few functionalities provided by the XMLNodeType (and how they were used):
1. XmlNodeType.Element: Used for element level parsing.

2. XmlNodeType.Text: Text content of a node to retrieve description of a column, etc.
3. XmlNodeType.EndElement: To find the exit point of an element (entity, attribute, or anything else)


An important thing to note here is that i am NOT retrieving values based on keys and inserting to the metadata store. I am actually building a well formatted valid XML string that will contain the key-value mapping structure so that it can be passed to the SaveAttribute and SaveEntity methods (OPENXML stored procedures).
Once this XML string is created it is passed to the methods SaveAttribute (executes stored proc AddAttribute) and SaveEntity (executes stored proc AddEntity). Both these methods execute with one call each to the database. The relationship between an attribute and its parent entity is also saved to the database table based on GUIDs. Following is a sample of the data from the metadata repository.






Figure 2: Sample data view.

Figure 3: Sample OPENXML stored procedure content (where @AttributeXML is the XML string created by the C# code):























Summary

The idea behind this article is to give the community another way of retrieving metadata from report models. This would be a good addition to the existing BI metadata repositories that companies currently support. One really has to understand and harness the power of metadata. There are people who do not support the idea behind metadata and its use. Think of a simple scenario (a real world project) of metadata use - A project for migrating around 1000 tables with millions of rows transactional data from AS400 i-series to SQL Server 2008 using SSIS, so, were you going to create source, staging and warehouse structures manually? I don't think so, Ideally you would retrieve all the metadata from the system catalog of AS400 and write dynamic SQL scripts to create all the structures.
Long story short - utilize metadata to the fullest extent, maintain a store of the metadata for business, technical and ETL process meta data and incorporate it during the earlier phases of your BI projects. Hope this was helpful. Do leave feedback.

Download code: Download Code here

Thanks,

Vishal Gamji

MCITP - Database Developer

admin@expertsqltraining.com

3 comments:

  1. Nice post on extracting data, simple and too the point :), I use python for simple extracting data, data extraction can be a time consuming process but for other projects that include documents, files, or the web i tried "extracting data from the web" which worked great, they build quick custom screen scrapers, extracting data, and data parsing programs

    ReplyDelete
  2. Nice post.We are having metadata based saas application and planning to have report model based SSRS reports as reporting solutions.Since it is metadata based appliction and we don't have any relationship in database.So using our own datalayer, i need to create/sync reportmodel dynamically.I am looking for the cod esample that you specified.But it was not available.It would be great if you provide code sample

    ReplyDelete
  3. Thank you for this post. Can you please provide source code to download. Thank you in advance.

    ReplyDelete