The data marts and the tool enable prompted reporting
NORTHWESTERN�UNIVERSITY
Data�Warehouse�and�
Jason Schober, Director of Project Cafe
I. Executive Summary
In writing this paper, we are guided by the principle that our tools, processes and systems should enable a culture of analytics and data‐driven decision making on campus by promoting ease of use, self‐service and role‐based data access.
Many recommendations are presented throughout the document around topics of staffing, self‐service, data integration, data governance and information management, the highlights of which include:
While some of the data marts were developed in‐house, other data marts were purchased from external vendors and customized in‐house (e.g., the financial data mart was purchased from Oracle, and the student data mart was purchased from Blackboard). Decisions on how best to approach integration and technical effort required in integrating these data marts will need to be prioritized against other needs.
Similarly, disparate security rules in different data marts need to be matched and integrated. As the build out of individual data marts reaches maturity, and the focus turns to cross‐functional analytics, having a group work through data access policies becomes increasingly important.
The topic of data access, analysis, and integration is consistently cited as a top area of need with regard to our information systems. Reporting and analytics has been treated as a high‐priority area, with a great deal of time and effort invested since the selection and purchase of Cognos BI in 2006 and many efforts prior to that point within each of the University’s enterprise systems. Yet, given this investment to‐date, the areas of reporting and analytics remain one of the top areas of as‐yet‐unmet demand. We believe this demand can be summarized along two main themes: 1) preparing our institution to make use of the available features and capabilities and 2) making real investments, both in terms of effort and priority, into integrated reporting and data access capabilities.
As it relates to delivering enhanced reporting and data access capabilities from our enterprise information systems, we have accomplished quite a bit. The number of function‐specific warehouses and marts that have been produced thus far and the volume of reporting that is conducted from these databases are significant. Even more capacities along these lines are being developed this fiscal year, both broadening and deepening the information available to the community. This development has enabled better decision‐making, bringing speed and depth of analysis to processes and functional areas that previously required a great deal of manual effort to analyze. We now need to take what has been built and maximize its usage. This requires awareness of the available capabilities, the continued development of power users in the community, and an increase in our report developer population. Training and support are also required to sustain this resource base.
III. Background and Strategy
As the University approached the replacement of its legacy financial systems several years ago, a workgroup was established to articulate our requirements and expectations of our future data access and reporting capabilities. At that time, like many of our peer institutions, the Northwestern community had historically met its reporting needs by deploying a variety of solutions. Developed by different units and users, many of these solutions addressed similar reporting needs and ultimately they offered a disjointed set of reporting alternatives. As a whole, these solutions were not integrated and thus not easy to use, requiring multiple log‐ins and permissions, and were not providing all the functionality that business users needed.
As a result of the recalibration, a dedicated team within Project Café would continue to develop the data warehouse and reporting capabilities for NUFinancials, FAMIS and InfoEd while the new BI team would provide common infrastructure for enterprise BI capabilities and focus on warehousing the remaining administrative data (SES, FASIS, AIMS/CATracks and other research administration systems).
The central BI team was established in November 2007, and in a joint report on BI strategy presented by the Office of Change Management (OCM) and NUIT to the Board of Trustees in 2008 noted its goals as below:
10.Providing a central point of contact for the BI vendor and managing the relationships.
A BI Advisory Group formed subsequently in 2008 guided the strategies and priorities of the BI team. In order to deliver on the priorities, an incremental approach was taken by the BI team in building the data warehouse infrastructure, with projects with highest business value and breadth of usage determining the order in which data was populated in the data warehouse. A lot of attention was also given to building foundational elements of the data warehouse while concurrently meeting reporting needs on campus.
The BI System currently has approximately 2,600 active users on campus. The users represent all the schools and central units of the University. A majority of the users utilize standard prompted reports for
5
The University has taken an evolutionary approach to building the
data warehouse, meaning additional subject areas are incrementally built
every fiscal year to address the highest priority business
requirements. A great deal of reporting and data access capability is
already in place. Still more is in progress, planned for delivery in
FY13, or proposed and under consideration for delivery in FY14. Please
refer to Appendix 1 for a more detailed data warehouse gap analysis
The last few years of development priorities have been in establishing many of the foundational data elements in the data warehouse (or data marts for student, financials, faculty and staff, development, research, etc.) to provide self‐service reporting, whereas the newest set of priorities have evolved towards more integrated reporting capabilities. Some of the requested integrated reporting and analytics capabilities include:
6
We would like to articulate the following future vision for the use of BI in the University.
There will be a distributed set of advanced reporting skills that facilitate the delivery of advanced reports and dashboards within a functional area. The distributed reporting resources will conform to University BI standards and best practices established by a central unit.
All data marts will share common identifiers, dimensions and data definitions to support
integrated reporting and analytics.
These concepts are illustrated in the diagram below and described more fully on pages 15 – 17:
7
8
Power Users in Schools and Central Units:
The power users in schools and central units are the analytics resources who use data directly for decision making or conduct analysis and create reports to support decision makers. These resources need access to ad hoc reporting and analysis tools so that they can analyze data and create reports without having to rely on IT staff. These resources should be able to provide local support and expertise pertaining to appropriate usage of data and tools.As illustrated in the chart below, the integration layer can be built as either physical databases or virtual BI models using different integration techniques. This approach ensures that the investments made to establish the various data marts are minimally disrupted. Architectural best practices such as common dimensions or ‘conformed dimensions’ should be applied to the foundation layer. However, additional work is needed to make the data marts conform to each other so that they can be linked through common dimensions such as person, organization and time. Any new data mart should conform to this best practice so that new data marts do not exist as silos of information.
9
The BI Solutions and other enterprise applications teams have been steadily at work on a prioritized list of initiatives aimed at improving data access. While a lot of the foundational data elements are in place as illustrated by the list of completed and ongoing projects, there are still a few key gaps that, if filled, could facilitate the majority of the reporting requirements from our core enterprise systems and accelerate building specialized applications that integrate data across multiple areas. Even if we are to remain on our current trajectory (using our existing approaches and resources), there are several key hurdles that lie in our immediate path. If we aspire to accelerate our pace or increase our ambitions, additional challenges must be tackled. These challenges are described below along with some of the
10
A key component of integrated reporting is the existence of common data elements across multiple source systems. Data quality and consistency across all enterprise systems accelerate data integration. For example, in order to successfully track multiple aspects of a faculty across all enterprise systems through the data warehouse (e.g., finance, HR, student), the same employee identifier should exist for all faculty members across the three systems. However, the employee identifier is not consistent in some cases in our enterprise systems.
Also of particular note are the disparate organizational unit values/rollup hierarchies present in many of our enterprise systems. The primary management unit (PMU) project has made some solid headway towards using a crosswalk approach to this potential data integration challenge. It is unclear at this stage, however, if we can state with confidence that this approach will enable the cross‐system analyses that the institution wishes to perform, or, if we need to plan to approach a standardized organizational data dimension in another fashion (e.g., change the organizational unit data structures at their source in the enterprise systems).
Action plans will need to be developed to implement the chosen solutions. These will be projects in their own sense, posing a potential opportunity cost to build out this “infrastructure”. All enterprise systems that utilize the identifiers or the business processes that generate the common identifiers could be impacted. Therefore, addressing this would require sound enterprise architecture design, and broad support and commitment across the various data stewards, systems owners and business owners.
b.Security and Data Access Policy
If we could develop a security framework for data access that incorporates either role based or primary management unit (PMU) based security (or both) across our enterprise data, it would significantly improve efficiency in developing and maintaining the BI applications. Although this approach would incur a larger initial cost and work would still be required to link each BI application to this data security framework, the return on investment over the longer term in time savings/ efficiency could be significant. The key point is that an approach of custom security provisioning incurs added cost for each new application that is built in the BI tool (and is even more costly if multiple approaches are taken within an application) and limits the way the University can produce reporting/analytics on its data.
12
c.Integration of disparate data marts
Many of the foundational data marts were built independently of each other to meet reporting requirements within the functional areas. Some data marts were purchased and customized in house, such as Oracle EPM warehouse for Financials and Blackboard Analytics for Student data. The other data marts such as for Faculty and Staff (FASIS) and Alumni and Development (CATracks) are built in‐house. Although data elements needed to meet the integrated reporting requirements may exist in these data marts, data integration across disparately governed and managed data marts will require effort.
As the data warehouse becomes the central repository of integrated administrative information with common definitions and institutional data access policies, it can be viewed as the central repository of validated and reliable source of institutional data that other applications may want to consume. Rather than developing one‐off data feeds to disparate areas within campus or administering a multitude of data base accounts, we envision a data as a service model, where another web portal or application could consume the information real time or nightly from the data warehouse. Because the IBM architecture for Cognos BI is based on Service Oriented Architecture (SOA), we anticipate applications to connect with the data warehouse via web service protocols.
Implications
ii. Human Resources and Prioritization:
As we continue to add more applications (data marts, reporting packages, ad hoc tools) to the production system, the demand for enhancements, support and maintenance continues to increase. At the same time, we are taking on more new projects annually. The demand for more analytics/reporting is certainly more than our capacity to deliver as is evident by the list of outstanding requests. Balancing these two demands within the constraint of fixed number of BI resources is a challenge that will become more evident in the coming fiscal years as we add new solutions to the production environment.
We need to creatively explore resourcing and funding models to support these types of initiatives. Otherwise we run the risk of units pursuing one off solutions inconsistent with the University’s BI strategy.
14
Although ad hoc, self‐service capabilities continue to expand, the need for standardized, authored reports continues to exist. We have learned that the work involved to create data repositories to support data access via the Cognos BI tool is substantial. Because the central BI resources are expected to provide report development and training, we are already, in effect, constrained in our capacity to tackle additional projects without additional resources in some of these highly specialized areas. Also, there remain limited resources within the FASIS, SES, and CATracks teams trained in and devoted to using Cognos Report Studio, raising questions about who will construct, implement, and maintain these reporting capabilities during the next 1‐2 years and beyond. Opportunity may exist to broaden the population of staff trained in skills such as report writing to alleviate, at least in part, this resource constraint. For example, report development may be a task that could be undertaken by many of the more data‐savvy business analysts on the various enterprise system teams.
Implications
A more distributed staffing structure for operational reporting will increase report development activities and will certainly put pressures on central resources supporting and coordinating the distributed efforts in order to ensure common standards and best practices are adopted. Capacity for administration, meta‐data model (business views of data) or data mart changes (ETL development), training, support, mentoring and coordination of distributed efforts need to be planned appropriately.
It needs to be underscored that taking a 5 day “Report Studio” training is not adequate to convert a SQL expert or a crystal report developer to Cognos report developer. Ongoing mentoring and co‐development efforts are needed to ensure long term success. New projects where distributed resources are included as project team members may provide such opportunities.
As each solution is deployed, users expect to receive basic training in the various tools they will need for reporting. Furthermore, they expect quick access to online resources and
documentation to provide help as necessary. The reporting community also expects the solution to include some level of help desk support. At a minimum, the help desk should assist users in trouble‐shooting any problems they encounter. Beyond this, reporting users (especially analysts and report authors) would profit tremendously from guidance provided by a core team of reporting tool specialists and data experts. To meet this overall expectation, this solution must include a dedicated group of resources who will provide the administrative, training and help desk services required to support this reporting solution.Most BI users benefit most from one‐on‐one or small group training sessions, especially those catered more specifically to users’ individual business needs. Training is time intensive and the central BI team does not have dedicated training staff to support this. Although the team has taken on training responsibilities for many of the project rollouts, it would be ideal to train local trainers who are closer to the business units they serve.
Training may need capacity to be dedicated that would otherwise be available for business analysis, etc. The portfolio of projects approved may be thus affected.
In order to develop local power BI users within schools or units who can facilitate their use of self‐service ad hoc analysis and ad hoc reporting tools we need:
|
---|
VIII. Planning for the Future
a.Data Warehouse Where Appropriate
As described in the “data as a service model” in this document, we recognize that there are business needs to access the data within the warehouse in formats other than on‐screen, print‐ready, or flat‐file reports. To the extent that system‐to‐system access is merited, this should be done via a service call rather than a read‐only database account. Coupled with sound data management practices, this approach will help to promote reusability in our solutions and to safeguard against
misinterpretation or misuse of data in the data warehouse.b.Expect Ongoing Change