|Nine key data warehousing trends CIOs need to know about|
February 28, 2011
Gartner analysts said the data warehouse is set to remain a key component of the IT infrastructure and believe that, as the demand for business intelligence and the wider category of business analytics increases, optimization, flexible designs, and alternative strategies will become more important.
“The data warehouse remains one of the largest—if not the largest—information repository in the enterprise,” said Mark Beyer, research vice-president at Gartner. “Only by being aware of the key market trends and how emerging technology solutions will blend with proven practices can the CIO avoid budget waste through 'misdirection' by the data warehouse management and delivery team.”
Gartner identified these trends in the data warehousing market for 2011 through 2012:
• Optimization and performance
Advanced functionality for hardware management of input/output, disk storage, and CPU/memory balancing are now included almost as a matter of course in data-warehouse-capable platforms. Some new entrants are focusing on optimization as a differentiator and nearly every data warehouse vendor is now addressing the issue of optimizing storage for the warehouse via compression and usage-based data placement strategies. Vendors are also expending great effort differentiating their products on performance claims and technology, in ways that are not necessarily significant to the use case.
• Data warehouse appliances
Although there are many reasons why organizations consider buying an appliance, the main reason is simplicity. The vendor builds and certifies the configuration, balancing hardware, software, and services for a predictable performance. The appliance is delivered complete and installs rapidly. If there are any problems, a single call to the appliance vendor is the first course of action. There is a secondary effect as well, in that appliances can speed delivery by avoiding time-consuming hardware balancing.
• The intensive POC
Proofs of concept should use as much real source-system extracted data from the operational systems as possible, and while using as many users as possible, creating a data warehouse workload that approaches that of the environment to be used in production.
• Data warehouse mixed workloads
There are six workloads that are delivered by the data warehouse platform: Bulk/batch load, basic reporting, basic online analytical processing, real-time/continuous load, data mining, and operational business intelligence. Warehouses delivering all six workloads need to be assessed for predictability of mixed workload performance.
• The resurgence of data marts
A data mart is defined as an application-specific analytic repository of any size, normally with a specific, smaller group of users than a data warehouse. Data marts can be used to optimize the data warehouse by offloading part of the workload to the data mart, returning greater performance to the warehousing environment.
• Column-store database management systems
Column-store database management systems generally exhibit faster query response than traditional, row-based systems and can serve as excellent data mart platforms, and even as a main data warehouse platform.
• In-memory database management systems
In-memory DBMS technologies exhibit extremely fast query response and data commit times and introduce a higher probability that analytics and transactional systems can share the same database. Analytic data models, master data approaches, and data services within a middle tier will begin to emerge as the dominant approach, forcing more traditional row-based vendors to adapt to column approaches and in-memory simultaneously.
• Data warehouse as a service and cloud
In 2011, data warehouse as a service comes in two types—software as a service and outsourced data warehouses. Data warehouse in the cloud is primarily an infrastructure design option as a data model must still be developed, an integration strategy must be deployed, and business intelligence user access must be enabled and managed. Private clouds are an emerging infrastructure design choice for some organizations in supporting their data warehouse and analytics.
• Using an open-source database management system to deploy the data warehouse
Open-source DBMSs are still being used in both experimental and more formalized approaches. At this point, open-source warehouses are rare and usually smaller than traditional ones and also generally require a more manual level of support. However, some solutions are optimized specifically for data warehousing.
Read more at:
(registration may be required)
| TechTopics Plus