Research Activities


Data Allocation in Parallel Database Systems
- Warehouse allocation
 

Multi-dimensional, hierarchical Fragmentation
We have developed a multi-dimensional, hierarchical fragmentation and allocation strategy for parallel data warehouses (MDHF) which significantly reduces work for star joins and supports appropriate processing and I/O parallelism.

We incorporated the data warehouse-inherent multi-dimensional, hierarchical character, bitmap indexing, fragment-wise processing and an adapted physical data placement into our framework.
Our VLDB 2000 paper outlines the approach and delivers a simulation study that confirms the benefits and exhibits its scalability.
 

Analytical determination of a star schema allocation
We conduct studies concerning semi-automatic determination of an allocation for parallel relational star schemas. The algorithm proposes appropriate fragmentations according to the needs of a weighted star query mix. We developed a combined metric that takes into account both query I/O-response time and -overhead. We approximate query costs by an analytical model (BTW 2001 paper, in German).
 

Warlock: A data allocation tool
To put it to practice, we have implemented a tool (Warlock, Warehouse allocation to disk) that assists a DBA during the complex allocation task. We demonstrated Warlock at the VLDB 2001 Conference in Rome, Italy (paper, poster, demo).
It comprises three layers (see figure):
 
Input layer
The DBA specifies the database and disk parameters as well as a representive mix of weighted star queries.

Prediction layer
Based on our cost model, this layer predicts I/O costs and query response times and determines fragmentations candidates. Heuristics are used to pre-exclude unreasonable fragmentation as early as possible.

Output layer
Detailed results are presented to the user. In particular, a ranked list of suitable fragmentations as well as an detailed query cost evaluation are provided. A physical data allocation scheme is proposed and visualized.



Related publications

T. Stöhr, E. Rahm: Warlock: A Data Allocation Tool for Parallel Warehouses
Proc. 27th Intl. Conference on Very Large Databases (VLDB 2001), Rome, Italy, Sep. 2001 (software demonstration)

T. Stöhr: Analytische Bestimmung einer Datenallokation für Parallele Data Warehouses
Proc. 9. Fachtagung Datenbanken für Büro, Technik und Wissenschaft (BTW 2001), Oldenburg, March 2001. Springer-Verlag, Berlin, 2001
[slides, zipped jpg, in German]

T. Stöhr, H. Märtens, E. Rahm: Multi-Dimensional Database Allocation for Parallel Data Warehouses
Proc. 26th Intl. Conference on Very Large Databases (VLDB 2000), Cairo, Egypt, Sep. 2000
[slides, pdf]



Full publication list ---- Project overview and related workgroup publications


Home Page ----- Research areas and publications of the Leipzig database group

last update: November 2002