|
|
 |
Database Microbenchmarking --- Create a
Small but Real World
Faculty: Anastassia Ailamaki
Student: Minglong Shao
Performance evaluation of Database system from the
architectural perspective has become a hot topic in database research.
It aims at characterizing database behavior on modern computer
architectures and correlating bottlenecks to the underlying hardware
components. Database benchmarks are synthetic database workloads
consisting of datasets and queries, which offer rich environments
representative of typical database applications. Although current
database benchmarks are well-designed to simulate the real world
applications, they are not applicable in performance evaluation due to
the following reasons. First, it is hard to setup the experiment
environment: large hardware configuration is usually beyond the
research budget; it may take years to execute a complex query on
simulator; researchers have to set hundreds of parameters correctly to
make sure that they match the requirements of the intended experiments.
Secondly, it is difficult to analyze the results. Full-scale benchmarks
test all aspects of the system. Intensive interactions between
different components make it difficult to create a
bottleneck-to-component mapping. Thirdly, full-scale benchmarks have
the feature of uncertainty. For instance, query plan may vary
dramatically with different system configurations, which complicates
the analysis unnecessarily. Due to the above reasons, studies of
database performance evaluation are mostly restricted on small scale
benchmarks and a subset of simple queries. Though previous researches
have successfully characterized the database behavior at small scale
and predicted some behavior trends of database system, they lack strict
analysis and proof based on sufficient experiments with database
systems of different scales. The questions of "is the behavior on
small scale benchmarks good representatives?", "when does scaling down
matter, in what way and to what extent?", and "what is the best way to
scale down database workloads for analysis purpose?" are still open.
The project aims at answering the above questions. We want to provide a
methodology to shrink/simplify database workloads correctly, which can
be used to evaluate computer architecture innovations and help to
decide the design trade-off's quickly.
|
|