|
|
 |
Data Mining Meets Traffic Modeling
Faculty: Anastassia Ailamaki, Christos
Faloutsos, Greg Ganger (CMU/SCS), Anthony Brockwell (CMU/Stat), Tara Madhyastha (UC Santa Cruz, CS)
Student: Mengzhi Wang, Spiros Papadimitriou, Kinman Au
Traffic modeling is extremely helpful in evaluating
system designs. The work involves the following two aspects. The
first is to discover and to quantify the most important features of the
traffic data. Two example features are temporal burstiness and spatial
locality. In addition, it's even harder to determine how these
features affect the performance of the traffic data in real
systems. Secondly, we need an efficient statistical model to
generate synthetic workloads of similar behavior as the real
ones. Traditional models such as Poisson are inadequate in
generating timestamps for traffic data of strong burstiness, not
mentioning generating multi-dimensional traffic.
This project is to solve the above problem. Our previous work has
focused on the spatio-temporal behavior of traffic data, more
specifically, the temporal burstiness and spatial locality of I/O
workload. Our proposed tool, entropy plot, is able to quantify
the temporal burstiness and spatial locality in traffic data. The
B-model generates the timestamps for the synthetic traffic to imitate
the temporal burstiness of real traffic data. The PQRS model goes
one step further by generating both the timestamps and request
locations for synthetic traces. The ongoing work is to augment
the model to deal with more dimensionality.
ACKNOWLEDGEMENTS: This material is based upon work
supported by the National Science Foundation
under Grant No.
IIS-0083148 which was a collaborative award, with Prof. Tara Madhyastha
of UC Santa Cruz (NSF grant number 0083130).
This work is also supported in part by the Pennsylvania
Infrastructure Technology Alliance (PITA),
a partnership of Carnegie Mellon, Lehigh University
and the Commonwealth of Pennsylvania's Department
of Community and Economic Development (DCED).
Additional funding was provided by donations from Intel and NTT.
Any opinions, findings, and conclusions or recommendations expressed in this
material are those of the author(s) and do not necessarily reflect the views
of the National Science Foundation, or other funding parties.
Publications
*
Yasushi Sakurai, Spiros Papadimitriou, Christos Faloutsos,
"BRAID: Stream Mining through Group Lag Correlations", SIGMOD, 2005
*
Edoardo Airoldi and Christos Faloutsos, "Recovering Latent Time-Series from their Observed Sums: Network Tomography with Particle Filters",
Proceedings of the 10th ACM SIGKDD Conference, 2004
*
Mengzhi Wang, Kinman Au, Anastassia Ailamaki, Anthony Brockwell, Christos Faloutsos and Greg Ganger,
"Storage Device Performance Prediction with CART Models" (extended abstract), SIGMETRICS - Performance, 2004
*
Spiros Papadimitriou and Christos Faloutsos, "Cross-Outlier Detection", SSTD, 2003
*
Spiros Papadimitriou, Anthony Brockwell and Christos Faloutsos, "Adaptive, Hands-Off Stream Mining", VLDB, 2003
*
Deepayan Chakrabarti and Christos Faloutsos,
"F4: Large-Scale Automated Forecasting Using Fractals", 11th ACM International
Conference on Information and Knowledge Management (CIKM 2002), Mclean, Virginia, 2002
* Data Mining
Meets Performance Evaluation: Fast Algorithms for Modeling Bursty
Traffic, M. Wang, T. Madhyastha, N.H. Chan, S. Papadimitriou, C.
Faloutsos, 18th Internal Conference on Data Engineering, 2002
* Capturing the
Spatio-Temporal Behavior of Real Traffic Data, M. Wang, A. Ailamaki, C.
Faloutsos, Performance 2002(IFIP Intl. Symp. on Computer Performance
Modeling, Measurement, and Evaluation), Rome, Italy.
|
|