Distance Function Design / Data Scaling

Faculty: Christos Faloutsos

Student: Leejay Wu

    We are working on a multi-stage approach to automated data mining.  The "grand vision" consists of the following:
    1. Nonlinear transformation through probability distributions in order to (a) summarize the data and (b) render it easier to model.
    2. Attribute selection in order to identify likely correlations and dependencies.  This results in a reduced-dimensionality, transformed space; or, isomorphically, a new distance function.
    3. Modeling of dropped attributes with selected attributes.

Publications

      * Leejay Wu and Christos Faloutsos. Making Every Bit Count: Fast Nonlinear Axis Scaling. In Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Edmonton, Alberta, July 2002.

      * Caetano Traina, Agma Traina, Leejay Wu and Christos Faloutsos. Fast Feature Selection Using the Fractal Dimension. In Proceedings of the XV Brazilian Symposium on Databases, Paraiba, Brazil, October 2000.