Multimedia and Stream Data Mining

Faculty: Christos Faloutsos

Student: Jia-Yu Pan

    We provide tools for the following questions:
  1. Given a video stream, find interesting patterns, including either visual, auditory or text-level patterns, which lead to applications like segmentation, clustering and rule discovery.
  2. Given a general stream (e.g. motion capture data), find patterns such as correlated attributes which will benefit segmentation and rules discovery.
    Several research topics are related to our work. Here, we make a partial list of those with our current proposed methods and results on them.
  1. Dimensionality reduction: ICA (or Fastmap) for dimensionality reduction
  2. Similarity search (distance function): Dimensionality reduction as the first step.
  3. Multimodal classification: Combine classification decisions from multiple classifiers.
  4. Rule discovery: For example,
    1. Using "geoplot" we find rules such as: Words for describing ``storms developed in the ocean'': ``Typhoon'', used in West Pacific;``Monsoon'', used in Indian Ocean; ``Hurricane'', used in North Atlantic.
    2. Using "videocubes" we find multimodal rules like: ``White ceiling associates with human speech, blue sky associates with soft music.''
  5. Efficiency consideration: The ideal result is a fast computation and online algorithm for streaming data (O(1) storage space, O(1) amortized time complexity).
    Currently, we have proposed several tools in achieving the goals outlined above. They are
  1. AutoSplit: Sparse coding and blind hidden variable separation. Cluster attributes using weights specified by bases. Cluster data items base on their similar values on the hidden variables.
  2. VideoCube: Automatic and ``natural'' visual/auditory feature extraction, considering both spatial and temporal information.
  3. VideoGraph: Dimensionality reduction using FastMap. Segmentation by thresholding a marginal cost measurement.
  4. Geoplot: Similarity function between Geo-footprints and rule discovery. Involving the name-entity extraction and a gazetteer.
  5. FastCARS: Temporal correlation awared sampling method on data streams.

Publications

      * Jia-Yu Pan, Srinivasan Seshan, and Christos Faloutsos. FastCARS: Fast, Correlation-Aware Sampling for Network Data Mining. In Proceedings of IEEE GlobeCOM 2002 - Global Internet Symposium, 2002.

      * Jia-Yu Pan and Christos Faloutsos. "Geoplot": Spatial Data Mining on Video Libraries. In Proceedings of the Eleventh International Conference on Information and Knowledge Management (CIKM 2002), 2002.

      * Jia-Yu Pan and Christos Faloutsos. VideoCube: a novel tool for video mining and classification. In Proceedings of the Fifth International Conference on Asian Digital Libraries (ICADL 2002), 2002.

      * Jia-Yu Pan and Christos Faloutsos. VideoGraph: A New Tool for Video Mining and Visualization. In Proceedings of the First ACM+IEEE Joint Conference on Digital Libraries (JCDL 2001), 2001.