PANDA - A Unified and Flexible Framework for Comparing Arbitrarily Complex Patterns

PANDA overview

PANDA is a generic framework for the comparison of both simple and arbitrarily complex patterns defined over raw data and over other patterns, respectively. PANDA is generic since patterns of arbitrary complexity are supported and flexible since dissimilarity assessment can be easily adapted to specific user/application requirements.

In PANDA patterns are modeled as entities composed of two parts: the structure component that identifies ``interesting'' regions in the attribute space, e.g., the head and the body of an association rule, and the measure component that describes how the patterns are related to the underlying raw data, e.g., the support and the confidence of the rule.

When comparing two simple patterns, the dissimilarity of their structure components and the dissimilarity of their measure components are combined (through some combining function) in order to derive the total dissimilarity score ...

The problem of comparing complex patterns is reduced to the problem of comparing the corresponding sets (or lists, arrays etc.) of component (simple) patterns. Thus, component patterns are first paired (using a specific matching type) and their scores are then aggregated (through some aggregation function) so as to obtain the overall dissimilarity score. This recursive definition of the dissimilarity allows PANDA to handle patterns of arbitrary complexity...