Information gain-based selection of sequential patterns extracted from partial unimodal probabilistic bases of sequences
Résumé
The uncertainty of symbolic data can be represented by probability mass functions.
Numerous work adopt this approach to characterize the uncertainty of the events forming a probabilistic base of sequences and extract sequential patterns under the possible worlds semantics. To our knowledge, there is no method for selecting sequential patterns from probabilistic bases of sequences whose probability mass functions are unimodal and for which only the probabilities of the modes are available. Since this situation arises for several kinds of data, a method for selecting sequential patterns extracted from partial unimodal probabilistic bases of sequences is thus proposed in this paper. Using an information gain approach, it outputs informative patterns whose occurrences tend to describe the dataset in a complementary way. Experiments on synthetic and real datasets show that the method is scalable and that selected patterns, beside being informative and complementary, help end-users to complete their knowledge.
Fichier principal
Information_gain_based_selection_of_sequential_patterns_extracted_from_partial_unimodal_probabilistic_bases_of_sequences.pdf (4.14 Mo)
Télécharger le fichier
Origine | Fichiers produits par l'(les) auteur(s) |
---|