Reservoir Pattern Sampling in Data Streams - Laboratoire LI, équipe BDTLN Accéder directement au contenu
Communication Dans Un Congrès Année : 2021

Reservoir Pattern Sampling in Data Streams

Résumé

Many applications generate data streams where online analysis needs are essential. In this context, pattern mining is a complex task because it requires access to all data observations. To overcome this problem, the state-of-the-art methods maintain a data sample or a compact data structure retaining only recent information on the main patterns. This paper addresses online pattern discovery in data streams based on pattern sampling techniques. Benefiting from reservoir sampling, we propose a generic algorithm, named ResPat, that uses a limited memory space and that integrates a wide spectrum of temporal biases simulating landmark window, sliding window or exponential damped window. For these three window models, we provide fast damping optimizations and we study their temporal complexity. Experiments show that the performance of ResPat algorithms is particularly good. Finally, we illustrate the interest of our approach with online outlier detection in data streams.
Fichier principal
Vignette du fichier
streamsamp.pdf (1.06 Mo) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)

Dates et versions

hal-03467864 , version 1 (06-12-2021)

Identifiants

Citer

Arnaud Giacometti, Arnaud Soulet. Reservoir Pattern Sampling in Data Streams. European Conference on Machine Learning and Knowledge Discovery in Databases. (ECML PKDD 2021), Sep 2021, Bilbao (virtuel), Spain. pp.337-352, ⟨10.1007/978-3-030-86486-6_21⟩. ⟨hal-03467864⟩
55 Consultations
145 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More