Pattern Sampling in Distributed Databases - Laboratoire LI, équipe BDTLN Accéder directement au contenu
Chapitre D'ouvrage Année : 2020

Pattern Sampling in Distributed Databases

Lamine Diop
  • Fonction : Auteur
  • PersonId : 1081731
Cheikh Talibouya Diop
  • Fonction : Auteur
Arnaud Giacometti

Résumé

Many applications rely on distributed databases. However, only few discovery methods exist to extract patterns without centralizing the data. In fact, this centralization is often less expensive than the communication of extracted patterns from the different nodes. To circumvent this difficulty, this paper revisits the problem of pattern mining in distributed databases by benefiting from pattern sampling. Specifically , we propose the algorithm DDSampling that randomly draws a pattern from a distributed database with a probability proportional to its interest. We demonstrate the soundness of DDSampling and analyze its time complexity. Finally, experiments on benchmark datasets highlight its low communication cost and its robustness. We also illustrate its interest on real-world data from the Semantic Web for detecting outlier entities in DBpedia and Wikidata.
Fichier principal
Vignette du fichier
adbis20.pdf (589.79 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)

Dates et versions

hal-03009021 , version 1 (17-11-2020)

Identifiants

Citer

Lamine Diop, Cheikh Talibouya Diop, Arnaud Giacometti, Arnaud Soulet. Pattern Sampling in Distributed Databases. Advances in Databases and Information Systems - 24th European Conference, ADBIS 2020, Lyon, France, August 25-27, 2020, Proceedings, pp.60-74, 2020, ⟨10.1007/978-3-030-54832-2_7⟩. ⟨hal-03009021⟩
88 Consultations
198 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More