Cardinality estimation of an approximate substring query is an important problem in database systems. Traditional approaches build a summary from the text data and estimate the cardinality using the summary with some statistical assumptions. Since deep learning models can learn underlying complex data patterns effectively, they have been successfully applied and shown to outperform traditional methods for cardinality estimations of queries in database systems. However, since they are not yet applied to approximate substring queries, we investigate a deep learning approach for cardinality estimation of such queries. Although the accuracy of deep learning models tends to improve as the train data size increases, producing a large train data is computationally expensive for cardinality estimation of approximate substring queries. Thus, we develop efficient train data generation algorithms by avoiding unnecessary computations and sharing common computations. We also propose a deep learning model as well as a novel learning method to quickly obtain an accurate deep learning-based estimator. Extensive experiments confirm the superiority of our data generation algorithms and deep learning model with the novel learning method. [Go to the full record in the library's catalogue]
This video is presented here with the permission of the speakers.
Any downloading, storage, reproduction, and redistribution are strictly prohibited
without the prior permission of the respective speakers.
Go to Full Disclaimer.
Full Disclaimer
This video is archived and disseminated for educational purposes only. It is presented here with the permission of the speakers, who have mandated the means of dissemination.
Statements of fact and opinions expressed are those of the inditextual participants. The HKBU and its Library assume no responsibility for the accuracy, validity, or completeness of the information presented.
Any downloading, storage, reproduction, and redistribution, in part or in whole, are strictly prohibited without the prior permission of the respective speakers. Please strictly observe the copyright law.