Please use this identifier to cite or link to this item:
http://hdl.handle.net/2067/46606
Title: | Topic Modeling by Community Detection Algorithms | Authors: | Amati, Giambattista Angelini, Simone Cruciani, Antonio Fusco, Gianmarco Gaudino, Giancarlo Pasquini, Daniele Vocca, Paola |
Issue Date: | 2021 | Abstract: | We first estimate the number of Italian users active on Twitter in the last year by filtering the Italian flow of Twitter. We show that our filter misses about the 6.86% of the Italian flow, while 86.80% of the selected tweets belongs to the Italian language. Given this accuracy of the Italian Twitter's Firehose filter, we are able to assess the actual number of the Italian active users (AUs) of this platform. We then introduce a massive text document clustering algorithm that is easily applicable and scalable to the Twitter social network. Instead of a topic modeling approach based on features selection and any conventional clustering algorithm, such as LDA, we apply community detection algorithms on the weighted hashtag graph . In order to scale with the graph size, we apply two linear community detection algorithms, CoDA and Louvain. Once the hashtags have been assigned to clusters, both the most numerous clusters and hashtags were associated with topics of general interest, such as sports, politics, health etc. In this way we are able to provide significant statistics of the topics covered on Twitter in the past year. |
URI: | http://hdl.handle.net/2067/46606 | ISBN: | 9781450386326 | DOI: | 10.1145/3472720.3483622 |
Appears in Collections: | D1. Contributo in Atti di convegno |
Files in This Item:
File | Description | Size | Format | Existing users please |
---|---|---|---|---|
HT'21.pdf | 428.22 kB | Adobe PDF | Request a copy |
SCOPUSTM
Citations
20
3
Last Week
0
0
Last month
0
0
checked on Apr 17, 2024
Page view(s)
69
Last Week
0
0
Last month
0
0
checked on Apr 24, 2024
Download(s)
5
checked on Apr 24, 2024
Google ScholarTM
Check
Altmetric
All documents in the "Unitus Open Access" community are published as open access.
All documents in the community "Prodotti della Ricerca" are restricted access unless otherwise indicated for specific documents