Data to topics

Tourist guides

Data sources

https://theculturetrip.com/europe/cyprus

https://www.visitcyprus.com/index.php/en

https://www.fodors.com/world/europe/cyprus

https://www.lonelyplanet.com/cyprus

https://www.roughguides.com/cyprus

Source code

https://github.com/hovjdev/CyprusVitalSigns/blob/main/wordembedings.py

Algorithm

  1. A corpus of text referring to tourism in Cyprus is collected and used to train a Latent Dirichlet Allocation (LDA) model.

  2. From the model are extracted the most important topics as well as the associated keywords.

  3. The topics and keywords are displayed on word clouds.

Topics extracted from a corpus of text referring to tourism in Cyprus

Various topics extracted from the LDA model