On Bi-gram Graph Attributes
Autor: | Konstantinovsky, Thomas, Mizrachi, Matan |
---|---|
Rok vydání: | 2021 |
Předmět: | |
Druh dokumentu: | Working Paper |
DOI: | 10.5539/cis.v14n3p78 |
Popis: | We propose a new approach to text semantic analysis and general corpus analysis using, as termed in this article, a "bi-gram graph" representation of a corpus. The different attributes derived from graph theory are measured and analyzed as unique insights or against other corpus graphs. We observe a vast domain of tools and algorithms that can be developed on top of the graph representation; creating such a graph proves to be computationally cheap, and much of the heavy lifting is achieved via basic graph calculations. Furthermore, we showcase the different use-cases for the bi-gram graphs and how scalable it proves to be when dealing with large datasets. Comment: 7 pages,8 figures |
Databáze: | arXiv |
Externí odkaz: |