Filtering Instagram Hashtags Through Crowdtagging and the HITS Algorithm
Autor: | Nicolas Tsapatsoulis, Stamatios Giannoulakis |
---|---|
Rok vydání: | 2019 |
Předmět: |
Information retrieval
Computer science business.industry Rank (computer programming) Collective intelligence 020207 software engineering Context (language use) 02 engineering and technology HITS algorithm Crowdsourcing Human-Computer Interaction Automatic image annotation Modeling and Simulation 0202 electrical engineering electronic engineering information engineering Selection (linguistics) 020201 artificial intelligence & image processing business Image retrieval Social Sciences (miscellaneous) |
Zdroj: | IEEE Transactions on Computational Social Systems. 6:592-603 |
ISSN: | 2373-7476 |
DOI: | 10.1109/tcss.2019.2914080 |
Popis: | Instagram is a rich source for mining descriptive tags for images and multimedia in general. The tags–image pairs can be used to train automatic image annotation (AIA) systems in accordance with the learning by example paradigm. In previous studies, we had concluded that, on average, 20% of the Instagram hashtags are related to the actual visual content of the image they accompany, i.e., they are descriptive hashtags, while there are many irrelevant hashtags, i.e., stop-hashtags, that are used across totally different images just for gathering clicks and for searchability enhancement. In this paper, we present a novel methodology, based on the principles of collective intelligence that helps in locating those hashtags. In particular, we show that the application of a modified version of the well-known hyperlink-induced topic search (HITS) algorithm, in a crowdtagging context, provides an effective and consistent way for finding pairs of Instagram images and hashtags, which lead to representative and noise-free training sets for content-based image retrieval. As a proof of concept, we used the crowdsourcing platform Figure-eight to allow collective intelligence to be gathered in the form of tag selection (crowdtagging) for Instagram hashtags. The crowdtagging data of Figure-eight are used to form bipartite graphs in which the first type of nodes corresponds to the annotators and the second type to the hashtags they selected. The HITS algorithm is first used to rank the annotators in terms of their effectiveness in the crowdtagging task and then to identify the right hashtags per image. |
Databáze: | OpenAIRE |
Externí odkaz: |