Popis: |
Service search in IoT’s large-scale, heterogeneous and multi-domain services space is a challenging task. It can take the time that may not be acceptable for many IoT applications and requires resources that may not be available in many IoT devices. A categorisation of these services into their application domains can reduce the search space and offer an efficient and scalable service search. Recently, in many fields, such as short text messages and categorisation of IoT service specifications, generative probabilistic models, like Topic modelling, are being used. Generally, IoT service descriptions are short and sparse. Existing work on IoT services categorisation is based on Latent Dirichlet Allocation (LDA), but it does not perform well in short and sparse texts. Also, IoT services categorisation has few specific issues, which are not well addressed by existing short texts-specific topic modelling approaches. In this paper, we identify these issues and quantitatively and qualitatively evaluate how well a set of selected short texts-specific topic modelling approaches perform as IoT service categorisers against these issues. The results show that these approaches do not perform well in a corpus of noisy APIs descriptions and heterogeneous service descriptions. Also, they do not support domain identification of services, which is essential in domain-based service search. We conclude that integrating an appropriate and comprehensive knowledge base (i.e., domain ontology) could minimise noise and address IoT’s APIs and service descriptions’ heterogeneity. More importantly, it can identify the domains of those APIs and services. |