Popis: |
Far from representing a clear-cut entity, the humanitarian space encompasses a wide variety of actors that often do not conform to country and language borders. In a context where people and ideas are in perpetual movement, terms are bound to end up having different meanings to different members of the community (Collinson & Elhawary, 2012). In fact, surveys among humanitarian practitioners show that, when it comes to defining key humanitarian concepts, they are unable to reach unanimous agreements and adopt markedly contrasting practices (Egger & Schopper, 2018). This insight led to the compilation of the Humanitarian Encyclopedia corpus , which contains over 70 million occurrences and allows for the study of semantic variation in this field. Terminology offers all the other necessary tools. The corpus can be divided into eleven subcorpora for different types of humanitarian organizations, so we specifically set out to examine diastratic variation. From a terminological perspective, we understand this phenomenon as the coexistence of different language uses within different groups of experts in the same field (Picton & Dury, 2017). To reveal whether certain humanitarian terms present in the corpus are subject to semantic variation, we propose a methodology that combines quantitative and qualitative methods, accompanied by several case studies. Given that we profit from an unusually large corpus for terminology research, we take advantage of innovative machine-learning techniques and use them as a stepping-stone for a detailed inspection (Hilpert & Gries, 2011; Condamines & Picton, in press). As a result, we proceed in two steps. First, we use a series of R scripts to produce vectors that capture the meanings of the terms in each subcorpus, situating them as more or less distant points that we can plot and compare. To do so, we build on the idea that semantic similarity correlates with distributional similarity (Harris, 1954; Firth, 1957), relying on the word2vec family of algorithms to infer the meaning of terms from their contexts in the corpus (Mikolov et al., 2013). To substantiate the observations that we make in the resulting visualizations, we then turn to more habitual practices in terminology studies. In particular, we use concordance software to take a closer look at specific contexts of the terms in our corpus and we contrast our results with the perspective of humanitarian experts (Picton & Dury, 2015). |