ГИПЕРСЕТЬ НАУЧНОГО СОАВТОРСТВА. АНАЛИЗ ДАННЫХ БД REPEC

Rok vydání: 2022
Předmět:
DOI: 10.24412/2073-0667-2022-4-70-83
Popis: Рассмотрены вопросы моделирования комплексной сети научного соавторства, представленной в виде гиперграфа, в отличие от традиционного подхода к изучению этого феномена, базирующегося на построении взвешенного либо невзвешенного графа. Приведены формальные сведения, необходимые для описания множественных отношений между группами соавторов, представлены две модели анализируемого объекта. На основе реальной информации, извлеченной из библиографической базы данных, сконструирован гиперграф сети соавторства, измерены его параметры и сформулированы основные свойства. Приведен содержательный пример. В результате работы феномен научного соавторства рассмотрен с новой точки зрения.
The article deals with the modeling of a complex network of co-authorship, presented in the form of a hypergraph. The formal information necessary to describe multiple relationships between coauthors is given and two models of the analyzed object are presented. Based on real information extracted from the bibliographic database RePEc, a hypergraph of the co-authorship network is constructed. Most of the previous studies consider the co-authorship relation between two authors as a collaboration. So a network is represented as a simple graph in which link relates only a pair of authors that are coauthors of at least one scientific paper (SP). These pairwise networks have been studied from many aspects such as degree distribution analysis, community extraction, authors ranking, see, for example, [4-8]. Such networks does not provide a complete description of the collaboration because we only know whether scientists have collaborated or not but we can’t know whether a group of authors linked together in the network were coauthors of the same paper or not. As a variant of the representation that takes into account n-arv relations between authors, a bipartite graph may be considered, in which one partite set represents the authors, the other — SPs prepared by these authors. This makes it possible to use the apparatus of graph theory, but at the same time, heterogeneity in the definition of nodes makes more complicated the study of such topological properties as connectivity and clustering. Therefore, in [10], it is proposed to use a graph generalization, a hypergraph [11], to represent a complex system and call it a hyper-network. Edges of a hypergraph can relate groups of more than two nodes. A (undirected) hypergraph H = (V,E) on a finite set V = v1,v2,... ,vn is defined bv the family E = (E1,E2, ... ,Em) of subsets of the set V. An element vᵢ ϵ V is called a node, an element Eᵢ ϵ E is called an ( hyper)edge [17]. Let P = {p1,p2,... ,pm} be the set of SPs, and S = {si,s2,..., sn} be the set of their authors. We assume that P contains only those SPs that have two or more authors, i.e. the constructed hypergraph will not have edges consisting of a single vertex. Let us define a hypergraph H1 = (V, E1) such that the set S is mapped to the set of vertices V, and the set P is mapped to the set of edges E1 and if the SP pᵢ is prepared precisely by the authors v1,v2,...,vk than Eᵢ = {vi, v2,..., vk} is an edge, Eᵢ ϵ E1. The number of edges m1 = |E1| is the number of publications |P| [10]. We can also define a hypergraph H2 = (V, E2,ω) in which nodes represent authors and hyper-edges represent the groups of authors that have published papers together. Here Eᵢ = {v1, v2,... ,vk} ϵ E2 if there is at least one SP jointly published by the authors v1,v2,... ,vk. The edge weight is the number of SPs published jointly by these k authors. Number of edges m2 = |E2| is the number of groups of authors [6]. In our work, we consider a set of SPs indexed in the RePEc database at the time of extraction. The procedure for filtering “raw” data is presented in [15]. As a result, having 91113 co-authored SPs and 32434 authors we construct the hypergraph Hca = (V, E) by analogy with H1 above. At this stage we use the bipartite incidence graph K(Hca) = (V,V1,Ek) in order to calculate a number of parameters of Hca . The graph K(Hca) that is isomorphic to Hca can be obtained by associating with each hyper-edge Ej ϵ E ад additional vertex vej and defining the set V' = {vej : Ej ϵ E} such that an edge between v ϵ V and vej ϵ V 'exists if v ϵ Ej [24]. It is shown that the hypergraph Hca is neither simple nor conformal. Parameter values are given in Tab. 2. As an example, we consider the hypergraph component consisting of 12 nodes and 27 edges (Fig. 1, Tab. 1). It is noted that based on the hypergraph, co-authorship networks considered in the works [15, 16] can be built, the reverse is not true.
Databáze: OpenAIRE