Semantic-based robust text watermarking algorithm

Autor: ZHANG Kun, LI Bo, CHEN Xi, YANG Xiaoyi, WU Le, HONG Richang
Jazyk: čínština
Rok vydání: 2024
Předmět:
Zdroj: 大数据, Vol 10, Pp 49-61 (2024)
Druh dokumentu: article
ISSN: 2096-0271
DOI: 10.11959/j.issn.2096-0271.2024068
Popis: Text watermarking can determine the copyright ownership of text data, facilitating secure circulation and sharing of data. Existing text watermarking algorithms typically pre-mark words and employ word substitution methods to embed watermarks. However, these algorithms only mark candidate words based on the hash value of the previous word, limiting the robustness of the watermarking algorithm. To address this issue, SRTW algorithm was proposed. Specifically, semantic embeddings of the text were obtained using existing embedding models. Then, these embeddings were converted into word markers (-1 or 1) through a trained word marking model. Finally, words marked as 1 were selected to replace the original words to construct the watermark. Compared with existing more advanced benchmark algorithms, the proposed SRTW algorithm improves the AUC metric by 2.08%, 5.17%, and 3.09% in three different attack scenarios, respectively, demonstrating the effectiveness of the SRTW algorithm.
Databáze: Directory of Open Access Journals