Clipping the Page – Automatic Article Detection and Marking Software in Production of Newspaper Clippings of a Digitized Historical Journalistic Collection

Autor: Kimmo Kettunen, Erno Liukkonen, Tuula Pääkkönen
Přispěvatelé: Doucet, Antoine, Isaac, Antoine, Golub, Koraljka, Aalberg, Trond, Jatowt, Adam, The National Library of Finland, Research Library
Rok vydání: 2019
Předmět:
Zdroj: Digital Libraries for Open Knowledge ISBN: 9783030307592
TPDL
DOI: 10.1007/978-3-030-30760-8_33
Popis: This paper describes utilization of article detection and extraction on the Finnish Digi (https://digi.kansalliskirjasto.fi/etusivu?set_language=en) newspaper material of the National Library of Finland (NLF) using data of one newspaper, Uusi Suometar 1869–1918. We use PIVAJ software [1] for detection and marking of articles in our collection. Out of the separated articles we can produce automatic clippings for the user. The user can collect clippings for own use both as images and as OCRed text. Together these functionalities improve usability of the digitized journalistic collection by providing a structured access to the contents of a page.
Databáze: OpenAIRE