Decoding the Cauzin Softstrip: a case study in extracting information from old media
Autor: | Michael Reimsbach, John Aycock |
---|---|
Jazyk: | angličtina |
Rok vydání: | 2021 |
Předmět: |
History
Computer science Digital data Information needs Convolutional neural network Library and Information Sciences computer.software_genre Barcode law.invention Set (abstract data type) Software law Encoding (memory) Digital curation Original Paper Multimedia business.industry Deep learning 05 social sciences Cauzin Softstrip Optical recognition Artificial intelligence 0509 other social sciences 050904 information & library sciences business computer |
Zdroj: | Archival Science |
ISSN: | 1573-7500 1389-0166 |
Popis: | Having content in an archive is of limited value if it cannot be read and used. As a case study of extricating information from obsolete media, making it readable once again through deep learning techniques, we examine the Cauzin Softstrip: one of the first two-dimensional bar codes, released in 1985 by Cauzin Systems, which could be used for encoding all manner of digital data. Softstrips occupy a curious middle ground, as they were both physical and digital. The bar codes were printed on paper, and in that sense are no different in an archival way than any printed material. Softstrips can be found in old computer magazines, computer books, and booklets of software Cauzin produced. However, managing the digital nature of these physical artifacts falls within the scope of digital curation. To make the information on them readable and useful, the digital information needs to be extracted, which originally would have occurred using a physical Cauzin Softstrip reader. Obtaining a working Softstrip reader is already extremely difficult and will most likely be impossible in the coming years. In order to extract the encoded data, we created a digital Softstrip reader, making Softstrip data accessible without needing a physical reader. Our decoding strategy is able to decode over 91% of the 1229 Softstrips in our Softstrip corpus; this rises to 99% if we only consider Softstrip images produced under controlled conditions. Furthermore, we later acquired another set of 117 Softstrips and we were able to decode nearly 95% of them with no adjustments to the decoder. These excellent results underscore the fact that technology like deep learning is readily accessible to non-experts; we obtained these results using a convolutional neural network, even though neither of the authors are expert in the area. |
Databáze: | OpenAIRE |
Externí odkaz: |