Autor: | William DuMouchel, Christian Posse, Nandini Raghavan, Martha Nason, David Madigan, Greg Ridgeway |
---|---|
Rok vydání: | 2002 |
Předmět: |
Computer Networks and Communications
Computer science Statistical model ComputerSystemsOrganization_PROCESSORARCHITECTURES Lossy compression computer.software_genre Computer Science Applications Statistical analyses Statistical analysis Data mining Hardware_CONTROLSTRUCTURESANDMICROPROGRAMMING Hardware_REGISTER-TRANSFER-LEVELIMPLEMENTATION computer Information Systems Data compression |
Zdroj: | Data Mining and Knowledge Discovery. 6:173-190 |
ISSN: | 1384-5810 |
DOI: | 10.1023/a:1014095614948 |
Popis: | Squashing is a lossy data compression technique that preserves statistical information. Specifically, squashing compresses a massive dataset to a much smaller one so that outputs from statistical analyses carried out on the smaller (squashed) dataset reproduce outputs from the same statistical analyses carried out on the original dataset. Likelihood-based data squashing (LDS) differs from a previously published squashing algorithm insofar as it uses a statistical model to squash the data. The results show that LDS provides excellent squashing performance even when the target statistical analysis departs from the model used to squash the data. |
Databáze: | OpenAIRE |
Externí odkaz: |