Popis: |
Statistical data are often stored in proprietary files formats such as SAS, Stata, SPSS, and others. While useful for processing and analytical purposes, it makes it challenging to access unless you have the right software or utility, which often requires commercial licensing. While statistical packages are not particularly metadata aware, these files hold a significant amount of variable level information. Having the ability to extract these in a DDI friendly XML format along with complementing it with summary statistics computed off the data, is highly desirable. Extending on previous efforts, Metadata Technology North America has enhanced and developed new Java based packages for reading Stata and SPSS files that can export the data in ASCII text format and extract variable level DDI-Codebook and DDI-Lifecycle metadata (data dictionary and summary statistics). Various options are available in terms of ASCII flavors and metadata generation, providing features beyond what is typical export capabilities of statistical packages or utilities. This enables the conversion of data files into open format combining ASCII+DDI, fit for long term preservation, dissemination, or further processing by DDI aware tools. Our presentation will provide an overview of these utilities, describe use cases, share lessons learned, and discuss future development. |