Scalable Reading of Structured Data

Max Odsbjerg Pedersen, Josephine Møller Jensen, Victor Harbo Johnston, Alexander Ulrich Thygesen, Helle Strandgaard Jensen

Research output: Contribution to journal/Conference contribution in journal/Contribution to newspaperJournal articleResearchpeer-review

Abstract

In this lesson, we introduce a workflow for scalable reading of structured data, combining close interpretation of individual data points and statistical analysis of the entire dataset. The lesson is structured in two parallel tracks:

A general track, suggesting a way to work analytically with structured data where distant reading of a large dataset is used as context for a close reading of distinctive datapoints.
An example track, in which we use simple functions in the programming language R to analyze Twitter data. Combining these two tracks, we show how scalable reading can be used to analyze a wide variety of structured data. Our suggested scalable reading workflow includes two distant reading approaches that will help researchers to explore and analyze overall features in large data sets (chronologically and in relation to binary structures), plus a way of using distant reading to select individual data points for close reading in a systematic and reproducible manner.
Original languageEnglish
JournalProgramming Historian
Volume11
ISSN2397-2068
DOIs
Publication statusPublished - Nov 2022

Fingerprint

Dive into the research topics of 'Scalable Reading of Structured Data'. Together they form a unique fingerprint.

Cite this