From Bias to Insight: Computational Challenges and Opportunities in the Humanities

Research output: Contribution to book/anthology/report/proceedingArticle in proceedingsResearchpeer-review

Abstract

From Bias to Insight: Computational Challenges and Opportunities in the Humanities
Thursday, 06/Mar/2025 9:00am - 9:30am
ID: 198 / Session LP 05: 1
Long paper (abstract) | 20-minute presentation with a 10-minute Q&A
Keywords: Bias in Historical Data; Digital Scholarly Editions (DSE); Computational Humanities; Data Preservation; Cultural
Canonization
1. Introduction: Bias as a Critical Pathway to Historical Insight
The integration of computational methods into traditional humanities disciplines has transitioned from a field of potential to one
central to the transformation of concrete research practices. These advancements have significantly reshaped how scholars
interpret cultural data. However, with these advancements come challenges. In this paper, we address the challenge of
transforming bias—often regarded as a flaw—into a valuable analytical tool. In historical datasets, bias is not simply a problem
to be eliminated but a reflection of the cultural, political, and social forces that shaped the material. This understanding shifts
bias from an obstacle to a critical pathway for uncovering deeper meanings and insights within given data.
In other words, this paper explores how computational methods can harness the biases present in historical corpora to produce
more nuanced interpretations. Moreover, it addresses the role of data preservation strategies in ensuring that these biases are
not lost to future scholarship. Using the Digital Scholarly Edition (DSE) of Grundtvig’s Works as a case study, we seek to
demonstrate how bias can be both a challenge and an interpretative opportunity in the development and sustainability of digital
archives. Within the framework of the DSE, the emphasis in this case is on a single authorship, reflecting both the inherent bias
of one individual from a particular era and the bias inherent in the DSE itself.
2. Preserving Bias for Future Insight: Data Quality and Longevity
The preservation of historical data is not merely a technical concern but a philosophical one that directly impacts how future
scholars will interpret cultural data. Traditional humanities research, which often centers on close readings of carefully curated
texts, allows for a high degree of control over the sources. In contrast, large-scale computational projects must contend with the
digitization of incomplete, inconsistent, or biased texts, introducing challenges in maintaining the quality of the data while
preserving its interpretative richness.
The Grundtvig’s Works DSE exemplifies the effort needed to produce high-quality digital editions that acknowledge, rather than
erase, the biases inherent in historical data (cf., Oltmanns et al., 2019; Pierazzo, 2014). While rigorous data cleaning and curation
are essential, questions of long-term preservation loom large. Without sustainable infrastructures and open data standards,
digital scholarly editions risk becoming obsolete or inaccessible, leading to what some scholars have called a potential ‘digital
wasteland’ (Baunvig et al., 2023). Preservation is crucial not just for maintaining access to the content, but for safeguarding the
biases within the data that reflect the ideological and socio-political conditions of the time.
Rather than prioritizing rendition—where texts are cleaned and ‘corrected’ for readability—digital humanists must emphasize
storage and preservation to ensure that future scholars can continue to engage with the historical biases embedded within the
texts. This allows for a more reflective approach to the past, where bias is seen not as an error to be erased, but as a feature to
be explored.
3. Bias as an Interpretative Lens in Historical Datasets
Bias is an inherent concern in any humanities project, but it takes on greater complexity in large-scale computational work.
Traditional humanistic methods give scholars a degree of control and intentionality over text selection, but computational projects
often involve datasets digitized at scale, where biases are more deeply embedded and sometimes amplified. The digitization
process itself can introduce errors, but more critically, it can reflect and perpetuate historical biases from the time when the texts
were created. These biases may concern the ideological, religious, or nationalist frameworks of the authors and societies in
question. For example, in the Grundtvig’s Works DSE, the digitized materials reflect nationalist and religious assumptions of
19th-century Denmark, giving scholars insight into the cultural forces shaping those works (Rasmussen et al., 2022). Bias, in
this context, becomes a rich source of information, providing pathways for the dominant narratives of the past – and for exploring
exclusion, and marginalization.
Computational approaches provide opportunities for identifying and analyzing these biases at scale, while humanistic reflection
remains essential. Digital humanists must balance computational rigor with interpretive depth, recognizing that bias detection
algorithms, often used to ensure fairness in contemporary datasets, can be repurposed to trace historical ideologies in ways that
enhance our understanding of the past.
4. Refining Bias Detection Algorithms for Historical Data
Bias detection algorithms, initially designed to ensure fairness in modern datasets, can be refined to address the unique
challenges posed by historical corpora. By refining bias detection algorithms, digital humanists can preserve these interpretive
complexities while analyzing historical data. One refinement involves making the algorithms context sensitive. Rather than
applying contemporary standards of fairness, algorithms tailored for historical datasets would flag specific biases related to
gender, race, or religion as elements to explore, not to remove. This allows scholars to critically engage with the biases in their
historical context, providing a more nuanced understanding of the data.
Additionally, biases in historical texts are often multilayered, involving not only individual authors but also broader institutional
and societal forces. Algorithms refined to detect these multiple layers would help scholars uncover both explicit biases and the
silences or omissions that reveal underlying patterns of exclusion. Biases also evolve over time, reflecting changing cultural
norms. Algorithms capable of tracking these temporal dynamics would allow scholars to explore how specific biases shifted
across different historical periods, offering new insights into the cultural transformations of the time. Likewise, cultural specificity
is key—what constitutes bias in one historical or regional context may not apply in another. By training algorithms on region-
specific datasets, scholars can enhance their ability to detect biases unique to the cultures they are studying.
Another refinement lies in the use of bias as a tool for reflection. Algorithms can be designed to not only detect bias but also help
scholars interpret how these biases relate to the broader socio-political structures of the time. This allows for a deeper exploration
of power dynamics embedded within the texts. User-directed features, where researchers can specify the types of biases they
33
wish to explore, would further enhance the utility of these tools for historical analysis. Finally, advanced visualization tools can
be integrated into bias detection algorithms, offering visual maps of where and how biases appear within a corpus. These
visualizations allow for a more interpretive engagement with the data, highlighting intersections of different biases and their shifts
over time, making the complexity of historical corpora more accessible to scholars.
5. Ethical Considerations: A Reflective Approach to Bias
As computational methods become more prevalent, ethical considerations regarding the treatment of bias in historical datasets
become critical. The biases in cultural, political, and ideological contexts are not simply flaws to be corrected but essential
features for understanding the conditions under which historical texts were created.
Rather than erasing these biases in the pursuit of neutrality, scholars should aim to understand how they shaped the historical
record. By embedding ethical reflection into every stage of the research process, digital humanists can ensure that computational
methods do not merely amplify dominant narratives but also reveal marginalized voices, offering new insights into the power
dynamics of the past. Ethical frameworks specific to the digital humanities are needed to engage more deeply with these
complexities and to ensure that computational tools are used in a way that enriches, rather than diminishes, the interpretative
potential of historical data (Kleinberg et al., 2016).
6. Canonical Bias in the Grundtvig’s Works DSE: A Paradox of Cultural Preservation
Finally, in the context of bias as both challenge and opportunity, the DSE of Grundtvig’s Works presents a compelling paradox.
On one hand, it provides scholars with high-quality, meticulously annotated texts that enhance our understanding of Grundtvig’s
substantial impact on Danish national identity, religious thought, and cultural history. On the other hand, the very creation of the
DSE is shaped by a significant form of bias—one embedded in the processes of cultural canonization and national self-
conception.
Grundtvig’s prominence in Danish history has ensured that he is not just an important figure, but a cultural icon, occasionally
referred to as a ‘cultural saint’, for his role in shaping the nation’s democracy, educational system, and church life. It is this
canonization that has enabled the DSE project to attract significant funding, with (currently) well over 150 million DKK invested
in making Grundtvig’s extensive writings available to the public and scholars alike. This level of financial and institutional support
is not distributed equally across all figures or texts from Danish history. Instead, it is directly tied to Grundtvig’s symbolic status
as a central figure in the collective Danish imagination.
Here, we encounter a form of institutional bias that, while making Grundtvig’s works more accessible, simultaneously perpetuates
the focus on canonical figures. The DSE was made possible not simply because of the scholarly value of Grundtvig’s writings,
but because of the cultural weight his name carries. This raises critical questions about whose works are considered worthy of
preservation and extensive study, and which voices remain marginalized or overlooked. In this sense, the bias embedded in the
DSE reflects broader patterns of historical selection, where resources are allocated based on cultural prominence rather than
equitable representation.
However, this paradoxical bias also opens up an opportunity for reflection. The DSE’s existence, made possible by Grundtvig’s
cultural stature, allows us to interrogate the very processes of canonization and the role of bias in shaping scholarly endeavors.
The creation of such a comprehensive, well-funded project demonstrates how bias operates not only within the content of
historical texts but also within the systems of support that determine which texts are preserved, studied, and celebrated.
Moreover, this bias is not static. Just as computational tools can reveal and analyze the biases within historical data, so too can
the systems of cultural preservation evolve. The attention paid to canonical figures like Grundtvig is beginning to shift, with
increasing efforts to broaden the scope of cultural heritage projects to include a more diverse range of voices and materials.
While the DSE represents a pinnacle of canon-focused scholarship, it also exemplifies how bias—paradoxically—serves both to
sustain and to challenge the boundaries of cultural memory. By critically engaging with this bias, we can better understand the
socio-political forces that shape the preservation of historical data. In the case of the DSE, this awareness prompts us to reflect
on how the mechanisms that prioritize certain figures might, in the future, be reconfigured to offer a more inclusive, socially
sustainable approach to cultural heritage.
7. Conclusion: Embracing Bias as Insight in Digital Scholarship
The integration of computational methods into the humanities marks the beginning of a new era—one where technology
amplifies, rather than replaces, traditional humanistic inquiry. The Grundtvig’s Works DSE serves as a powerful example of how
computational methods can open new interpretive avenues by addressing the challenges of bias, data quality, and preservation.
Computation should not be viewed as a tool for efficiency but as a key to deeper inquiry, particularly in its ability to expose biases
in historical datasets that might otherwise remain hidden. Bias, in this context, becomes not a flaw to be eliminated but a critical
aspect of historical interpretation. Digital humanists must leverage these biases to illuminate the socio-political forces that shaped
historical texts, gaining deeper insights into the past.
The future of digital scholarship lies in maintaining a balance between the interpretive complexity of humanistic traditions and
the power of computational tools. By fostering interdisciplinary collaboration, developing robust ethical frameworks, and building
sustainable infrastructures, the humanities are poised to thrive in the digital age, producing richer and more nuanced
understandings of our cultural heritage.
Bibliography
Baunvig, K.F., Rasmussen, K.S.G., Møldrup-Dalum, P., & Vad, K., 2023, Storage Over Rendition. Towards a Sustainable
Infrastructure in the Digital Textual Heritage Sector. Digital Humanities in the Nordic and Baltic Countries Publications 5(1):
24047. https://doi.org/10.5617/dhnbpub.10667.
Friedman, B., & Nissenbaum, H., 1996, “Bias in computer systems”, ACM Transactions on Infor-mation Systems (TOIS), 14(3),
330-347.
Kleinberg, J., Mullainathan, S., & Raghavan, M., 2016, “Inherent trade-offs in the fair determination of risk scores”, arXiv
preprint arXiv:1609.05807.
34
Oltmanns, E., Hasler, T., Peters-Kottig, W., & Kuper, H.-G., 2019, Different Preservation Levels: The Case of Scholarly Digital
Editions. Data Science Journal, 18(1), 51.
Pierazzo, E., 2015, Digital Scholarly Editing: Theories, Models and Methods, Ashgate.
Rasmussen, K.S.G., Tafdrup, J., Ravn, K.S., & Baunvig, K.F., 2022, The Case for Scholarly Edi-tions. Proceedings of the 6th
Digital Humanities in the Nordic and Baltic Countries Conference (DHNB 2022), pp. 401-405. CEUR Workshop Proceedings,
Vol. 3232. https://ceur-ws.org/Vol-3232/paper39.pdf.
Original languageDanish
Title of host publicationDigital Humanities in the Nordic and Baltic Countries Book of Abstracts
EditorsMari Väina
Number of pages3
Publication dateMar 2025
Pages33-35
Publication statusPublished - Mar 2025
SeriesDigital Humanities in the Nordic and Baltic Countries Publications
ISSN2704-1441

Cite this