Department of Political Science

Can the online crowd match real expert judgments? How task complexity and coder location affect the validity of crowd-coded data

Research output: Contribution to journal/Conference contribution in journal/Contribution to newspaperJournal articleResearchpeer-review


Crowdcoding is a novel technique that allows for fast, affordable, and reproducible online categorization of large numbers of statements. It combines judgements by multiple, paid, non-expert coders to avoid miscoding(s). Benoit et al. (2016) argue that crowdcoding could replace expert judgements; using the coding of political texts as an example in which both strategies produce similar results. Since crowdcoding yields the potential to extend the replication standard to data production and to “scale” coding schemes based on a modest number of carefully devised test questions and answers, it is important that we better understand its possibilities and limitations. While previous results for low complexity coding tasks are encouraging, we assess whether and under what conditions simple and complex coding tasks can be outsourced to the crowd without sacrificing content validity in return for scalability. The simple task is to decide whether a party statement counts as positive reference to a concept – in our case equality. The complex task is to distinguish between five concepts of equality. To account for the crowdcoder’s contextual knowledge, we vary the IP restrictions. The basis for our comparisons are 1404 party statements; coded by experts and the crowd (resulting in 30.000 online judgements). We compare the expert-crowd match at the statement- and party level and find that the (aggregated) results are substantively similar even for the complex task, suggesting that complex category schemes can be scaled via crowdcoding. The match is only slightly higher when IP restrictions are used as an approximation of coder expertise.
Original languageEnglish
JournalEuropean Journal of Political Research
Pages (from-to)236-247
Publication statusPublished - 2019

See relations at Aarhus University Citationformats

ID: 124034795