My da|ra Login

Detailed view

metadata language: German English

Multilingual historical narratives on Wikipedia

Version
1
Resource Type
Dataset
Creator
  • Samoilenko, Anna (GESIS)
Publication Date
2017
Contributor
  • Strohmaier, Markus (GESIS, University of Koblenz-Landau) (Supervisor)
  • Weller, Katrin (GESIS) (Project Member)
  • Zens, Maria (GESIS) (Project Member)
  • Lemmerich, Florian (GESIS, University of Koblenz-Landau) (Project Member)
  • Samoilenko, Anna (GESIS, University of Koblenz-Landau) (Contact Person)
Classification
  • ZA:
    • Society, Culture
    • Historical Social Research
    • Communication, Public Opinion, Media
    • Historical Studies Data
Description
  • Abstract

    Portrayals of history are never complete, and each description inherently exhibits a specific view- point and emphasis. In this work, we automatically identified such differences by computing time- lines and detecting temporal focal points of written history across languages on Wikipedia. In particular, we studied articles related to the history of all UN member states and compared them in 30 language editions. We developed a computational approach that allows to identify focal points quantitatively, and found that Wikipedia narratives about national histories (i) are skewed towards more recent events (recency bias) and (ii) are distributed unevenly across the continents with sig- nificant focus on the history of European countries (Eurocentric bias). Thus, our work explored how colonial ties shape popular historiography on Wikipedia. We also established that national historical timelines vary across language editions, although average interlingual consensus is rather high. We hope that this work provides a starting point for a broader computational analysis of written history on Wikipedia and elsewhere.
Temporal Coverage
  • 2016-07 / 2016-07
  • 2016-07 / 2016-07
Sampled Universe
Main text of Wikipedia articles on history of 193 UN memberstates (and their outlinks) in 30 language editions, collected in July 2016
Sampling
Live-crawling of Wikipedia pages
Collection Mode
    • Other
    • Content Analysis
    • Other
    • Content Analysis
Data and File Information
    • File Name: collected_dates_per_decade_by_country_matrices.zip
      File Format: application/zip
      File Size: 513525
      Data Fingerprint: a29f8c0a5409da6a36151151018019ca
      Method Fingerprint: MD5
    • File Name: collected_dates_per_decade_by_country_matrices.zip
      File Format: application/zip
      File Size: 513525
      Data Fingerprint: a29f8c0a5409da6a36151151018019ca
      Method Fingerprint: MD5
    • File Name: jensen_shannon_divergence_years_matrices.zip
      File Format: application/zip
      File Size: 268229
      Data Fingerprint: c26f67e5888abea94aff3d69f1700aae
      Method Fingerprint: MD5
    • File Name: jensen_shannon_divergence_years_matrices.zip
      File Format: application/zip
      File Size: 268229
      Data Fingerprint: c26f67e5888abea94aff3d69f1700aae
      Method Fingerprint: MD5
    • File Name: z-scores.zip
      File Format: application/zip
      File Size: 161462
      Data Fingerprint: d9d1fc4dab7e352e69885e727d1e952b
      Method Fingerprint: MD5
    • File Name: z-scores.zip
      File Format: application/zip
      File Size: 161462
      Data Fingerprint: d9d1fc4dab7e352e69885e727d1e952b
      Method Fingerprint: MD5
    • File Name: evaluation_final_error_rates.csv
      File Format: application/octet-stream
      File Size: 1752
      Data Fingerprint: c07f040dad5c3376ce35388998bfba5a
      Method Fingerprint: MD5
    • File Name: evaluation_final_error_rates.csv
      File Format: application/octet-stream
      File Size: 1752
      Data Fingerprint: c07f040dad5c3376ce35388998bfba5a
      Method Fingerprint: MD5
    • File Name: dates_extraction.py
      File Format: application/octet-stream
      File Size: 3005
      Data Fingerprint: 5e791009507f15ce825348d76c7095bc
      Method Fingerprint: MD5
    • File Name: dates_extraction.py
      File Format: application/octet-stream
      File Size: 3005
      Data Fingerprint: 5e791009507f15ce825348d76c7095bc
      Method Fingerprint: MD5
    • File Name: README.txt
      File Format: text/plain
      File Size: 3506
      Data Fingerprint: 585949120c656c8b6a2182bf2f3325d5
      Method Fingerprint: MD5
    • File Name: README.txt
      File Format: text/plain
      File Size: 3506
      Data Fingerprint: 585949120c656c8b6a2182bf2f3325d5
      Method Fingerprint: MD5
Note
Source: the free encyclopedia Wikipedia
Availability
Download
Free Access (without Registration)
Rights
CC BY-NC 4.0
Publications
  • Samoilenko, Anna and Lemmerich, Florian, and Zens, Maria and Weller, Katrin and Strohmaier, Markus "Analysing Timelines of National Histories across Wikipedia Editions: A Comparative Computational Approach" to appear in the ICWSM'17 volume as a full paper.;

Update Metadata: 2019-10-01 | Issue Number: 2 | Registration Date: 2017-03-06

Samoilenko, Anna (2017): Multilingual historical narratives on Wikipedia. Version: 1. GESIS Datenarchiv. Dataset. https://doi.org/10.7802/1411