Text Analysis


1.A dataset of more than 18,000 speeches by EU elites, such as prime ministers, EU commissioners, ECB officials and European Parliament leaders (Schumacher et al. 2016).


Work in progress

1.A paper reviewing text analysis, bridging political science and psychology approaches and critically evaluating some of our assumptions (with Martijn Schoonvelde and Bert Bakker).

2.A paper analyzing the complexity of speeches of politicians (with Martijn Schoonvelde, Anna Brosius and Bert Bakker). Click here for the slides of my presentation at the EPSA Annual Conference in Milan (2017).

3.A paper that demonstrates that we can use Google Translate on non-English speeches for automated text analysis purposes (with Martijn Schoonvelde and Erik de Vries).



Gijs Schumacher, Martijn Schoonvelde, Denise Traber, Tanushree Dahiya, and Erik de Vries (2016). EUSpeech: A New Dataset of EU Elite Speeches. In Proceedings of the International Conference on the Advances in Computational Analysis of Political Text (PolText 2016), Dubrovnik, 75–80.

Read Proceedings
Gijs Schumacher, Martijn Schoonvelde, Tanushree Dahiya, and Erik de Vries. (2016). EUSpeech. A Dataset of EU Elite Speeches 2007-2015. Version 2.0.

Access data
Greene, Zac, Andrea Ceron, Gijs Schumacher, and Zoltan Fazekas (2016). The Nuts and Bolts of Automated Text Analysis. Comparing Different Document Pre-Processing Techniques in Four Countries. Open Science Framework.

Read article