R code for investigating standardized type-token ratio in mediated texts with ANOVA.
Scripts for extracting a corpus of comments and articles from an Italian newspaper.
A semi-automatic approach to in-domain parallel corpora extraction from Wikipedia.