From: An open source library to parse and analyze online collaborative knowledge-building portals
Data Extraction | |
Task 1 | Extracting 5 Wikipedia articles from each quality category namely FA, GA, B, C, Start and Stub. a |
Task 2 | Extracting 10,000 random questions, its answers and comments from |
 | Stack Exchange site, say, anime.stackexchange.com |
Data Parsing | |
Task 3 | Finding the number of words, sentences and Wikilinks added/deleted in each revision |
 | of an article (United States). |
Task 4 | Extracting all the questions which had an accepted answer from anime.stackexchange.com |
Analysis Methods | |
Task 5 | Find the correlation between monthly pageviews and the number of revisions of |
 | an article (United States). |
Task 6 | Find the correlation between Gini coefficient (a measure of inequality of contribution) |
 | and answer to question ratio for various stack stackexchange portals. |