Skip to primary navigation.
Skip to secondary navigation.
Skip to page content.


Return to top of page.
Skip to secondary navigation.
Skip to page content.
Return to top of page.
Return to primary navigation.
Skip to secondary navigation.

Stylometry Authentication Datasets

ISOT Twitter Dataset

Dataset Description

The dataset consists of a file named "data.zip" which contains 100 files, each corresponding to one of the (100) authors.

The file "TweetCrawling.zip" contains Java source code to retrieve a JSON structure for a specific Tweet ID

ISOT Forgery Dataset

Dataset Description

Click here to download the ISOT Forgery dataset

Data Preprocessing

The file "Canonicizers_.java" contains some filters (e.g., stopwords) used in our research.

Return to top of page.
Return to primary navigation.
Skip to page content.
Return to top of page.
Return to primary navigation.
Return to secondary navigation.
Return to page content.