Datasets
The ISOT Lab has collected through different projects various datasets some of which are available for public sharing. The following datasets are available:
- ISOT Botnet Dataset;
- ISOT Mouse Dynamics Dataset.
Botnet Dataset
The ISOT Botnet dataset is the combination of several existing publicly available malicious and non-malicious datasets.
- Dataset Description
- Click here to download the (2010) ISOT Botnet dataset. Other link for the same dataset.
- The UNB ISCX Intrusion Detection Evaluation DataSet is available at http://www.iscx.ca/dataset
Mouse Dynamics Dataset
The ISOT mouse dynamics dataset consists of mouse dynamics data for 48 users collected over several months.
The dataset cannot be downloaded directly. Instead you need first to fill an agreement about how the data will be used; the agreement has to be signed by a supervisor. Please send the signed agreement to Dr. Issa Traore <itraore at ece.uvic.ca>.
After approving the request, a link will be sent to you to download the dataset with the instructions about to use it.
Mouse Dynamic Dataset Agreement Template.
Stylometry
The ISOT stylometry datasets include the following:
ISOT Twitter dataset
- Dataset description
- The dataset consists of a file named “data.zip” contains 100 files, which contains 100 files, each corresponding to one of the (100) authors
- The file “TweetCrawling.zip” contains Java source code to retrieve a JSON structure for a specific Tweet ID
ISOT Forgery dataset
- Dataset description
- The file “forgeryDataset.zip” contains our dataset in plain text
Data Preprocessing
- The file Canonicizers_.java contains some filters (e.g., stopwords) used in our research