The BeiHang Keystroke Dynamics Database

The BeiHang Keystroke Dynamics database includes 2057 test samples and 556 training samples from 117 subjects. The whole database is divided into two subsets, Dataset A and Dataset B, collected from two different environments. DatasetA was collected in cybercafe environment. The commercial system was embedded into the login system of an online application. DatasetB was collected online. The commercial system is open to the public. Each subject includes registration data from genuine users which is used as training samples, log-in data from genuine users and log-in data from intruders. All the data are stored in text format, you may download the whole data set by clicking here (971KB).

Data Description

All the data in this database are orginally collected, without any manual modification. A set of data begins with a "0", and ends with "enter" operation.

Generanlly a password is represented by the following vector:
P1,R1, P2,R2,..., Pn,Rn,

where Pi and Ri represent the press and release time of the ith keystroke of a password.

We show the meaning of the differnt files according to file name. All the files (smples) are categorized into seperate folders (subjects) according to login ID. In each folder, there is only one training file and several testing files. Details are given belows:

Training Files

In each folder, the only traning file contains 4-5 registration samples and is in the form of:


for example,"[]12345(-regliaoxiaoying).txt" means this is the traning file for ID="12345", PSW="liaoxiaoying".

Testing Files

In each folder, all the testing files lie in the same format:

[Year-Month-Day Hour.Min.Sec]ID(-loginPSW)_ IsGenuine_IsPostive.txt, where

IsGenuine: 0 represents data from the genuine user, while 1 represents data from intruders.

IsPostive: y represents positive data from users, while n represents negative data.

For example, the testing file "[2009-12-30 14.07.01]12345(-loginliaoxiaoying)_1_n.txt" indicates the login time is "2009-12-30 14.07.01", ID is "12345", PSW is "liaoxiaoying", and represents negative data from intruders.

Related Publication:

Yilin Li, Baochang Zhang, Yao Cao, Sanqiang Zhao, Yongsheng Gao, Jianzhuang Liu. "Study on the BeiHang Keystroke Dynamics Database". International Joint Conference on Biometrics (IJCB), pp.1-5, 2011. [PDF]


CopyRight reserved
Designed by Machine Learning Group