Dataset Repository

The purpose of the CHIBI dataset repository is to enable development, validation and benchmarking of new and existing informatics methods before these are applied in real-life projects. Datasets are developed locally or have been acquired from the public domain or though private relationships with research groups.
The datasets come in three main varieties:

  1. Simulated data (to examine specific theoretical properties of informatics algorithms and tools)
  2. Real data (from a variety of domains and sources)
  3. Re-simulated data (i.e., simulated data from a generative model fitted on real data so that it combines the real-life complexity and behavior of real data with the experimental control afforded by simulations).

Except for datasets that are part of online supplements to published papers (which are disseminated separately), access to the data repository is restricted to CHIBI faculty/staff/trainees and collaborators.

To request current contents and access privileges please contact Alexander Statnikov