Databases¶

Introduction¶

A Database handles the raw data, annotations and how the datasets (learn, validation or test) should be build. N2D2 integrates pre-defined modules for several well-known database used in the deep learning community, such as MNIST, GTSRB, CIFAR10 and so on. That way, no extra step is necessary to be able to directly build a network and learn it on these database.

All the database modules inherit from a base Database, which contains some generic configuration options:

Option [default value]	Description
`DefaultLabel` []	Default label for composite image (for areas outside the ROIs). If empty, no default label is created and default label ID is -1
`ROIsMargin` [0]	Margin around the ROIs, in pixels, with no label (label ID = -1)
`RandomPartitioning` [1]	If true (1), the partitioning in the learn, validation and test sets is random, otherwise partitioning is in the order
`DataFileLabel` [1]	If true (1), load pixel-wise image labels, if they exist
`CompositeLabel` [`Auto`]	See the following `CompositeLabel` section
`TargetDataPath` []	Data path to target data, to be used in conjunction with the `DataAsTarget` option in `Target` modules
`MultiChannelMatch` []	See the following multi-channel handling section
`MultiChannelReplace`	See the following multi-channel handling section

`CompositeLabel` parameter¶

A label is said to be composite when it is not a single labelID for the stimulus (the stimulus label is a matrix of size > 1). For the same stimulus, different type of labels can be specified, i.e. the labelID, pixel-wise data and/or ROIs. The way these different label types are handled is configured with the CompositeLabel parameter:

None: only the labelID is used, pixel-wise data are ignored and ROIs are loaded but ignored as well by loadStimulusLabelsData().
Auto: the label is only composite when pixel-wise data are present or the stimulus labelID is -1 (in which case the DefaultLabel is used for the whole label matrix). If the label is composite ROIs, if present, are applied. Otherwise, a single ROI is allowed and is automatically extracted when fetching the stimulus.
Default: the label is always composite. The labelID is ignored. If there is no pixel-wise data, the DefaultLabel is used. ROIs, if present, are applied.
Disjoint: the label is always composite. If there is no pixel-wise data:
- the labelID is used if there is no ROI;
- the DefaultLabel is used if there is any ROI.
ROIs, if present, are applied.
Combine: the label is always composite. If there is no pixel-wise data, the labelID is used. ROIs, if present, are applied.

Multi-channel handling¶

Multi-channel images are automatically handled and the default image format in N2D2 is BGR.

Any Database can also handle multi-channel data, where each channel is stored in a different file. In order to be able to interpret a series of files as an additional data channel to a first series of files, the file names must follow a simple yet arbitrary naming convention. A first parameter, MultiChannelMatch, is used to match the files constituting a single channel. Then, a second parameter, MultiChannelReplace is used to indicate how the file names of the other channels are obtained. See the example below, with the DIR_Database:

[database]
Type=DIR_Database
...
; Multi-channel handling:
; MultiChannelMatch is a regular expression for matching a single channel (for example the first one).
; Here we match anything followed by "_0", followed by "." and anything except
; ".", so we match "_0" before the file extension.
MultiChannelMatch=(.*)_0(\.[^.]+)
; Replace what we matched to obtain the file name of the different channels.
; For the first channel, replace "_0" by "_0", so the name doesn't change.
; For the second channel, replace "_0" by "_1" in the file name.
; To disable the second channel, replace $1_1$2 by ""
MultiChannelReplace=$1_0$2 $1_1$2

Note that when MultiChannelMatch is not empty, only files matching this parameter regexp pattern (and the associated channels obtained with MultiChannelReplace, when they exist) will be loaded. Other files in the dataset not matching the MultiChannelMatch filter will be ignored.

Stimuli are loaded even if some channels are missing (in which case, “Notice” messages are issued for the missing channel(s) during database loading). Missing channel values are set to 0.

Annotations are common to all channels. If annotations exist for a specific channel, they are fused with the annotations of the other channels (for geometric annotations). Pixel-wise annotations, obtained when DataFileLabel is 1 (true), through the Database::readLabel() virtual method, are only read for the match (MultiChannelMatch) channel.

MNIST¶

MNIST [LBBH98] is already fractionned into a learning set and a testing set, with:

60,000 digits in the learning set;
10,000 digits in the testing set.

Example:

[database]
Type=MNIST_IDX_Database
Validation=0.2  ; Fraction of learning stimuli used for the validation [default: 0.0]

Option [default value]	Description
`Validation` [0.0]	Fraction of the learning set used for validation
`DataPath`	Path to the database
[`$N2D2_DATA`/mnist]

GTSRB¶

GTSRB [SSSI12] is already fractionned into a learning set and a testing set, with:

39,209 digits in the learning set;
12,630 digits in the testing set.

Example:

[database]
Type=GTSRB_DIR_Database
Validation=0.2  ; Fraction of learning stimuli used for the validation [default: 0.0]

Option [default value]	Description
`Validation` [0.0]	Fraction of the learning set used for validation
`DataPath`	Path to the database
[`$N2D2_DATA`/GTSRB]

Directory¶

Hand made database stored in files directories are directly supported with the DIR_Database module. For example, suppose your database is organized as following (in the path specified in the N2D2_DATA environment variable):

GST/airplanes: 800 images
GST/car_side: 123 images
GST/Faces: 435 images
GST/Motorbikes: 798 images

You can then instanciate this database as input of your neural network using the following parameters:

[database]
Type=DIR_Database
DataPath=${N2D2_DATA}/GST
Learn=0.4 ; 40% of images of the smallest category = 49 (0.4x123) images for each category will be used for learning
Validation=0.2 ; 20% of images of the smallest category = 25 (0.2x123) images for each category will be used for validation
; the remaining images will be used for testing

Each subdirectory will be treated as a different label, so there will be 4 different labels, named after the directory name.

The stimuli are equi-partitioned for the learning set and the validation set, meaning that the same number of stimuli for each category is used. If the learn fraction is 0.4 and the validation fraction is 0.2, as in the example above, the partitioning will be the following:

Label ID	Label name	Learn set	Validation set	Test set
[0.5ex] 0	`airplanes`	49	25	726
1	`car_side`	49	25	49
2	`Faces`	49	25	361
3	`Motorbikes`	49	25	724
	Total:	196	100	1860

Mandatory option

Option [default value]	Description
`DataPath`	Path to the root stimuli directory
`IgnoreMasks`	Space-separated list of mask strings to ignore. If any is present in a file path, the file gets ignored. The usual * and + wildcards are allowed.
`Learn`	If `PerLabelPartitioning` is true, fraction of images used for the learning; else, number of images used for the learning, regardless of their labels
`LoadInMemory` [0]	Load the whole database into memory
`Depth` [1]	Number of sub-directory levels to include. Examples:
	`Depth` = 0: load stimuli only from the current directory (`DataPath`)
	`Depth` = 1: load stimuli from `DataPath` and stimuli contained in the sub-directories of `DataPath`
	`Depth` < 0: load stimuli recursively from `DataPath` and all its sub-directories
`LabelName` []	Base stimuli label name
`LabelDepth` [1]	Number of sub-directory name levels used to form the stimuli labels. Examples:
	`LabelDepth` = -1: no label for all stimuli (label ID = -1)
	`LabelDepth` = 0: uses `LabelName` for all stimuli
	`LabelDepth` = 1: uses `LabelName` for stimuli in the current directory (`DataPath`) and `LabelName`/sub-directory name for stimuli in the sub-directories
`PerLabelPartitioning` [1]	If true (1), the `Learn`, `Validation` and `Test` parameters represent the fraction of the total stimuli to be partitioned in each set, instead of a number of stimuli
`EquivLabelPartitioning` [1]	If true (1), the stimuli are equi-partitioned in the learn and validation sets, meaning that the same number of stimuli for each label is used (only when `PerLabelPartitioning` is 1). The remaining stimuli are partitioned in the test set
`Validation` [0.0]	If `PerLabelPartitioning` is true, fraction of images used for the validation; else, number of images used for the validation, regardless of their labels
`Test` [1.0-`Learn`-`Validation`]	If `PerLabelPartitioning` is true, fraction of images used for the test; else, number of images used for the test, regardless of their labels
`ValidExtensions` []	List of space-separated valid stimulus file extensions (if left empty, any file extension is considered a valid stimulus)
`LoadMore` []	Name of an other section with the same options to load a different `DataPath`
`ROIFile` []	File containing the stimuli ROIs. If a ROI file is specified, `LabelDepth` should be set to -1
`DefaultLabel` []	Label name for pixels outside any ROI (default is no label, pixels are ignored)
`ROIsMargin` [0]	Number of pixels around ROIs that are ignored (and not considered as `DefaultLabel` pixels)

Note

If EquivLabelPartitioning is 1 (default setting), the number of stimuli per label that will be partitioned in the learn and validation sets will correspond to the number of stimuli from the label with the fewest stimuli.

To load and partition more than one DataPath, one can use the LoadMore option:

[database]
Type=DIR_Database
DataPath=${N2D2_DATA}/GST
Learn=0.6
Validation=0.4
LoadMore=database.test

; Load stimuli from the "GST_Test" path in the test dataset
[database.test]
DataPath=${N2D2_DATA}/GST_Test
Learn=0.0
Test=1.0
; The LoadMore option is recursive:
; LoadMore=database.more

; [database.more]
; Load even more data here

Speech Commands Dataset¶

Use with Speech Commands Data Set, released by the Google [Warden18].

[database]
Type=DIR_Database
DataPath=${N2D2_DATA}/speech_commands_v0.02
ValidExtensions=wav
IgnoreMasks=*/_background_noise_
Learn=0.6
Validation=0.2

CSV data files¶

CSV_Database is a generic driver for handling CSV data files. It can be used to load one or several CSV files where each line is a different stimulus and one column contains the label.

The parameters are the following:

Option [default value]	Description
`DataPath`	Path to the database
`Learn` [0.6]	Fraction of data used for the learning
`Validation` [0.2]	Fraction of data used for the validation
`PerLabelPartitioning` [1]	If true (1), the `Learn`, `Validation` and `Test` parameters represent the fraction of the total stimuli to be partitioned in each set, instead of a number of stimuli
`EquivLabelPartitioning` [1]	If true (1), the stimuli are equi-partitioned in the learn and validation sets, meaning that the same number of stimuli for each label is used (only when `PerLabelPartitioning` is 1). The remaining stimuli are partitioned in the test set
`LabelColumn` [-1]	Index of the column containing the label (if < 0, from the end of the row)
`NbHeaderLines` [0]	Number of header lines to skip
`Test` [1.0-`Learn`- `Validation`]	If `PerLabelPartitioning` is true, fraction of images used for the test; else, number of images used for the test, regardless of their labels
`LoadMore` []	Name of an other section with the same options to load a different `DataPath`

Note

If EquivLabelPartitioning is 1 (default setting), the number of stimuli per label that will be partitioned in the learn and validation sets will correspond to the number of stimuli from the label with the fewest stimuli.

Usage example¶

In this example, we load the Electrical Grid Stability Simulated Data Data Set (https://archive.ics.uci.edu/ml/datasets/Electrical+Grid+Stability+Simulated+Data+).

The CSV data file (Data_for_UCI_named.csv) is the following:

"tau1","tau2","tau3","tau4","p1","p2","p3","p4","g1","g2","g3","g4","stab","stabf"
95906002455997,3.07988520422811,8.38102539191882,9.78075443222607,3.76308477206316,-0.782603630987543,-1.25739482958732,-1.7230863114883,0.650456460887227,0.859578105752345,0.887444920638513,0.958033987602737,0.0553474891727752,"unstable"
3040972346785,4.90252411201167,3.04754072762177,1.36935735529605,5.06781210427845,-1.94005842705193,-1.87274168559721,-1.25501199162931,0.41344056837935,0.862414076352903,0.562139050527675,0.781759910653126,-0.00595746432603695,"stable"
97170690932022,8.84842842134833,3.04647874898866,1.21451813833956,3.40515818001095,-1.20745559234302,-1.27721014673295,-0.92049244093498,0.163041039311334,0.766688656526962,0.839444015400588,0.109853244952427,0.00347087904838871,"unstable"
716414776295121,7.66959964406565,4.48664083058949,2.34056298396795,3.96379106326633,-1.02747330413905,-1.9389441526466,-0.997373606480681,0.446208906537321,0.976744082924302,0.929380522872661,0.36271777426931,0.028870543444887,"unstable"
13411155161342,7.60877161603408,4.94375930178099,9.85757326996638,3.52581081652096,-1.12553095451115,-1.84597485447561,-0.554305007534195,0.797109525792467,0.455449947148291,0.656946658473716,0.820923486481631,0.0498603734837059,"unstable"
...

There is one header line and the last column is the label, which is the default.

This file is loaded and the data is splitted between the learning set and the validation set with a 0.7/0.3 ratio in the INI file with the following section:

[database]
Type=CSV_Database
Learn=0.7
Validation=0.3
DataPath=Data_for_UCI_named.csv
NbHeaderLines=1

Other built-in databases¶

Actitracker_Database¶

Actitracker database, released by the WISDM Lab [LWX+11].

Option [default value]	Description
`Learn` [0.6]	Fraction of data used for the learning
`Validation` [0.2]	Fraction of data used for the validation
`UseUnlabeledForTest` [0]	If true, use the unlabeled dataset for the test
`DataPath`	Path to the database
[`$N2D2_DATA`/WISDM_at_v2.0]

CIFAR10_Database¶

CIFAR10 database [Kri09].

Option [default value]	Description
`Validation` [0.0]	Fraction of the learning set used for validation
`DataPath`	Path to the database
[`$N2D2_DATA`/cifar-10-batches-bin]

CIFAR100_Database¶

CIFAR100 database [Kri09].

Option [default value]	Description
`Validation` [0.0]	Fraction of the learning set used for validation
`UseCoarse` [0]	If true, use the coarse labeling (10 labels instead of 100)
`DataPath`	Path to the database
[`$N2D2_DATA`/cifar-100-binary]

CKP_Database¶

The Extended Cohn-Kanade (CK+) database for expression recognition [LuceyCohnKanade+10].

Option [default value]	Description
`Learn`	Fraction of images used for the learning
`Validation` [0.0]	Fraction of images used for the validation
`DataPath`	Path to the database
[`$N2D2_DATA`/cohn-kanade-images]

Caltech101_DIR_Database¶

Caltech 101 database [FFFP04].

Option [default value]	Description
`Learn`	Fraction of images used for the learning
`Validation` [0.0]	Fraction of images used for the validation
`IncClutter` [0]	If true, includes the BACKGROUND_Google directory of the database
`DataPath`	Path to the database
[`$N2D2_DATA`/
101_ObjectCategories]

Caltech256_DIR_Database¶

Caltech 256 database [GHP07].

Option [default value]	Description
`Learn`	Fraction of images used for the learning
`Validation` [0.0]	Fraction of images used for the validation
`IncClutter` [0]	If true, includes the BACKGROUND_Google directory of the database
`DataPath`	Path to the database
[`$N2D2_DATA`/
256_ObjectCategories]

CaltechPedestrian_Database¶

Caltech Pedestrian database [DollarWSP09].

Note that the images and annotations must first be extracted from the seq video data located in the videos directory using the dbExtract.m Matlab tool provided in the “Matlab evaluation/labeling code” downloadable on the dataset website.

Assuming the following directory structure (in the path specified in the N2D2_DATA environment variable):

CaltechPedestrians/data-USA/videos/... (from the setxx.tar files)
CaltechPedestrians/data-USA/annotations/... (from the setxx.tar files)
CaltechPedestrians/tools/piotr_toolbox/toolbox (from the Piotr’s Matlab Toolbox archive)
CaltechPedestrians/*.m including dbExtract.m (from the Matlab evaluation/labeling code)

Use the following command in Matlab to generate the images and annotations:

cd([getenv('N2D2_DATA') '/CaltechPedestrians'])
addpath(genpath('tools/piotr_toolbox/toolbox')) % add the Piotr's Matlab Toolbox in the Matlab path
dbInfo('USA')
dbExtract()

Option [default value]	Description
`Validation` [0.0]	Fraction of the learning set used for validation
`SingleLabel` [1]	Use the same label for “person” and “people” bounding box
`IncAmbiguous` [0]	Include ambiguous bounding box labeled “person?” using the same label as “person”
`DataPath`	Path to the database images
[`$N2D2_DATA`/
CaltechPedestrians/data-USA/images]
`LabelPath`	Path to the database annotations
[`$N2D2_DATA`/
CaltechPedestrians/data-USA/annotations]

Cityscapes_Database¶

Cityscapes database [COR+16].

Option [default value]	Description
`IncTrainExtra` [0]	If true, includes the left 8-bit images - trainextra set (19,998 images)
`UseCoarse` [0]	If true, only use coarse annotations (which are the only annotations available for the trainextra set)
`SingleInstanceLabels` [1]	If true, convert group labels to single instance labels (for example, `cargroup` becomes `car`)
`DataPath`	Path to the database images
[`$N2D2_DATA`/
Cityscapes/leftImg8bit] or
[`$CITYSCAPES_DATASET`] if defined
`LabelPath` []	Path to the database annotations (deduced from `DataPath` if left empty)

Warning

Don’t forget to install the libjsoncpp-dev package on your device if you wish to use this database.

# To install JSON for C++ library on Ubuntu
sudo apt-get install libjsoncpp-dev

Daimler_Database¶

Daimler Monocular Pedestrian Detection Benchmark (Daimler Pedestrian).

Option [default value]	Description
`Learn` [1.0]	Fraction of images used for the learning
`Validation` [0.0]	Fraction of images used for the validation
`Test` [0.0]	Fraction of images used for the test
`Fully` [0]	When activate it use the test dataset to learn. Use only on fully-cnn mode

DOTA_Database¶

DOTA database [XBD+17].

Option [default value]	Description
`Learn`	Fraction of images used for the learning
`DataPath`	Path to the database
[`$N2D2_DATA`/DOTA]
`LabelPath`	Path to the database labels list file
[]

FDDB_Database¶

Face Detection Data Set and Benchmark (FDDB) [JLM10].

Option [default value]	Description
`Learn`	Fraction of images used for the learning
`Validation` [0.0]	Fraction of images used for the validation
`DataPath`	Path to the images (decompressed originalPics.tar.gz)
[`$N2D2_DATA`/FDDB]
`LabelPath`	Path to the annotations (decompressed FDDB-folds.tgz)
[`$N2D2_DATA`/FDDB]

GTSDB_DIR_Database¶

GTSDB database [HSS+13].

Option [default value]	Description
`Learn`	Fraction of images used for the learning
`Validation` [0.0]	Fraction of images used for the validation
`DataPath`	Path to the database
[`$N2D2_DATA`/FullIJCNN2013]

ILSVRC2012_Database¶

ILSVRC2012 database [RDS+15].

Option [default value]	Description
`Learn`	Fraction of images used for the learning
`DataPath`	Path to the database
[`$N2D2_DATA`/ILSVRC2012]
`LabelPath`	Path to the database labels list file
[`$N2D2_DATA`/ILSVRC2012/synsets.txt]

KITTI_Database¶

The KITTI Database provide ROI which can be use for autonomous driving and environment perception. The database provide 8 labeled different classes. Utilization of the KITTI Database is under licensing conditions and request an email registration. To install it you have to follow this link: http://www.cvlibs.net/datasets/kitti/eval_tracking.php and download the left color images (15 GB) and the trainling labels of tracking data set (9 MB). Extract the downloaded archives in your $N2D2_DATA/KITTI folder.

Option [default value]	Description
`Learn` [0.8]	Fraction of images used for the learning
`Validation` [0.2]	Fraction of images used for the validation

KITTI_Road_Database¶

The KITTI Road Database provide ROI which can be used to road segmentation. The dataset provide 1 labeled class (road) on 289 training images. The 290 test images are not labeled. Utilization of the KITTI Road Database is under licensing conditions and request an email registration. To install it you have to follow this link: http://www.cvlibs.net/datasets/kitti/eval_road.php and download the “base kit” of (0.5 GB) with left color images, calibration and training labels. Extract the downloaded archive in your $N2D2_DATA/KITTI folder.

Option [default value]	Description
`Learn` [0.8]	Fraction of images used for the learning
`Validation` [0.2]	Fraction of images used for the validation

KITTI_Object_Database¶

The KITTI Object Database provide ROI which can be use for autonomous driving and environment perception. The database provide 8 labeled different classes on 7481 training images. The 7518 test images are not labeled. The whole database provide 80256 labeled objects. Utilization of the KITTI Object Database is under licensing conditions and request an email registration. To install it you have to follow this link: http://www.cvlibs.net/datasets/kitti/eval_object.php?obj_benchmark and download the “lef color images” (12 GB) and the training labels of object data set (5 MB). Extract the downloaded archives in your $N2D2_DATA/KITTI_Object folder.

Option [default value]	Description
`Learn` [0.8]	Fraction of images used for the learning
`Validation` [0.2]	Fraction of images used for the validation

LITISRouen_Database¶

LITIS Rouen audio scene dataset [RG14].

Option [default value]	Description
`Learn` [0.4]	Fraction of images used for the learning
`Validation` [0.4]	Fraction of images used for the validation
`DataPath`	Path to the database
[`$N2D2_DATA`/data_rouen]

Dataset images slicing¶

It is possible to automatically slice images from a dataset, with a given slice size and stride, using the .slicing attribute. This effectively increases the number of stimuli in the set.

[database.slicing]
ApplyTo=NoLearn
Width=2048
Height=1024
StrideX=2048
StrideY=1024
RandomShuffle=1  ; 1 is the default value

The RandomShuffle option, enabled by default, randomly shuffle the dataset after slicing. If disabled, the slices are added in order at the end of the dataset.

Databases¶

Introduction¶

CompositeLabel parameter¶

Multi-channel handling¶

MNIST¶

GTSRB¶

Directory¶

Speech Commands Dataset¶

CSV data files¶

Usage example¶

Other built-in databases¶

Actitracker_Database¶

CIFAR10_Database¶

CIFAR100_Database¶

CKP_Database¶

Caltech101_DIR_Database¶

Caltech256_DIR_Database¶

CaltechPedestrian_Database¶

Cityscapes_Database¶

Daimler_Database¶

DOTA_Database¶

FDDB_Database¶

GTSDB_DIR_Database¶

ILSVRC2012_Database¶

KITTI_Database¶

KITTI_Road_Database¶

KITTI_Object_Database¶

LITISRouen_Database¶

Dataset images slicing¶

`CompositeLabel` parameter¶