Databases¶
Introduction¶
A Database
handles the raw data, annotations and how the datasets
(learn, validation or test) should be build.
N2D2 integrates pre-defined modules for several well-known database
used in the deep learning community, such as MNIST, GTSRB, CIFAR10 and
so on. That way, no extra step is necessary to be able to directly build
a network and learn it on these database.
All the database modules inherit from a base Database
, which contains some
generic configuration options:
Option [default value] |
Description |
---|---|
|
Default label for composite image (for areas outside the ROIs). If empty, no default label is created and default label ID is -1 |
|
Margin around the ROIs, in pixels, with no label (label ID = -1) |
|
If true (1), the partitioning in the learn, validation and test sets is random, otherwise partitioning is in the order |
|
If true (1), load pixel-wise image labels, if they exist |
|
See the following |
|
Data path to target data, to be used in conjunction with the
|
|
See the following multi-channel handling section |
|
See the following multi-channel handling section |
CompositeLabel
parameter¶
A label is said to be composite when it is not a single labelID for the
stimulus (the stimulus label is a matrix of size > 1).
For the same stimulus, different type of labels can be specified,
i.e. the labelID, pixel-wise data and/or ROIs.
The way these different label types are handled is configured with the
CompositeLabel
parameter:
None
: only the labelID is used, pixel-wise data are ignored and ROIs are loaded but ignored as well byloadStimulusLabelsData()
.Auto
: the label is only composite when pixel-wise data are present or the stimulus labelID is -1 (in which case theDefaultLabel
is used for the whole label matrix). If the label is composite ROIs, if present, are applied. Otherwise, a single ROI is allowed and is automatically extracted when fetching the stimulus.Default
: the label is always composite. The labelID is ignored. If there is no pixel-wise data, theDefaultLabel
is used. ROIs, if present, are applied.Disjoint
: the label is always composite. If there is no pixel-wise data:the labelID is used if there is no ROI;
the
DefaultLabel
is used if there is any ROI.
ROIs, if present, are applied.
Combine
: the label is always composite. If there is no pixel-wise data, the labelID is used. ROIs, if present, are applied.
Multi-channel handling¶
Multi-channel images are automatically handled and the default image format in N2D2 is BGR.
Any Database
can also handle multi-channel data, where each channel is stored
in a different file. In order to be able to interpret a series of files as an
additional data channel to a first series of files, the file names must follow
a simple yet arbitrary naming convention. A first parameter,
MultiChannelMatch
, is used to match the files constituting a single
channel. Then, a second parameter, MultiChannelReplace
is used to indicate
how the file names of the other channels are obtained. See the example below,
with the DIR_Database
:
[database]
Type=DIR_Database
...
; Multi-channel handling:
; MultiChannelMatch is a regular expression for matching a single channel (for example the first one).
; Here we match anything followed by "_0", followed by "." and anything except
; ".", so we match "_0" before the file extension.
MultiChannelMatch=(.*)_0(\.[^.]+)
; Replace what we matched to obtain the file name of the different channels.
; For the first channel, replace "_0" by "_0", so the name doesn't change.
; For the second channel, replace "_0" by "_1" in the file name.
; To disable the second channel, replace $1_1$2 by ""
MultiChannelReplace=$1_0$2 $1_1$2
Note that when MultiChannelMatch
is not empty, only files matching this parameter
regexp pattern (and the associated channels obtained with MultiChannelReplace
,
when they exist) will be loaded. Other files in the dataset not matching the
MultiChannelMatch
filter will be ignored.
Stimuli are loaded even if some channels are missing (in which case, “Notice” messages are issued for the missing channel(s) during database loading). Missing channel values are set to 0.
Annotations are common to all channels. If annotations exist for a specific channel,
they are fused with the annotations of the other channels (for geometric annotations).
Pixel-wise annotations, obtained when DataFileLabel
is 1 (true), through
the Database::readLabel()
virtual method, are only read for the match
(MultiChannelMatch
) channel.
MNIST¶
MNIST [LBBH98] is already fractionned into a learning set and a testing set, with:
60,000 digits in the learning set;
10,000 digits in the testing set.
Example:
[database]
Type=MNIST_IDX_Database
Validation=0.2 ; Fraction of learning stimuli used for the validation [default: 0.0]
Option [default value] |
Description |
---|---|
|
Fraction of the learning set used for validation |
|
Path to the database |
[ |
GTSRB¶
GTSRB [SSSI12] is already fractionned into a learning set and a testing set, with:
39,209 digits in the learning set;
12,630 digits in the testing set.
Example:
[database]
Type=GTSRB_DIR_Database
Validation=0.2 ; Fraction of learning stimuli used for the validation [default: 0.0]
Option [default value] |
Description |
---|---|
|
Fraction of the learning set used for validation |
|
Path to the database |
[ |
Directory¶
Hand made database stored in files directories are directly supported
with the DIR_Database
module. For example, suppose your database is
organized as following (in the path specified in the N2D2_DATA
environment variable):
GST/airplanes
: 800 imagesGST/car_side
: 123 imagesGST/Faces
: 435 imagesGST/Motorbikes
: 798 images
You can then instanciate this database as input of your neural network using the following parameters:
[database]
Type=DIR_Database
DataPath=${N2D2_DATA}/GST
Learn=0.4 ; 40% of images of the smallest category = 49 (0.4x123) images for each category will be used for learning
Validation=0.2 ; 20% of images of the smallest category = 25 (0.2x123) images for each category will be used for validation
; the remaining images will be used for testing
Each subdirectory will be treated as a different label, so there will be 4 different labels, named after the directory name.
The stimuli are equi-partitioned for the learning set and the validation set, meaning that the same number of stimuli for each category is used. If the learn fraction is 0.4 and the validation fraction is 0.2, as in the example above, the partitioning will be the following:
Label ID |
Label name |
Learn set |
Validation set |
Test set |
[0.5ex] 0 |
|
49 |
25 |
726 |
1 |
|
49 |
25 |
49 |
2 |
|
49 |
25 |
361 |
3 |
|
49 |
25 |
724 |
Total: |
196 |
100 |
1860 |
Mandatory option
Option [default value] |
Description |
---|---|
|
Path to the root stimuli directory |
|
Space-separated list of mask strings to ignore. If any is present in a file path, the file gets ignored. The usual * and + wildcards are allowed. |
|
If |
|
Load the whole database into memory |
|
Number of sub-directory levels to include. Examples: |
|
|
|
|
|
|
|
Base stimuli label name |
|
Number of sub-directory name levels used to form the stimuli labels. Examples: |
|
|
|
|
|
|
|
If true (1), the |
|
If true (1), the stimuli are equi-partitioned in the learn and validation sets, meaning that the same number of stimuli for each label is used
(only when |
|
If |
|
If |
|
List of space-separated valid stimulus file extensions (if left empty, any file extension is considered a valid stimulus) |
|
Name of an other section with the same options to load a different |
|
File containing the stimuli ROIs. If a ROI file is specified, |
|
Label name for pixels outside any ROI (default is no label, pixels are ignored) |
|
Number of pixels around ROIs that are ignored (and not considered as |
Note
If EquivLabelPartitioning
is 1 (default setting), the number of stimuli
per label that will be partitioned in the learn and validation sets will
correspond to the number of stimuli from the label with the fewest stimuli.
To load and partition more than one DataPath
, one can use the
LoadMore
option:
[database]
Type=DIR_Database
DataPath=${N2D2_DATA}/GST
Learn=0.6
Validation=0.4
LoadMore=database.test
; Load stimuli from the "GST_Test" path in the test dataset
[database.test]
DataPath=${N2D2_DATA}/GST_Test
Learn=0.0
Test=1.0
; The LoadMore option is recursive:
; LoadMore=database.more
; [database.more]
; Load even more data here
Speech Commands Dataset¶
Use with Speech Commands Data Set, released by the Google [Warden18].
[database]
Type=DIR_Database
DataPath=${N2D2_DATA}/speech_commands_v0.02
ValidExtensions=wav
IgnoreMasks=*/_background_noise_
Learn=0.6
Validation=0.2
CSV data files¶
CSV_Database
is a generic driver for handling CSV data files. It can be used
to load one or several CSV files where each line is a different stimulus and one
column contains the label.
The parameters are the following:
Option [default value] |
Description |
---|---|
|
Path to the database |
|
Fraction of data used for the learning |
|
Fraction of data used for the validation |
|
If true (1), the |
|
If true (1), the stimuli are equi-partitioned in
the learn and validation sets, meaning that the
same number of stimuli for each label is used
(only when |
|
Index of the column containing the label (if < 0, from the end of the row) |
|
Number of header lines to skip |
|
If |
|
Name of an other section with the same options to
load a different |
Note
If EquivLabelPartitioning
is 1 (default setting), the number of stimuli
per label that will be partitioned in the learn and validation sets will
correspond to the number of stimuli from the label with the fewest stimuli.
Usage example¶
In this example, we load the Electrical Grid Stability Simulated Data Data Set (https://archive.ics.uci.edu/ml/datasets/Electrical+Grid+Stability+Simulated+Data+).
The CSV data file (Data_for_UCI_named.csv
) is the following:
"tau1","tau2","tau3","tau4","p1","p2","p3","p4","g1","g2","g3","g4","stab","stabf"
2.95906002455997,3.07988520422811,8.38102539191882,9.78075443222607,3.76308477206316,-0.782603630987543,-1.25739482958732,-1.7230863114883,0.650456460887227,0.859578105752345,0.887444920638513,0.958033987602737,0.0553474891727752,"unstable"
9.3040972346785,4.90252411201167,3.04754072762177,1.36935735529605,5.06781210427845,-1.94005842705193,-1.87274168559721,-1.25501199162931,0.41344056837935,0.862414076352903,0.562139050527675,0.781759910653126,-0.00595746432603695,"stable"
8.97170690932022,8.84842842134833,3.04647874898866,1.21451813833956,3.40515818001095,-1.20745559234302,-1.27721014673295,-0.92049244093498,0.163041039311334,0.766688656526962,0.839444015400588,0.109853244952427,0.00347087904838871,"unstable"
0.716414776295121,7.66959964406565,4.48664083058949,2.34056298396795,3.96379106326633,-1.02747330413905,-1.9389441526466,-0.997373606480681,0.446208906537321,0.976744082924302,0.929380522872661,0.36271777426931,0.028870543444887,"unstable"
3.13411155161342,7.60877161603408,4.94375930178099,9.85757326996638,3.52581081652096,-1.12553095451115,-1.84597485447561,-0.554305007534195,0.797109525792467,0.455449947148291,0.656946658473716,0.820923486481631,0.0498603734837059,"unstable"
...
There is one header line and the last column is the label, which is the default.
This file is loaded and the data is splitted between the learning set and the validation set with a 0.7/0.3 ratio in the INI file with the following section:
[database]
Type=CSV_Database
Learn=0.7
Validation=0.3
DataPath=Data_for_UCI_named.csv
NbHeaderLines=1
Other built-in databases¶
Actitracker_Database¶
Actitracker database, released by the WISDM Lab [LWX+11].
Option [default value] |
Description |
---|---|
|
Fraction of data used for the learning |
|
Fraction of data used for the validation |
|
If true, use the unlabeled dataset for the test |
|
Path to the database |
[ |
CIFAR10_Database¶
CIFAR10 database [Kri09].
Option [default value] |
Description |
---|---|
|
Fraction of the learning set used for validation |
|
Path to the database |
[ |
CIFAR100_Database¶
CIFAR100 database [Kri09].
Option [default value] |
Description |
---|---|
|
Fraction of the learning set used for validation |
|
If true, use the coarse labeling (10 labels instead of 100) |
|
Path to the database |
[ |
CKP_Database¶
The Extended Cohn-Kanade (CK+) database for expression recognition [LuceyCohnKanade+10].
Option [default value] |
Description |
---|---|
|
Fraction of images used for the learning |
|
Fraction of images used for the validation |
|
Path to the database |
[ |
Caltech101_DIR_Database¶
Caltech 101 database [FFFP04].
Option [default value] |
Description |
---|---|
|
Fraction of images used for the learning |
|
Fraction of images used for the validation |
|
If true, includes the BACKGROUND_Google directory of the database |
|
Path to the database |
[ |
|
101_ObjectCategories] |
Caltech256_DIR_Database¶
Caltech 256 database [GHP07].
Option [default value] |
Description |
---|---|
|
Fraction of images used for the learning |
|
Fraction of images used for the validation |
|
If true, includes the BACKGROUND_Google directory of the database |
|
Path to the database |
[ |
|
256_ObjectCategories] |
CaltechPedestrian_Database¶
Caltech Pedestrian database [DollarWSP09].
Note that the images and annotations must first be extracted from the
seq video data located in the videos directory using the
dbExtract.m
Matlab tool provided in the “Matlab evaluation/labeling
code” downloadable on the dataset website.
Assuming the following directory structure (in the path specified in the
N2D2_DATA
environment variable):
CaltechPedestrians/data-USA/videos/...
(from the setxx.tar files)CaltechPedestrians/data-USA/annotations/...
(from the setxx.tar files)CaltechPedestrians/tools/piotr_toolbox/toolbox
(from the Piotr’s Matlab Toolbox archive)CaltechPedestrians/*.m
includingdbExtract.m
(from the Matlab evaluation/labeling code)
Use the following command in Matlab to generate the images and annotations:
cd([getenv('N2D2_DATA') '/CaltechPedestrians'])
addpath(genpath('tools/piotr_toolbox/toolbox')) % add the Piotr's Matlab Toolbox in the Matlab path
dbInfo('USA')
dbExtract()
Option [default value] |
Description |
---|---|
|
Fraction of the learning set used for validation |
|
Use the same label for “person” and “people” bounding box |
|
Include ambiguous bounding box labeled “person?” using the same label as “person” |
|
Path to the database images |
[ |
|
CaltechPedestrians/data-USA/images] |
|
|
Path to the database annotations |
[ |
|
CaltechPedestrians/data-USA/annotations] |
Cityscapes_Database¶
Cityscapes database [COR+16].
Option [default value] |
Description |
---|---|
|
If true, includes the left 8-bit images - trainextra set (19,998 images) |
|
If true, only use coarse annotations (which are the only annotations available for the trainextra set) |
|
If true, convert group labels to single instance labels (for example, |
|
Path to the database images |
[ |
|
Cityscapes/leftImg8bit] or |
|
[ |
|
|
Path to the database annotations (deduced from |
Warning
Don’t forget to install the libjsoncpp-dev package on your device if you wish to use this database.
# To install JSON for C++ library on Ubuntu
sudo apt-get install libjsoncpp-dev
Daimler_Database¶
Daimler Monocular Pedestrian Detection Benchmark (Daimler Pedestrian).
Option [default value] |
Description |
---|---|
|
Fraction of images used for the learning |
|
Fraction of images used for the validation |
|
Fraction of images used for the test |
|
When activate it use the test dataset to learn. Use only on fully-cnn mode |
DOTA_Database¶
DOTA database [XBD+17].
Option [default value] |
Description |
---|---|
|
Fraction of images used for the learning |
|
Path to the database |
[ |
|
|
Path to the database labels list file |
[] |
FDDB_Database¶
Face Detection Data Set and Benchmark (FDDB) [JLM10].
Option [default value] |
Description |
---|---|
|
Fraction of images used for the learning |
|
Fraction of images used for the validation |
|
Path to the images (decompressed originalPics.tar.gz) |
[ |
|
|
Path to the annotations (decompressed FDDB-folds.tgz) |
[ |
GTSDB_DIR_Database¶
GTSDB database [HSS+13].
Option [default value] |
Description |
---|---|
|
Fraction of images used for the learning |
|
Fraction of images used for the validation |
|
Path to the database |
[ |
ILSVRC2012_Database¶
ILSVRC2012 database [RDS+15].
Option [default value] |
Description |
---|---|
|
Fraction of images used for the learning |
|
Path to the database |
[ |
|
|
Path to the database labels list file |
[ |
KITTI_Database¶
The KITTI Database provide ROI which can be use for autonomous driving
and environment perception. The database provide 8 labeled different
classes. Utilization of the KITTI Database is under licensing conditions
and request an email registration. To install it you have to follow this
link: http://www.cvlibs.net/datasets/kitti/eval_tracking.php and
download the left color images (15 GB) and the trainling labels of
tracking data set (9 MB). Extract the downloaded archives in your
$N2D2_DATA/KITTI
folder.
Option [default value] |
Description |
---|---|
|
Fraction of images used for the learning |
|
Fraction of images used for the validation |
KITTI_Road_Database¶
The KITTI Road Database provide ROI which can be used to road
segmentation. The dataset provide 1 labeled class (road) on 289 training
images. The 290 test images are not labeled. Utilization of the KITTI
Road Database is under licensing conditions and request an email
registration. To install it you have to follow this link:
http://www.cvlibs.net/datasets/kitti/eval_road.php and download the
“base kit” of (0.5 GB) with left color images, calibration and training
labels. Extract the downloaded archive in your $N2D2_DATA/KITTI
folder.
Option [default value] |
Description |
---|---|
|
Fraction of images used for the learning |
|
Fraction of images used for the validation |
KITTI_Object_Database¶
The KITTI Object Database provide ROI which can be use for autonomous
driving and environment perception. The database provide 8 labeled
different classes on 7481 training images. The 7518 test images are not
labeled. The whole database provide 80256 labeled objects. Utilization
of the KITTI Object Database is under licensing conditions and request
an email registration. To install it you have to follow this link:
http://www.cvlibs.net/datasets/kitti/eval_object.php?obj_benchmark and
download the “lef color images” (12 GB) and the training labels of
object data set (5 MB). Extract the downloaded archives in your
$N2D2_DATA/KITTI_Object
folder.
Option [default value] |
Description |
---|---|
|
Fraction of images used for the learning |
|
Fraction of images used for the validation |
LITISRouen_Database¶
LITIS Rouen audio scene dataset [RG14].
Option [default value] |
Description |
---|---|
|
Fraction of images used for the learning |
|
Fraction of images used for the validation |
|
Path to the database |
[ |
Dataset images slicing¶
It is possible to automatically slice images from a dataset, with a
given slice size and stride, using the .slicing
attribute. This
effectively increases the number of stimuli in the set.
[database.slicing]
ApplyTo=NoLearn
Width=2048
Height=1024
StrideX=2048
StrideY=1024
RandomShuffle=1 ; 1 is the default value
The RandomShuffle
option, enabled by default, randomly shuffle the
dataset after slicing. If disabled, the slices are added in order at the
end of the dataset.