Export: ONNX¶
- Export type:
ONNX
ONNX export.
n2d2 MobileNet_ONNX.ini -seed 1 -w /dev/null -export ONNX
Principle¶
The ONNX export allows you to generate an ONNX model from a N2D2 model. The generated ONNX model is optimized for inference and can be quantized beforehand with either post-training quantization or Quantization Aware Training (QAT).
Graph optimizations¶
Weights are equalized between layers when possible;
BatchNorm
is automatically fused with the precedingConv
orFc
when possible;Padding
layers are fused withConv
when possible;Dropout
layers are removed.
Export parameters¶
Extra parameters can be passed during export using the
-export-parameters params.ini
command line argument. The parameters must be
saved in an INI-like file.
List of available parameters:
Argument [default value] |
Description |
---|---|
|
If true (1), casting in the graph is implicit and |
|
If true (1), the graph is fake quantized, meaning floating-point ONNX operators are used for the computation |
The parameters ImplicitCasting
and FakeQuantization
are useful only for
quantized networks. In this case, a full integer ONNX graph is generated when
possible, notably using the ONNX ConvInteger and MatMulInteger when
-nbbits
is ≤ 8 bits. An example of generated graph is shown below, with a
Single-shift
activation rescaling mode (-act-rescaling-mode
, see
Post-training quantization):
By default, strict adherence to the ONNX standard is enforced, by adding
explicit Cast
operators when required. The automatic insertion of Cast
operators can be disabled by setting the ImplicitCasting
export parameter
to true. This results in the simplified graph below:
The FakeQuantization
parameter allows to export a quantized network using
fake quantization, meaning the parameters of the network are quantized (integer)
but their representation remains in floating-point and the computation is done
with floating-point operators. However, the output values of the network
should be almost identical to when the computation is done in integer. The
differences are due to numerical errors as all integers cannot be represented
exactly with floating-point.
Note
The FakeQuantization
, when set, implies ImplicitCasting
, as no
casting operator is required in a fully floating-point graph.
Example¶
n2d2 MobileNet_ONNX.ini -seed 1 -w weights_validation -export ONNX -nbbits 8 -calib -1 -db-export 100 -test
This command generates a 8-bits integer quantized ONNX model in the sub-directory
export_ONNX_int8
.