KITTIObject¶

class tri3d.datasets.KITTIObject(data_dir, split='training', label_map=None)¶

KITTI 3D object detection dataset.

The sensors labels are cam and img for the left camera in 3D and homogeneous coordinates resp., and velo for the lidar.

Note

To match tri3D conventions, box annotation are modified as follows:

center is at the center of the box, not bottom
transform converts from tri3D object coordinates (x forward, z up) not kitti (x right-ward, z forward)
size is (length, width, height), not (height, length, width)

KITTI objects website

cam_sensors: List[str] = ['cam']¶: Camera names.

img_sensors: List[str] = ['img2']¶: Camera names (image plane coordinate).

pcl_sensors: List[str] = ['velo']¶: Point cloud sensor names.

det_labels: List[str] = ['Car', 'Van', 'Truck', 'Pedestrian', 'Person_sitting', 'Cyclist', 'Tram', 'Misc', 'Person', 'DontCare']¶: Detection labels.

sequences()¶: Return the list of sequences/recordings indices (0..num_sequences).

frames(seq=None, sensor=None)¶

Return the frames in the dataset or a particular sequence.

Parameters:

seq – Sequence index.
seq – Sequence index.

Returns:

A list of (sequence, frame) index tuples sorted by sequence and frame.

timestamps(seq, sensor)¶

Return the frame timestamps for a given sensor .

Parameters:

seq – Sequence index.
sensor – Sensor name.

Returns:

An array of timestamps.

Note

frames are guarenteed to be sorted.

poses(seq, sensor, timeline=None)¶

Return all sensor to world transforms for a sensor.

World references an arbitrary coordinate system for a sequence, not all datasets provide an actual global coordinate system.

Parameters:

seq – sequence
sensor – sensor name
timeline – When specified, the sensor poses will be interpolated to the timestamps of that timeline if necessary.

Returns:

Sensor poses as a batched transform.

image(seq, frame, sensor='img2')¶

Return image from given camera at given frame.

A default sensor (for instance a front facing camera) should be provided for convenience.

alignment(seq, frame, coords)¶

Return the transformation from one coordinate system and timestamp to another.

Parameters:

seq (int) – The sequence index
frame (int | tuple[int, int]) – Either a single frame or a (src, dst) tuple. The frame is respective to the sensor timeline as specified by coords.
coords (str | tuple[str, str]) – Either a single sensor/coordinate system or a (src, dst) tuple. The transformation also accounts for mismatches in sensor timelines and movement of the ego-car.

Returns:

A transformation that projects points from one coordinate system at one frame to another.

boxes(seq, frame, coords=None)¶

Return the 3D box annotations.

This function will interpolate and transform annotations if necessary in order to match the requested coordinate system and timeline.

Parameters:

seq (int) – Sequence index.
frame (int) – Frame index.
coords (str | None) – The coordinate system and timeline to use.

Returns:

A list of box annotations.

instances(seq, frame)¶

Return pointwise instance annotations.

Parameters:

seq (int) – sequence
frame (int) – frame

Returns:

array of pointwise instance label

points(seq, frame, sensor=None, coords=None)¶

Return an array of 3D point coordinates from lidars.

The first three columns contains xyz coordinates, additional columns are dataset-specific.

Parameters:

seq (int)
frame (int)
sensor (str | None)
coords (str | None)

rectangles(seq, frame)¶

Return a list of 2D rectangle annotations.

Note

The default coordinate system should be documented.

Parameters:

seq (int) – sequence
frame (int) – frame or None to request annotations for the whole sequence

Returns:

A list of 2D annotations.

semantic(seq, frame)¶

Return pointwise class annotations.

Parameters:

seq (int) – sequence
frame (int) – frame

Returns:

array of pointwise class label

sem_labels: List[str]¶: Segmentation labels.