ZODFrames

class tri3d.datasets.ZODFrames(root, metadata='trainval-frames-mini.json', split='train', anon_method='dnat')[source]

ZOD Frames (Zenseact Open Dataset).

Note

Notable differences with the original ZOD dataset:

  • Lidars are rotated by 90° around Z so that x points forward of the ego car.

  • Boxes are interpolated to all frames, use the timestamps to decide if they are relevant.

cam_sensors: list[str] = ['front']

Camera names.

img_sensors: list[str] = ['img_front']

Camera names (image plane coordinate).

pcl_sensors: list[str] = ['velodyne']

Point cloud sensor names.

det_labels: list[str] = ['Animal', 'DynamicBarrier', 'Pedestrian', 'PoleObject', 'TrafficBeacon', 'TrafficGuide', 'TrafficSign', 'TrafficSignal', 'Vehicle', 'VulnerableVehicle']

Detection labels.

sem_labels: list[str] = []

Segmentation labels.

sequences()[source]

Return the list of sequences/recordings indices (0..num_sequences).

timestamps(seq, sensor)[source]

Return the frame timestamps for a given sensor .

Parameters:
  • seq – Sequence index.

  • sensor – Sensor name.

Returns:

An array of timestamps.

Note

frames are guaranteed to be sorted.

../_images/tri3d.datasets.ZODFrames.timestamps.jpg
image(seq, frame, sensor)[source]

Return image from given camera at given frame.

A default sensor (for instance a front facing camera) should be provided for convenience.

Parameters:
  • seq – Sequence index.

  • frame – Frame index.

  • sensor – The image sensor to use.

../_images/tri3d.datasets.ZODFrames.image.jpg
alignment(seq, frame, coords)[source]

Return the transformation from one coordinate system and timestamp to another.

Parameters:
  • seq (int) – Sequence index.

  • frame (int | tuple[int, int]) – Either a single frame or a (src, dst) tuple. The frame is respective to the sensor timeline as specified by coords.

  • coords (str | tuple[str, str]) – Either a single sensor/coordinate system or a (src, dst) tuple. The transformation also accounts for mismatches in sensor timelines and movement of the ego-car.

Returns:

A transformation that projects points from one coordinate system at one frame to another.

Return type:

Transformation

boxes(seq, frame, coords)[source]

Return the 3D box annotations.

This function will interpolate and transform annotations if necessary in order to match the requested coordinate system and timeline.

Parameters:
  • seq (int) – Sequence index.

  • frame (int) – Frame index.

  • coords (str) – The coordinate system and timeline to use.

Returns:

A list of box annotations.

Return type:

Sequence[type[Box]]

../_images/tri3d.datasets.ZODFrames.boxes.jpg
frames(seq, sensor)[source]

Return the frames indices of particular sequence for a sensor.

The indices are normally contiguous (ie: np.arange()).

Parameters:
  • seq (int) – Sequence index.

  • seq – Sequence index.

  • sensor (str)

Returns:

A list of (sequence, frame) index tuples sorted by sequence and frame.

Return type:

ndarray

instances(seq, frame, sensor)[source]

Return pointwise instance ids.

Parameters:
  • seq (int) – Sequence index.

  • frame (int) – Frame index.

  • sensor (str)

Returns:

array of pointwise instance label

instances2d(seq, frame, sensor)[source]

Return pixelwise instance annotations.

Background label pixels will contain -1. Other instance ids will follow dataset-specific rules.

Parameters:
  • seq (int) – Sequence index.

  • frame (int) – Frame index.

  • sensor (str)

Returns:

array of pointwise instance label

points(seq, frame, sensor, coords=None)[source]

Return an array of 3D point coordinates from lidars.

The first three columns contains xyz coordinates, additional columns are dataset-specific.

For convenience, the point cloud can be returned in the coordinate system of another sensor. In that case, frame is understood as the frame for that sensor and the point cloud which has the nearest timestamp is retrieved and aligned.

Parameters:
  • seq (int) – Sequence index.

  • frame (int) – Frame index.

  • sensor (str) – The 3D sensor (generally a LiDAR) to use.

  • coords (str | None) – The coordinate system and timeline to use. Defaults to the sensor.

Returns:

A NxD array where the first 3 columns are X, Y, Z point coordinates and the remaining ones are dataset-specific.

Return type:

ndarray

../_images/tri3d.datasets.ZODFrames.points.jpg
poses(seq, sensor, timeline=None)[source]

Return all sensor to world transforms for a sensor.

World references an arbitrary coordinate system for a sequence, not all datasets provide an actual global coordinate system.

Parameters:
  • seq (int) – Sequence index.

  • sensor (str) – Sensor name.

  • timeline (str | None) – When specified, the sensor poses will be interpolated to the timestamps of that timeline if necessary.

Returns:

Sensor poses as a batched transform.

Return type:

RigidTransform

../_images/tri3d.datasets.ZODFrames.poses.jpg
rectangles(seq, frame, sensor)[source]

Return a list of 2D rectangle annotations.

Note

The default coordinate system should be documented.

Parameters:
  • seq (int) – Sequence index.

  • frame (int) – Frame index or None to request annotations for the whole sequence

  • sensor (str)

Returns:

A list of 2D annotations.

semantic(seq, frame, sensor)[source]

Return pointwise class annotations.

Parameters:
  • seq (int) – Sequence index.

  • frame (int) – Frame index.

  • sensor (str) – The camera sensor for which annotations are returned.

Returns:

array of pointwise class label

semantic2d(seq, frame, sensor)[source]

Return pixelwise class annotations.

Parameters:
  • seq (int) – Sequence index.

  • frame (int) – Frame index.

  • sensor (str)

Returns:

array of pointwise class label