Waymo¶

class tri3d.datasets.Waymo(root, split='training')[source]¶

Waymo Open dataset (parquet file format).

Note

LiDAR timestamps are increased by a +0.05 so that they correspond to the middle of the sweep instead of the beginning.

The timestamps for cameras refer to the pose timestamp in raw waymo data. Other temporal information such as trigger timestamps are not exposed.

Waymo.instances2d() returns global ids [1] which are consistent between frames. They are not common with the tracking and 3D annotations.

Waymo.frames() supports an extra sensor name ‘SEG_LIDAR_TOP’ where it returns lidar frames for which 3D segmentation is available.

cam_sensors: list[str] = ['CAM_FRONT', 'CAM_FRONT_LEFT', 'CAM_FRONT_RIGHT', 'CAM_SIDE_LEFT', 'CAM_SIDE_RIGHT', 'CAM_REAR_LEFT', 'CAM_REAR', 'CAM_REAR_RIGHT']¶: Camera names.

img_sensors: list[str] = ['IMG_FRONT', 'IMG_FRONT_LEFT', 'IMG_FRONT_RIGHT', 'IMG_SIDE_LEFT', 'IMG_SIDE_RIGHT', 'IMG_REAR_LEFT', 'IMG_REAR', 'IMG_REAR_RIGHT']¶: Camera names (image plane coordinate).

pcl_sensors: list[str] = ['LIDAR_TOP', 'LIDAR_FRONT', 'LIDAR_SIDE_LEFT', 'LIDAR_SIDE_RIGHT', 'LIDAR_REAR']¶: Point cloud sensor names.

det_labels: list[str] = ['UNKNOWN', 'VEHICLE', 'PEDESTRIAN', 'SIGN', 'CYCLIST']¶: Detection labels.

sem_labels: list[str] = ['UNDEFINED', 'CAR', 'TRUCK', 'BUS', 'OTHER_VEHICLE', 'MOTORCYCLIST', 'BICYCLIST', 'PEDESTRIAN', 'SIGN', 'TRAFFIC_LIGHT', 'POLE', 'CONSTRUCTION_CONE', 'BICYCLE', 'MOTORCYCLE', 'BUILDING', 'VEGETATION', 'TREE_TRUNK', 'CURB', 'ROAD', 'LANE_MARKER', 'OTHER_GROUND', 'WALKABLE', 'SIDEWALK']¶: Segmentation labels.

sequences()[source]¶: Return the list of sequences/recordings indices (0..num_sequences).

timestamps(seq, sensor)[source]¶

Return the frame timestamps for a given sensor .

Parameters:

seq – Sequence index.
sensor – Sensor name.

Returns:

An array of timestamps.

Note

frames are guaranteed to be sorted.

../_images/tri3d.datasets.Waymo.timestamps.jpg

image(seq, frame, sensor)[source]¶

Return image from given camera at given frame.

A default sensor (for instance a front facing camera) should be provided for convenience.

Parameters:

seq – Sequence index.
frame – Frame index.
sensor – The image sensor to use.

../_images/tri3d.datasets.Waymo.image.jpg

semantic(seq, frame, sensor)[source]¶

Return pointwise class annotations.

Parameters:

seq – Sequence index.
frame – Frame index.
sensor – The camera sensor for which annotations are returned.

Returns:

array of pointwise class label

../_images/tri3d.datasets.Waymo.semantic.jpg

instances(seq, frame, sensor)[source]¶

Return pointwise instance ids.

Parameters:

seq – Sequence index.
frame – Frame index.

Returns:

array of pointwise instance label

../_images/tri3d.datasets.Waymo.instances.jpg

semantic2d(seq, frame, sensor)[source]¶

Return pixelwise class annotations.

Parameters:

seq – Sequence index.
frame – Frame index.

Returns:

array of pointwise class label

instances2d(seq, frame, sensor)[source]¶

Return pixelwise instance annotations.

Background label pixels will contain -1. Other instance ids will follow dataset-specific rules.

Parameters:

seq – Sequence index.
frame – Frame index.

Returns:

array of pointwise instance label

frames(seq, sensor)[source]¶

Return the frames indices of particular sequence for a sensor.

The indices are normally contiguous (ie: np.arange()).

Parameters:

seq – Sequence index.
seq – Sequence index.

Returns:

A list of (sequence, frame) index tuples sorted by sequence and frame.

alignment(seq, frame, coords)[source]¶

Return the transformation from one coordinate system and timestamp to another.

Parameters:

seq (int) – Sequence index.
frame (int | tuple[int, int]) – Either a single frame or a (src, dst) tuple. The frame is respective to the sensor timeline as specified by coords.
coords (str | tuple[str, str]) – Either a single sensor/coordinate system or a (src, dst) tuple. The transformation also accounts for mismatches in sensor timelines and movement of the ego-car.

Returns:

A transformation that projects points from one coordinate system at one frame to another.

Return type:

Transformation

boxes(seq, frame, coords)[source]¶

Return the 3D box annotations.

This function will interpolate and transform annotations if necessary in order to match the requested coordinate system and timeline.

Parameters:

seq (int) – Sequence index.
frame (int) – Frame index.
coords (str) – The coordinate system and timeline to use.

Returns:

A list of box annotations.

Return type:

Sequence[type[Box]]

../_images/tri3d.datasets.Waymo.boxes.jpg

points(seq, frame, sensor, coords=None)[source]¶

Return an array of 3D point coordinates from lidars.

The first three columns contains xyz coordinates, additional columns are dataset-specific.

For convenience, the point cloud can be returned in the coordinate system of another sensor. In that case, frame is understood as the frame for that sensor and the point cloud which has the nearest timestamp is retrieved and aligned.

Parameters:

seq (int) – Sequence index.
frame (int) – Frame index.
sensor (str) – The 3D sensor (generally a LiDAR) to use.
coords (str | None) – The coordinate system and timeline to use. Defaults to the sensor.

Returns:

A NxD array where the first 3 columns are X, Y, Z point coordinates and the remaining ones are dataset-specific.

Return type:

ndarray

../_images/tri3d.datasets.Waymo.points.jpg

poses(seq, sensor, timeline=None)[source]¶

Return all sensor to world transforms for a sensor.

World references an arbitrary coordinate system for a sequence, not all datasets provide an actual global coordinate system.

Parameters:

seq (int) – Sequence index.
sensor (str) – Sensor name.
timeline (str | None) – When specified, the sensor poses will be interpolated to the timestamps of that timeline if necessary.

Returns:

Sensor poses as a batched transform.

Return type:

RigidTransform

../_images/tri3d.datasets.Waymo.poses.jpg

rectangles(seq, frame, sensor)[source]¶

Return a list of 2D rectangle annotations.

Note

The default coordinate system should be documented.

Parameters:

seq (int) – Sequence index.
frame (int) – Frame index or None to request annotations for the whole sequence
sensor (str)

Returns:

A list of 2D annotations.