Introduction to Apollo Perception Module
In last article, the general data processing chain inside a Cyber Component is introduced in detail. We know that the customized iniitalization and data processing of a perception component (for example the Lidar detection component) will be implemented in its Init()
and Proc()
member functions. In this article I would like to make the perception process clear, but won't illustrate the algorithm details.
People will want to see where are the deep learning models being invoked in perception at first, actually there are several levels till it appears. Like the last article, I will begin from a dag
file where to launch the perception. Under /modules/perception/production/dag
path there are several dag
files which will combine different perception components into a module. Each perception component class is a child class of Cyber::Component
. All perception components are defined under /modules/perception/onboard
path. Here will take dag_streaming_perception_lgsvl.dag
as example, which includes the following perception components:
- LidarDetectionComponent
- LidarTrackingComponent
- RadarDetectionComponent
- MultiSensorFusionComponent
And at the time of writing, the code structure of the perception process for different sensors has not been unified yet, especially radar perception part. I will address them separately.
Init()
of a perception component
I. Lidar perception component
First, take a look at the structure of LidarDetectionComponent
class:
From the config section of LidarDetectionComponent
in the dag
file, can see the config file path for this component:
config_file_path: "/apollo/modules/perception/production/conf/perception/lidar/velodyne128_detection_conf.pb.txt"
This file contains the following config items:
1 | sensor_name: "velodyne128" |
In LidarDetectionComponent::Init()
, it will do: 1. Load the config items above with GetProtoConfig()
into the member variables related to config info 2. Create an Writer<LidarFrameMessage>
object writer_
3. GetProtoFromFile()
loads configs from lidar_detection_conf_file : "lidar_detection_pipeline.pb.txt"
into lidar_detection_config_
4. Create an LidarObstacleDetection
object lidar_detection_pipeline_
5. Initialize lidar_detection_pipeline_
with lidar_detection_config_
from step 3
Let's take a look at the file "lidar_detection_pipeline.pb.txt":
1 | pipeline_type: LIDAR_DETECTION |
It defines 7 stages, these stage objects will be created during the initialization process:
1 | LidarDetectionComponent::Init() |
Each stage has its own class definition and is a child class of Stage
. In this example config file, CNNSegementation
is configured as the kern detector. Its Init()
will prepare the model: 1. Load configs from its stage_config
section in "lidar_detection_pipeline.pb.txt" file, which includes 4 files:
1
2
3
4param_file: "/apollo/modules/perception/production/data/perception/lidar/models/cnnseg/cnnseg128_param.conf" \
proto_file: "/apollo/modules/perception/production/data/perception/lidar/models/cnnseg/cnnseg128_caffe/deploy.prototxt" \
weight_file: "/apollo/modules/perception/production/data/perception/lidar/models/cnnseg/cnnseg128_caffe/deploy.caffemodel" \
engine_file: "/apollo/modules/perception/production/data/perception/lidar/models/cnnseg/cnnseg128_caffe/engine.conf" \1
2
3
4
5
6
7
8
9.....
model_type: "RTNet"
.....
ground_detector: "SpatioTemporalGroundDetector"
roi_filter: "HdmapROIFilter"
remove_ground_points: true
network_param { ... }
feature_param { ... }inference_
with model_type
config, here is RTNet
4. Initialize the inference object inference_
, which will create the network and input&output blobs with TensorRT APIs 5. Initialize the feature generator feature_generator_
, the features are the input of the inference network
Besides RTNet
, Apollo provides also CaffeNet
, PaddleNet
, TorchDet
, OnnxObstacleDetector
and MINet
network classes, they are all inherited from inference
class, and all of these classes have the kern member variables blobs_
input_names_
output_names_
for the model network input and output and the kern member function infer()
. The function CreateInferenceByName
works as an inference factory, it creates an corresponding inference object by the model type config. At the time of writing this article, the other lidar detector classes CenterPoint
MaskPillarsDetection
PointPillarsDetection
NCutSegementation
and the radar detector class ContiArsDetector
are not using this function yet.
II. Radar perception component
Unlike LidarDetectionComponent
class, RadarDetectionComponent
class doesn't have pipeline member variable from Pipeline
class which contains a couple of stages.
In RadarDetectionComponent::Init()
, it will create and initialize the following 3 member variables: 1
2
3
4
5
61. hdmap_input_->Init()
2. radar_preprocessor_->Init()
3. radar_perception_->Init(pipeline_name_) which includes:
detector_->Init()
roi_filter_->Init()
tracker_->Init()
There are two config sections of RadarDetectionComponent
in the dag
file: "FrontRadarDetection" and "RearRadarDetection". Take the first one as example, can see the config file path for this component:
config_file_path: "/apollo/modules/perception/production/conf/perception/radar/front_radar_component_conf.pb.txt"
This file contains the following config items: 1
2
3
4
5
6
7
8
9radar_name: "radar_front"
tf_child_frame_id: "radar_front"
radar_forward_distance: 200.0
radar_preprocessor_method: "ContiArsPreprocessor"
radar_perception_method: "RadarObstaclePerception"
radar_pipeline_name: "FrontRadarPipeline"
odometry_channel_name: "/apollo/localization/pose"
output_channel_name: "/perception/inner/PrefusedObjects"radar_pipeline_name: "FrontRadarPipeline"
tells the info about the radar detection pipeline, this file is at /apollo/modules/perception/production/conf/perception/radar/front_radar_pipeline.config
. The file is as following:
1 | model_configs { |
III. Lidar tracking component and Multi-Sensor fusion component
These two components have the same structure and initialzation process as Lidar detection component, here will not illustrate again.
Proc()
of a perception component
The implementations of Proc(message)
member function for the above perception component classes have similar process, it will basically do the following two things:
- InternalProc(in_message, out_message)
- writer_->Write(out_message)
As LidarDetectionComponent
LidarTrackingComponent
and MultiSensorFusionComponent
have the similar pipeline and stage structure, their InternalProc()
are also having similiar processes insdie: 1
2
3
4InternalProc(in_message, out_message)
-> lidar_detection_pipeline_->Process(data_frame)
-> Pipeline::InnerProcess(data_frame)
-> for loop: stage_ptr->Process(frame)CNNSegmentation
is a stage in lidar detection pipeline, its Process()
executes the following steps: 1
2
3
4
5CNNSegmentation::Process(DataFrame* data_frame)
-> Detect(options, lidar_frame)
-> feature_generator_->Generate(original_cloud_, point2grid_) // generate features
+ inference_->Infer() // model inference
+ GetObjectsFromSppEngine(&frame->segmented_objects) // processing clustering
Other lidar detectors CenterPoint
MaskPillarsDetection
PointPillarsDetection
have similar steps in Process()
: 1
2
3
4
5CenterPointDetection::Process(DataFrame* data_frame)
-> Detect(options, lidar_frame)
-> DoInference(points_data, num_points, &out_detections, &out_labels, &out_scores)
+ FilterScore(&out_detections, &out_labels, &out_scores, FLAGS_score_threshold, &out_detections_final, &out_labels_final)
+ GetObjects(&frame->segmented_objects, frame->lidar2world_pose, &out_detections_final, &out_labels_final)RadarDetectionComponent
, its current InterProc()
of RadarDetectionComponent
does in a different way as following: 1
2
3
4
5
61. radar_preprocessor_->Preprocess(raw_obstacles, preprocessor_options, &corrected_obstacles)
2. radar_perception_->Perceive(corrected_obstacles, options, &radar_objects) which includes:
detector_->Detect(corrected_obstacles, options.detector_options, detect_frame_ptr)
roi_filter_->RoiFilter(options.roi_filter_options, detect_frame_ptr)
tracker_->Track(*detect_frame_ptr, options.track_options, tracker_frame_ptr)InternalProc
function interface can see the input and output message data types: - LidarDetectionComponent: bool InternalProc(const std::shared_ptr<const drivers::PointCloud>& in_message, const std::shared_ptr<LidarFrameMessage>& out_message)
- LidarTrackingComponent: bool InternalProc(const std::shared_ptr<const LidarFrameMessage>& in_message, const std::shared_ptr<SensorFrameMessage>& out_message)
- RadarDetectionComponent: bool InternalProc(const std::shared_ptr<ContiRadar>& in_message, std::shared_ptr<SensorFrameMessage> out_message)
- MultiSensorFusionComponent: bool InternalProc(const std::shared_ptr<SensorFrameMessage const>& in_message, std::shared_ptr<PerceptionObstacles> out_message, std::shared_ptr<SensorFrameMessage> viz_message)
So LidarDetectionComponent
takes PointCloud
data as input, and outputs LidarFrameMessage
data type which contains LidarFrame
type data: 1
2
3
4
5
6
7
8
9
10
11
12struct LidarFrame {
std::shared_ptr<base::AttributePointCloud<base::PointF>> cloud;
std::shared_ptr<base::AttributePointCloud<base::PointD>> world_cloud;
double timestamp = 0.0;
std::shared_ptr<base::HdmapStruct> hdmap_struct = nullptr;
std::vector<std::shared_ptr<base::Object>> segmented_objects;
std::vector<std::shared_ptr<base::Object>> tracked_objects;
base::PointIndices roi_indices;
base::PointIndices non_ground_indices;
base::PointIndices secondary_indices;
base::SensorInfo sensor_info;
}LidarDetectionComponent
actually outputs to the segmented_objects
in the LidarFrame
. LidarTrackingComponent
outputs to the tracked_objects
in the LidarFrame
first and transfers to another general data type SensorFrameMessage
to hold the output, for the next step fusion. SensorFrameMessage
data type contains Frame
type data, and its kern is objects
i.e. object info: 1
2
3
4
5
6
7
8
9
10
11struct alignas(16) Frame {
SensorInfo sensor_info;
double timestamp = 0.0;
std::vector<std::shared_ptr<Object>> objects;
Eigen::Affine3d sensor2world_pose;
// sensor-specific frame supplements
LidarFrameSupplement lidar_frame_supplement;
RadarFrameSupplement radar_frame_supplement;
CameraFrameSupplement camera_frame_supplement;
UltrasonicFrameSupplement ultrasonic_frame_supplement;
}RadarDetectionComponent
is a little bit different on its input data type. ContiRadar
is the raw message getting from the Conti radar sensor, it can be found at modules/common_msgs/sensor_msgs/conti_radar.proto
file. Notice it doesn't contain the raw point cloud data like Lidar detection input, but contains an obstacle array ContiRadarObs
direct from radar. After checking the kern function of the radar detector ContiArsDetector::Detect()
, it actually does the converting of the obstacle info to Frame
data format via invoking RawObs2Frame()
.
The output data type PerceptionObstacles
from MultiSensorFusionComponent
can be found in perception_obstacle.proto
file, it's the final output from the perception module.