Intelligent Driving Fusion Algorithm Research: sparse algorithms, temporal fusion and enhanced planning and control become the trend.
China Intelligent Driving Fusion Algorithm Research Report, 2024 released by ResearchInChina analyzes the status quo and trends of intelligent driving fusion algorithms (including perception, positioning, prediction, planning, decision, etc.), sorts out algorithm solutions and cases of chip vendors, OEMs, Tier1 & Tier2 suppliers and L4 algorithm providers, and summarizes the development trends of intelligent driving algorithms.
Since the period of eight months from Musk's live test drive of FSD V12 Beta in August 2023 to the 30-day free trial of FSD V12 Supervised in March 2024, advanced intelligent driving such as urban NOA has begun to become the arena of major OEMs, and there have been ever more application cases for end-to-end algorithms, BEV Transformer algorithms, and AI foundation model algorithms.
1. Sparse algorithms improve efficiency and reduce intelligent driving cost.
At present, most BEV algorithms are dense and consume considerable computing power and storage. The smoothness of more than 30 frames per second requires expensive computing resources such as NVIDIA A100. Even so, only 5 to 6 2MP cameras can be supported. For 8MP cameras, extremely expensive resources like multiple H100 GPUs are needed.
Our real world has sparse features. Sparsification helps sensors reduce noise and improve robustness. In addition, as distance increases, grids are bound to be sparse, and a dense network can only be maintained within about 50 meters. By reducing queries and feature interactions, sparse perception algorithms speed up calculations and lower storage requirements, greatly improve the computing efficiency and system performance of the perception model, shorten the system latency, expand the perception accuracy range, and ease the impact of vehicle speed.
![融合算法 1_副本.png](/UpLoads/Article/2024/融合算法%201_副本.png)
Therefore, the academia has shifted to sparse target-level algorithms rather than dense grid-based algorithms since 2021. With long-term efforts, sparse target-level algorithms can perform almost as well as dense grid-based algorithms. The industry also keeps iterating sparse algorithms. Recently, Horizon Robotics has open-sourced Sparse4D, its vision-only algorithm which ranks first on both nuScenes vision-only 3D detection and 3D tracking lists.
Sparse4D is a series of algorithms moving towards long-time-sequence sparse 3D target detection, belonging to the scope of multi-view temporal fusion perception technology. Facing the industry development trend of sparse perception, Sparse4D builds a pure sparse fusion perception framework, which makes perception algorithms more efficient and accurate and simplifies perception systems. Compared with dense BEV algorithms, Sparse4D reduces the computational complexity, breaks the limit of computing power on the perception range, and outperforms dense BEV algorithms in perception effect and reasoning speed.
Another significant advantage of sparse algorithms is to cut down the cost of intelligent driving solutions by reducing dependence on sensors and consuming less computing power. For example, Megvii Technology mentioned that taking a range of measures, for example, optimizing the BEV algorithm, reducing computing power, removing HD maps, RTK and LiDAR, unifying the algorithm framework, and automatic annotation, it has lowered the costs of its intelligent driving solutions based on PETR series sparse algorithms by 20%-30%, compared with conventional solutions on the market.
2. 4D algorithms offer higher accuracy and make intelligent driving more reliable.
As seen from the sensor configurations of OEMs, in recent three years ever more sensors have been installed, with increasing intelligent driving functions and application scenarios. Most urban NOA solutions are equipped with 10-12 cameras, 3-5 radars, 12 ultrasonic radars and 1-3 LiDARs.
![融合算法 2_副本.png](/UpLoads/Article/2024/融合算法%202_副本.png)
With the increasing number of sensors, ever more perception data are generated. How to improve the utilization of the data is also placed on the agenda of OEMs and algorithm providers. Although the algorithm details of companies are a little different, the general ideas of the current mainstream BEV Transformer solutions are basically the same: conversion from 2D to 3D and then to 4D.
Temporal fusion can greatly improve the algorithm continuity, and the memory of obstacles can handle occlusion and allows for better perception the speed information. The memory of road signs can improve the driving safety and the accuracy of vehicle behavior prediction. The fusion of information from historical frames can improve the perception accuracy of the current object, while the fusion of information from future frames can verify the object perception accuracy, thereby enhancing the algorithm reliability and accuracy.
Tesla's Occupancy Network algorithm is a typical 4D algorithm.
![融合算法 3_副本.png](/UpLoads/Article/2024/融合算法%203_副本.png)
Tesla adds the height information to the vector space of 2D BEV+ temporal information output by the original Transformer algorithm to build the 4D space representation form of 3D BEV + temporal information. The network runs every 10ms on the FSD, that is, it runs at 100FPS, which greatly improves the speed of model detection.
![融合算法 4.png](/UpLoads/Article/2024/融合算法%204.png)
3. End-to-end algorithms integrating perception, planning and control enable more anthropomorphic intelligent driving.
Mainstream intelligent driving algorithms have adopted the “BEV+Transformer” architecture, and many innovative perception algorithms have emerged. However, rule-based algorithms still prevail among planning and control algorithms. Some OEMs face technical and practical challenges in both perception and planning & control systems, which are sometimes in a "split" state. In some complex scenarios, the perception module may fail to accurately recognize or understand the environmental information, and the decision module may make incorrect driving decisions due to improper handling of the perception results or algorithm limitations. This restricts the development of advanced intelligent driving to some extent.
UniAD, an end-to-end intelligent driving algorithm jointly released by SenseTime, OpenDriveLab and Horizon Robotics, was rated as the Best Paper in CVPR2023. UniAD integrates three main tasks (perception, prediction and planning) and six sub-tasks (target detection, target tracking, scene mapping, trajectory prediction, grid prediction and path planning) into a unified end-to-end network framework based on Transformer for the first time to attain a general model of full-stack task-critical driving. Under the nuScenes real scene dataset, UniAD performs all tasks best in the field, especially in terms of the prediction and planning results far better the previous best solution.
The basic end-to-end algorithm enables direct inputs from sensors and predictive control outputs, but it is difficult to optimize, because of lacking effective feature communication between network modules and effective interaction between tasks and needing to output results in phases. The decision-oriented perception and decision integrated design proposed by the UniAD algorithm uses token features for deep fusion according to the perception-prediction-decision process, so that the indicators of all tasks targeting decision are consistently improved.
![融合算法 5_副本.png](/UpLoads/Article/2024/融合算法%205_副本.png)
In terms of planning and control algorithms, Tesla adopts an approach of interactive search + evaluation model to enable a comfortable and effective algorithm that combines conventional search algorithms with artificial intelligence:
Firstly, candidate objects are obtained according to lane lines, occupancy networks and obstacles, and then decision trees and candidate object sequences are generated.
The trajectory for reaching the above objects is constructed synchronously using conventional search and neural networks;
The interaction between the vehicle and other participants in the scene is predicted to form a new trajectory. After multiple evaluations, the final trajectory is selected. During the trajectory generation, Tesla applies conventional search algorithms and neural networks, and then scores the generated trajectory according to collision check, comfort analysis, the possibility of the driver taking over and the similarity with people, to finally decide the implementation strategy.
![融合算法 6.png](/UpLoads/Article/2024/融合算法%206.png)
XBrain, the ultimate architecture of Xpeng’s all-scenario intelligent driving, is composed of XNet 2.0, a deep vision neural network, and XPlanner, a planning and control module based on a neural network. XPlanner is a planning and control algorithm based on a neural network, with the following features:
Rule algorithm
Long time sequence (minute-level)
Multi-object (multi-agent decision, gaming capability)
Strong reasoning
The previous advanced algorithms and ADAS functional architectures were separated and consisted of many small logic planning and control algorithms for sub-scenes, while XPlanner has a unified planning and control algorithm architecture. XPlanner is supported by a foundation model and a large number of extreme driving scenes for simulation training, thus ensuring that it can cope with various complex situations.
![融合算法 7.png](/UpLoads/Article/2024/融合算法%207.png)
Ecological Domain and Automotive Hardware Expansion Research Report, 2024
Automotive Ecological Domain Research: How Will OEM Ecology and Peripheral Hardware Develop? Ecological Domain and Automotive Hardware Expansion Research Report, 2024 released by ResearchInChina ...
C-V2X and CVIS Industry Research Report, 2024
C-V2X and CVIS Research: In 2023, the OEM scale will exceed 270,000 units, and large-scale verification will start.The pilot application of "vehicle-road-cloud integration” commenced, and C-V2X entere...
Automotive Intelligent Cockpit Platform Configuration Strategy and Industry Research Report, 2024
According to the evolution trends and functions, the cockpit platform has gradually evolved into technical paths such as cockpit-only, cockpit integrated with other domains, cockpit-parking integratio...
Analysis on Huawei's Electrification, Connectivity, Intelligence and Sharing,2023-2024
Analysis on Huawei's Electrification, Connectivity, Intelligence and Sharing: Comprehensive layout in eight major fields and upgrade of Huawei Smart Selection
The “Huawei Intelligent Driving Business...
Li Auto’s Layout in Electrification, Connectivity, Intelligence and Sharing and Strategy Analysis Report, 2023-2024
Li Auto overestimates the BEV market trend and returns to intensive cultivation.
In the MPV market, Denza D9 DM-i with the highest sales (8,030 units) in January 2024 is a hybrid electric vehicle (H...
Analysis on NIO’s Layout in Electrification, Connectivity, Intelligence and Sharing, 2023
Analysis on NIO’s Layout in Electrification, Connectivity, Intelligence and Sharing, 2023
Because of burning money and suffering a huge loss, many people thought NIO would soon go out of business. NI...
Monthly Monitoring Report on China Automotive Sensor Technology and Data Trends (Issue 3, 2024)
Insight into intelligent driving sensors: “Chip-based” reduces costs, and the pace of installing 3-LiDAR solutions in cars quickens. LiDARs were installed in 173,000 passenger cars in China in Q1 2024...
Autonomous Driving Simulation Industry Report, 2024
Autonomous Driving Simulation Research: Three Trends of Simulation Favoring the Implementation of High-level Intelligent Driving.
On November 17, 2023, the Ministry of Industry and Information Techno...
Mobile Charging Robot Research Report, 2024
Research on mobile charging robot: more than 20 companies have come in and have implemented in three major scenarios.
Mobile Charging Robot Research Report, 2024 released by ResearchInChina highlight...
End-to-end Autonomous Driving (E2E AD) Research Report, 2024
End-to-end Autonomous Driving Research: status quo of End-to-end (E2E) autonomous driving
1. Status quo of end-to-end solutions in ChinaAn end-to-end autonomous driving system refers to direct mappi...
Monthly Monitoring Report on China Automotive Intelligent Driving Technology and Data Trends (Issue 2, 2024)
Insight into intelligent driving: ECARX self-develops intelligent driving chips, and L2.5 installation soared by 175% year on year.
Based on the 2023 version, the 2024 version of Monthly...
Monthly Monitoring Report on China Automotive Intelligent Cockpit Technology and Data Trends (Issue 2, 2024)
Insight into intelligent cockpit: the trend towards large screens is obvious, with >10" center console screens sweeping over 80%.
Based on the 2023 Edition, the 2024 Edition of Monthly Monitoring...
China Intelligent Driving Fusion Algorithm Research Report, 2024
Intelligent Driving Fusion Algorithm Research: sparse algorithms, temporal fusion and enhanced planning and control become the trend.
China Intelligent Driving Fusion Algorithm Research Report, 2024 ...
Automotive Electronics OEM/ODM/EMS Industry Report, 2024
Automotive electronics OEM/ODM/EMS research: top players’ revenue has exceeded RMB10 billion, and new entrants have been coming in.
At present, OEMs in the Chinese automotive electronics indus...
Analysis on Xpeng’s Layout in Electrification, Connectivity, Intelligence and Sharing, 2023
Research on Xpeng’s layout in electrification, connectivity, intelligence and sharing: in the innovation-driven rapid development, secured orders for 100 flying cars.
NIO, Xp...
Automotive Cockpit SoC Research Report, 2024
Automotive Cockpit SoC Research: Automakers quicken their pace of buying SoCs, and the penetration of domestic cockpit SoCs will soar
Mass production of local cockpit SoCs is accelerating, and the l...
Automotive Integrated Die Casting Industry Report, 2024
Integrated Die Casting Research: adopted by nearly 20 OEMs, integrated die casting gains popularity.
Automotive Integrated Die Casting Industry Report, 2024 released by ResearchInChina summari...
China Passenger Car Cockpit Multi/Dual Display Research Report, 2023-2024
In intelligent cockpit era, cockpit displays head in the direction of more screens, larger size, better looking, more convenient interaction and better experience. Simultaneously, the conventional “on...