1. The development of autonomous driving is gradually driven by data rather than technology
Today, autonomous driving sensor solutions and computing platforms have become increasingly homogeneous, and the technology gap between suppliers is narrowing. In the past two years, the iteration of autonomous driving technology has advanced rapidly, and mass production has accelerated. According to ResearchInChina, a total of 4.79 million passenger cars with L2 assisted driving were insured in China in 2021, a year-on-year increase of 58.0%. From January to June 2022, the penetration rate of L2 assisted driving in the Chinese new passenger car market climbed to 32.4%.
For autonomous driving, data runs through the entire life cycle ranging from R&D, testing, mass production, operation to maintenance. As the number of sensors in intelligent connected vehicles swells, the amount of data generated by ADAS and autonomous vehicles is growing exponentially, from gigabytes to terabytes, petabytes, exabytes, and even zettabytes in the future. The evolution of data-driven vehicles can meet the personalized demand of users, and facilitate the long-term development of automakers.
According to "Safety Guidelines for Processing of Data Collected by Automobiles", the data collected by automobiles refer to the data collected by automotive sensors and control units, as well as additional data generated after aforementioned data are processed, including out-of-vehicle data, cockpit data, operation data, position data, trajectory data, etc..
The “Several Provisions on Management of Automobile Data Security (Draft)” issued by Cyberspace Administration of China in August 2021 details regulations for collection, analysis, storage, transmission, query, application, deletion, etc. of automobile data. It requires that automobile data processing should adhere to the principles of "in-vehicle processing", "data should be not collected by default", "applicable accuracy range ", "desensitization processing" and so on, so as to reduce the disorderly collection and illegal abuse of automobile data. During the development of autonomous driving technology, data collection and processing must be legal and compliant.
Data collection/cleaning
The massive unstructured data (images, video, speech) collected by automotive cameras, radar, LiDAR, and ultrasonic radar can be raw and messy. To make them meaningful, they should be cleaned, structured, and organized. At first, the data from multiple sources should be imported into appropriate repositories with their formats being standardized and they should be aggregated according to relevant rules. Then, checks should be made to detect corrupt, duplicated, or missing data points, and the data that might affect the overall quality of the dataset should be discarded. Finally, labels should be used to classify videos captured under different conditions, such as daytime, night, sunny day, rain, etc. This step provides the cleaned structured data that will be used for training and validation.
Data annotation
The structured data that are cleaned after data collection should be labeled. Labeling is the process of assigning encoded values to raw data. Encoded values include, but are not limited to, assigning class labels, drawing bounding boxes, and marking object boundaries. High-quality annotation is needed to teach supervised learning models what objects are and to measure the performance of trained models.
In the field of autonomous driving, data annotation usually covers scenarios where vehicles are changing lanes to overtake, passing through intersections, turning left or right without traffic light control, running red lights and parking on roadsides illegally, pedestrians are jaywalking, etc.
Popular annotation tools are involved with general picture frames, lane line annotation, driver face annotation, 3D point cloud annotation, 2D/3D fusion annotation, panoramic semantic segmentation, etc. Prompted by development of big data and the spike in the number of large datasets, data annotation tools are used more and more widely.
Data transmission
Nowadays, data collection occurs every few milliseconds, requiring high-precision data in thousands of signal dimensions (such as bus signals, the internal state of sensors, software embedment, user behaviors, and environmental perception data, etc.). At the same time, in order to avoid data loss, disorder, hopping and delay, the transmission/storage cost is greatly reduced under the premise of high precision and high quality. The long uplink and downlink (from automotive MCU, DCU, gateways, 4G/5G to the cloud) of IoV data require the data transmission quality of each link node.
In response to new changes in data transmission, some companies have been able to provide efficient data acquisition and vehicle-cloud integrated transmission solutions. For example, EXCEEDDATA’s flexible data acquisition platform solution implements 10-millisecond real-time operations based on real-time data in the automotive computing environment to trigger flexible data collection and upload. After being calculated and filtered, the amount of uploaded data is significantly reduced. In addition, 100-300 times lossless compression and storage of the original signals at the vehicle is performed. The cloud management platform saves lossless high-quality signals of the vehicle with a high compression ratio, supports the issuance of data acquisition algorithms, the triggering of multiple acquisition modes, and the one-click download of acquired data uploaded to the business desktop in real time. The data can be flexibly filtered by vehicle, event, time, etc., and the storage and calculation are separated, realizing the closed loop of collection-calculation-upload-processing of vehicle-cloud isomorphic data. In 2021, HiPhiX became China's first production model equipped with EXCEEDDATA’s solution.
Data storage
In order to perceive the surrounding environment more clearly, autonomous vehicles carry more sensors and generate massive data. Some high-level autonomous driving systems are even equipped with more than 40 assorted sensors to accurately perceive 360° environment around vehicles. The R&D of autonomous driving systems has to go through multiple links such as data collection, data aggregation, cleaning and marking, model training, simulation, big data analysis, etc.. It involves the aggregation and storage of massive data, the data flow between different systems of different links, and reading and writing of massive data during model training. Data see new challenges from storage bottlenecks.
In this regard, the technology and capabilities of many cloud service providers have become the key to automakers. For example, Amazon Web Services (AWS) offers cloud computing services. AWS is centered on the autonomous driving data lake, helping automakers build an end-to-end autonomous driving data closed loop. Automakers can exploit Amazon Simple Storage Service (Amazon S3) to build an autonomous driving data lake so as to realize data collection, data management and analysis, data annotation, model and algorithm development, simulation verification, map development, DevOps and MLOps, as well as to conduct development, testing and application of autonomous driving easily.
For example, Baidu's data closed loop solution provides data retrieval services for multi-source data information of roadsides and vehicles, which are used for massive data search on business platforms, with advantages like multi-dimensional retrieval (vehicle information, mileage, autonomous driving duration, etc.), management of the entire life cycle from data production to destruction, support for panoramic data views, data traceability, data openness and sharing.
2. The efficient development of autonomous driving requires construction of a data closed loop system
The development of autonomous driving is gradually driven by data rather than technology. However, data-driven business models have many difficulties.
Difficult massive data processing: High-level autonomous driving test vehicles collect terabytes of data every day, so development teams need petabytes of storage space. However, less than 5% of the data are available for training as value data. In addition, there are strict security compliance requirements for the data collected by sensors such as automotive cameras, LiDAR, and high-precision sensors, which undoubtedly brings great challenges to the access, storage, desensitization, and processing of massive data.
High data annotation cost: Data annotation costs a lot of labor and time. With the development of advanced capabilities of autonomous driving, scenarios are becoming more and more complex, and difficult scenarios will happen. Improving the accuracy of vehicle perception models places higher requirements on the scale and quality of training datasets. In terms of efficiency and cost, traditional manual annotation has been unable to meet the demand of model training for massive datasets.
Low simulation test efficiency: Virtual simulation is an effective means to accelerate the training of autonomous driving algorithms, but simulation scenarios, especially complex and dangerous scenarios, are difficult to construct and embody a low degree of restoration. Plus the insufficient parallel simulation capability, the efficiency of simulation tests is low, and the iteration cycle of algorithms is too long.
Less coverage of HD maps: HD maps mainly rely on self-collection and self-made mapping, and only cover designated roads in the experimental stage. In the future, commercial HD maps will face prominent challenges in coverage, dynamic update, cost and efficiency when spreading to urban streets in major cities across the country.
In order to solve difficulties and problems, the efficient development of autonomous driving requires the construction of an efficient data closed loop system.
As far as the closed loop of autonomous driving data is concerned, Corner Cases should be solved in the process of autonomous driving. To this end, there must be enough data samples and convenient automotive verification methods. Shadow mode is one of the best solutions for Corner Cases.
Shadow mode was proposed by Tesla in April 2019 and applied to vehicles so as to compare relevant decisions and trigger data upload. It uses autonomous driving software on the sold vehicles to continuously record data detected by sensors, and selectively sends back autopilot algorithm for machine learning and refinement at the appropriate time.
In 2021, Tesla delivered 936,200 vehicles globally, of which 484,100 ones came from Chinese factory. Tesla delivered 560,000 units in 2022H1. Tesla takes advantage of mass production to continuously optimize its algorithm through shadow mode. Tesla leverages shadow mode to take millions of sold vehicles as test vehicles to perceive the surrounding environment and capture special road conditions, thereby continuously strengthening the capability to predict, avoid, and learn from uncertain events. Thanks to millions of sold vehicles, more Corner Cases and extreme working conditions will be covered. The high-quality data collected by flexible triggering can iterate better algorithms which determines the value of software. In terms of software update subscription services, the energy of data closed loop has just emerged.
3. Data closed loop becomes the core of iterative upgrade of autonomous driving
The premise of continuous iteration of automatic driving systems lies in constant optimization of algorithms which hinges on the efficiency of data closed loop systems. The efficient flow of data in each scenario of autonomous driving development is crucial, and data intelligence will become the key to accelerating mass production of autonomous vehicles.
In December 2021, Haomo.AI officially released MANA (Snow Lake), the first autonomous driving data intelligence system in China, to accelerate evolution of autonomous driving technology from the perspectives of perception, cognition, annotation, simulation and calculation. In the next three years, the assisted driving system of Haomo.AI will land on more than 1 million passenger cars. By virtue of its fully self-developed autonomous driving system, Haomo.AI has achieved remarkable advantages in data accumulation, processing and application. Massive data brings about technological iterative advantages, like obvious cost reduction and efficiency improvement.
Momenta has acquired leading full-process data-driven technology. Algorithmic modules about perception, fusion, prediction and regulation can be efficiently iterated and updated in a data-driven manner. Momenta’s Closed Loop Automation (CLA) is a complete toolchain that lets data streams drive automatic iterations of data-driven algorithms. CLA can automatically filter out massive gold data, drive automatic iteration of algorithms, and make autonomous driving flywheel spin faster and faster.
In the context of software-defined vehicles, data, algorithms and computing power are three elements of autonomous driving development. Automakers have shortened their R&D cycle and accelerated functional iteration. In the future, they can continue to collect data at low cost, high efficiency and high performance, and finally form a data closed loop and a business closed loop, which are the crux of the sustainable development of autonomous driving companies, through real data iterative algorithms.
In-vehicle Payment and ETC Market Research Report, 2024
Research on in-vehicle payment and ETC: analysis on three major application scenarios of in-vehicle payment
In-vehicle payment refers to users selecting and purchasing goods or services in the car an...
Automotive Audio System Industry Report, 2024
Automotive audio systems in 2024: intensified stacking, and involution on number of hardware and software tuning
Sales of vehicle models equipped with more than 8 speakers have made stea...
China Passenger Car Highway & Urban NOA (Navigate on Autopilot) Research Report, 2024
NOA industry research: seven trends in the development of passenger car NOA
In recent years, the development path of autonomous driving technology has gradually become clear, and the industry is acce...
Automotive Cloud Service Platform Industry Report, 2024
Automotive cloud services: AI foundation model and NOA expand cloud demand, deep integration of cloud platform tool chainIn 2024, as the penetration rate of intelligent connected vehicles continues to...
OEMs’ Passenger Car Model Planning Research Report, 2024-2025
Model Planning Research in 2025: SUVs dominate the new lineup, and hybrid technology becomes the new focus of OEMs
OEMs’ Passenger Car Model Planning Research Report, 2024-2025 focuses on the medium ...
Passenger Car Intelligent Chassis Controller and Chassis Domain Controller Research Report, 2024
Chassis controller research: More advanced chassis functions are available in cars, dozens of financing cases occur in one year, and chassis intelligence has a bright future. The report combs th...
New Energy Vehicle Thermal Management System Market Research Report, 2024
xEV thermal management research: develop towards multi-port valve + heat pump + liquid cooling integrated thermal management systems.
The thermal management system of new energy vehicles evolves fro...
New Energy Vehicle Electric Drive and Power Domain industry Report, 2024
OEMs lead the integrated development of "3 + 3 + X platform", and the self-production rate continues to increase
The electric drive system is developing around technical directions of high integratio...
Global and China Automotive Smart Glass Research Report, 2024
Research on automotive smart glass: How does glass intelligence evolve
ResearchInChina has released the Automotive Smart Glass Research Report 2024. The report details the latest advances in di...
Passenger Car Brake-by-Wire and AEB Market Research Report, 2024
1. EHB penetration rate exceeded 40% in 2024H1 and is expected to overshoot 50% within the yearIn 2024H1, the installations of electro-hydraulic brake (EHB) approached 4 million units, a year-on-year ...
Autonomous Driving Data Closed Loop Research Report, 2024
Data closed loop research: as intelligent driving evolves from data-driven to cognition-driven, what changes are needed for data loop?
As software 2.0 and end-to-end technology are introduced into a...
Research Report on Intelligent Vehicle E/E Architectures (EEA) and Their Impact on Supply Chain in 2024
E/E Architecture (EEA) research: Advanced EEAs have become a cost-reducing tool and brought about deep reconstruction of the supply chain
The central/quasi-central + zonal architecture has become a w...
Automotive Digital Power Supply and Chip Industry Report, 2024
Research on automotive digital power supply: looking at the digital evolution of automotive power supply from the power supply side, power distribution side, and power consumption side
This report fo...
Automotive Software Business Models and Suppliers’ Layout Research Report, 2024
Software business model research: from "custom development" to "IP/platformization", software enters the cost reduction cycle
According to the vehicle software system architecture, this report classi...
Passenger Car Intelligent Steering Industry Research Report, 2024
Intelligent Steering Research: Steer-by-wire is expected to land on independent brand models in 2025
The Passenger Car Intelligent Steering Industry Research Report, 2024 released by ResearchInChina ...
China Passenger Car Mobile Phone Wireless Charging Research Report, 2024
China Passenger Car Mobile Phone Wireless Charging Research Report, 2024 highlights the following:Passenger car wireless charging (principle, standards, and Qi2.0 protocol);Passenger car mobile phone ...
Automotive Smart Exteriors Research Report, 2024
Research on automotive smart exteriors: in the trend towards electrification and intelligence, which exteriors will be replaced by intelligence
The Automotive Smart Exteriors Research Report, 2024 r...
Automotive Fragrance and Air Conditioning System Research Report, 2024
Research on automotive fragrance/air purification: With surging installations, automotive olfactory interaction is being linked with more scenarios.
As users require higher quality of personalized, i...