Addressing Data Processing Challenges in Autonomous Vehicles

Alex Vakulov
Addressing Data Processing Challenges in Autonomous Vehicles
Illustration: © IoT For All

The rise of self-driving cars is a testament to advancements in artificial intelligence, but their success hinges on much more than just AI. Autonomous vehicles rely on a network of sensors, including cameras, GPS, sonars, lidars, and radars, to navigate diverse environments. The car’s onboard computer processes this information in real time; some data is also transmitted to external data centers for deeper analysis, eventually moving through various cloud systems. Handling these vast amounts of data is a significant challenge for the autonomous vehicle industry.

In this context, the role of the Internet of Things becomes crucial. It is not only about the AI capabilities but also about the power of onboard computing, peripheral servers, and cloud technologies. The efficiency of IoT infrastructure in enabling rapid data transmission and ensuring low latency is vital for the seamless functioning of autonomous vehicles.

Data Processing Challenges

Today, even regular cars with drivers are producing increasing amounts of data. When it comes to self-driving cars, the data generation is just on another level, reaching around 1TB per hour. The challenge lies in the processing of all this info.

It is impractical to rely solely on cloud or peripheral data centers for processing all of a self-driving car’s data, as this introduces excessive delays. In the world of autonomous driving, even a 100-millisecond delay can be critical, potentially being the difference between life and death for a pedestrian or car passenger. Therefore, these vehicles must be equipped to respond to changing situations immediately, making swift data processing vital.

To minimize the lag between receiving and responding to information, a portion of the data is processed by the car’s onboard computer. Take the new Jeep models, for example. They come equipped with an onboard computer consisting of about 50 processing cores. This computer powers a range of functions like blind-spot monitoring, cruise control, automatic braking, obstacle warning, etc. The various vehicle nodes communicate internally, creating an in-vehicle network.

This configuration aligns well with the concept of edge computing within the Internet of Things framework, considering the onboard computer as a peripheral node of the IoT network. As a result, autonomous vehicles form a complex hybrid network that integrates centralized data centers, cloud services, and numerous peripheral nodes. Nodes are not limited to vehicles; they are also embedded in charging stations, control posts, traffic lights, etc.

Data centers and servers outside the vehicle greatly aid in driverless navigation. They enable the vehicle to “see” beyond its sensor range, manage road network traffic loads, and assist in making optimal driving decisions. This interconnected system represents a significant leap forward in road safety.

The Data Exchange Revolution in Self-Driving Car Technology

Computer vision systems and GPS equip self-driving cars with essential information about their location and surroundings. Yet, despite the expanding range of their whereabouts calculations, a single car can only gather a finite amount of data. Therefore, data exchange between vehicles is critical. This exchange allows each vehicle to better understand driving conditions using a larger dataset collected by the entire fleet of autonomous vehicles. Vehicle-to-vehicle systems use mesh networks formed by vehicles within the same area to share information and send signals like distance warnings to one another.

Moreover, Vehicle-to-Vehicle networks are progressively expanding to include interactions with road infrastructure, like traffic lights. This is where vehicle-to-infrastructure communication comes into play. V2I standards are continually evolving. In the United States, for example, the Federal Highway Administration regularly publishes guides and reports to foster technological advancements. The benefits of V2I go well beyond just safety. In addition to improving road safety, Vehicle-to-Infrastructure technology offers mobility and environmental interaction advantages.

Just as drivers who travel the same route daily become familiar with every pothole, self-driving cars are also continuously learning from their environment. Autonomous vehicles will upload useful information to peripheral data centers, which could be integrated into charging stations and other objects. Equipped with AI algorithms, these stations will analyze data from cars and propose potential solutions. This information will then be shared with other autonomous vehicles via the cloud.

If this model of data exchange among all self-driving cars comes to life in the next few years, we can anticipate a staggering amount of data being generated daily – potentially reaching millions of terabytes. By that time, estimates suggest that the number of self-driving cars on the roads could range from hundreds of thousands to tens of millions.

Autonomous Cars and 5G

Again, self-driving cars are capable of gathering information about pedestrians and cyclists not only through their sensors but also via data shared with other vehicles, traffic lights, and other urban infrastructure systems. This is being facilitated by several 5G-connected car projects. Autonomous cars utilize Cellular Vehicle-to-Everything technology and 5G networks for communication with other traffic lights, cyclists, and cars.

Traffic lights may be fitted with thermal imagers to sense pedestrians nearing crosswalks, triggering alerts to appear on the dashboard of the car. Cyclists connected to this network can broadcast their location to nearby vehicles, significantly reducing the risk of accidents. Additionally, in poor visibility conditions, parked vehicles can automatically activate their emergency flashers, alerting other drivers to their presence.

The advent of 5G mobile networks is proving invaluable for the advancement of self-driving cars. 5G networks offer high speeds, extremely low latency, and the capacity to handle numerous connections simultaneously. Without these capabilities, autonomous vehicles would struggle to outperform humans in critical tasks like detecting pedestrians at a nearby crosswalk. Moreover, the need for minimal delay is vital, as even a slight fraction of a second can be the difference between safety and a potential accident.

Major automotive manufacturers, including Toyota, BMW, Hyundai, and Ford, already incorporate 5G technology into their vehicles. With billions of dollars invested by cellular operators in building 5G networks, the timing could not be better for equipping vehicles with capabilities essential for everyday operation.

However, all the progress and experiments with 5G-connected autonomous cars hinge on the availability of a robust 5G infrastructure. Given that an autonomous vehicle can generate up to 1TB of data per hour, these networks must be already prepared to handle such immense data transfer demands, with the potential to cover even greater demands in the future.

Storing and Processing Exabytes of Data Effectively

Not every piece of data collected by self-driving cars demands immediate processing, and there are limitations to the performance and storage capabilities of onboard computers. Therefore, it is practical to accumulate data that can afford some delay and analyze it in peripheral data centers. Simultaneously, other data sets can be migrated to the cloud for processing.

The responsibility of collecting, processing, moving, safeguarding, and analyzing data about every pedestrian, car, pothole, or traffic jam should fall on both city governments and automakers. Some smart city planners are already leveraging machine learning algorithms to more efficiently analyze traffic data. These algorithms can swiftly identify road issues like potholes, optimize traffic flow, and provide immediate responses to accidents. On a broader scale, machine learning algorithms are being used to offer recommendations for enhancing city infrastructure.

Integrating fully autonomous driving into our daily lives requires addressing the challenge of processing and storing massive amounts of data. A single self-driving vehicle can generate up to 20 TB of data each day. Looking ahead, this could lead to the generation of exabytes of data in a single day. Managing this requires a flexible, high-performance, reliable, and secure edge infrastructure for data storage, along with efficient data processing capabilities.

For an onboard computer to make real-time decisions, it must have access to the latest information about its environment. Data that is outdated, such as the vehicle’s location and speed from an hour ago, typically becomes redundant for immediate decision-making. However, this historical data holds significant value for the ongoing improvement of autonomous driving algorithms, necessitating a balance between real-time processing and long-term data utilization.

To effectively train deep learning networks, system developers require substantial amounts of data. This includes identifying objects and their movements through camera feeds and lidar information and optimally integrating data about the environment and infrastructure for decision-making. For road safety experts, the data gathered by autonomous cars immediately before incidents or hazardous situations is invaluable.

The necessity for a structured and efficient data storage system grows as autonomous vehicles gather data that is relayed to peripheral data centers and ultimately stored in the cloud. Fresh data should be analyzed promptly to refine machine learning models, requiring high throughput and low latency. Solid State Drives (SSDs) and high-capacity Heat-Assisted Magnetic Recording (HAMR) drives, equipped with support for multi-drive technologies, are ideally suited for these tasks.

Once data from autonomous vehicles has undergone initial analysis, it needs to be stored more cost-effectively, ideally on high-capacity but lower-cost traditional nearline storage solutions. These storage servers are needed for data that might be helpful in the future. Older data that is less likely to be used but still needs to be retained can be sent to archival storage.

The shift towards processing and analyzing data at the edge is a hallmark of Industry 4.0, revolutionizing our data usage. Edge computing enables data to be processed near its point of collection rather than relying on traditional distant cloud servers. This approach allows for much quicker analysis, enabling immediate responses to changing situations. A super fast and effective network that supports the transfer of information between data centers and vehicles will improve the safety and reliability of self-driving technology.


The advancement of self-driving cars showcases a leap in artificial intelligence and the crucial role of IoT in handling complex data networks. Autonomous vehicles, equipped with an array of sensors and supported by edge computing, are reshaping road safety and urban mobility. The introduction of 5G networks is further enhancing their capabilities, enabling faster, more reliable communication with other vehicles and urban infrastructure.

However, the effective processing and storage of the vast amounts of data generated remain a significant challenge. As we move towards a future with potentially millions of data-generating autonomous vehicles on the roads, developing efficient and secure data infrastructure becomes imperative for the success and safety of this revolutionary technology.

Alex Vakulov
Alex Vakulov
Alex Vakulov is a cybersecurity researcher with over 20 years of experience in malware analysis. Alex has strong malware removal skills. He is writing for numerous tech-related publications, sharing his security experience. Alex is assisting organ...
Alex Vakulov is a cybersecurity researcher with over 20 years of experience in malware analysis. Alex has strong malware removal skills. He is writing for numerous tech-related publications, sharing his security experience. Alex is assisting organ...