Videos, Slides, Films

Software reliability in the Big Data era with an industry‐minded focus

Conferences
AEiC 2021 - 25th Ada-Europe International Conference on Reliable Software Technologies Keynote 1: Ángel Conde (2021)
Available as
Online
Summary

In order to build reliable Big Data software, we should prepare ourselves to the distributed fallacy, "A distributed system will always fail". In this way, Big Data systems have been architected fo...

In order to build reliable Big Data software, we should prepare ourselves to the distributed fallacy, "A distributed system will always fail". In this way, Big Data systems have been architected for survival of multiple system failures from the ground up. We will make a little introduction about different Big Data frameworks and their architectural decisions in order to be able to survive to multiple system failures in order to be reliable. Next, we will focus on the Industrial Internet of Things (IIoT). IIoT nowadays is an exploding trend with significant implications for the global economy. It spans industries including manufacturing, mining, agriculture, oil and gas, and utilities. It also encompasses companies that depend on durable physical goods to conduct business, such as organizations that operate hospitals, warehouses and ports or that offer transportation, logistics and healthcare services. Not surprisingly, the IIoT's potential payoff is enormous. A specific example of this potential use is the Predictive maintenance of assets, saving over scheduled repairs, reducing overall maintenance costs and eliminating breakdowns up. For example, Thames Water, one of the largest providers of water in the UK, is using sensors and real‐time data to help the utility company anticipate equipment failures and respond more quickly to critical situations, such as leaks or adverse weather events. However, analyzing such large quantities of usually out‐of‐order real‐time data from different sensors and system is a real challenge with Big Data analytics frameworks. The final part of the talk will be composed of a hands‐on workshop where we will use Apache Zeppelin, Apache Kafka, Apache Avro, Apache Cassandra and Spark´s Structured Streaming API to see how we can solve challenges of related to IIoT projects such as handling late unordered data. All the code used along with the developer environment be available at GitHub right after the keynote.

Details

Additional Information