With growing data, the time taken to run ETL (extract, transform, load) processes to support the myriad downstream workloads has also grown. At the current time, Apache Spark has emerged as the de-facto standard for streamlining at-scale ETL workloads and analytics processing. With Apache Spark, organisations are able to process large amounts of data in record time. Spark offers a set of easy-to-use APIs for ETL (extract, transform, load), machine learning, and graph processing for a variety of data sets from different sources. Currently, Spark is being run on millions of on-premise and cloud servers.
NVIDIA introduced its end-to-end GPU acceleration to Apache Spark 3.0 in 2020. This allows data scientists and machine learning engineers, for the first time, to be able to apply GPU acceleration to ETL workloads. This capability also delivers the performance and the scale needed to bring together the power of AI and the potential of big data.
To help understand and appreciate the true potential of this technology, NVIDIA and Micropoint, with Analytics India Magazine, are organising a webinar on ‘Performance boosting ETL workloads using RAPIDS on Spark 3.0’ on October 20th 2021.
The session will be conducted by Saurav Agarwal, Sr. Enterprise Architect – Big Data, Advanced Analytics & ML, at NVIDIA. He will be speaking about the most commonly used data architectures and ETL workloads, and how they can be accelerated using GPUs and RAPIDS on Adobe Spark 3.0.
The webinar will cover —
- ETL/data architecture and workflows in the industry
- Hands-on examples of speeding up the workflows using open source plugins on Spark
- Introduction to best practices around performance optimisation and speed ups.
- Introduction to NVIDIA RAPIDS and how it can help boost performance of ETL workloads
Who should attend?
- Data science, data engineering, analytics & Big Data enthusiasts
- Data engineering professionals & aspirants
- Aspiring data engineers
- Working professionals interested in the analytics domain
- Data science & analytics professionals looking to pivot
- Students from engineering/technical background
Saurav Agarwal – Sr. Enterprise Architect – Big Data, Advanced Analytics & ML
Saurav has around ten years of data industry experience implementing AI/data science/analytics solutions on big data platforms, including large-scale data lake systems. He is an experienced senior architect and seasoned data engineer with experience building distributed real-time data science pipelines. Along with having hands-on architecture and implementation experience in enterprise data landscapes, including Hadoop and Spark ecosystems, Saurav has been part of multiple large-scale projects covering end-to-end data landscape solutions for automotive, supply chain, healthcare, banks, fintech, and more. His top projects include streaming predictive alerts of heart ailments for a primary healthcare provider and building a petabyte-scale data lake for a large fintech firm and its various partner consumers.