Course Overview
TOPLearn how to harness the power of Apache Spark and powerful clusters running on the Azure Databricks platform to run large data engineering workloads in the cloud.
Scheduled Classes
TOPOutline
TOPModule 1 : Explore Azure Databricks
- Provision an Azure Databricks workspace
- Identify core workloads for Azure Databricks
- Use Data Governance tools Unity Catalog and Microsoft Purview
- Describe key concepts of an Azure Databricks solution
Module 2 : Perform data analysis with Azure Databricks
- Ingest data using Azure Databricks.
- Using the different data exploration tools in Azure Databricks.
- Analyze data with DataFrame APIs.
Module 3 : Use Apache Spark in Azure Databricks
- Describe key elements of the Apache Spark architecture.
- Create and configure a Spark cluster.
- Describe use cases for Spark.
- Use Spark to process and analyze data stored in files.
- Use Spark to visualize data.
Module 4 : Manage data with Delta Lake
- What Delta Lake is
- How to manage ACID transactions using Delta Lake
- How to use schema versioning and time travel in Delta Lake
- How to maintain data integrity with Delta Lake
Module 5 : Build data pipelines with Delta Live Tables
- Describe Delta Live Tables
- Ingest data into Delta Live Tables
- Use Data Pipelines for real time data processing
Module 6 : Deploy workloads with Azure Databricks Workflows
- What Azure Databricks Workflows are
- The key components and benefits of Azure Databricks Workflows
- How to deploy workloads using Azure Databricks Workflows
Prerequisites
TOPWho Should Attend
TOPStudents willing to Implement a Data lakehouse Analytics Solution with Azure Databricks