logo


your one source for IT & AV

Training Presentation Systems Services & Consulting Cloud Services Purchase Client Center Computer Museum
Arrow Course Schedule | Classroom Rentals | Student Information | Free Seminars | Client Feedback | Partners | Survey | Standby Discounts

Data Engineering on Microsoft Azure (DP-203T00)

SS Course: GK821362

Course Overview

TOP

Students will begin by understanding the core compute and storage technologies that are used to build an analytical solution. The students will learn how to interactively explore data stored in files in a data lake. They will learn the various ingestion techniques that can be used to load data using the Apache Spark capability found in Azure Synapse Analytics or Azure Databricks, or how to ingest using Azure Data Factory or Azure Synapse pipelines. The students will also learn the various ways they can transform the data using the same technologies that is used to ingest data. They will understand the importance of implementing security to ensure that the data is protected at rest or in transit. The student will then show how to create a real-time analytical system to create real-time analytical solutions.

One Microsoft exam voucher included with class.

                                                                  

Scheduled Classes

TOP
02/06/23 - GVT - Virtual Classroom - Virtual Instructor-Led
03/06/23 - GVT - Virtual Classroom - Virtual Instructor-Led
03/13/23 - GVT - Virtual Classroom - Virtual Instructor-Led
07/10/23 - GVT - Virtual Classroom - Virtual Instructor-Led
07/17/23 - GVT - Virtual Classroom - Virtual Instructor-Led

Outline

TOP

Module 1: Introduction to Azure Synapse Analytics

  • Identify the business problems that Azure Synapse Analytics addresses.
  • Describe core capabilities of Azure Synapse Analytics.
  • Determine when to use Azure Synapse Analytics.

Module 2: Explore Azure Databricks

  • Provision an Azure Databricks workspace.
  • Identify core workloads and personas for Azure Databricks.
  • Describe key concepts of an Azure Databricks solution.

Module 3: Introduction to Azure Data Lake storage

  • Decide when you should use Azure Data Lake Storage Gen2
  • Create an Azure storage account by using the Azure portal
  • Compare Azure Data Lake Storage Gen2 and Azure Blob storage
  • Explore the stages for processing big data by using Azure Data Lake Store
  • List the supported open-source platforms

Module 4: Get started with Azure Stream Analytics

  • Understand data streams.
  • Understand event processing.
  • Get started with Azure Stream Analytics.

Module 5: Use Azure Synapse serverless SQL pool to query files in a data lake

  • Identify capabilities and use cases for serverless SQL pools in Azure Synapse Analytics
  • Query CSV, JSON, and Parquet files using a serverless SQL pool
  • Create external database objects in a serverless SQL pool

Module 6: Create a lake database in Azure Synapse Analytics

  • Understand lake database concepts and components
  • Describe database templates in Azure Synapse Analytics
  • Create a lake database

Module 7: Secure data and manage users in Azure Synapse serverless SQL pools

  • Choose an authentication method in Azure Synapse serverless SQL pools
  • Manage users in Azure Synapse serverless SQL pools
  • Manage user permissions in Azure Synapse serverless SQL pools

Module 8: Use Apache Spark in Azure Databricks

  • Describe key elements of the Apache Spark architecture.
  • Create and configure a Spark cluster.
  • Describe use cases for Spark.
  • Use Spark to process and analyze data stored in files.
  • Use Spark to visualize data.

Module 9: Use Delta Lake in Azure Databricks

  • Describe core features and capabilities of Delta Lake.
  • Create and use Delta Lake tables in Azure Databricks.
  • Create Spark catalog tables for Delta Lake data.
  • Use Delta Lake tables for streaming data.

Module 10: Analyze data with Apache Spark in Azure Synapse Analytics

  • Identify core features and capabilities of Apache Spark.
  • Configure a Spark pool in Azure Synapse Analytics.
  • Run code to load, analyze, and visualize data in a Spark notebook.

Module 11: Integrate SQL and Apache Spark pools in Azure Synapse Analytics

  • Describe the integration methods between SQL and Spark Pools in Azure Synapse Analytics
  • Understand the use-cases for SQL and Spark Pools integration
  • Authenticate in Azure Synapse Analytics
  • Transfer data between SQL and Spark Pool in Azure Synapse Analytics
  • Authenticate between Spark and SQL Pool in Azure Synapse Analytics
  • Integrate SQL and Spark Pools in Azure Synapse Analytics
  • Externalize the use of Spark Pools within Azure Synapse workspace
  • Transfer data outside the Synapse workspace using SQL Authentication
  • Transfer data outside the Synapse workspace using the PySpark Connector
  • Transform data in Apache Spark and write back to SQL Pool in Azure Synapse Analytics

Module 12: Use data loading best practices in Azure Synapse Analytics

  • Understand data loading design goals
  • Explain loading methods into Azure Synapse Analytics
  • Manage source data files
  • Manage singleton updates
  • Set-up dedicated data loading accounts
  • Manage concurrent access to Azure Synapse Analytics
  • Implement Workload Management
  • Simplify ingestion with the Copy Activity

Module 13: Petabyte-scale ingestion with Azure Data Factory or Azure Synapse Pipeline

  • Introduction
  • List the data factory ingestion methods
  • Describe data factory connectors
  • Exercise: Use the data factory copy activity
  • Exercise: Manage the self hosted integration runtime
  • Exercise: Setup the Azure integration runtime
  • Understand data ingestion security considerations
  • Knowledge check
  • Summary

Module 14: Integrate data with Azure Data Factory or Azure Synapse Pipeline

  • Understand Azure Data Factory
  • Describe data integration patterns
  • Explain the data factory process
  • Understand Azure Data Factory components
  • Azure Data Factory security
  • Set up Azure Data Factory
  • Create linked services
  • Create datasets
  • Create data factory activities and pipelines
  • Manage integration runtime

Module 15: Perform code-free transformation at scale with Azure Data Factory or Azure Synapse Pipeline

  • Introduction
  • Explain Data Factory transformation methods
  • Describe Data Factory transformation types
  • Use Data Factory mapping data flow
  • Debug mapping data flow
  • Use Data Factory wrangling data
  • Use compute transformations within Data Factory
  • Integrate SQL Server Integration Services packages within Data Factory
  • Knowledge check
  • Summary

Module 16: Orchestrate data movement and transformation in Azure Data Factory or Azure Synapse Pipeline

  • Introduction
  • Understand data factory control flow
  • Work with data factory pipelines
  • Debug data factory pipelines
  • Add parameters to data factory components
  • Integrate a Notebook within Azure Synapse Pipelines
  • Execute data factory packages
  • Knowledge check
  • Summary

Module 17: Plan hybrid transactional and analytical processing using Azure Synapse Analytics

  • Describe Hybrid Transactional / Analytical Processing patterns.
  • Identify Azure Synapse Link services for HTAP.

Module 18: Implement Azure Synapse Link with Azure Cosmos DB

  • Configure an Azure Cosmos DB Account to use Azure Synapse Link.
  • Create an analytical store enabled container.
  • Create a linked service for Azure Cosmos DB.
  • Analyze linked data using Spark.
  • Analyze linked data using Synapse SQL.

Module 19: Secure a data warehouse in Azure Synapse Analytics

  • Understand network security options for Azure Synapse Analytics
  • Configure Conditional Access
  • Configure Authentication
  • Manage authorization through column and row level security
  • Manage sensitive data with Dynamic Data masking
  • Implement encryption in Azure Synapse Analytics

Module 20: Configure and manage secrets in Azure Key Vault

  • Explore proper usage of Azure Key Vault
  • Manage access to an Azure Key Vault
  • Explore certificate management with Azure Key Vault
  • Configure a Hardware Security Module Key-generation solution

Module 21: Implement compliance controls for sensitive data

  • Plan and implement data classification in Azure SQL Database
  • Understand and configure row-level security and dynamic data masking
  • Understand the usage of Microsoft Defender for SQL
  • Explore how Azure SQL Database Ledger works

Module 22: Enable reliable messaging for Big Data applications using Azure Event Hubs

  • Create an event hub using the Azure CLI
  • Configure applications to send or receive messages through the event hub
  • Evaluate performance of event hub using the Azure portal

    Prerequisites

    TOP

    Successful students start this course with knowledge of cloud computing and core data concepts and professional experience with data solutions.

    Specifically completing:

    • AZ-900 - Azure Fundamentals
    • DP-900 - Microsoft Azure Data Fundamentals

      Who Should Attend

      TOP

      The primary audience for this course is data professionals, data architects, and business intelligence professionals who want to learn about data engineering and building analytical solutions using data platform technologies that exist on Microsoft Azure. The secondary audience for this course data analysts and data scientists who work with analytical solutions built on Microsoft Azure.