Description
Module 1 :
- What is Data Pipeline
- What is Azure databricks
- Azure Databricks Architecture
- Azure Account Setup
- WorkSpace Setup
Module 2:
- Navigate the Workspace
- Runtimes
- Clusters
- Notebooks
- Libraries
- Repos
- Databricks File System (DBFS)
- DBUTILS
- Widgets
- Workflows
- Metastore – Setup external Metastore
Module 3 :
- What is RDD
- Creating RDD
- RDD transformations
- RDD Actions
- RDD Joins
- Pair RDD
- Broadcast Variables
- Accumulators
- Convert RDD to DataFrame
- Import & Read data
- Create a table using the UI
- Create a table in a Notebook
Module 4 :
- Create DataFrames
- Define Schema
- Functions
- Casting Operations
- Filter Transformation
- Update, Update ALL & UpdateByName
- OrderBy & SortBY
- GroupBy
- Remove Duplicates
- Window Functions
- Date and Timestamp Functions
- UDF (User Defined Function)
- JOIN
- Handle corrupt records using the badRecordsPath
- File metadata column
Module 5 :
- Read Parquet File
- Read CSV Files
- Read JSON Files
- Read XML Files
- Read Excel file
- SQL databases using JDBC
- Azure blob storage
Module 6 :
- What is Spark Structure Streaming
- Data Source & Sink
- Rate & File Source
- Kafka Source
- Sink : Console, Memory, File & Custom
- Build Streaming ETL
- Stream ETL 1 : Setup Event Hub
- Streaming ETL 2 : Event Hub Producer
- Streaming ETL 3 : Integrate Event Hubs with Data Bricks
- Streaming ETL 4 : Transformation
- Streaming ETL 5 : Ingest into Azure Data storage
- Twitter Sentiment Analysis – Introduction
- Setup Twitter Developer Account
- Twitter Sentiment Analysis – II
- Twitter Sentiment Analysis – III
Module 7 :
- Components in Databricks SQL
- Configuring a SQL Endpoint
- Creating a Table from a CSV File
- Create Queries
- Parameterized Query
- Query Profile
- Building Visualization (Table, BAR & PIE )
- Building Line Chart & Counter Chart
- Adding Charts to Dashboards
- Defining a Query Alert
- Access Control on Databricks SQL Objects
- Lab: Data Object Access Control
- Transfer Ownership
- Access SQL Endpoint from Python
- Databricks SQL CLI
- Databricks SQL CLI
Who this course is for:
- Data Engineering Students & Developers
- Bigdata Developer
- Python & SQL Developer
Requirements
- Basic Python Skills
- Basic SQL Skills
- Azure Account
Last Updated 10/2022
Download Links
Direct Download
Master Azure Databricks.zip (4.2 GB) | Mirror
Torrent Download
Master Azure Databricks.torrent (114 KB) | Mirror