Basics of Databases
DML Vs DDL Operations
SQL Vs PL SQL
RDBMS Vs NoSQL
Basic database objects
Data Normalization concepts ( 1st, 2nd, 3rd and BCNF)
Basics of Data Modeling
The Cloud Data Engineering training program is carefully designed to make the students develop a solid understanding of the diverse Cloud Platforms’ processing components. The Program comprises of modules starting from the very basic aspects of Data Engineering improvising all the way up to the Data Engineering and Analytics modules and services on cloud platforms such as AWS and Azure. This program ensures that the student is well equipped with all the required skills to take up any project in the Cloud data engineering context.
DATABASE CONCEPTS
Basics of Databases DML Vs DDL Operations SQL Vs PL SQL RDBMS Vs NoSQL Basic database objects Data Normalization concepts ( 1st, 2nd, 3rd and BCNF) Basics of Data Modeling |
mb | ||
SQL
Select Statements Restricting and Sorting data Single row functions Aggregating Using Group functions Manipulating Dat Creating and Managing Tables Joins Including Constraints Using SET Operators Datetime Functions Subqueries |
mb | ||
PLSQL
Declaring Variables Writing executable statements Writing control structures Composite data types Cursor Creating Procedures Creating Functions Creating Triggers |
mb | ||
Introduction to NoSQL
What is NoSQL? CAP Theorem BASE Concept What are the Types of NoSQL Databases? Intro to MongoDB RDBMS Vs MongoDB Key Value Pairs CRUD operations |
mb |
Data Warehousing
Data Warehousing basics What is a Data Warehouse? Data warehouse Vs OLTP System Top Down approach Bottom up approach Enterprise Data Warehouse Vs Data Marts Typical Data Warehouse Architecture Logical Vs Physical Design Star Schema Snowflake Schema Facts and Dimensions Slowly changing dimensions |
mb | ||
ETL/ Data Integration
Data Sources and Extraction Data Transformation Data Loading and Refreshing Data Load time and Throughput Mapping and Process scheduling Data Load Administration and Monitoring Lookups and other important transformations Time Series analysis & data loading process for Slowly Changing Dimension(SCD) ETL Tool Walkthrough (Informatica or Talend) |
mb | ||
OLAP/ Data Visualization/ Business Intelligence
Decision support systems Modeling the data Business Intelligence Overview Data Quality How is Data Analysed? What is OLAP? What is Data Mining? Vizualizing Data Tabular Data, Charts and Dashboards ROLAP and MOLAP Report automation and scheduling OLAP Tool walkthrough (Tableau or PowerBI) |
mb |
Big Data/Hadoop
|
mb | ||||||||||
Big Data Testing
Introduction to PIG scripting
|
mb |
TESTING in Data Engineering Context
Test Plan Test cases & scenarios Testing cycle UTC
|
mb | |||||||||||||||
Introduction to Agile
Agile overview Agile types Agile methodologies Agile methodology in testing |
mb | |||||||||||||||
Introduction to Unix
Unix Basics UNIX commands for various operations UNIX file I/O operations and file permissions |
mb |
Introduction to Cloud and Azure Fundamentals
|
mb | ||||||||||||
Azure Storage
|
mb | ||||||||||||
Azure Data Catalog
|
mb | ||||||||||||
Azure Data factory
|
mb | ||||||||||||
Azure Data Lake
|
mb | ||||||||||||
"Azure Synapse Analytics (formerly SQL DW) & Polybase"
|
mb | ||||||||||||
Tabular Model
|
mb | ||||||||||||
Power BI
|
mb | ||||||||||||
Event Hub & Stream Analytics
|
mb | ||||||||||||
Logic Apps
-Streaming data from Social media |
mb | ||||||||||||
Azure DataBricks
|
mb | ||||||||||||
Azure Cosmos DB (DocumentDB)
-Provides an insight to DocumentDB which is a NoSQL offering from Microsoft on the cloud. |
mb | ||||||||||||
HD Insight
|
mb |
Introduction to AWS
Introduction to Cloud computing & AWS |
mb | ||||||||||||
Ec2
|
mb | ||||||||||||
AWS S3
|
mb | ||||||||||||
AWS IAM Lab
|
mb | ||||||||||||
AWS Lambda
|
mb | ||||||||||||
AWS Redshift
|
mb | ||||||||||||
Basics of EMR
Amazon Elastic MapReduce (EMR) is an Amazon Web Services (AWS) tool for big data processing and analysis. |
mb | ||||||||||||
Basics of Glue
Fully managed extract, transform, and load (ETL) service |
mb | ||||||||||||
Intro to Amazon Kinesis Data Streams
Massively scalable, highly durable data ingestion and processing service optimized for streaming data. |
mb | ||||||||||||
Basics of DynamoDB
Fully managed proprietary NoSQL database service that supports key-value and document data structures |
mb | ||||||||||||
Basics of Athena
Serverless Interactive Query Service |
mb | ||||||||||||
Route 53 (DNS)
|
mb | ||||||||||||
VPCs
|
mb | ||||||||||||
High Availability
|
mb |
Project involving creation of database objects, Dimensional Modeling, ETL transformations, Mappings, and OLAP reports using cloud services
|
mb |