In this course we will see the following topics with practical concepts ,
Map Reduce Distributed Processing
Apache Spark -
- Challenges of Map Reduce and Why We need Spark ?
- Apache Spark vs Databricks
- DAG
- Spark Transformations
- PySpark Use Case Problems with Huge Size of Data
Architecture - All the spark core components will be discussed in detail
Spark Core APIs - RDD (Not in Use in real time- but will be discussed with practical example)
Higher Level Spark APIS - Dataframes, Spark SQL
Dataframes In Depth - All the dataframe operations will be discussed with example.
Caching (Performance Optimization Technique)
Spark on Resource Management Architecture - YARN
Dataframe Writer API
Spark Workflow
Optimizations Technique for better performance.
Duration : 30 Days ( Indlucing Weekends).
Interview point of view questions will be shared and relevant practical implementations will be done
Practice Datasets will be provided and assignments will be provided
This is not a certificate course
Real time hands on spark course