Course Content for Hadoop Developer
This Course Covers 100% Developer and 40% Administration Syllabus.
Introduction to BigData, Hadoop:-
Big Data Introduction
Hadoop Introduction
What is Hadoop? Why Hadoop?
Hadoop History?
Different types of Components in Hadoop?
HDFS, MapReduce, PIG, Hive, SQOOP, HBASE, OOZIE, Flume, Zookeeper and so on…
What is the scope of Hadoop?
Deep Drive in HDFS (for Storing the Data):-
Introduction of HDFS
HDFS Design
HDFS role in Hadoop
Features of HDFS
Daemons of Hadoop and its functionality
o Name Node
o Secondary Name Node
o Job Tracker
o Data Node
o Task Tracker
Anatomy of File Wright
Anatomy of File Read
Network Topology
o Nodes
o Racks
o Data Center
Parallel Copying using DistCp
Basic Configuration for HDFS
Data Organization
o Blocks and
o Replication
Rack Awareness
Heartbeat Signal
How to Store the Data into HDFS
How to Read the Data from HDFS
Accessing HDFS (Introduction of Basic UNIX commands)
CLI commands
MapReduce using Java (Processing the Data):-
The introduction of MapReduce.
MapReduce Architecture
Data flow in MapReduce
o Splits
o Mapper
o Portioning
o Sort and shuffle
o Combiner
o Reducer
Understand Difference Between Block and InputSplit
Role of RecordReader
Basic Configuration of MapReduce
MapReduce life cycle
o Driver Code
o Mapper
o and Reducer
How MapReduce Works
Writing and Executing the Basic MapReduce Program using Java
Submission & Initialization of MapReduce Job.
File Input/Output Formats in MapReduce Jobs
o Text Input Format
o Key Value Input Format
o Sequence File Input Format
o NLine Input Format
Joins
o Map-side Joins
o Reducer-side Joins
Word Count Example
Partition MapReduce Program
Side Data Distribution
o Distributed Cache (with Program)
Counters (with Program)
o Types of Counters
o Task Counters
o Job Counters
o User Defined Counters
o Propagation of Counters
Job Scheduling
PIG:-
Introduction to Apache PIG
Introduction to PIG Data Flow Engine
MapReduce vs. PIG in detail
When should PIG use?
Data Types in PIG
Basic PIG programming
Modes of Execution in PIG
o Local Mode and
o MapReduce Mode
Execution Mechanisms
o Grunt Shell
o Script
o Embedded
Operators/Transformations in PIG
PIG UDF’s with Program
Word Count Example in PIG
The difference between the MapReduce and PIG
SQOOP:-
Introduction to SQOOP
Use of SQOOP
Connect to mySql database
SQOOP commands
o Import
o Export
o Eval
o Codegen etc…
Joins in SQOOP
Export to MySQL
Export to HBase
HIVE:-
Introduction to HIVE
HIVE Meta Store
HIVE Architecture
Tables in HIVE
o Managed Tables
o External Tables
Hive Data Types
o Primitive Types
o Complex Types
Partition
Joins in HIVE
HIVE UDF’s and UADF’s with Programs
Word Count Example
HBASE:-
Introduction to HBASE
Basic Configurations of HBASE
Fundamentals of HBase
What is NoSQL?
HBase Data Model
o Table and Row
o Column Family and Column Qualifier
o Cell and its Versioning
Categories of NoSQL Data Bases
o Key-Value Database
o Document Database
o Column Family Database
HBASE Architecture
o HMaster
o Region Servers
o Regions
o MemStore
o Store
SQL vs. NOSQL
How HBASE is differed from RDBMS
HDFS vs. HBase
Client-side buffering or bulk uploads
HBase Designing Tables
HBase Operations
o Get
o Scan
o Put
o Delete
MongoDB:–
What is MongoDB?
Where to Use?
Configuration On Windows
Inserting the data into MongoDB?
Reading the MongoDB data.
Cluster Setup:–
Downloading and installing the Ubuntu12.x
Installing Java
Installing Hadoop
Creating Cluster
Increasing Decreasing the Cluster size
Monitoring the Cluster Health
Starting and Stopping the Nodes
Zookeeper
Introduction Zookeeper
Data Modal
Operations
OOZIE
Introduction to OOZIE
Use of OOZIE
Where to use?
Flume
Introduction to Flume
Uses of Flume
Flume Architecture
o Flume Master
o Flume Collectors
o Flume Agents