hadoop

By attending Hadoop workshop, Delegates will learn:

  • Big Data relies on volume, velocity, and variety with respect to processing.
  • Data can be divided into three types—unstructured data, semi-structured data, and structured data.
  • Big Data technology understands and navigates big data sources, analyzes unstructured data, and ingests data at a high speed.
  • Hadoop is a free, Java-based programming framework that supports the processing of large data sets in a distributed computing environment.

In Hadoop training course, you will understand Big Data, the limitations of the existing solutions for Big Data problem, how Hadoop solves the Big Data problem, the common Hadoop ecosystem components, Hadoop Architecture, HDFS, Anatomy of File Write and Read, how MapReduce Framework works.

COURSE AGENDA

  • Introduction to Big Data & Hadoop Fundamentals
  • Dimensions of Big data
  • Type of Data generation
  • Apache ecosystem & its projects
  • Hadoop distributors
  • HDFS core concepts
  • Modes of Hadoop employment
  • HDFS Flow architecture
  • HDFS MrV1 vs. MrV2 architecture
  • Types of Data compression techniques
  • Rack topology
  • HDFS utility commands
  • Min h/w requirements for a cluster & property files changes
  • MapReduce Design flow
  • MapReduce Program (Job) execution
  • Types of Input formats & Output Formats
  • MapReduce Datatypes
  • Performance tuning of MapReduce jobs
  • Counters techniques
  • Hive architecture flow
  • Types of hive tables flow
  • DML/DDL commands explanation
  • Partitioning logic
  • Bucketing logic
  • Hive script execution in shell & HUE
  • Introduction to Pig concepts
  • Pig modes of execution/storage concepts
  • Pig program logics explanation
  • Pig basic commands
  • Pig script execution in shell/HUE
  • Introduction to Hbase concepts
  • Introduction to NoSQL/CAP theorem concepts
  • Hbase design/architecture flow
  • Hbase table commands
  • Hive + Hbase integration module/jars deployment
  • Hbase execution in shell/HUE
  • Introduction to Sqoop concepts
  • Sqoop internal design/architecture
  • Sqoop Import statements concepts
  • Sqoop Export Statements concepts
  • Quest Data connectors flow
  • Incremental updating concepts
  • Creating a database in MySQL for importing to HDFS
  • Sqoop commands execution in shell/HUE
  • Introduction to Flume & features
  • Flume topology & core concepts
  • Property file parameters logic
  • Introduction to Hue design
  • Hue architecture flow/UI interface
  • Introduction to zookeeper concepts
  • Zookeeper principles & usage in Hadoop framework
  • Basics of Zookeeper