Accessing Hadoop Data Using Hive
Hive is a data warehousing tool built on top of Hadoop. Learn how to easily query and analyze your Big Data projects with this course on Apache Hive.
About This Course
Writing MapReduce programs to analyze your Big Data can get complex. Hive can help make querying your data much easier. Apache Hive, first created at Facebook, is a data warehouse system for Hadoop that facilitates easy data summarization, ad-hoc queries, and the analysis of large datasets stored in Hadoop compatible file systems. Hive provides a mechanism to project structure onto this data and query the data using a SQL-like language called HiveQL. This course will get you started so that you can use Hive for Data Warehousing tasks on your Big Data projects.
What will I get after passing this course?
You will receive a completion certificate.
Course Syllabus
Lesson 1 – Introduction to Hive
Describe what Hive is, what it’s used for and how it compares to other similar technologies
Describe the Hive architecture
Describe the main components of Hive
List interesting ways others are using Hive
Lesson 2 – Hive DDL
Create databases and tables in Hive, while using a variety of different Data Types
Run a variety of different DDL commands
Use Partitioning to improve performance of Hive queries
Create Managed and External tables in Hive
Lesson 3 – Hive DML
Load data into Hive
Export data out of Hive
Run a variety of different HiveQL DML queries
Lesson 4 – Hive Operators and Functions
Use a variety of Hive Operators in your queries
Utilize Hive’s Built-in Functions
Explain ways to extend Hive functionality