Apache Hive - Interview Questions

What is Hive?

 FAQ

Hive is a data warehousing framework based on Apache Hadoop which enables easy data summarization, ad-hoc queries and analytics on large volumes of Hadoop data.

Hive provides a SQL-like (HiveQL) interface to query data from Hadoop based databases and file systems.

Hive is suitable for batch and data warehousing tasks. It is not suitable for online transaction processing.

HiveQL queries are implicitly converted to MapReduce, Apache Tez and Spark jobs.

What are the key features of Hive?

 FAQ

Following are the key features of Hive.

Hive provides a SQL-like interface to query data from Hadoop based databases and file systems. Hence Hive enables SQL based portability on Hadoop.

Hive supports different storage types such as HBase, text files etc.

What are the key architectural components of Hive?

 FAQ

Following are the major components of a Hive architecture.

Metastore - Stores the metadata for each of the Hive tables.

Driver - Controller that receives the HiveQL statements.

Compiler - Compiles the HiveQL query to an execution plan.

Optimizer - Performs transformations on the execution plan.

Executor - Executes the tasks after compilation and execution.

UI - Command line interface to interact with Hive.

How do you create Hive tables using HiveQL?

 FAQ

You can create a Hive table using the DDL 'CREATE TABLE' statement.

CREATE TABLE employees (fname STRING, age INT)

How do you create Hive tables that can be partitioned using HiveQL?

 FAQ

You can create a Hive table that can be partitioned using the DDL 'CREATE TABLE... PARTITIONED BY... ' statement?

hive> CREATE TABLE employees (fname STRING, lname STRING, age INT) PARTITIONED BY (ds STRING);
Big Data Interview Guide has over 150+ interview questions and answers. Get the guide for $49.95 only.
 
BUY EBOOK
 

How do you list tables in Hive?

 FAQ

You can list Hive tables using the DDL statement 'SHOW TABLES' statement?

SHOW TABLES

How do you list columns of a table in Hive?

 FAQ

You can list columns of a tables using the DDL statement 'DESCRIBE'?

DESCRIBE employees

How do you add new columns to a table in Hive?

 FAQ

You can add new columns to a table in Hive by using the DDL statement 'ALTER TABLE... ADD COLUMNS..'?

hive> ALTER TABLE employees ADD COLUMNS (lname STRING);

Where does Hive store the table metadata?

 FAQ

By default, Hive platform stores the table metadata in an embedded Derby database.

How do you load data from flat files into Hive?

 FAQ

You can load data from flat files into Hive by using the command 'LOAD DATA... INTO TABLE' command.

//Load from local files
hive> LOAD DATA LOCAL INPATH './files/employees.txt' OVERWRITE INTO TABLE employees;
//Load from Hadoop files
hive> LOAD DATA INPATH './files/employees.txt' OVERWRITE INTO TABLE employees;
Big Data Interview Guide has over 150+ interview questions and answers. Get the guide for $49.95 only.
 
BUY EBOOK
 

How do you load data from flat files into different partitions of a Hive table?

 FAQ

You can load data from flat files into different partitions of Hive by using the command 'LOAD DATA... INTO TABLE... PARTITION...' command.

//Load from local files
hive> LOAD DATA LOCAL INPATH './files/employees.txt' OVERWRITE INTO TABLE employees PARTITION (ds='2008-08-15');
//Load from Hadoop files
hive> LOAD DATA INPATH './examples/files/employees.txt' OVERWRITE INTO TABLE employees PARTITION (ds='2008-08-15');
 
Big Data Interview Guide

$29.95

BUY EBOOK
  SSL Secure Payment
Java Interview Quesiuons - Secure Payment
Big Data Interview Guide

$29.95

BUY EBOOK
  SSL Secure Payment
Java Interview Quesiuons - Secure Payment
 

Big Data - Interview Questions

Map ReduceApache FlumeApache KafkaApache HiveApache HueApache OozieApache Sqoop
 
RECOMMENDED RESOURCES
Behaviorial Interview
Top resource to prepare for behaviorial and situational interview questions.

STAR Interview Example