Apache Hue, which stands for 'Hadoop User Experience' is an open-source interactive tool for analyzing and visualizing Hadoop data.
Compared to other Big Data and Hadoop technologies, detailed technical questions on Hue may be asked less frequently. But it is important that you know where and how Hue fits into the Hadoop and Big Data ecosystem, and the key capabilities provided by Hue
Below Apache Hue interview questions addresses these topics.
Hue, which stands for 'Hadoop User Exeprience', is an open-source web interface for analyzing data with Apache Hadoop. Hue is an analytics workbench designed for fast data discovery, intelligent query assistance, and seamless collaboration.
Hue focuses on SQL but also supports job submissions.
Hue is present in some major Hadoop distributions - CDH, HDP and MapR.
Hue primarily focuses on Apache Hive and Apache Impala.
In addition, Hue is also compatible with and supports the following sources.
- Any SQL database - Oracle, MySQL, SparkSQL, Apache Phoenix, Apache Presto, Apache Drill, Apache Kylin, PostgreSQL, Redshift, BigQuery etc.
- Solr SQL
Apache Hue consists of the following products.
Hue Query Editor - Hue Query editor is a user interface that provides capabilities and features which make querying data easy and more productive.
Hue Dashboard - Hue Dashboard is an interactive dashboard that provides capabilities for visualizing data quickly and easily.
Hue Scheduler - Hue scheduler lets you build workflows and then schedule them to run regularly and automatically.
Apache Hue editor provides the following capabilities.
Importing and Managing Data - The Apache Hue editor provides a convenient user interface that assists in the management of metadata of the data sources. It provides drag & drop features, and table creation and import wizards to import data easily.
Querying Data - Apache Hue Editor provides a powerful autocomplete feature that supports language syntax and will highight any syntax or logical errors. It provides quick previews of datasets, highlights common columns and JOINs and provides recommendations for type optimized queries. Results can be exported to S3/HDFS/ADLS or downloaded as CSV/Excel.
Apache Hue Dashboards provide an interactive way to explore and visualize data quickly and easily. No programming is required, the dashbaords can be created by drag and drop of widgets provided by Hue.
Hue Dashboards provide widgets for - Text, Timeline, Pie, Line, Bar, Map, Filters, Grid and HTML widgets that can be used for generating dynamic dashboards quickly and easily.
Apache Hue provides scheduling features and capabilities that can be used to create workflows and schedule them to run automatically.
Apache Hue provides the Workflow editor which can be used to create, update, import and export workflows.
Apache Hue provides a monitoring interface that can be used to check the progress and status of jobs, start jobs, pause jobs etc.
You configure SQL data sources in Apache Hue by using the configuration file 'hue.ini'. Some examples are listed below
# Host where Hive Server is running.
# Host where Impala Server is running