
Architecture of Hive

Hive is a Hadoop data warehouse infrastructure tool for processing structured data. It resides on top of Hadoop to summarize Big Data and facilitates querying and analyzing.
Hive Characteristics:
- It is intended for OLAP.
- HiveQL or HQL is a querying language that is similar to SQL.
- It is well-known, quick, scalable, and extensible.
- It keeps the schema in a database and the processed data in HDFS.
User Interface:
Hive provides various interfaces to cater to the diverse needs of users. These interfaces include:
- Hive Web UI: This web-based interface offers a user-friendly way to interact with Hive, making it accessible to those who prefer a graphical approach.
- Hive Command Line: For those who are more inclined towards a command-line experience, Hive offers a command-line interface that allows users to issue queries and commands directly.
- Hive HD Insight (In Windows Server): This is an interface tailored for Windows users, ensuring a seamless experience when working with Hive on Windows-based systems.
Meta Store:
Hive’s Meta Store is where the magic begins. It relies on chosen database servers to house the essential metadata. This metadata includes information about tables, databases, columns within tables, their respective data types, and the crucial mapping to HDFS.
In other words, it’s the brain behind the structure of your data, ensuring everything is organized and accessible.
Architecture of Hive
HiveQL Process Engine:
Hive Query Language (HiveQL) takes center stage when it comes to interacting with the Meta Store. Think of HiveQL as a cousin of SQL, designed specifically for querying metadata stored in the Meta Store.
Execution Engine:
Here’s where the magic happens. The Execution Engine is the point where HiveQL and MapReduce seamlessly blend together. This engine processes your queries and generates results, much like MapReduce. It leverages the power of MapReduce under the hood, ensuring your data processing tasks are executed efficiently and effectively.
HDFS or HBase:
When it comes to data storage, Hive provides you with options. You can store your data in the Hadoop Distributed File System (HDFS) or opt for HBase, another robust storage technique. HDFS is well-suited for large-scale batch processing, while HBase is designed for real-time, NoSQL-style data storage.