Hive Integration

Introduction

Hive integration means 2 things:
  • Read/Write hive metadata. You can read the existing hive table schema and store new created table schema into hive metastore. (create statement will store table schema into hive metastore)
  • Read/Write existing hive table. You can do data analytics on hive table directly or join hive table with other data sources. (select statement can read hive table)

Configuration

  • Set zeppelin.flink.enableHive to be true.
  • Set HIVE_CONF_DIR to be the folder which contains hive-site.xml
  • Copy the following dependencies into flink lib folder
    • flink-connector-hive_2.11-{flink.version}.jar
    • flink-hadoop-compatibility_2.11-{flink.version}.jar
    • Copy hive-exec-2.x.jar to flink lib if you are using hive 2.x, and copy hive-exec-1.x.jar, hive-metastore-1.x.jar, libfb303-0.9.2.jar and libthrift-0.9.2.jar to flink lib folder if you are using hive 1.x
    • Set zeppelin.flink.hive.version to be the hive version you are using (this is only required for flink 1.10, you don't need to set it if you are using flink 1.11 or afterwards)
  • Make sure you have started hive metastore properly

Query Hive

After the above configuration, you can query HiveCatalog and read hive data as following

Video Tutorial

Community

Join Zeppelin community to discuss with others