Hadoop Distributed File System
Hadoop Distributed File System (HDFS) is a file system that manages large data sets that can run on commodity hardware.
If you select Hadoop Distributed File System (HDFS) from the Storage section, configure the following details:
Field | Description |
---|---|
Display name | Enter the name to be displayed. |
Thrift URI | Enter the Thrift URI. |
Thrift Port | Enter the Thrift port. |
Kerberos authentication | Select the checkbox Kerberos authentication for secure connection. a. Enter the following information: i. HDFS principal ii. Hive client principal iii. Hive server principal b. Upload the following files: i. Kerberos config file (.config) ii. HDFS keytab file (.keytab) iii. Hive keytab file (.keytab) |
Upload core site file (.xml) | Upload core site file (.xml) |
Upload HDFS site file (.xml) | Upload HDFS site file (.xml) |
Associate catalog | Add a catalog for your storage. This catalog is associated with your storage and serves as your query interface with the data stored within. |
Catalog type | The supported catalog is Apache Hive. |
Catalog name | Enter the name of your catalog. |
Create | Click Create to create the storage. |