IBM Cloud Docs
Hadoop Distributed File System

Hadoop Distributed File System

Hadoop Distributed File System (HDFS) is a file system that manages large data sets that can run on commodity hardware.

If you select Hadoop Distributed File System (HDFS) from the Storage section, configure the following details:

Register bucket
Field Description
Display name Enter the name to be displayed.
Thrift URI Enter the Thrift URI.
Thrift Port Enter the Thrift port.
Kerberos authentication Select the checkbox Kerberos authentication for secure connection.
a. Enter the following information:
i. HDFS principal
ii. Hive client principal
iii. Hive server principal
b. Upload the following files:
i. Kerberos config file (.config)
ii. HDFS keytab file (.keytab)
iii. Hive keytab file (.keytab)
Upload core site file (.xml) Upload core site file (.xml)
Upload HDFS site file (.xml) Upload HDFS site file (.xml)
Associate catalog Add a catalog for your storage. This catalog is associated with your storage and serves as your query interface with the data stored within.
Catalog type The supported catalog is Apache Hive.
Catalog name Enter the name of your catalog.
Create Click Create to create the storage.