Auto Configuration with Auto-Discovery
When starting up, Datagen is loading the application.properties in-memory and proceeds to what is called an auto-discovery.
When setup in CDP, Datagen (if cm.autodiscovery is true) will make an auto-discovery of ALL services using CM API. (User does not need to specify anything).
It also possible rely on fields described as AUTO-DISCOVERY, that are by defaults are filled in.
These fields, referenced properties file such as hdfs-site.xml, hive-site.xml etc… With this, Datagen parsed these files and automatically configure all possible other fields.
Once started, you can see such logs in Datagen Web Server:
If using CM auto-discovery:
9:35:04.701 AM INFO PropertiesLoader [main] Going to auto-discover hbase.zookeeper.quorum with CM API
9:35:04.701 AM INFO PropertiesLoader [main] Going to auto-discover hbase.zookeeper.port with CM API
If not using CM auto-discovery but configuration files auto-discovery:
2022-10-13 13:33:49,220 INFO [main] com.cloudera.frisch.randomdatagen.config.PropertiesLoader: Going to auto-discover hbase.zookeeper.quorum
2022-10-13 13:33:49,222 DEBUG [main] com.cloudera.frisch.randomdatagen.Utils: Return value: server_zk1,server_zk_2,server_zk_3 from file: dev-support/test_files/hbase-site.xml for property: hbase.zookeeper.quorum
2022-10-13 13:33:49,222 INFO [main] com.cloudera.frisch.randomdatagen.config.PropertiesLoader: Going to auto-discover hbase.zookeeper.port
2022-10-13 13:33:49,223 DEBUG [main] com.cloudera.frisch.randomdatagen.Utils: Return value: 2181 from file: dev-support/test_files/hbase-site.xml for property: hbase.zookeeper.property.clientPort
Using a specific Cloudera Manager for Auto-Discovery
If running the program locally or non CDP managed, it is still possible to specify a CM server and use auto-discovery from it to automatically configure services.
Required Configurations are the following:
- cm.autodiscovery=true
- cm.url= : URL to reach CM in the form of: https://
: - cm.user= : user that can read configuration from CM
- cm.password= : password of that user
- cm.cluster.name= : name of the cluster in CM that will be used