Compute Engine - VM Instance
Provision
In GCP > Compute Engine > VM Instances > Launch Instance and choose these settings:
- Machine Configuration: E2 > e2-micro
- Boot Disk: Debian GNU/Linux 12 (bookworm)
- Volume: Default of 10GB
Important: You must have ssh enabled to this machine, either through browser-ssh, gcloud, ssh with OS logins or metadata keys Important: You must have a firewall rules to allow connection from your IP as TCP to port 4242 and have it applied to the VM’s network, using network tags
Note: It is possible to use machines with better power (cpu/memory especially) Note: It is possible to use other OS, as long as Java 17 can be installed, this is just an example
Installation
Then ssh to it.
1. Install Java :
sudo apt-get install -y openjdk-17-jre
2. Download & extract Datagen:
wget https://datagen-repo.s3.eu-west-3.amazonaws.com/1.0.0/standalone/datagen-standalone-files.tar.gz
tar -xvzf datagen-standalone-files.tar.gz
cd datagen_standalone-1.0.0/
3. Launch it:
./launch.sh \
--min-mem=512M \
--max-mem=1G \
--log-dir=/tmp/datagen/ \
--load-default-models=false
Access
Access UI using: http as protocol, the full hostname of PUBLIC IP and port 4242 as an example: http://34.1.5.173:4242/
Use admin/admin as user/password to connect and start to generate data:
Custom Configuration
Later to launch it in background process, add option: --launch-with-nohup=true
, for example:
./launch.sh \
--min-mem=512M \
--max-mem=1G \
--log-dir=/tmp/datagen/\
--load-default-models=false \
--launch-with-nohup=true