Virtual Machine

Provision

In Azure > Virtual Machines > and choose these settings:

  • Avaibility Options: Not required
  • Image: Ubuntu Server 24_04
  • Size : Standard_B1s
  • Disks: OS Disk Size of 30GB & disk type of Standard HDD

Important: You must select an already azure ssh key or provide a public ssh key during creation Important: You must have a network security group applied to your VM, allowing connections from your IP as TCP to port 4242

Note: It is possible to use machines with better power (cpu/memory especially) Note: It is possible to use other OS, as long as Java 17 can be installed, this is just an example

Installation

Then ssh to it.

1. Install Java :

sudo apt update
sudo apt install -y openjdk-17-jre

2. Download & extract Datagen:

 wget https://datagen-repo.s3.eu-west-3.amazonaws.com/1.0.0/standalone/datagen-standalone-files.tar.gz 
 tar -xvzf datagen-standalone-files.tar.gz
 cd datagen_standalone-1.0.0/

3. Launch it:

./launch.sh \
  --min-mem=512M \
  --max-mem=1G \
  --log-dir=/tmp/datagen/ \
  --load-default-models=false

Access

Access UI using: http as protocol, the full hostname of PUBLIC IP and port 4242 as an example: http://20.61.59.19:4242/

Use admin/admin as user/password to connect and start to generate data:

Custom Configuration

Later to launch it in background process, add option: --launch-with-nohup=true, for example:

./launch.sh \
  --min-mem=512M \
  --max-mem=1G \
  --log-dir=/tmp/datagen/\
  --load-default-models=false \
  --launch-with-nohup=true