Node Wizard Installation
Node Wizard is a node level component that needs to be installed on each VM host as a node agent. To be managed by a control plane - Cluster Wizard or Node client, install Node Wizard on a VM host, then register it with the control plane. Registration links the host to the control plane and establishes the identity and network information used for ongoing management and communication.
Node Wizard Requirements
- Supported Host OSes are Ubuntu 20.04, 22.04, 24.04, Rocky Linux 9.2, RHEL 9.2 and SLES 15 SP7
- Network Bridge
- Node Wizard assumes a network bridge exists on the host. If one does not exist, create it before Node Wizard installation.
- See the Tutorials for Network Bridge for more information.
- This bridge requires Internet access and will be used for VM's network configuration.
- Node Wizard assumes a network bridge exists on the host. If one does not exist, create it before Node Wizard installation.
- Network connectivity to the selected control plane (Cluster Wizard or Node Client), depending on your deployment
Node Wizard creates a permanent identity for each node at registration time.
- After registration, do not clone this system or move its operating system to another machine. Doing so will cause identity and authentication failures and may require the Node Wizard to be re-registered.
- Node Wizard data stored on each node need to persist across reboots and updates. Do not make any changes to the installation directory.
- If the IP address assigned to the bridge changes after registration, the node will be unable to reconnect until its configuration is updated in Cluster Wizard or Node Client.
Install Node Wizard as system service
- Download an auto deployment script from Cluster Wizard Product Deployments on the host.
wget https://raw.githubusercontent.com/corespeq-cw/deployments/refs/heads/main/scripts/auto_deploy_software.sh
- Install Node Wizard systemd service by running the downloaded script,
auto_deploy_software.sh. It will detect the Host OS, download and install Node Wizard accordingly.
bash auto_deploy_software.sh node-wizard 0.5.2
Alternatively, you can download a release tarball and install Node Wizard manually. The file naming convention is node-wizard-<version>-<distribution>.tgz.
The steps are
- Download a tarball from Node Wizard Downloads
- Extract the tarball
# assume Host OS is Ubuntu 22.04
tar -xvf node-wizard-0.5.2-ubuntu22.tgz
- Run the included installer,
deploy-node-wizard.sh. Please see the included README or use-hfor more information.
cd node-wizard-0.5.2;
sudo ./deploy-node-wizard.sh
Once the installation begins, you will be asked to specify the following for VMs that are run on this host:
From version 0.5.0, the deployment script automatically detects and provides the list of options to use for gateway IP address, name servers, network bridge and virsh pool for Ceph RBD. Below is an example of gateway selection.
Select a gateway for configuring vm's network:
1) 192.168.128.111
2) Custom
#? 1
Chosen gateway: 192.168.128.111
- A gateway IP address
- See Basic VM Network Settings for steps on obtaining gateway IP addresses from VM host.
- A list of name servers (DNS)
- See Basic VM Network Settings for information on obtaining DNS IP addresses from the VM host.
- The name of a network bridge to be used as the default network for VMs on this host
- At least one network bridge exists on the host.
- If multiple bridges are available, the selected bridge will be used by default for newly created VMs unless explicitly overridden during VM creation.
- See Basic VM Network Settings for help listing available network bridges.
- If a bridge is not available, see Tutorials for Network Bridge for guidance on creating network bridges.
- Virsh storage pool name configured for Ceph RBD pool (optional)
- see Ceph Storage Pool for more information about listing virsh pools.
- If a virsh storage pool for Ceph is not available and a Ceph storage service is available, see Ceph Basics for Cluster Wizard for steps on creating a virsh RBD storage pool.
The network configuration provided here will be used for VMs only and will not modify the host machine’s network settings.
A token will be printed after installation. This token is necessary to register this VM host as a node to the selected control plane (Cluster Wizard or Node Client). To retrieve this token after installation, run the following command.
$ sudo /root/bin/node_wizard/node-wizard --token
Verify Node Wizard is running
- To verify the installation of Node Wizard, check that the status of the systemd service is in an active running state.
When it's verified, the VM host is ready to be registered with Cluster Wizard or Node Client using CLI or WebUI.
sudo systemctl status node-wizard.service
Node Wizard location
- The Node Wizard binary is located at
/root/bin/node_wizard/node-wizard.
GuestOS Setup
- To create a VM with the following guest OSes, the corresponding installer ISO files need to be placed at /storage/VM/iso directory:
ISO management commands through the control plane(Cluster Wizard or Node client), including listing and direct downloading on a node, are available from version 0.5.0.
-
GuestOS
- RedHat 9.2 : rhel-9.2-x86_64-dvd.iso
- RedHat 9.5 : rhel-9.5-x86_64-dvd.iso
- RedHat 10.0 : rhel-10.0-x86_64-dvd.iso
- Rocky Linux 9.2 : Rocky-9.2-x86_64-dvd.iso
- Rocky Linux 9.5 : Rocky-9.5-x86_64-dvd.iso
- Rocky Linux 10.0 : Rocky-10.0-x86_64-dvd1.iso
- SUSE Linux 15 SP5 : SLE-15-SP5-Full-x86_64-GM-Media1.iso
- SUSE Linux 15 SP6 : SLE-15-SP6-Full-x86_64-GM-Media1.iso
- SUSE Linux 15 SP7 : SLE-15-SP7-Full-x86_64-GM-Media1.iso
- Ubuntu 20.04 : ubuntu-20.04.4-live-server-amd64.iso
- Ubuntu 22.04 : ubuntu-22.04.2-live-server-amd64.iso
- Ubuntu 24.04 : ubuntu-24.04.1-live-server-amd64.iso
- Windows Server 2019: Windows_Server2019.iso / virtio-win.iso
-
To download these files and verify checksums, please see GuestOS Installers download page
Basic VM Network Settings
An easy way to configure VM network settings is to reuse the network configuration from the VM host. This guide shows how to retrieve the required network values from a basic VM host setup.
Gateway IP address
The gateway IP address currently used by the VM host. On the VM host, run:
ip route show | grep default | head -1 | cut -d" " -f3
Nameserver (DNS) IP address
The DNS nameserver IP addresses currently used by the VM host. On the VM host, run:
# Ubuntu
resolvectl status | grep "Current DNS" | head -1 | cut -d " " -f4
# Rocky Linux / RHEL
nmcli dev show | grep DNS | cut -d ":" -f2 | sed 's/^[ \t]*//'
# SLES
cat /etc/resolv.conf | grep nameserver |cut -d " " -f2
Network Bridges
List available network bridges on the VM host by running:
ip link show type bridge
If a bridge is not available, see Tutorials for Network Bridge for guidance on creating network bridges.
Ceph Storage Pool
List storage pools on the VM host and verify whether any are configured for Ceph RBD. On the VM host, run:
virsh pool-list --all
If a virsh storage pool for Ceph is not available and a Ceph storage service is available, see Ceph Basics for Cluster Wizard for steps on creating a virsh RBD storage pool.
Troubleshooting
Quick Diagnostic Commands
Run these first to quickly assess the situation:
# Check if node-wizard is running
systemctl status node-wizard
# Check for logs
cat /var/log/node-wizard-err.log
cat /var/log/node-wizard-out.log
Common Problems
Corrupted Node Wizard tarball file
- Bash output of
tar -xvf node-wizard-0.5.2-<distribution>.tgzis one of:-
gzip: stdin: unexpected end of file
tar: Unexpected EOF in archive
tar: Unexpected EOF in archive
tar: Error is not recoverable: exiting now -
gzip: stdin: not in gzip format
tar: Child returned status 1
tar: Error is not recoverable: exiting now -
gzip: stdin: invalid compressed data--crc error
gzip: stdin: invalid compressed data--length error
tar: Child returned status 1
tar: Error is not recoverable: exiting now -
tar: This does not look like a tar archive
tar: Skipping to next header
tar: A lone zero block at 11
tar: Exiting with failure status due to previous errors
-
- Cause: The node-wizard.tgz file is corrupted.
- Verify:
- Verify the file type matches expectations:
file node-wizard-latest-ubuntu22.tgz
node-wizard-latest-ubuntu22.tgz: gzip compressed data... - check compression integrity (no output indicates success).
gzip -t node-wizard-latest-ubuntu22.tgz
- Verify the file type matches expectations:
- Verify:
- Solution: Download Node Wizard tarball again or use the auto deployment script in Install Node Wizard as system service instead.
Corrupted subscription RHEL
-
Bash output of
auto_deploy_software.shcontains:Retrieving all needed dependencies
Update package list
Installing This...
Error: Failed to install This. -
Cause: The deployment script can't retrieve required dependencies
- Verify:
- Verify the subscription using :
subscription-manager status
- Verify the subscription using :
- Verify:
-
Solution: Reset and re-register your RHEL system with
subscription-managersubscription-manager clean
subscription-manager register
Missing env_node.json file
- Systemctl Output:
× node-wizard.service - Run node_wizard for kvm functionality expose
Loaded: loaded (/etc/systemd/system/node-wizard.service; enabled; preset: enabled)
Active: failed (Result: exit-code) since Mon 2025-09-22 23:04:27 CEST; 7min ago
Duration: 42ms
Process: 268436 ExecStart=/root/bin/node_wizard/node-wizard (code=exited, status=1/FAILURE)
Main PID: 268436 (code=exited, status=1/FAILURE)
CPU: 26ms - Error Logs:
2025/09/22 23:04:27 ERROR : open /root/bin/node_wizard/env_node.json: no such file or directory
2025/09/22 23:04:27 ERROR : SHUTDOWN node-wizard - failed to set env values - open /root/bin/node_wizard/env_node.json: no such file or directory
- Cause: The env_node.json file has not been created.
- Solution: Re-deploy Node Wizard. See Install Node Wizard as system service
Invalid env_node.json file
- Systemctl Output:
× node-wizard.service - Run node_wizard for kvm functionality expose
Loaded: loaded (/etc/systemd/system/node-wizard.service; enabled; preset: enabled)
Active: failed (Result: exit-code) since Mon 2025-09-22 23:27:00 CEST; 1s ago
Duration: 55ms
Process: 271557 ExecStart=/root/bin/node_wizard/node-wizard (code=exited, status=1/FAILURE)
Main PID: 271557 (code=exited, status=1/FAILURE)
CPU: 33ms - Error Logs:
2025/09/22 23:27:00 ERROR : invalid character ':' after top-level value
2025/09/22 23:27:00 ERROR : SHUTDOWN node-wizard - failed to set env values - invalid character ':' after top-level value
- Cause: The env_node.json file is malformated.
- Solution: Re-deploy Node Wizard. See Install Node Wizard as system service
NTP Problem
- Systemctl Output:
× node-wizard.service - Run node_wizard for kvm functionality expose
Loaded: loaded (/etc/systemd/system/node-wizard.service; enabled; preset: enabled)
Active: failed (Result: exit-code) since Mon 2025-09-22 23:37:23 CEST; 30s ago
Duration: 2min 20.154s
Process: 272695 ExecStart=/root/bin/node_wizard/node-wizard (code=exited, status=1/FAILURE)
Main PID: 272695 (code=exited, status=1/FAILURE)
CPU: 66ms - Error Logs:
2025/09/22 23:37:23 ERROR : time servers - retrying attempt 15 of 15
2025/09/22 23:37:23 ERROR : cannot get current time from time server 0
2025/09/22 23:37:23 ERROR : cannot get current time from time server 1
2025/09/22 23:37:23 ERROR : cannot get current time from time server 2
2025/09/22 23:37:23 ERROR : cannot get current time from time server 3
2025/09/22 23:37:23 ERROR : cannot get current time from time server 4
2025/09/22 23:37:23 ERROR : cannot get current time from time servers
2025/09/22 23:37:23 ERROR : SHUTDOWN node-wizard - failed to check current time - ERROR : cannot get current time from time servers
- Cause: NTP servers are not available.
- Solution: Check your firewall settings and your internet connection.
Service Running But Server Unavailable
NTP Problem
- Systemctl Output:
● node-wizard.service - Run node_wizard for kvm functionality expose
Loaded: loaded (/etc/systemd/system/node-wizard.service; enabled; preset: enabled)
Active: active (running) since Mon 2025-09-22 23:39:54 CEST; 10s ago
Main PID: 273395 (node-wizard)
Tasks: 6 (limit: 76757)
Memory: 8.3M (peak: 9.2M)
CPU: 32ms
CGroup: /system.slice/node-wizard.service
└─273395 /root/bin/node_wizard/node-wizard - Error Logs:
2025/09/22 23:37:23 ERROR : time servers - retrying attempt 5 of 15
2025/09/22 23:37:23 ERROR : cannot get current time from time server 0
2025/09/22 23:37:23 ERROR : cannot get current time from time server 1
2025/09/22 23:37:23 ERROR : cannot get current time from time server 2
2025/09/22 23:37:23 ERROR : cannot get current time from time server 3
2025/09/22 23:37:23 ERROR : cannot get current time from time server 4
...
- Cause: NTP servers are not available and server is retrying to access them.
- Solution: Check your firewall settings and your internet connection.
Network Management
- Systemctl Output:
● node-wizard.service - Run node_wizard for kvm functionality expose
Loaded: loaded (/etc/systemd/system/node-wizard.service; enabled; preset: enabled)
Active: active (running) since Mon 2025-09-22 23:39:54 CEST; 10s ago
Main PID: 273395 (node-wizard)
Tasks: 6 (limit: 76757)
Memory: 8.3M (peak: 9.2M)
CPU: 32ms
CGroup: /system.slice/node-wizard.service
└─273395 /root/bin/node_wizard/node-wizard - Output Logs:
2025/09/22 23:48:27 Startup Vxlan: Vxlan {vxl517 517 enp2s0} already exists. Skipping.
2025/09/22 23:48:27 Startup Vxlan: Creating vxlan network.vxlan{Vxlanname:"vxl758", Vxlanid:758, Dev:"bla"}
2025/09/22 23:48:27 Startup Vxlan: Warning - Failed to create vxlan {Vxlanname:vxl758 Vxlanid:758 Dev:bla}. Cannot find device "bla"
, exit status 1
...
- Cause: At startup, the different networks are created. If they are based on interfaces that no longer exist, the server keeps trying to create them. During this time, it remains unavailable.
- Solution: After startup, try removing the networks causing issues to ensure a faster startup next time (using
unregister-vxlan/unregister-vlan/unregister-bridgecommands).