Skip to content

Commit

Permalink
added QW troubleshooting
Browse files Browse the repository at this point in the history
  • Loading branch information
rachel-netq committed Nov 7, 2024
1 parent 8209a2c commit 1181820
Show file tree
Hide file tree
Showing 3 changed files with 29 additions and 30 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@ You can install NetQ either on your premises or as a remote, cloud solution. If
| Server Arrangement | Hypervisor | Requirements & Installation |
| :--- | --- | :---: |
| Single server | KVM or VMware | {{<link title="Set Up Your Virtual Machine for a Single Cloud Server" text="Start install">}} |
| High-availability cluster | KVM or VMware | {{<link title="Set Up Your Virtual Machine for a Cloud HA Server Cluster" text="Start install">}}|
| High-availability cluster | KVM or VMware | {{<link title="Set Up Your Virtual Machine for a Cloud HA Server Cluster" text="Start install">}}|

## Base Command Manager

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ netq check roce

## View RoCE Counters Networkwide in the UI

1. From the header or {{<img src="https://icons.cumulusnetworks.com/01-Interface-Essential/03-Menu/navigation-menu.svg" height="18" width="18">}} menu, select **Spectrum-X**, then **RoCE**.
1. From the header or {{<img src="https://icons.cumulusnetworks.com/01-Interface-Essential/03-Menu/navigation-menu.svg" height="18" width="18">}} Menu, select **Spectrum-X**, then **RoCE**.

2. Select either **RoCE switches** or **RoCE DPUs**.

Expand Down
55 changes: 27 additions & 28 deletions content/cumulus-netq-412/Troubleshoot-Issues/_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -72,63 +72,62 @@ Verified installer version FINISHED
...
```
{{< /expand >}}
<!--
## Troubleshoot Installation and Upgrade

## Troubleshoot NetQ Installation and Upgrade Issues

Before you attempt a NetQ installation or upgrade, verify that your system meets the {{<link title="Install the NetQ System" text="minimum VM requirements">}} for your deployment type.

{{%notice note%}}
If an upgrade or installation process stalls or fails, run the `netq bootstrap reset` command to stop the process, followed by the `netq install` command to re-attempt the installation or upgrade.
If an upgrade or installation process stalls or fails, run the {{<link title="bootstrap" text="netq bootstrap reset">}} command to stop the process, followed by the {{<link title="install" text="netq install">}} command to re-attempt the installation or upgrade.
{{%/notice%}}
## Known Installation and Upgrade Issues

{{<tabs "TabID113" >}}

{{<tab "Upgrade Issues">}}

| Error | Setup | Solution |
| Error Message | Deployment Type | Solution |
| ---- | ---- | ---- |
| Cannot upgrade a non-bootstrapped NetQ server. Please reset the cluster and re-install.| | Only a server that has been bootstrapped and has a valid `/etc/app-release` file can be upgraded.<br> 1. Run the `netq bootstrap reset` command. <br> 2. Run the `netq install` command. |
| Unable to get response from admin app. | | Re-run the `netq upgrade bundle <tarball>` command. If the retry fails with same error, reset the server and run the `install` command:<br> 1. Run the `netq bootstrap reset` command <br> 2. Run the `netq install` command. |
| Unable to get response from kubernetes api server. | | Re-run the `netq upgrade bundle <tarball>` command. If the retry fails with same error, reset the server and run the `install` command:<br> 1. Run the `netq bootstrap reset` command <br> 2. Run the `netq install` command. |
| Cluster vip is an invalid parameter for standalone upgrade. | Single server | Remove the `cluster-vip` option from the `netq upgrade bundle` command. |
| Please provide cluster-vip option and run command. | HA server cluster | Include the `cluster-vip` option in the `netq upgrade bundle` command. |
| Could not find admin app pod, please re-run the command. | | Re-run the `netq upgrade bundle <tarball>` command. |
| Cannot upgrade a non-bootstrapped NetQ server. Please reset the cluster and re-install.| | Only a server that has been bootstrapped and has a valid `/etc/app-release` file can be upgraded.<br> 1. Run the `netq bootstrap reset` command. <br> 2. Run the {{<link title="install" text="netq install">}} command according to your deployment type. |
| Unable to get response from admin app. | | Re-run the {{<link title="upgrade" text="netq upgrade bundle <text-bundle-url>">}} command. If the retry fails with same error, reset the server and run the `install` command:<br> 1. Run the `netq bootstrap reset` command. <br> 2. Run the {{<link title="install" text="netq install">}} command according to your deployment type. |
| Unable to get response from kubernetes api server. | | Re-run the {{<link title="upgrade" text="netq upgrade bundle <text-bundle-url>">}} command. If the retry fails with same error, reset the server and run the `install` command:<br> 1. Run the `netq bootstrap reset` command <br> 2. Run the {{<link title="install" text="netq install">}} command according to your deployment type. |
| Cluster vip is an invalid parameter for standalone upgrade. | Single server | Remove the `cluster-vip` option from the {{<link title="upgrade" text="netq upgrade bundle">}} command. |
| Please provide cluster-vip option and run command. | HA cluster | Include the `cluster-vip` option in the {{<link title="upgrade" text="netq upgrade bundle">}} command. |
| Could not find admin app pod, please re-run the command. | | Re-run the {{<link title="upgrade" text="netq upgrade bundle <text-bundle-url>">}} command. |
| Could not upgrade server, unable to restore got exception: {} | On-premises | The backup/restore option is only applicable for on-premises deployments which use {{<link title="Install a Custom Signed Certificate" text="self-signed certificates">}}.|
{{</tab>}}

{{<tab "Installation Issues" >}}

| Error | Setup | Solution |
| Error Message | Deployment Type | Solution |
| ---- | ---- | ---- |
| NetQ is operational with version: {}. Run the bootstrap reset before re-installing NetQ. | | 1. Run the `netq bootstrap reset` command. <br> 2. Run the `netq install` command. |
| NetQ is operational with version: {}. Run the bootstrap reset before re-installing NetQ. | | NetQ has previously been installed. To install a new version:<br>1. Run the {{<link title="bootstrap" text="netq bootstrap reset">}} command. <br> 2. Run the {{<link title="install" text="netq install">}} command. |
| The Default interface was not found | | You must have a default route configured in your routing table, and the associated network interface must correspond to this default route.
| No default route found. Please set the default route via interface {} and re-run the installation. | | See above. |
| Default route set via a different interface {}. Please set the default route via interface {} and re-run the installation.| | See above. |
| No default route found. Please set the default route via interface {} and re-run the installation. | | You must have a default route configured in your routing table, and the associated network interface must correspond to this default route. |
| Default route set via a different interface {}. Please set the default route via interface {} and re-run the installation.| | You must have a default route configured in your routing table, and the associated network interface must correspond to this default route. |
| Minimum of {} GB RAM required but {} GB RAM detected.| | Increase VM resources according to your {{<link title="Install the NetQ System" text="deployment model requirements">}}.|
| Minimum of {} CPU cores required but {} detected.| | Increase VM resources according to your {{<link title="Install the NetQ System" text="deployment model requirements">}}.|
| Please free up disk as {} is {}% utilised. Recommended to keep under 70%. | | Delete previous software tarballs in the `/mnt/installables/` directory to regain space. If you cannot decrease disk usage to under 70%, contact the NVIDIA support team. |
| Did not find the vmw_pvscsi driver enabled on this NetQ VM. Please re-install the NetQ VM on ESXi server. | VMware | The NetQ VM must have the `vmw_pvscsi` driver enabled. |
| Did not find the `vmw_pvscsi` driver enabled on this NetQ VM. Please re-install the NetQ VM on ESXi server. | VMware | The NetQ VM must have the `vmw_pvscsi` driver enabled. |
| Error: Bootstrapped IP does not belong to any local IPs | | The IP address used for bootstrapping should be from the local network. |
| ERROR: IP address mismatch. Bootstrapped with: {} kube_Config: {} Admin_kube_config: {}| | The bootstrap IP address must match the kube config and admin kube config IP addresses. |
| ERROR: Clock not synchronised. Please check timedatectl. | | The system clock must be synchronized. Verify synchronization using the `timedatectl` command. |
| ERROR: Clock not synchronised. Please check `timedatectl`. | | The system clock must be synchronized. Verify synchronization using the `timedatectl` command. |
| {} does not have sse4.2 capabilities. Check lscpu. | | The CPU model used for the installation must support SSE4.2. |
| NTP is installed. Please uninstall NTP as it will conflict with chrony installation.| | Uninstall NTP and any other NTP services, such as `ntpd` or SNTP.|
| Netqd service is not running | | Verify that the `netqd` service is up and running prior to installation. |
| Found identical ip for {} and {}/ Please provide different ip for cluster vip/workers. | Cluster | The cluster virtual IP address (VIP) and worker node IP addresses must be unique. |
| Please provide worker nodes IPV6 addresses in order to have IPV6 support. | Cluster | IPv6 addresses must be provided for worker nodes if IPv6 support is required. |
| Master node is not initialised. Run “net install cluster master-init” on master node before NetQ Install/upgrade command. | Cluster | Initialize the master node with the `netq install cluster master-init` command.|
| Worker node Ip {} is not reachable| Cluster | Make sure worker nodes are reachable in your network. |
| Worker node {} is not initialised. Please initialise worker node and re-run the command.| Cluster | After initializing the cluster on the master node, initialize each worker nodes with the `netq install cluster worker-init` command. |
| Cluster VIP is not valid IP address | Cluster | Provide a valid cluster IP address. |
| All cluster addresses must belong to the same subnet. Master node net mask = {} | Cluster | Make sure all cluster IP addresses---the master, worker and virtual IP---belong to the same subnet.|
| Virtual IP {} is already used | Cluster | Provide a unique virtual IP address. |
| Package {} with version {} must be installed. | | Make sure `netq-apps` version is the same as the tarball version. |
| Master node is already bootstrapped | | Run the `netq bootstrap reset` command, followed by the `netq install` command to re-attempt the installation. |
| Found identical ip for {} and {}/ Please provide different ip for cluster vip/workers. | HA cluster | The cluster virtual IP address (VIP) and worker node IP addresses must be unique. |
| Please provide worker nodes IPV6 addresses in order to have IPV6 support. | HA cluster | Provide IPv6 addresses for each worker node. |
| Master node is not initialised. Run “net install cluster master-init” on master node before NetQ Install/upgrade command. | HA cluster | Initialize the master node with the `netq install cluster master-init` command.|
| Worker node Ip {} is not reachable| HA cluster | Make sure worker nodes are reachable in your network. |
| Worker node {} is not initialised. Please initialise worker node and re-run the command.| HA cluster | After initializing the cluster on the master node, initialize each worker node with the `netq install cluster worker-init` command. |
| Cluster VIP is not valid IP address | HA cluster | Provide a valid cluster IP address. |
| All cluster addresses must belong to the same subnet. Master node net mask = {} | HA cluster | Make sure all cluster IP addresses---the master, worker, and virtual IP---belong to the same subnet.|
| Virtual IP {} is already used | HA cluster | Provide a unique virtual IP address. |
| Package {} with version {} must be installed. | | Make sure the `netq-apps` version is the same as the tarball version. |
| Master node is already bootstrapped | | Run the {{<link title="bootstrap" text="netq bootstrap rest">}} command, followed by the {{<link title="install" text="netq install">}} command to re-attempt the installation. |
{{</tab>}}

{{</tabs>}}
-->

## Installation and Upgrade Hook Scripts

NVIDIA might provide hook scripts to patch issues encountered during a NetQ installation or upgrade. When you run the `netq install` or `netq upgrade` command, NetQ checks for specific hook script filenames in the `/usr/bin` directory. The expected filenames for NetQ 4.12.0 are:
Expand Down

0 comments on commit 1181820

Please sign in to comment.