1.3.1.Analyze a Turbine Platform Installer Support Bundle
a support bundle is a collection of logs and output from the underlying kubernetes cluster resources this information is invaluable for troubleshooting as it includes tons of information about node health, swimlane pod health and logs, kots pod health and logs, and more the support bundle is a gzip’d tarball ( tar gz) that can be extracted using the tar command the support bundle is comprised of logs divided into several directories ceph logs and outputs related to the status and configuration of the ceph storage system used by the kots related pods cluster info information about the kubernetes version cluster resources definitions of all of the resources running in the clusters each resource type is a different subdirectory (deployments, services, statefulsets, etc ) each resource subdirectory has a json file for each namespace that has that resource type swimlane resources are in the default namespace kots logs specific to the kots ui and process pods goldpinger goldpinger checks the network connections between each unique pair of nodes results are stored as json in default/kotsadm ###/goldpinger statistics stdout txt when working correctly, each ping should return a 200 status code swimlane the custom support bundle collectors we have written for our swimlane platform resources each pod type has a separate subdirectory for the logs collected for each pod of that type what we collect api logs api pod logs pip list list of python packages installed in the api pods sw mongo cert expiration date of the mongodb certificate df disk usage for the partitions of the mongodb pods free memory usage of the mongodb pods logs mongodb pod logs mongo version the version of mongodb stats mongodb database stats time drift the current time of the mongodb pods to use to compare against the support bundle date and time to check for time drift top top running processes in the mongodb pods tasks logs tasks pod logs pip list list of python packages installed in the tasks pods tools environment validator output of the environment validator script being run from the tools pod logs tools pod logs ping a ping test done from the tools pod to see if there is internet access note this only confirms if the node that the tools pod is running on has internet access and not every node in the cluster web logs web pod logs velero logs for the velero resources that handle snapshots
1.3.2.Generate a Support Bundle
if you're having issues with the turbine platform installer (tpi), your swimlane support representative will likely ask you to generate a support bundle to identify the diagnostics for your issue support bundles contain logs from all relevant pods, as well as other useful information from your deployment airgapped deployments use these instructions for an airgapped deployment before you begin generating a support bundle you must first ensure that you have the support bundle command installed install the support bundle command if you have already installed the support bundle command, you can skip these steps to install the support bundle command from a computer that is connected to the internet run move the files to the server next, connect to the airgapped server and make the support bundle file executable chmod 777 support bundle execute the installer for support bundle /support bundle finally, generate the support bundle kubectl support bundle /path/to/spec yaml give this support bundle to swimlane support generating a support bundle with cli (airgap) to generate a support bundle with cli (airgap) log in to tpi, and connect to the server run this support bundle command if this command throws the following error, it means that a support bundle file has been saved in the directory from which the commands were run if you do not have the support bundle command, see generate a support bundle /#install the support bundle command the support bundle uploads to your turbine platform installer ui from the turbine platform installer ui, click dashboard and then open the troubleshoot tab on troubleshoot, click download bundle deliver the downloaded file to swimlane support online deployments use these instructions for an online deployment that is connected to the internet before you begin generating a support bundle you must first ensure that you have the support bundle command installed see install the support bundle command for details generating a support bundle with cli (online) to generate a support bundle with cli (online) log in to tpi, and connect to the server run this support bundle command curl https //krew\ sh/support bundle | bash generate the support bundle by running this command kubectl support bundle secret/default/kotsadm turbine supportbundle if this command throws the following error, it means that a support bundle file has been saved in the directory from which the commands were run the support bundle uploads to your turbine platform installer ui from the turbine platform installer ui, click dashboard and then open the troubleshoot tab on troubleshoot, click download bundle deliver the downloaded file to swimlane support generating a support bundle through the tpi ui use these instructions to get a support bundle from the tpi ui log in to the turbine platform installer ui from the turbine platform installer ui, click dashboard and then open the troubleshoot tab if you do not have the support bundle command, see install the support bundle command on the troubleshoot tab, click analyze swimlane the tpi checks the deployment for issues this may take some time you will know when the process is finished once you receive an analysis overview to view the content of the support bundle, click the file inspector tab to download the support bundle, click download bundle deliver the downloaded file to swimlane support
1.3.3.Stop and Restart a Single-Node Deployment
on occasion, single node tpi deployments must be stopped and restarted one example of this is when you need to perform scheduled maintenance to stop and restart a single node tpi deployment login to the node as a user with administrator privileges find the name of the node in the example within this topic, the node is named master3a drain the node \# /bin/kubectl drain master3a ignore daemonsets delete local data check that the node is in the “ready,schedulingdisabled” status stop or restart the server or vm next, uncordon the node (i e make the node schedulable) finally, ensure that the node is in the “ready” status
1.3.4.Troubleshooting Checklist
use this topic as a checklist for troubleshooting issues with the swimlane platform installer (spi) check the configuration ensure that docker and kubelet are running containerd run systemctl status containerd if the service is in a failed state, search the service log for errors with this command journalctl fu containerd kubelet run systemctl status kubelet if the service is in a failed state, search the service log for errors with this command journalctl fu kubelet check cluster node health run kubectl get nodes all nodes should show ready in the status column if any nodes are not showing as ready , run the following and look for relevant issues/errors kubectl describe node nodename check for any pods that are not running run kubectl get pods a | grep v running | grep v completed if any pods are listed check the pod logs for issues with this command kubectl logs all containers podname n namespace check the pod describe output for issues with this command kubectl describe pod podname n namespace check for errors in the events run kubectl get events sort by= metadata creationtimestamp ensure the infrastructure is configured according to the system requirements for an embedded cluster install docid 9lxricxlm1t14ydlkt4zr or system requirements for an existing cluster install docid\ snkpyftncien7nwfs7tfp resources to check; logs/info to retrieve ensure that there is a support bundle available see generate a support bundle docid\ qmv6 xfrfeuey9d5wybkc for details in addition to the support bundle, get the status and logs of the containerd and kubelet services use this code if you can't generate a support bundle, for example, if the issue is occurring before the spi is successfully installed and running, or the support bundle generation fails, then run the following commands and provide the output kubectl get pods a | grep v running | grep v completed if any pods are listed, include the pod logs with kubectl logs all containers podname n namespace include the pod describe output with kubectl describe pod podname n namespace then continue providing output with if any nodes have a status other than ready , then also include the describe output of the node with next, get the status and logs of the containerd and kubelet services finally, provide the following os and version infrastructure setup instance provider, for example, bare metal, vm, aws instance, azure instance, gcp instance, etc instance sizes, including memory, cpu, disk/partitions load balancer type and setup
1.3.5.Upload a New TLS Certificate
if you've already gone through the setup process once, and you want to upload new tls certificates, run this command to restore the ability to upload new tls certificates kubectl n default annotate secret kotsadm tls acceptanonymousuploads=1 adding this annotation temporarily creates a vulnerability for an attacker to maliciously upload tls certificates once tls certificates have been uploaded again, the vulnerability goes away after adding the annotation, you will need to restart the kurl proxy server the simplest way to do that is to delete the kurl proxy pod (the pod will automatically get restarted) with this command kubectl delete pods proxy server after the pod has been restarted, re direct your browser to http //\<your ip> 8800/tls to see the same page that you did during the initial installation then, load your tls certificate swimlane recommends that you complete this process as soon as possible in order to avoid anyone from nefariously uploading tls certificates after this process is complete, the vulnerability is closed, and uploading new tls certificates will be disallowed again please repeat the steps above in order to upload new tls certificates