PowerEdge R720 QEMU & Kubernetes Setup #23

Merged
eric merged 4 commits from r720-setup into main 2025-07-31 21:41:20 +00:00
Owner

Part 1

In this video I am attempting to recycle the main.yaml Ansible playbook from this repository to be used for the PowerEdge R720 server. The goal here is to deploy five additional virtual machines to the existing Kubernetes cluster to alleviate recent compute/disk pressure. To my surprise, the original code for deploying virtual machines was not copied to DevOps/ansible-role-libvirt-guest and is now hidden in the commit history. I was able to find a non-functional version in another file, and eventually discovered the original in commit v1.0.0, but this problem prevented the full completion of the PR in one sitting; even so, only a small amount of work remains to fully debug the new playbook. Once completed, I will have several follow-up tasks to complete, which have been documented in the following issues:

Additionally, the original issue being addressed here, DevOps/ansible-role-eom#37, will not be completed until the new NFS client is utilized, as I believe the service crashes are due to disk I/O limitations rather than insufficient compute power.

# Part 1 <video controls type="video/mp4" src="https://minio.eom.dev/public/Videos/2025-07-30_20-07-33.mp4"></video> In this video I am attempting to recycle the `main.yaml` Ansible playbook from this repository to be used for the PowerEdge R720 server. The goal here is to deploy five additional virtual machines to the existing Kubernetes cluster to alleviate recent compute/disk pressure. To my surprise, the original code for deploying virtual machines was not copied to DevOps/ansible-role-libvirt-guest and is now hidden in the commit history. I was able to find a non-functional version in another file, and eventually discovered the original in commit v1.0.0, but this problem prevented the full completion of the PR in one sitting; even so, only a small amount of work remains to fully debug the new playbook. Once completed, I will have several follow-up tasks to complete, which have been documented in the following issues: - DevOps/ansible-role-libvirt-guest#1 - DevOps/ansible-role-mastodon#2 Additionally, the original issue being addressed here, DevOps/ansible-role-eom#37, will not be completed until the new NFS client is utilized, as I believe the service crashes are due to disk I/O limitations rather than insufficient compute power.
eric added 1 commit 2025-07-31 01:59:30 +00:00
eric changed title from PowerEdge R720 QEMU & Kubernetes Setup to WIP: PowerEdge R720 QEMU & Kubernetes Setup 2025-07-31 02:00:15 +00:00
eric added 1 commit 2025-07-31 03:01:43 +00:00
eric added 1 commit 2025-07-31 14:00:27 +00:00
eric added spent time 2025-07-31 14:03:08 +00:00
2 hours 54 minutes
eric added spent time 2025-07-31 14:03:27 +00:00
30 minutes
eric started working 2025-07-31 15:40:15 +00:00
eric stopped working 2025-07-31 16:21:51 +00:00
41 minutes 36 seconds
Author
Owner

Part 2

Following up from yesterday's live stream, I was able to get further into the deployment by referencing the code from commit 6e7ee42c1e; unfortunately, the playbook subsequently failed due to networking issues. I believe this was caused by the assignment of a new IP address to the virtual bridge created for the wan libvirt network. Once this resource was created, the host and its guests needed to be accessed via a new address. Having rebooted the R720, I believe this is now fixed and will be able to continue in the next stream.

# Part 2 <video controls type="video/mp4" src="https://minio.eom.dev/public/Videos/2025-07-31_11-28-24.mp4"></video> Following up from yesterday's live stream, I was able to get further into the deployment by referencing the code from commit 6e7ee42c1e; unfortunately, the playbook subsequently failed due to networking issues. I believe this was caused by the assignment of a new IP address to the virtual bridge created for the `wan` libvirt network. Once this resource was created, the host and its guests needed to be accessed via a new address. Having rebooted the R720, I believe this is now fixed and will be able to continue in the next stream.
eric added 1 commit 2025-07-31 21:31:07 +00:00
eric changed title from WIP: PowerEdge R720 QEMU & Kubernetes Setup to PowerEdge R720 QEMU & Kubernetes Setup 2025-07-31 21:36:57 +00:00
eric merged commit 9015a48417 into main 2025-07-31 21:41:20 +00:00
Author
Owner

Part 3

Success! After rebooting the hypervisor and rerunning the playbook, I was able to confirm the addition of 5 nodes to the Kubernetes cluster. As mentioned in the video, I do not expect this to solve the issue presented in DevOps/ansible-role-eom#37, but I will address that in another PR after ensuring my assumptions about needing to use the second NFS client are confirmed to be correct. Please note that while this PR is merged and, therefore, technically closed, users are invited to continue posting questions or suggestions here (as well as on any other issue/pull request).

# Part 3 <video controls type="video/mp4" src="https://minio.eom.dev/public/Videos/2025-07-31_17-03-34.mp4"></video> Success! After rebooting the hypervisor and rerunning the playbook, I was able to confirm the addition of 5 nodes to the Kubernetes cluster. As mentioned in the video, I do not expect this to solve the issue presented in DevOps/ansible-role-eom#37, but I will address that in another PR after ensuring my assumptions about needing to use the second NFS client are confirmed to be correct. Please note that while this PR is merged and, therefore, technically closed, users are invited to continue posting questions or suggestions here (as well as on any other issue/pull request).
eric deleted branch r720-setup 2025-07-31 22:16:45 +00:00
eric added spent time 2025-07-31 22:17:04 +00:00
43 minutes
Sign in to join this conversation.
No reviewers
No Label
No Milestone
No project
No Assignees
1 Participants
Notifications
Total Time Spent: 4 hours 48 minutes
eric
4 hours 48 minutes
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: DevOps/software-infrastructure#23
No description provided.