View Issue Details

IDProjectCategoryView StatusLast Update
0006832Rocky-Linux-9Generalpublic2025-01-09 13:31
ReporterBrad 2014 Assigned ToLouis Abel  
PrioritynormalSeverityminorReproducibilityalways
Status needinfoResolutionopen 
Platformx86_64OSRocky 9OS Version9.4
Summary0006832: Image Rocky-9-GenericCloud-Base-9.4-20240509.0.x86_64.qcow2 does not boot under qemu-kvm
DescriptionOn a working x86_64 host running Rocky 9.4 (kernel 5.14.0-427.16.1.el9_4.x86_64), I am creating a qemu-kvm virtual machine.

On a Rocky 9.4 host with qemu-kvm, libvirt, virt-manager and virt-install packages installed,
the virtual machine starts cleanly with base=Rocky-9-GenericCloud-Base-9.3-20231113.0.x86_64.qcow2
and fails to start with base=Rocky-9-GenericCloud-Base-9.4-20240509.0.x86_64.qcow2

Steps To ReproduceTo reproduce (see below for the content of ./test-user-data):
# qemu-img create -b $base -f qcow2 -F qcow2 test.qcow2
# virt-install --import --os-variant=rocky9 --autostart --graphics none --autoconsole none "--name=test" "--ram=2048" "--vcpus=2" --network "type=default" --network "bridge=bridge,source=eno1,model=virtio,type=direct,trustGuestRxFilters=on" --disk "path=test.qcow2,format=qcow2" --cloud-init "user-data=test-user-data"
# virsh console # (boot sequence output ending with login prompt, ^] to exit)
# virsh destroy test
# virsh undefine test




Additional InformationIn the above example, the cloud-init data in ./test-user-data is minimal:
---------------------
#cloud-config
preserve_hostname: false
hostname: test
fqdn: test.example.com
ssh_pwauth: True
users:
  - name: root
    hashed_passwd: [redacted]
    lock_passwd: false
    ssh_authorized_keys:
      - ssh-ed25519 [redacted]
---------------------
TagsNo tags attached.

Activities

Brad 2014

Brad 2014

2024-05-27 12:16

reporter   ~0007195

This failure to boot is also reproducible when using the LVM image: Rocky-9-GenericCloud-LVM-9.4-20240509.0.x86_64.qcow2
Brad 2014

Brad 2014

2024-05-28 12:32

reporter   ~0007228

I notice the following in the syslog:

May 28 12:22:58 ... setroubleshoot[181117]: SELinux is preventing /usr/libexec/qemu-kvm from getattr access on the file /proc/sys/vm/max_map_count. For complete SELinux messages run: sealert -l 6b4461e1-8775-46d6-975b-2ec47d990999
May 28 12:22:58 ... setroubleshoot[181117]: SELinux is preventing /usr/libexec/qemu-kvm from getattr access on the file /proc/sys/vm/max_map_count.

So may be related to https://gitlab.com/qemu-project/qemu/-/issues/2324
Brad 2014

Brad 2014

2024-05-28 12:36

reporter   ~0007229

Belay that. The AVC above occurs on both booting and non-booting images. Probably a red herring.
Brad 2014

Brad 2014

2025-01-08 17:57

reporter   ~0009274

This continues to fail on Rocky-9-GenericCloud-Base-9.5-20241118.0.x86_64.qcow2

The last working release appears to be Rocky-9-GenericCloud-Base-9.3-20231113.0.x86_64.qcow2.
Louis Abel

Louis Abel

2025-01-08 18:09

administrator   ~0009275

Thank you for the report. Your host system is Rocky Linux 9.4, which is not supported. Please update your host system to 9.5. https://wiki.rockylinux.org/rocky/version/#__tabbed_1_2

Your issue is NOT reproducible on Rocky Linux 9.5.

% wget https://dl.rockylinux.org/pub/rocky/9/images/x86_64/Rocky-9-GenericCloud-Base-9.5-20241118.0.x86_64.qcow2 -O genclo9.5.qcow2
% cat data
#cloud-config
package_upgrade: true
growpart:
  mode: auto
  ignore_growroot_disabled: false

users:
  - default
  - name: ansible
    shell: /bin/bash
    sudo:
      - ALL=(ALL) NOPASSWD:ALL
    ssh_authorized_keys:
      - . . .
    uid: 1000
    passwd: randomnoiseandthisisntahash
  - name: testuser
    shell: /bin/bash
    sudo:
      - ALL=(ALL) NOPASSWD:ALL
    uid: 1001
    passwd: . . .
    lock_passwd: false

fqdn: testsystem.angelsofclockwork.net

% virt-install --memory 8192 --vcpus 4 --cloud-init 'user-data=/var/lib/libvirt/images/data' --os-variant rocky9 --name testsystem.angelsofclockwork.net --disk $PWD/genclo9.5.qcow2 --import --network bridge=br1000 --boot uefi,loader_secure=no --autoconsole none

Starting install...
Creating domain... | 00:00:00 Domain creation completed.
[root@npxh00s0 images]# virsh console testsystem.angelsofclockwork.net
Connected to domain 'testsystem.angelsofclockwork.net'
Escape character is ^] (Ctrl + ])

testsystem login: testuser
Password:
[testuser@testsystem ~]$
%
% audit2why < /var/log/audit/audit.log
Nothing to do
% grep /proc/sys/vm/max_map_count /var/log/messages
%echo $?
1

Setting to needinfo.
Brad 2014

Brad 2014

2025-01-08 18:27

reporter   ~0009276

Thank you, I think your example pointed me to the failure. Boot fails on both Rocky-9.4 and 9.5 unless I specify the "--boot uefi" option, which was not required on Rocky-9.3. Out of curiosity, is there any documentation that would indicate this requirement or change?

FIX: Starting with Rocky 9.4, the virt-install "--boot uefi" option is required.
Louis Abel

Louis Abel

2025-01-08 18:40

administrator   ~0009277

UEFI is not required. The images are designed to boot on UEFI and non-UEFI. Here's the image booted in non-UEFI.

% virt-install --memory 8192 --vcpus 4 --cloud-init 'user-data=/var/lib/libvirt/images/data' --os-variant rocky9 --name testsystem.angelsofclockwork.net --disk $PWD/genclo9.5-2.qcow2 --import --network bridge=br1000 --autoconsole none
% virsh console testsystem.angelsofclockwork.net
Connected to domain 'testsystem.angelsofclockwork.net'
Escape character is ^] (Ctrl + ])

testsystem login: testuser
Password:
Last login: Wed Jan 8 18:36:21 on ttyS0
[testuser@testsystem ~]$ sudo su -
Last login: Wed Jan 8 18:36:23 UTC 2025 on ttyS0
[root@testsystem ~]# test -d /sys/firmware/efi
[root@testsystem ~]# echo $?
1
[root@testsystem ~]# fdisk -l /dev/vda
Disk /dev/vda: 10 GiB, 10737418240 bytes, 20971520 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: E811E2B5-98BE-4773-8387-812FAE82D5C6

Device Start End Sectors Size Type
/dev/vda1 2048 6143 4096 2M BIOS boot
/dev/vda2 6144 210943 204800 100M EFI System
/dev/vda3 210944 2258943 2048000 1000M Linux extended boot
/dev/vda4 2258944 20971486 18712543 8.9G Linux root (x86-64)

The above output shows the system is not booted via UEFI and that the partition table supports both BIOS boot and EFI boot. This can be verified in the VM xml as well.

### This shows this system is running with UEFI on.
% grep firmware xmpp01.angelsofclockwork.net.xml
  <os firmware='efi'>
    <firmware>
    </firmware>

### This shows the system is NOT running with UEFI.
% grep firmware testsystem.angelsofclockwork.net.xml
Brad 2014

Brad 2014

2025-01-08 20:06

reporter   ~0009278

I am able to reproduce the failure of the virt-install on a 9.5 host starting a 9.5 guest when not using the --boot uefi flag, so I'll do more research to find out why we're seeing different behavior.
Brad 2014

Brad 2014

2025-01-08 22:18

reporter   ~0009281

Upon further analysis, the regression is related to the --graphics=none option to virt-install, which creates a headless guest (no video device).

1. With the GenericCloud-Base-9.3 image, virt-install --graphics=none boots SUCCESSFULLY:
# virt-install --memory 2048 --vcpus 2 --os-variant rocky9 --name test2 --disk image.qcow2 --import --network type=default --graphics none --autoconsole=text

2. With the GenericCloud-Base-9.5 image, virt-install --graphics=none FAILS TO BOOT (it hangs before the first console message):
# virt-install --memory 2048 --vcpus 2 --os-variant rocky9 --name test2 --disk image.qcow2 --import --network type=default --graphics none --autoconsole=text

3. As a workaround, it turns out that booting 9.5 using UEFI rather than BIOS works SUCCESSFULLY; it seems to deal with the missing video device:
# virt-install --memory 2048 --vcpus 2 --os-variant rocky9 --name test2 --disk image.qcow2 --import --network type=default --graphics none --boot uefi --autoconsole=text

4. Alternatively, removing the --graphics=none option also allows the BIOS boot to SUCCEED (though the guest machine is no longer headless, it now has a graphics device/vnc port, even though the console remains text):
# virt-install --memory 2048 --vcpus 2 --os-variant rocky9 --name test2 --disk $PWD/test2.qcow2 --import --network type=default --autoconsole=text

So the regression is that BIOS boot of a guest created using --graphics=none worked in 9.3 and fails in 9.4 and 9.5. The workaround is to either remove the --graphics=none option, causing a video device/vnc to be added to the guest, or to use the --boot=uefi option, which boots when there is no video device.

I'll leave the above here for the google-fu. The workarounds suffice for me, if this is considered a feature rather than a bug.
Brad 2014

Brad 2014

2025-01-09 13:31

reporter   ~0009307

[ typos corrected ]

Upon further analysis, the regression is related to the --graphics=none option to virt-install, which creates a headless guest (no video device).

1. With the GenericCloud-Base-9.3 image, virt-install --graphics=none boots SUCCESSFULLY:
# virt-install --memory 2048 --vcpus 2 --os-variant rocky9 --name test2 --disk image.qcow2 --import --network type=default --graphics none --autoconsole=text

2. With the GenericCloud-Base-9.5 image, virt-install --graphics=none FAILS TO BOOT (it hangs before the first console message):
# virt-install --memory 2048 --vcpus 2 --os-variant rocky9 --name test2 --disk image.qcow2 --import --network type=default --graphics none --autoconsole=text

3. As a workaround, it turns out that booting 9.5 using UEFI rather than BIOS works SUCCESSFULLY; it seems to deal with the missing video device:
# virt-install --memory 2048 --vcpus 2 --os-variant rocky9 --name test2 --disk image.qcow2 --import --network type=default --graphics none --boot uefi --autoconsole=text

4. Alternatively, removing the --graphics=none option also allows the BIOS boot to SUCCEED (though the guest machine is no longer headless, it now has a graphics device/vnc port, even though the console remains text):
# virt-install --memory 2048 --vcpus 2 --os-variant rocky9 --name test2 --disk image.qcow2 --import --network type=default --autoconsole=text

So the regression is that BIOS boot of a guest created (on a Rocky 9 9.5 system) using --graphics=none worked for GenericCloud-Base 9.3 imagesand fails on 9.4 and 9.5 images. The workaround is to either remove the --graphics=none option, causing a video device/vnc to be added to the guest, or to use the --boot=uefi option, which boots when there is no video device.

I'll leave the above here for the google-fu. The workarounds suffice for me, if this is considered a feature rather than a bug in the GenericCloud-Base image.

Issue History

Date Modified Username Field Change
2024-05-23 16:42 Brad 2014 New Issue
2024-05-27 12:16 Brad 2014 Note Added: 0007195
2024-05-28 12:32 Brad 2014 Note Added: 0007228
2024-05-28 12:36 Brad 2014 Note Added: 0007229
2025-01-08 17:57 Brad 2014 Note Added: 0009274
2025-01-08 18:09 Louis Abel Assigned To => Louis Abel
2025-01-08 18:09 Louis Abel Status new => needinfo
2025-01-08 18:09 Louis Abel Note Added: 0009275
2025-01-08 18:27 Brad 2014 Note Added: 0009276
2025-01-08 18:40 Louis Abel Note Added: 0009277
2025-01-08 20:06 Brad 2014 Note Added: 0009278
2025-01-08 22:18 Brad 2014 Note Added: 0009281
2025-01-09 13:31 Brad 2014 Note Added: 0009307