Proxmox server won't boot

Hello!
I had this server with 4 disks in this situation:

  • Disks A and B have Proxmox running on raid1
  • Disks C and D are empty

I wanted to reinstall everything from scratch and I planned to install Proxmox on ZFS on disks C and D (mirroring) and wipe disks A and B. I did not find the option to work with disks A and B, but installation finished without errors. I used the Proxmox ISO.

After rebooting, I noticed everything was running like nothing happened. It was booting the old Proxmox installation.

Then I thought: maybe deleting the partitions on disks A and B will allow me to boot the fresh Proxmox instance on the other disks without much work :joy:

Now I know it doesn’t work that way.

So, how can I boot Proxmox installed on disks C and D?

Reinstalling is an option, but I want to learn before this happens with valuable data.

I hope you can help me. Thanks

You need to boot into a rescue mode and reinstall grub using the correct drive/boot options. If you installed grub on disks C and D as part of the installation, then you could try giving those a higher priority to boot with in the BIOS, so it tries those first over disks A and B.

1 Like

to add to that and make things more clearer: grub should ideally be installed on the disk that is primary boot option in bios. so usually that will still be disk A even after the upgrade (which also made it boot into the old proxmox).
yet you need to tell that grub exactly, that there is another install on a different disk, which should be the one.

if you go the rescue mode way make sure you mount the new install, chroot into it and maybe write grub to all disks to have all possible bios boot orders covered :wink:

Writing grub to all disks is a good shout.

I think I’m not prepared for all of this yet.

I reinstalled Proxmox with ZFS on disks C and D but it won’t boot anyway.

Checked BIOS and ILO and I don’t have the option to select a specific disk to boot from.

Is this rescue mode?

https://hostballs.com/uploads/default/original/2X/2/2e19c8af46bf53647f348df9bd1190c5bd3d1235.png

Any good tutorial that explains how to do this? Found this but I don’t know where to write it.

no, sorry, at least that’s not what I meant. instead boot from a rescue ISO which lets you enter an actual shell. gparted or whatever your provider offers should do.

from there you need to mount your installed system, chroot into it and rewrite grub. SystemRescue - Repairing a damaged Grub

by the way, did you install the root partition on ZFS as well (hope you didn’t)? do you have a separate /boot partition?

alternatively you can try to manually boot from the correct partition using grub rescue (what you see in the screenshot), find a howto here: Repair Linux boot failures in GRUB 2 rescue mode - probably not if everything is on ZFS

Which controller is in there? Iirc those HP raid controllers let you select which raid device to boot off of. Your need to get in to the raid controllers BIOS though. It probably says like “Press control-a to enter bios” during POST.

As @skorous said above, I’m sure those HPs have an option to select which disk you want - where it says in screenshot “See Boot Controller Order”.

Another simple option might be to boot while viewing the console in iLO (I can’t remember which option it is to bring up the boot menu), and when you bring up the boot menu and go to HDD, you can then change which drive to use. Try this method to see if you can actually boot into the correct proxmox. If not, then it seems grub is installed wrong or not at all.

Under Boot Controller Order there were 2 options only. I have 4 disks.

I tried iLO, BIOS, the boot menu… nothing. The only option is HDD and not a specific one.

Thank you guys for your time. I was frustrated and decided to use Debian ISO instead, then install Proxmox on top using raid1. That way was easier. Maybe later I have time for more tests.

When you selected HDD did you not get a second menu asking which HDD?

Glad it’s sorted. That’s how I always install proxmox normally - on top of standard Debian install.

After selecting #5, it continues to boot from first disk.

Out of curiosity, which HP model is this?

DL360e gen8

Now I see this before shutdown:

:frowning:

And this on start up:

Then everything continues fine. mdadm shows raid devices clean.

I suggest you define what you want to achieve in terms of partitioning and volumes and stuff first. and only then decide with what type of raid/filesystems you want to roll and also what to put where.

mixing up everything just because doesn’t make sense and can potentially lead to problems, as the kernel might need to load drivers early etc.

rather try to keep it simple.
f.i. make a seperate boot partition, if you haven’t already, that’s plain simple raid1
don’t put root partition on fancy sh*t like zfs or lvm unless really needed
use that only for your data folder (var/lib/vz ?) which can be (un-)mounted accordingly at runtime

1 Like

LVM is there because it has always been there in all my servers. To be honest, I don’t know how neccessary it is.

I have 3 raid devices.

  • sdc/sdb
    • md0
      • /boot
    • md1
      • lvm
        • /root
        • swap
  • sda/sdb
    • md2
      • /data

So LVM is not needed then?

Why did I do it that way? What I want:

  • OS and important VMs running on sdc/sdd. They are more reliable disks.
  • VMs for dev or tests running on sda/sdb. They are smaller and low performant SSDs
  • Mirroring.

I see. not saying LVM is wrong, it’s just that I prefer to remove layers that are not needed. and in your case it probably is not, unless you plan at some point to use some specific features of it.

if that’s not the case, of course you could put root directly on md1 and swap even directly to a partition of each disk (no need to have swap on raid1)

for data zfs could make sense if you would want to benefit from the caching and such, but obviously would not need mdraid then but use zraid directly. also if it’s rather staging, the benefit might not be worth the hassle and memory consumption on the other hand.

overall your setup is not wrong. as usual there are many ways to get there. but if root is on LVM then the messages you see on shutdown or startup might make sense, as lvm obviously can only start and mount your root partition after the mdraid is up and running and vice versa for shutting everything down.
eventually things wait or are retried anyway, so that’s why the message might be there but nothing gets hurt. it’s just at the moment in time things are not ready for that task…

1 Like

Sorry if I’m being a bit slow. Why are we using MD on this box? Doesn’t it have a Smart Array controller to handle the raid creation?

I was wondering if maybe it was smart in case a disk abruptly dies, to not risk loosing swap contents …? :slight_smile:

When I bought the server, I was told the controller doesn’t work with latest version of Debian. TBH I didn’t try.