Re: Problem after upgrade 4.5 to 4.6: ERR M

Previous thread: ***SPAM*** ddddd by lkjhdfs on Sunday, March 21, 2010 - 8:58 pm. (1 message)

Next thread: ZFS in OpenBSD by Dan Naumov on Monday, March 22, 2010 - 4:33 am. (12 messages)
From: Uwe Dippel
Date: Monday, March 22, 2010 - 1:41 am

Having done upgrades from 4.0 onwards, on a OpenBSD-only server (amd64), 
this time something must have gone wrong: Despite of the (remote, I have 
no physical access, via serial console) 'successful'  upgrade (no error 
messages), when I was asked to reboot, I did, as always. Alas, it came 
up with

Attempting Boot From Floppy Drive (A:)
Attempting Boot From CD-ROM
Attempting Boot From Hard Drive (C:)
Using drive 0, partition 3.
Loading...
ERR M

on an HP ML350G4p.

 From all I know it is a problem with the MBR.
What I'd really like to get, before I drive there and get access, is how 
to best solve this problem, and most straightforward. Talking about what 
went wrong can wait, since this is a production machine and should be 
back as soon as possible.

Thanks in advance,

Uwe

From: Chris Bennett
Date: Monday, March 22, 2010 - 3:32 am

This sort of problem happens a lot. It was covered recently.
You will probably be able to fix it.
Search here, try ERR M

http://blog.gmane.org/gmane.os.openbsd.misc
<http://blog.gmane.org/gmane.os.openbsd.misc>http://marc.info

-- 
A human being should be able to change a diaper, plan an invasion,
butcher a hog, conn a ship, design a building, write a sonnet, balance
accounts, build a wall, set a bone, comfort the dying, take orders,
give orders, cooperate, act alone, solve equations, analyze a new
problem, pitch manure, program a computer, cook a tasty meal, fight
efficiently, die gallantly. Specialization is for insects.
   -- Robert Heinlein

From: Tobias Ulmer
Date: Monday, March 22, 2010 - 3:45 am

That is biosboot(8) telling you that it cannot find /boot, which is the
boot loader that prints the boot prompt and so on. Biosboot has to be

As explained above, no, you likely moved around/corrupted /boot in a way


From: Uwe Dippel
Date: Monday, March 22, 2010 - 6:59 am

Hmm. Actually I didn't. Through serial console, I had rebooted the 
server, just 'to make sure', before booting to bsd.rd, and everything 
went through. I rebooted again, immediately, to bsd.rd, and went through 
the very normal and standard procedure like umteen times before. One 
exception: the bsd.mp was shown as corrupted by its sha256 hash. The 
install program, however, continued; so that I could not rectify this. 
being on a multi-CPU box, in the end, it automatically copied the 
(corrupted) bsd.mp to bsd, which then had a size of 1.3 MB. Therefore, 
at the very end, after the device nodes, at the 'reboot now' prompt, I 
ftp-ed a correct version from another location into there, and cp-ed it 
into bsd. Then, strangely enough, suddenly there appeared a bsd.sp of a 
size of 0, which had not been there before.
I found this quite strange, both the installer going through despite of 
the wrong hash; and more so the (new?) automatic move of bsd.mp to bsd 
on a multicore machine; though the size was wrong. And in the end, a 
'0'-sized bsd.sp after moving in a healthy bsd.mp.
I would not totally exclude an interference of this (new?) code that 
lead to the described situation. Honestly, nothing at all done in that 
session aside from what I wrote, between the 2 boots. I guess, nothing 
of what I did should hurt the /boot?

Thanks for the reply. I'll go there next to try what has been proposed. 
Before I try, in case the
# /usr/*m*dec/installboot -v boot /*usr/mdec*/biosboot sd0
does NOT work, what else could I do? (I am asking, because it is a 
server room quite far away, with little chance for me to communicate, 
and difficult to go.) So, is there any alternative, or additional, 
solution to fall back to, when I am there, and installboot doesn't cut it?

Uwe

From: Tobias Ulmer
Date: Monday, March 22, 2010 - 9:03 am

On Mon, Mar 22, 2010 at 09:59:20PM +0800, Uwe Dippel wrote:

Well, re-install. The installer does it the same way.

From: Nick Holland
Date: Monday, March 22, 2010 - 12:32 pm

actually, the hash wasn't reported as "wrong", just different from what 
was stored in bsd.rd.  there are a lot of reasons why that might be 
that are legit, so that's why the installer continues.

It is up to you to determine "I know why the hashes didn't match, so 

well, something did damage /boot.  I doubt it was anything you did 
intentionally, but something also caused a bad bsd.sp to be copied over. 
  Possibly related.  May indicate system problems of some type.

The upgrade process will install a new boot loader for you, but 

well, I'd certainly copy over a new, verified sane /boot before running 
it, and I'm not sure what all the asterisks are doing in there...and you 
better make sure you are in the right directory if you are planning on 
running it as you show.

Might even want to wait a bit before hitting "reboot", in case something 
really odd, like a caching RAID controller that dumps unwritten data as 
part of a reboot (I really hope I'm wrong on that one...).

Nick.

From: Uwe Dippel
Date: Monday, March 22, 2010 - 10:52 pm

Okay, back. Works!

But we should not stop here.
Because at mounting my / on /mnt, I noticed that the /boot had also 
taken to a zero size. Like that bsd.sp, which was okay, but received 0 
after copying bsd.mp to bsd. What would now make /boot zero?

/usr/mdec/installboot -v boot /usr/mdec/biosboot sd0
says something like
boot: boot proto: /usr/mdec/biosboot device:/dev/rsd0c
no error message, but /boot is still '0'.
Then I removed /boot, and then an error message came up:
... boot: No such file or directory
Meaning, it couldn't rectify the /boot of size 0.
Last chance: I copied /usr/mdec/boot to /mnt/
Again:
/usr/mdec/installboot -v boot /usr/mdec/biosboot sd0
boot: boot proto: /usr/mdec/biosboot device:/dev/rsd0c
boot is 3 blocks x 16384 bytes
fs block shift 2; offset 63; inode block 24, offset 936
using MBR partition 3: type 0xA6 offset 63
Now, this looked promising and actually worked.
I still take a bet on a round of drinks that there is a bug in the 
recent install/upgrade code that has a tendency to render files to zero 
size.

Thanks for all the input to get this production box back!

Uwe

Previous thread: ***SPAM*** ddddd by lkjhdfs on Sunday, March 21, 2010 - 8:58 pm. (1 message)

Next thread: ZFS in OpenBSD by Dan Naumov on Monday, March 22, 2010 - 4:33 am. (12 messages)