kern.geom.debugflags=16 does NOT allow me to write to device

Previous thread: Re: kern/145339: [zfs] deadlock after detaching block device from raidz pool by Steve Polyack on Thursday, April 22, 2010 - 8:20 am. (1 message)

Next thread: vdev_geom_io: parallelize ? by Andriy Gapon on Thursday, April 22, 2010 - 3:04 pm. (3 messages)
From: Peter Schuller
Date: Thursday, April 22, 2010 - 1:55 pm

open() in O_RDWR fails on the device in question (which is "used" by
glabel, indirectly by gmirror and zfs).

This is on an 8.0 userland and 8-STABLE kernel. This is a bit stupid I
know (nevermind why), but given that a plain open() syscall is failing
I highly doubt that it has anything to do with the userland being out
of synch. I cannot imagine GEOM changing like that in between 8.0 and
8-STABLE before the 8.1 release (correct me if this is a poor
assumption).

Observe:

% whoami
root
% sysctl -w kern.geom.debugflags=16
kern.geom.debugflags: 16 -> 16
% sysctl kern.geom.debugflags
kern.geom.debugflags: 16
% ktrace disklabel -B /dev/ad9s1
disklabel: Class not found

kdump shows:

 15399 disklabel CALL  open(0x800c02040,O_RDWR,<unused>0xa1a5)
 15399 disklabel NAMI  "/dev/ad9s1"
 15399 disklabel RET   open -1 errno 1 Operation not permitted
 15399 disklabel CALL  open(0x800651b68,O_RDONLY,<unused>0)
 15399 disklabel NAMI  "/dev/geom.ctl"
 15399 disklabel RET   open 4
 15399 disklabel CALL  ioctl(0x4,GEOM_CTL,0x800c04040)
 15399 disklabel RET   ioctl 0
 15399 disklabel CALL  close(0x4)
 15399 disklabel RET   close 0
 15399 disklabel CALL  write(0x2,0x7fffffffde90,0xb)
 15399 disklabel GIO   fd 2 wrote 11 bytes
       "disklabel: "
 15399 disklabel RET   write 11/0xb
 15399 disklabel CALL  write(0x2,0x7fffffffdf70,0xf)
 15399 disklabel GIO   fd 2 wrote 15 bytes
       "Class not found"
 15399 disklabel RET   write 15/0xf

It has these labels:

label/prboot1r1     N/A  ad9s1a
label/prswap1r1     N/A  ad9s1b
label/prtank1r1     N/A  ad9s1d

prtank1r1 is part of a ZFS pool.

-- 
/ Peter Schuller
_______________________________________________
freebsd-fs@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-fs
To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
From: Freddie Cash
Date: Thursday, April 22, 2010 - 2:14 pm

need to set it to 17 now.

-- 
Freddie Cash
fjwcash@gmail.com
_______________________________________________
freebsd-fs@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-fs
To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
From: Peter Schuller
Date: Thursday, April 22, 2010 - 2:29 pm

I saw some references to that Googling, but it doesn't work:

% sysctl -w kern.geom.debugflags=17


       prometheus:/tmp(0)
kern.geom.debugflags: 17 -> 17
% ktrace disklabel -B /dev/ad9s1


       prometheus:/tmp(0)
disklabel: Class not found

And kdump still shows:

 15535 disklabel CALL  open(0x800c02040,O_RDWR,<unused>0xa1a5)
 15535 disklabel NAMI  "/dev/ad9s1"
 15535 disklabel RET   open -1 errno 1 Operation not permitted

In addition, geom(4) still has:

     0x10 (allow foot shooting)
             Allow writing to Rank 1 providers.  This would, for example,
             allow the super‐user to overwrite the MBR on the root disk or
             write random sectors elsewhere to a mounted disk.  The implica‐
             tions are obvious.

In addition, geom/geom_subr.c has:

	/* If foot-shooting is enabled, any open on rank#1 is OK */
	if ((g_debugflags & 16) && pp->geom->rank == 1)
                      ;

I wonder if the problem is that it's not of rank 1 because I'm writing
to the slice's first second rather than the MBR... That's now feeling
pretty likely and can perhaps explain lots of confusion that seems to
exist based on Googling.

Anyone has thoughts on what the proper action here? Or do I need to
patch my kernel to update my label? :)

(I could pop it out of geom/zfs temporarily and hope the other disk
doesn't go. But as a matter of principle I don't want to go that
route...)

-- 
/ Peter Schuller
_______________________________________________
freebsd-fs@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-fs
To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
From: Peter Schuller
Date: Thursday, April 22, 2010 - 2:57 pm

Well, for now I went with that option anyway (zpool offline, gmirror
remove, glabel stop, then disklabel, then zpool online etc).

It would be interesting to hear though if there is a technical reason
why the foot shooting flag does not apply to non-rank providers. If
I'm missing something please enlighten me, but the particular use case
of writing a boot strap on a slice is presumably not very unusual at
all so I think it is something that ought to be possible to do.

-- 
/ Peter Schuller
_______________________________________________
freebsd-fs@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-fs
To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
From: Jeremy Chadwick
Date: Friday, April 23, 2010 - 1:23 am

You'd have seen this problem on 8.0-RELEASE as well.  I've been bitching
about it since trying out 8.0-RC1.  The "Class not found" errors have
existed for way too long, and are confusing people left and right.

Supposedly we're supposed to use gpart(8) now, but I haven't figured out
how to use it in the same way as bsdlabel.

-- 
| Jeremy Chadwick                                   jdc@parodius.com |
| Parodius Networking                       http://www.parodius.com/ |
| UNIX Systems Administrator                  Mountain View, CA, USA |
| Making life hard for others since 1977.              PGP: 4BD6C0CB |

_______________________________________________
freebsd-fs@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-fs
To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
From: Andrey V. Elsukov
Date: Friday, April 23, 2010 - 1:59 am

It's easy.

# gpart create -s mbr md0
md0 created
# gpart add -t freebsd md0
md0s1 added
# gpart create -s bsd md0s1
md0s1 created
# gpart add -t freebsd-ufs -s 50m md0s1
md0s1a added
# gpart add -t freebsd-swap md0s1
md0s1b added
# gpart bootcode -b /boot/boot md0s1
md0s1 has bootcode
# gpart bootcode -b /boot/mbr md0
md0 has bootcode
# gpart set -a active -i 1 md0
md0s1 has active set


-- 
WBR, Andrey V. Elsukov
_______________________________________________
freebsd-fs@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-fs
To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
From: Jeremy Chadwick
Date: Friday, April 23, 2010 - 2:02 am

Thank you.  Based on this, I take it the following two commands are
equivalent?

Old: bsdlabel -B ad0s1
New: gpart bootcode -b /boot/boot ad0s1

-- 
| Jeremy Chadwick                                   jdc@parodius.com |
| Parodius Networking                       http://www.parodius.com/ |
| UNIX Systems Administrator                  Mountain View, CA, USA |
| Making life hard for others since 1977.              PGP: 4BD6C0CB |

_______________________________________________
freebsd-fs@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-fs
To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
From: Martin Simmons
Date: Friday, April 23, 2010 - 1:09 pm

Yes, they should be.

__Martin
_______________________________________________
freebsd-fs@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-fs
To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
From: Peter Schuller
Date: Friday, April 23, 2010 - 3:34 am

> It's easy.

Thank you for posting the example. I never really understood that
gpart was to be the generic tool; I thought it was gpt specific.
Obviously I should have read up better.

Is gpart to be considered "tested", "stable", "production quality"
and/or "default" now then, or is it still cutting edge/experimental?

So for example, would it make sense to submit patches (not promising
anything) to adjust the handbook and relegate disklabel to an
historical artifact?

-- 
/ Peter Schuller
_______________________________________________
freebsd-fs@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-fs
To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
From: Andriy Gapon
Date: Friday, April 23, 2010 - 5:25 am

Yes, it's "tested", "stable", "production quality" and/or "default".
All other tools are slowly rotting now, but can be fixed to correctly works via
GEOM interface the same way gpart does now.

gpart is the way to go and is the best tool to use on recent FreeBSD.

-- 
Andriy Gapon
_______________________________________________
freebsd-fs@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-fs
To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
From: Antony Mawer
Date: Friday, April 23, 2010 - 5:42 am

Agreed. Having just gone through updating $work's installer to use
gpart instead, it makes partitioning so much more pleasant and
consistent than previous tools...

--Antony
_______________________________________________
freebsd-fs@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-fs
To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
From: John Baldwin
Date: Friday, April 23, 2010 - 5:44 am

Actually, the other tools were already fixed to work properly with GEOM, but
they used the older set of GEOM classes (GEOM_BSD, GEOM_MBR, etc.) instead
of the GEOM_PART classes.

-- 
John Baldwin
_______________________________________________
freebsd-fs@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-fs
To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
From: Andriy Gapon
Date: Friday, April 23, 2010 - 7:24 am

Yes.  Thanks for the clarification!

-- 
Andriy Gapon
_______________________________________________
freebsd-fs@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-fs
To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
From: Ivan Voras
Date: Friday, April 23, 2010 - 8:18 am

It also depends on the meaning of "fixed" :) They mostly wrote to disk 
drives directly from userland and relied on GEOM to pick up the changes 
via the "spoil" mechanism - which is why they couldn't write to 
partition tables if a file system on one of the partition was mounted, 
etc., requiring hacks like kern.geom.debugflags=16 to drop the 
permission (actually reference counting) checks.

Gpart has a kernel-userland interface so that the userland part(s) tell 
the kernel what they need and the kernel does the writing (if applicable).

_______________________________________________
freebsd-fs@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-fs
To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
From: John Baldwin
Date: Monday, April 26, 2010 - 6:59 am

Not true.  They sent verb actions to the modules to write updated sectors.
Look at the write_disk() method in fdisk.c for example.  It uses a verb
for the GEOM_MBR class to update the MBR and only falls back to raw disk
access if that fails.

-- 
John Baldwin
_______________________________________________
freebsd-fs@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-fs
To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
From: Martin Simmons
Date: Friday, April 23, 2010 - 12:58 pm

Ironically, disklabel is using GEOM, but sysinstall and sade are still using
libdisk which does direct I/O...and also generates different BSD labels from
GEOM_PART.

__Martin
_______________________________________________
freebsd-fs@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-fs
To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
From: jhell
Date: Friday, April 23, 2010 - 6:12 am

As a curious question. I seem to remember a thread that discussed the
need to specify -b & -s to to "add". I recall this being fixed in more
recent versions i.e. >= 8.X. Does anyone know if these changes can be
MFC'd to stable/7 ?

Attached is a diff I just generated "stable/7/sbin/geom/class/part ->
stable/8/sbin/geom/class/part" as of recent commit r207113.

Thanks

-- 

 jhell
From: Peter Schuller
Date: Friday, April 23, 2010 - 3:29 am

And this avoids the problem by talking to GEOM via a published
interface and having the kernel do the actual change, instead of
writing directly to disk, correct? (I got that impression on a quick
look.)

-- 
/ Peter Schuller
_______________________________________________
freebsd-fs@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-fs
To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
Previous thread: Re: kern/145339: [zfs] deadlock after detaching block device from raidz pool by Steve Polyack on Thursday, April 22, 2010 - 8:20 am. (1 message)

Next thread: vdev_geom_io: parallelize ? by Andriy Gapon on Thursday, April 22, 2010 - 3:04 pm. (3 messages)