Re: Can not boot 7.0-BETA3 with IPSEC

Previous thread: Updated ata(4) patch for SB600 and SB700 by Martin Matuska on Thursday, November 22, 2007 - 2:04 am. (1 message)

Next thread: Now -stable is broken from undefined reference to `__mb_sb_limit' by Sean McNeil on Wednesday, November 21, 2007 - 3:00 pm. (7 messages)
From: Frank Behrens
Date: Thursday, November 22, 2007 - 1:46 am

Hi,

I tried to use the new 7.0 version, but have some trouble. The PC has been running 5.x/6.x 
for years without problems, but my new kernel does not boot. A self compiled GENERIC 7.0-
BETA3 kernel runs without problems.

When I use the following kernel configuration
-----------
include         GENERIC

ident           GENIPSEC

makeoptions     DEBUG=-g                # Build kernel with gdb(1) debug symbols

options         INVARIANTS              # Enable calls of extra sanity checking
options         INVARIANT_SUPPORT       # Extra sanity checks of internal structures, required by INVARIANTS
options         WITNESS                 # Enable checks to detect deadlocks and cycles
options         WITNESS_SKIPSPIN        # Don't run witness on spinlocks for speed

options         IPSEC                   #IP security
device          crypto
#options         IPSEC_DEBUG             #debug for IP security
options         IPSEC_FILTERTUNNEL         #filter ipsec packets from a tunnel

device          puc
nodevice        uart
options         COM_MULTIPORT
----------

the kernel boots until

FreeBSD 7.0-BETA3-200711220702 #1: Thu Nov 22 08:10:52 CET 2007
    frank@moon.behrens:/data3/sys/obj/data3/sources/fbsd7/sys/GENIPSEC
WARNING: WITNESS option enabled, expect reduced performance.
...
cryptosoft0: <software crypto> on motherboard
crypto: assign cryptosoft0 driver id 0, flags 100663296
...
Fast IPsec: Initialized Security Association Processing.
...
SMP: AP CPU #1 Launched!
cpu1 AP:
     ID: 0x01000000   VER: 0x00050014 LDR: 0x00000000 DFR: 0xffffffff
  lint0: 0x00010700 lint1: 0x00000400 TPR: 0x00000000 SVR: 0x000001ff
  timer: 0x000200ef therm: 0x00010000 err: 0x00010000 pcm: 0x00010000
WARNING: WITNESS option enabled, expect reduced performance.
Trying to mount root from ufs:/dev/ad6s1a
start_init: trying /sbin/init


Then the system seems to hang, no messages, no reaction on serial console.
With an different kernel including DDB I was not able to enter the ...
From: Bjoern A. Zeeb
Date: Thursday, November 22, 2007 - 6:47 am

On Thu, 22 Nov 2007, Frank Behrens wrote:


looks ok, from what I can see skipping over it. IPSEC, crypto,
optional FILTERTUNNEL.

What is strange is that it seems to hang once it enters userland after/
while starting init.

Just some random things that come to my mind:
a) do you have ipsec enabled in rc.conf so that a policy would be
    inserted?
b) you are not trying to mount anything from nfs?
c) is anything else displayed on the screen if not on serial console?
d) can you try without the puc/COM_MULTIPORT but with IPSEC?


I'll think a bit more this evening (CEST).

-- 
Bjoern A. Zeeb                                 bzeeb at Zabbadoz dot NeT
Software is harder than hardware  so better get it right the first time.
_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"
From: Frank Behrens
Date: Thursday, November 22, 2007 - 8:31 am

Hi Bjoern,

thanks for your answer!


Meanwhile I added some print debug statements. In init_main.c, function start_init the execve 
of /sbin/init returns without an error. But init seems not to be called, even an print statement 




A new kernel is been compiling now. I'll post the result.

Best regards,
   Frank
-- 
Frank Behrens, Osterwieck, Germany
PGP-key 0x5B7C47ED on public servers available.

_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"
From: Frank Behrens
Date: Thursday, November 22, 2007 - 8:54 am

3 Points!
The IPSEC kernel boots now fine. Now I'll try to see, what's wrong with sio/puc/MULTIPORT.
I'll post the result.

Best regards,
   Frank
-- 
Frank Behrens, Osterwieck, Germany
PGP-key 0x5B7C47ED on public servers available.

_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"
From: Frank Behrens
Date: Friday, November 23, 2007 - 3:20 am

My in previous email described problem is not caused by IPSEC setup.  I should have 
investigated this much better. :-(
Thanks to Bjoern for his debugging hints.

The 7.0-BETA3 seems to have a problem with puc(4) driver. When I enable the driver with 
kernel configuration

device          puc
nodevice        uart
options         COM_MULTIPORT

I see the following effects:

1. The sio(4) does not attach as in RELENG_6 to the ports provided by puc(4) driver. In 
RELENG_7 it shows as

puc0: <Oxford Semiconductor OX16PCI954 UARTs> port 0xdf00-0xdf1f,0xdec0-0xdedf mem 0xfe6f8000-0xfe6f8fff,0xfe6f7000-0xfe6f7fff irq 21 at device 13.0 on pci2
puc0: Reserved 0x20 bytes for rid 0x10 type 4 at 0xdf00
ioapic0: routing intpin 21 (PCI IRQ 21) to vector 54
puc0: [FILTER]
sio0 on puc0
sio0: type 16550A, console
sio0: [FILTER]
sio1: reserved for low-level i/o

where RELENG_6 shows

puc0: <Oxford Semiconductor OX16PCI954 UARTs> port 0xdf00-0xdf1f,0xdec0-0xdedf mem 0xfe6f8000-0xfe6f8fff,0xfe6f7000-0xfe6f7fff irq 21 at device 13.0 on pci2
sio4: <Oxford Semiconductor OX16PCI954 UARTs> on puc0
sio4: type 16550A
sio4: unable to activate interrupt in fast mode - using normal mode
sio5: <Oxford Semiconductor OX16PCI954 UARTs> on puc0
sio5: type 16550A
sio5: unable to activate interrupt in fast mode - using normal mode
....
sio0: <16550A-compatible COM port> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0
sio0: type 16550A, console
sio1: reserved for low-level i/o


In RELENG_6 the sio0 and sio1 are the ports on motherboard, the ports on external card are mapped to sio4, sio5 and so on.
In RELENG_7 the 1st port on external card seems to be mapped to sio0 and nothing else.

In both cases the content of /boot/device.hints ...
From: Frank Behrens
Date: Friday, November 23, 2007 - 7:57 am

Hi, dear FreeBSD developers!


Meanwhile I was able to restore the previous behavior. My RELENG_7 kernel boots with right 
sio port assignments. :-))

IMHO the reason for the error in 7.0 is, that it calls device_add_child(dev, NULL, -1);
The unit is not determined, so the bus subsystem assigns the 1st sio unit. That is 0 and 
wrong in my case. The patch uses some code from RELENG_6 to determine the 1st free sio 
unit. I know this is a hack, but you should see, where the problem is to be searched.

Regards,
    Frank

===================================================================
RCS file: /data/freebsd/src/sys/dev/puc/puc.c,v
retrieving revision 1.50
diff -u -w -p -r1.50 puc.c
--- puc.c	6 Jun 2007 22:17:01 -0000	1.50
+++ puc.c	23 Nov 2007 14:17:29 -0000
@@ -47,6 +47,7 @@ __FBSDID("$FreeBSD: src/sys/dev/puc/puc.
 #include <dev/puc/puc_bfe.h>
 
 #define	PUC_ISRCCNT	5
+#define PUC_DEBUG	1
 
 struct puc_port {
 	struct puc_bar	*p_bar;
@@ -70,6 +71,31 @@ const char puc_driver_name[] = "puc";
 
 MALLOC_DEFINE(M_PUC, "PUC", "PUC driver");
 
+
+static int
+puc_find_free_unit(device_t dev, char *name)
+{
+        devclass_t dc;
+        int start;
+        int unit;
+
+        unit = 0;
+        start = 0;
+        while (resource_int_value(name, unit, "port", &start) == 0 &&
+            start > 0)
+                unit++;
+        dc = devclass_find(name);
+        if (dc == NULL)
+                return (-1);
+        while (devclass_get_device(dc, unit))
+                unit++;
+#if PUC_DEBUG
+        device_printf(dev, "Using %s%d\n", name, unit);
+#endif
+        return (unit);
+}
+
+
 struct puc_bar *
 puc_get_bar(struct puc_softc *sc, int rid)
 {
@@ -201,6 +227,13 @@ puc_bfe_attach(device_t dev)
 	bus_space_handle_t bsh;
 	bus_space_tag_t bst;
 	int error, idx;
+#if PUC_DEBUG
+	int oldverbose = bootverbose;
+        bootverbose = 1;
+
+        device_printf(dev, "puc_bfe_attach\n");
+#endif
+
 
 	sc = device_get_softc(dev);
 
@@ ...
From: Marcel Moolenaar
Date: Saturday, November 24, 2007 - 11:20 am

No, it isn't. The puc(4) driver can have different children.  
Currently, it
can have 3 different children. Standard bus probing determines which
driver will attach. The puc(4) driver does not care about unit numbers  
for
the simple reason that it doesn't care about which driver attaches.
FYI,

-- 
Marcel Moolenaar
xcllnt@mac.com


_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"
From: Frank Behrens
Date: Sunday, November 25, 2007 - 5:46 am

Marcel,

thanks for your explanation.


OK. I interpret this as: It is not puc's problem, which sio units are asssigned, it is the job for 
the sio driver itself.

Then may I ask the question to the community:

How do I setup my 7.0 configuration, that it is the same as in 6.x (POLA)?
The serial devices on motherboard should be attached to sio0 and sio1, where sio0 is the 
console. The external PCI card with additional serial ports should be attached to subsequent 
sio units. 

When the loader uses sio0 as console and puc/sio assigns later sio0 to an external port this 
is definitely wrong. Anyway this stops the boot sequence.

Is this an error in the sio(4) driver which was not detected until 7.0? I have in device.hints the 
default entries
hint.sio.0.at="isa"
hint.sio.0.port="0x3F8"
hint.sio.0.flags="0x10"
hint.sio.0.irq="4"

and the puc(4) port is assigned to sio0.


Regards,
   Frank
-- 
Frank Behrens, Osterwieck, Germany
PGP-key 0x5B7C47ED on public servers available.

_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"
From: Marcel Moolenaar
Date: Sunday, November 25, 2007 - 2:56 pm

It's actually more a job for the newbus infrastructure. Whenever
a child is created in a particular device class, it's assigned
a unit number. Then and there do you want to implement policies
about unit numbers. Not in the individual drivers, whether leave
or otherwise.

FYI,

-- 
Marcel Moolenaar
xcllnt@mac.com


_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"
From: Frank Behrens
Date: Monday, November 26, 2007 - 7:27 am

Do you believe I should create a PR, that we have a regression in 7.0 for newbus 
infrastructure?

Or is my setup wrong?

Regards,
   Frank
-- 
Frank Behrens, Osterwieck, Germany
PGP-key 0x5B7C47ED on public servers available.

_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"
From: Marcel Moolenaar
Date: Monday, November 26, 2007 - 10:59 am

I don't think we have a regression in the newbus infrastructure,
because we never had this support there. The problem has always
been that we have treated the COM ports as a special case when
we shouldn't have. Granted, other bugs forced us to treat the
COM port specially (i.e. wiring the low-level console to a unit
number before bus enumeration proper), but this is just more of
the same...

My suggestion is to use uart(4) and not worry about the unit
number. Your console will work irrespective...

-- 
Marcel Moolenaar
xcllnt@mac.com


_______________________________________________
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org"
Previous thread: Updated ata(4) patch for SB600 and SB700 by Martin Matuska on Thursday, November 22, 2007 - 2:04 am. (1 message)

Next thread: Now -stable is broken from undefined reference to `__mb_sb_limit' by Sean McNeil on Wednesday, November 21, 2007 - 3:00 pm. (7 messages)