Re: uvm(9) clarifications wrt. uvm_km_alloc() / uvm_km_free()

Previous thread: Kuvajte sa 96% manje masti by Top Shop on Monday, December 6, 2010 - 8:53 am. (1 message)

Next thread: 14阿情订单◇ by halfordgc975 on Wednesday, December 8, 2010 - 6:40 am. (1 message)
From: Wouter Coene
Date: Wednesday, December 8, 2010 - 8:49 am

Hi,

Having spent quite some hours figuring out why the kernel panics when I free
memory allocated through uvm_km_kmemalloc() with uvm_km_free()
("pmap_remove_pte: managed page without PG_PVLIST for <address>" on amd64),
I added the following clarification to uvm(9):

Index: uvm.9
===================================================================
RCS file: /cvs/openbsd/src/share/man/man9/uvm.9,v
retrieving revision 1.42
diff -u -a -r1.42 uvm.9
--- uvm.9	9 Nov 2010 16:03:38 -0000	1.42
+++ uvm.9	8 Dec 2010 15:41:28 -0000
@@ -518,7 +518,7 @@
 .Fn uvm_km_zalloc
 functions allocate
 .Fa size
-bytes of wired kernel memory in map
+bytes of page-aligned wired kernel memory in map
 .Fa map .
 In addition to allocation,
 .Fn uvm_km_zalloc
@@ -532,9 +532,15 @@
 .Fn uvm_km_alloc1
 function allocates and returns
 .Fa size
-bytes of wired memory in the kernel map, zeroing the memory if the
+bytes of page-aligned wired memory in the kernel map, zeroing the memory if
+the
 .Fa zeroit
-argument is non-zero.
+argument is non-zero. Unless called on an interrupt-safe map, if memory is
+currently unavailable,
+.Fn uvm_km_alloc1
+may sleep to wait for resources to be released by other processes. If not
+enough memory is available, this function returns
+.Dv NULL .
 .Pp
 The
 .Fn uvm_km_kmemalloc
@@ -607,10 +613,15 @@
 .Fa size
 bytes of memory in the kernel map, starting at address
 .Fa addr .
+The memory must have been allocated with
+.Fn uvm_km_alloc ,
+.Fn uvm_km_zalloc
+or
+.Fn uvm_km_alloc1 .
 .Fn uvm_km_free_wakeup
-calls
-.Fn thread_wakeup
-on the map before unlocking the map.
+wakes up any processes waiting for memory on the map (via 
+.Fn thread_wakeup )
+before unlocking the map.
 .Sh ALLOCATION OF PHYSICAL MEMORY
 .nr nS 1
 .Ft struct vm_page *

Is this correct? And if so, any chance this could go in?

Regards,
Wouter Coene

From: Ariane van der Steldt
Date: Thursday, December 9, 2010 - 12:32 am

uvm_km_alloc and uvm_km_zalloc are already mentioned as being
implemented in terms of uvm_km_alloc1. So they don't need explicit


Uvm can only deal with page-aligned addresses and sizes. For smaller
sizes, either malloc(9) or pool(9) will have to do. If the
page-alignment note is really needed, it should be put in the
description or notes section.

I would recommend getting hold of CDC paper on uvm to understand it

I'm not yet convinced. What are you trying to do?


That's actually an error from the pmap layer. While uvm controls a lot
of the pmap layer, many parts of the kernel will manage pages themselves
(often using pmap_kenter_pa).
The error message complains that you are freeing a managed page (one
that was entered using pmap_enter) but it's pvlist is missing. The
pvlist is a list that keeps track of which pmaps use that page. The
missing pvlist usually happens when the page was entered using
pmap_kenter_pa.
The kernel_map in uvm consists of managed pages, only kmem_map (and
other intrsafe maps) may contain unmanaged pages. Unmanaged pages cannot
be shared across processes (because the invalidation of such a page is
impossible due to the lack of pvlist).

Ciao,
-- 
Ariane

From: Wouter Coene
Date: Thursday, December 9, 2010 - 2:32 pm

Pretty much as I described: I allocate kernel memory through
uvm_km_kmemalloc() from kernel_map, and then free it through uvm_km_free(),
returning it to the kernel_map.

The code I'm working on is a bit large as a test-case, but I've abused
diskmap(4) to reproduce the panic:

Index: diskmap.c
===================================================================
RCS file: /cvs/openbsd/src/sys/dev/diskmap.c,v
retrieving revision 1.2
diff -u -a -r1.2 diskmap.c
--- diskmap.c   14 Jun 2010 16:51:55 -0000      1.2
+++ diskmap.c   9 Dec 2010 09:04:56 -0000
@@ -37,6 +37,9 @@
 #include <sys/proc.h>
 #include <sys/vnode.h>

+#include <uvm/uvm_extern.h>
+#define DIOCBOOM       _IO('d', 42)
+
 int
 diskmapopen(dev_t dev, int flag, int fmt, struct proc *p)
 {
@@ -59,6 +62,18 @@
        struct vnode *vp = NULL, *ovp;
        char *devname;
        int fd, error = EINVAL;
+
+       if (cmd == DIOCBOOM) {
+               vaddr_t addr;
+
+               addr = uvm_km_kmemalloc(kernel_map, NULL, PAGE_SIZE,
+                   UVM_KMF_CANFAIL |  UVM_KMF_ZERO);
+               if (addr == NULL)
+                       return (ENOMEM);
+               uvm_km_free(kernel_map, addr, PAGE_SIZE);
+
+               return (0);
+       }

        if (cmd != DIOCMAP)
                return EINVAL;

This triggers the panic with the following test program:

#include <err.h>
#include <sys/fcntl.h>
#include <sys/dkio.h>
#define DIOCBOOM        _IO('d', 42)

int
main()
{
        int fd;

        fd = open("/dev/diskmap", O_RDWR, 0);
        if (fd < 0)
                err(1, "open(/dev/diskmap)");
        if (ioctl(fd, DIOCBOOM, 0) < 0)
                err(1, "DIOCBOOM");
        close(fd);

        return (0);
}

This is in -current from a few days ago, basically in GENERIC on amd64 but
with an ISA ne2k driver added (as bochs' PCI ne2k and OpenBSD don't agree
much).

DDB output:

panic: pmap_remove_pte: managed page without PG_PVLIST for 0xffff80000607b000
Stopped at   ...
From: Ariane van der Steldt
Date: Friday, December 10, 2010 - 12:47 am

If you specify NULL as your object, you'll be given intr-safe memory.
You want to use kernel object instead:

		addr = uvm_km_kmemalloc(kernel_map, uvm.kernel_object,

Exception is expected: page is entered by pmap_kenter_pa, but removed by

It's not well documented. I think the object parameter is one of the
more recent additions, actually.

I hope this helps you,
-- 
Ariane

From: Wouter Coene
Date: Friday, December 10, 2010 - 6:04 am

Ah, now it makes sense. How about this diff:

Index: uvm.9
===================================================================
RCS file: /cvs/openbsd/src/share/man/man9/uvm.9,v
retrieving revision 1.42
diff -u -a -r1.42 uvm.9
--- uvm.9	9 Nov 2010 16:03:38 -0000	1.42
+++ uvm.9	10 Dec 2010 13:00:05 -0000
@@ -534,7 +534,12 @@
 .Fa size
 bytes of wired memory in the kernel map, zeroing the memory if the
 .Fa zeroit
-argument is non-zero.
+argument is non-zero. Unless called on an interrupt-safe map, if memory is
+currently unavailable,
+.Fn uvm_km_alloc1
+may sleep to wait for resources to be released by other processes. If not
+enough memory is available, this function returns
+.Dv NULL .
 .Pp
 The
 .Fn uvm_km_kmemalloc
@@ -542,6 +547,10 @@
 .Fa size
 bytes of wired kernel memory into
 .Fa obj .
+.Fa obj
+can only be
+.Dv NULL
+when allocating from an interrupt-safe map.
 The flags can be any of:
 .Bd -literal
 #define UVM_KMF_NOWAIT  0x1                     /* matches M_NOWAIT */
@@ -608,9 +617,9 @@
 bytes of memory in the kernel map, starting at address
 .Fa addr .
 .Fn uvm_km_free_wakeup
-calls
-.Fn thread_wakeup
-on the map before unlocking the map.
+wakes up any processes waiting for memory on the map (via 
+.Fn thread_wakeup )
+before unlocking the map.
 .Sh ALLOCATION OF PHYSICAL MEMORY
 .nr nS 1
 .Ft struct vm_page *

Also, maybe a stupid question, but why doesn't the irq-safety of the
allocation depend on the VM_MAP_INTRSAFE flag, like for uvm_km_free()?

Thanks,
Wouter Coene

From: Jason McIntyre
Date: Friday, December 10, 2010 - 7:03 am

new sentence, new line, for man page diffs please.

From: Ariane van der Steldt
Date: Wednesday, December 15, 2010 - 8:56 pm

uvm_km_alloc1 may not be called with an intr-safe map. Only kmemalloc is
capable of understanding them. And valloc can handle any map type, since
it only allocates space, no backing store (physical pages) for the
allocation.




It should. In fact, I don't think there is more than 1 kernel object,
but I'd have to verify.


Strictly speaking, uvm_km should do:
[1] allocation and freeing of kernel memory
[2] submap management
[3] support the intr-safe mapping of single pages (which is special
    because some archs are pmap_direct archs).
[4] initialize kernel memory

Task 4 is implemented in uvm_km_init.
Task 3 is managed by uvm_km_getpage, uvm_km_putpage. This works well and
is clean. Non-pmap-direct archs get a lot of support code to create the
correct behaviour.
Task 2 is trivial and lives in uvm_km_suballoc.

After splitting off these functions, you get, for task 1, uvm_km_free*
for freeing (2 functions), page management (uvm_km_pgremove*) and
allocators...
  uvm_km_kmemalloc  (via a define in uvm/uvm_extern.h)
  uvm_km_kmemalloc_pla
  uvm_km_alloc1
  uvm_km_valloc
  uvm_km_valloc_try
  uvm_km_valloc_align
  uvm_km_valloc_prefer_wait
  uvm_km_valloc_wait
  uvm_km_zalloc  (via a define in uvm/uvm_extern.h)
  uvm_km_alloc  (via a define in uvm/uvm_extern.h)
This gives you 10 ways of allocating kernel memory.
uvm_km_kmemalloc is the most versatile.
The only special case are the _wait allocation functions:
uvm_km_kmemalloc would need an additional flag to duplicate their
behaviour.

There's too many allocation functions and it's too easy to get confused.
They need to be reduced (point 3 on my todo list).
Only uvm_km_kmemalloc can handle intr-safe maps.

Wired memory simply means that the memory will be there during
interrupts, but only intr-safe maps can be used to allocate from during
interrupt time. And the reason the intr-safe map is safe is because the
caller protects the map against other cpus and interrupts.


I hope this help you,
-- 
Ariane

Previous thread: Kuvajte sa 96% manje masti by Top Shop on Monday, December 6, 2010 - 8:53 am. (1 message)

Next thread: 14阿情订单◇ by halfordgc975 on Wednesday, December 8, 2010 - 6:40 am. (1 message)