* Ingo Molnar (mingo@elte.hu) wrote:Probing vmalloc faults is _really_ tricky : it also implies that the handler (let's call it probe) connected to the probe point (marker or kprobe) should _never_ cause a vmalloc page fault, it should therefore never touch vmalloc'd memory, which is a very restrictive constraint, especially for tracing which may need large buffers (the only sane alternative is to allocate the buffers statically before the kernel boots). As for the location of the probe point, we have to determine how we want to handle a OOPSing probe. If we put the probe point too soon in the do_page_fault function, we will end up doing recursive page_fault rather than a OOPS, which may make things harder to debug. In the LTTng instrumentation, I volountarily excluded the bad_area and bad_area_nosemaphore paths from the page fault instrumentation for this exact reason. Currently, I have markers around the handle_mm_fault call : trace_mark(kernel_arch_trap_entry, "trap_id %d ip #p%ld", 14, instruction_pointer(regs)); fault = handle_mm_fault(mm, vma, address, write); trace_mark(kernel_arch_trap_exit, MARK_NOARGS); I also instrument handle_mm_fault, but I leave these markers in do_page_fault to get the architecture specific trap id (the "trap_entry" and "trap_exit" events) and the instruction pointer causing the fault. My handle_mm_fault instrumentation : (note that handle_mm_fault is also called by get_user_pages, not only do_page_fault) int handle_mm_fault(struct mm_struct *mm, struct vm_area_struct *vma, unsigned long address, int write_access) { int res; pgd_t *pgd; pud_t *pud; pmd_t *pmd; pte_t *pte; trace_mark(mm_handle_fault_entry, "address %lu ip #p%ld write_access %d", address, KSTK_EIP(current), write_access); __set_current_state(TASK_RUNNING); count_vm_event(PGFAULT); if (unlikely(is_vm_hugetlb_page(vma))) { res = hugetlb_fault(mm, vma, address, write_access); goto end; } pgd = pgd_offset(mm, address); pud = pud_alloc(mm, pgd, address); if (!pud) { res = VM_FAULT_OOM; goto end; } pmd = pmd_alloc(mm, pud, address); if (!pmd) { res = VM_FAULT_OOM; goto end; } pte = pte_alloc_map(mm, pmd, address); if (!pte) { res = VM_FAULT_OOM; goto end; } res = handle_pte_fault(mm, vma, address, pte, pmd, write_access); end: trace_mark(mm_handle_fault_exit, MARK_NOARGS); return res; } Mathieu -- Mathieu Desnoyers Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 --
| Peter Zijlstra | [PATCH 6/6] sched: disabled rt-bandwidth by default |
| Alan Cox | Re: RFC: outb 0x80 in inb_p, outb_p harmful on some modern AMD64 with MCP51 laptops |
| Vegard Nossum | [RFC][PATCH] bitfields API |
| Pallipadi, Venkatesh | RE: 2.6.21-rc6-mm1 |
git: | |
| Jan Holesovsky | [PATCH] RFC: git lazy clone proof-of-concept |
| Junio C Hamano | Re: [PATCH resend] make "git push" update origin and mirrors, "git push --mirror" ... |
| Nicolas Pitre | Re: [PATCH] diff-delta: produce optimal pack data |
| Sam Vilain | [PATCH] git-mergetool: add support for ediff |
| Michael | QEMU /dev/tun issue with tun device number > 3 (more than 4 guests) |
| GVG GVG | ssh_exchange_identification: Connection closed by remote host |
| Predrag Punosevac | Re: OpenBSD project goals |
| Nick Guenther | Re: how to clear dmesg outpout |
| Stephen Pierce | SLS |
| C Wayne Huling | Re: Can males come from... |
| Les Andrzejewski | X386/WD90C31/SUMSUNG SYNC MASTER 4 |
| David Willmore | Re: Intel, the Pentium and Linux |
