Sorry again about the delay.Indeed, I think it is getting there. This makes me think about RF a little more. If you ever set it, there are some places we need to clear it too. That is, when the PC is being changed before returning to user mode, which is in signals and in ptrace. If the PC is changing to other than the breakpoint location hit by the handler that set RF, we need clear RF so that the first instruction at the changed PC can be a breakpoint hit of its own and not get masked. In fact, it may also be necessary to clear RF when freshly setting a new instruction breakpoint (when RF is set because the stop was not a debug exception at all), so that it isn't skipped if the PC happens to be right there already. I sort of wondered from the beginning why it was there. The rationale I can see is to avoid flutter. That is, when unregistering frees up a slot for a lower-priority allocation waiting in the wings, and then the new registration will just displace it again. The priority list diddling is wasted work to get back to just how it was before, but more importantly you don't want to have those callbacks for a momentarily-available slot coming and going. I don't know if this can really come up with the current code. Ugh. I've never heard of anyone using them, but I don't know the full story. The documentation I have says that RF is set in the trap frame on the stack (i.e. pt_regs.eflags) by every other kind of exception. However, for a debug exception that is due to an instruction breakpoint, RF=0 in the trap frame and the manual explicitly says that the handler must set the bit so that iret will resume and execute it rather than hit the breakpoint again. [later:] Is this really "some CPUs"? Or is it actually always as I described above (i.e. RF set usually but cleared for an instruction breakpoint hit)? The important thing is that there aren't any difficult races (i.e. what you get with callbacks). If register with no callback followed by unregister on seeing "registered but not installed" return value is simple and cheap, that is fine. It's no less meaningful than for a kernel allocation. In neither case is there a guarantee you'll keep it forever. What callers I had in mind want is a quick answer when the answer is negative at the time of the call, so they just punt on the complexity of dealing with a positive answer. It looks right to me. That is, it preserves the existing behavior for kernel-mode traps, and does not touch TF at all for user-mode traps. I think it's fine if a CPU getting an exception before it's processed the IPI looks at changed global state and says "oh, mine was stale", and punts the hit. (Or perhaps it transmorgifies its apparent DR# based on the new global state, if the CPU's old setting corresponds to one of the new settings. Probably the changing of settings can just preserve the old DR# selection in such cases and simplify the situation for the handler doing the catch-up to just if (old->dr[n] != new->dr[n]) ignore;.) You mean a context switch before the IPI gets in? switch_to_thread_hw_breakpoint can just install the latest global state. I'd still prefer to have a single machine-dependent field and not have .len. It looks promising. I'm not entirely sanguine about an 8-bit gennum. For the kernel settings, it's going to be fine--there won't be 256 updates before all the CPUs process their IPIs. But for the thbi->gennum comparison, a thread might very well not have run for days, while there have been many more updates than that, and its gennum%256 matching the current one or not is just luck. You may need some memory barriers around the switching/restart stuff. In fact, I think it would be better not to delve into reinventing the low-level bits there at all. Instead use read_seqcount_retry there (linux/seqlock.h). Using that read_seqcount_begin's value as the number to compare in thbi would also give a 32-bit sequence number. I don't see why notify_all_threads ever needs to be used. The sequence number changed, so the next switch in will always update. I guess that's how you were avoiding the untrustworthy 8-bit sequence number issue. But I think it's better to do the whole thing with seqcount and rely on 32-bit sequence numbers being good enough to let thread updates be entirely lazy. [later:] It looks to me like there is quite a lot to be shared. Of course the code can refer to constants like HB_NUM, they just have to be defined per machine. The dr7 stuff can all be a couple of simple arch_foo hooks, which will be empty on other machines. All of the list-managing logic, the prio stuff, etc., would be bad to copy. The two flavors could probably be accomodated cleanly with an HB_TYPES_NUM macro that's 1 on x86 and 2 on ia64, and is used in loops around some of the calls. I'm not suggesting you try to figure out that code structure ahead of time. But I don't think it will be a big barrier to code sharing. So it sounds like maybe the real behavior is that any dr[0-3]-induced exception resets the DR_TRAP[0-3] bits to just the new hit, but not the other bits (i.e. just DR_STEP in practice). Is that part true on all CPUs? I'm not sure what you have in mind using a new thread flag. To be consistent with existing (and machine) behavior, shouldn't that be clear only all the low (DR_TRAP[0-3]) bits when one of those bits is set? That looks fine. I'd like to see this concretely working on x86_64 as well as i386. That should be a simple matter of the new header file and the makefile patches to share the code. I can test on x86_64 if you can't. Do you have some simple test cases prepared? That is, some simple modules using the generic kernel hw_breakpoint support to readily report working or not working on basic functionality. I'd like to have something we can agree on as the baseline smoke test for trying the patches, and for new machine ports. I also want to get this machine-independent code sharing going for real. I'd like to have powerpc working as the non-x86 demonstration before we declare things in good shape. I don't expect you to write any powerpc support, but I hope I can get you to do the arch code separation to make the way for it. If you'll take a crack at it, I'll fill in and test the powerpc bits and I think we'll get something very satisfactory ironed out pretty fast. So consider the powerpc64 situation and imagine how you would do the implementation for it, and I think you'll find a lot of the code you've written is naturally shared for it. It's a bit of a degenerate case, because HB_NUM is 1, but that needn't really matter. There are only data address breakpoints of length 8 with an aligned address, so the only control info aside from the address is r/w bits. There is no separate control register. The control bits are stored in the low bits of the register whose high bits are the high bits of the aligned address. (I think other machines store their control bits the same way.) So in fact, not only is there no need for .len, but .type is actually just bits that could be stored directly in address.va (if noone expected to look at that for the address, or they used an accessor that masks off the low bits). But there are bits to spare there next to .priority, so keeping them separate doesn't hurt. What's important is that the chbi->dabr and thbi->dabr fields are stored in fully-encoded form for quick switching. Thanks, Roland -
| Linus Torvalds | Linux 2.6.27-rc8 |
| Greg KH | [patch 00/71] 2.6.26-stable review |
| Dmitry Torokhov | 2.6.27-rc8+ - first impressions |
| Rafael J. Wysocki | [Bug #11215] INFO: possible recursive locking detected ps2 command |
git: | |
| Christian MICHON | Re: MinGW port - initial work uploaded |
| Luiz Fernando N. Capitulino | Libification project (SoC) |
| Linus Torvalds | People unaware of the importance of "git gc"? |
| Jakub Narebski | [RFC] Git User's Survey 2008 |
| Richard Stallman | Real men don't attack straw men |
| Tony Abernethy | Re: What is our ultimate goal?? |
| GVG GVG | ssh_exchange_identification: Connection closed by remote host |
| James Hartley | scp batch mode? |
| Ingo Molnar | Re: [TCP]: TCP_DEFER_ACCEPT causes leak sockets |
| Timo Teräs | Re: xfrm_state locking regression... |
| Ingo Molnar | Re: [bug] stuck localhost TCP connections, v2.6.26-rc3+ |
| Natalie Protasevich | [BUG] New Kernel Bugs |
