On Tue, Nov 06, 2007 at 03:08:08PM -0800, Linus Torvalds wrote:We have the following properties in the character sets we handle: - every ASCII character is encoded with the same byte as in ASCII - if the eighth bit is 0, the byte can't be part of a multi-byte character - no ASCII character can be encoded in a different way This (plus most likely some other properties I've missed to mention) allows some parsing based on ASCII characters. But if you want to match "one character" (like TOMOYO does) or want to check for printable characters except space (like Smack does) you must know whether the byte string 0xC3 0xA0 is the character à or a sequence of two characters with the second one being NBSP. cu Adrian -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed -
| Greg KH | [GIT PATCH] driver core patches against 2.6.24 |
| Tejun Heo | [PATCH 2/5] sysfs: simplify sysfs_rename_dir() |
| Andi Kleen | [PATCH x86] [0/16] Various i386/x86-64 changes |
| Dave Hansen | Re: [RFC/PATCH] Documentation of kernel messages |
git: | |
| Gerrit Renker | [PATCH 15/37] dccp: Set per-connection CCIDs via socket options |
| David Miller | [GIT]: Networking |
| Jarek Poplawski | [PATCH] pkt_sched: Destroy gen estimators under rtnl_lock(). |
| Thomas Gleixner | Re: [BUG] New Kernel Bugs |
