On Wed, Aug 13, 2008 at 10:39:51AM -0700, Arjan van de Ven wrote:Something else to think about is what happens if the file is naturally written in pieces. For example, I've been playing with bittorrent recently, and it appears that trackerd will do something... not very intelligent in that it will repeatedly try to index a file which is being written in pieces, and in some cases, it will do things like call pdftext that aren't exactly cheap. A timeout *can* help (i.e., don't try to scan/index this file until 15 minutes after the last write), but it won't help if the torrent is very large, or the download bitrate is very slow. One very simple workaround is to disable trackerd altogether while you are downloading the file, but that's not very pleasant solution; it's horribly manual. Most of this may end up being outside of the kernel (i.e.,some kind of interface where a bittorrent client can say, "look this file is still being downloaded, so it's don't bother scanning it unless some process *other* than the bittorrent client tries to access the file". And maybe there should be some other more complex policies, such as the bittorrent client explicitly telling the indexer/scanner that the file is has been completely downloaded, so it's safe to index it now. But what this points out is that if you want a good solution, (a) it probably shouldn't all be in the kernel, since trying to get all of this complexity into the kernel will be painful, and (b) the policy about whether or not a bittorrent client should be allowed to say, "it's OK not to check the file until it's completely downloaded, even if I am handing out pieces to other people over the network --- after all the entire file has its own SHA checksum for data integrity verification --- is very much a policy question where different system administrators will come down on different sides about what should and shouldn't be allowed --- and therefore this kind of policy decision should ****NOT**** be in the kernel. We have an i_version support for NFSv4, so we have that already as far as the version of the file. We can have a single bit which means "block on open" that is stored on a file, and some kind of policy which dictates whether or not any modification to the file contens should automatically set the bit. However, questions of which version of virus database was used to scan a particular file should be stored outside of the filesystem, since each product will have its own version namespace, and the questions of what happens if a user switches from one version checker to another is going to be messy. So better that this be done in userspace, and that this information be stored in some on-disk database. - Ted --
| Tarkan Erimer | Re: Dual-Licensing Linux Kernel with GPL V2 and GPL V3 |
| Greg Kroah-Hartman | [PATCH 005/196] Chinese: add translation of SubmittingDrivers |
| Andrew Morton | 2.6.23-rc6-mm1 |
| Eric Paris | [RFC 0/5] [TALPA] Intro to a linux interface for on access scanning |
| Gerrit Renker | [PATCH 27/37] dccp: Integration of dynamic feature activation - part 2 (server side) |
| David Miller | [GIT]: Networking |
| Jarek Poplawski | [PATCH] pkt_sched: Destroy gen estimators under rtnl_lock(). |
| Natalie Protasevich | [BUG] New Kernel Bugs |
git: | |
