:How will this affect parallel IO (reads, but especially writes)? Would=20
:having such a global structure serialize it? (I'm assuming, possibly=20
:wrongly, that having trees per-cluster allowed you to lock individual=20
:clusters).
Reads will not be effected at all... the locking occurs at the B-Tree
node layer.
Writes will not be serialized and will still be asynchronous so the
most typical striping setups on multi-disk filesystems should still
yield very high performance. Writes WILL be far more likely to be
sequential which should actually improve write performance. Also
keep in mind that writes are buffered by the buffer cache, so there
is a caching layer between userland and the physical disk.
Mixed data writes (parallel write operations by multiple processes in
different parts of the filesystem) will generally lay down new
information sequentially on disk, which can be detrimental for read
performance since the individual files will not be entirely sequential.
I seem to recall a paper at a USENIX long ago where someone tested
locality of reference for reads after laying down writes from
parallel sources sequentially, and it was no worse then trying to zone
the disparate writes, so I'm not really worried about this case.
Also, once you get over a track or two's worth of data, it costs about
the same to seek 3 tracks as it does to seek 10 tracks, so as long as
writes are not *completely* strewn about due to lots of parallel write
activity occuring, it shouldn't be a problem. They won't be because
writes are cached in the buffer cache prior to being flushed out. We
should get nice long bursts of sequentially ordered data on disk.
--
I don't like to think that I wasted a ton of time building the
cluster mechanism, and its kinda sad to see so much code removed. But
most of the work over the last few months has been B-Tree centric,
implementing the inode cache, high level VOPs, record structures, etc...
and those parts of the codebase remain intact.
It really got to the point where implementing the last bits was starting
to take way way too much time. When things start to take that much time
to do, I know I've made a mistake somewhere in the design. Better to
fix it now then to try to slog through the complexity later on.
-Matt
Matthew Dillon
<dillon@backplane.com>
| Matt Mackall | Re: + fix-spellings-of-slab-allocator-section-in-init-kconfig.patch added to -mm t... |
| Andi Kleen | [PATCH] [0/36] Great change_page_attr patch series v3 |
| Bron Gondwana | Re: BUG: mmapfile/writev spurious zero bytes (x86_64/not i386, bisected, reproduca... |
| Nigel Kukard | SATA problems |
| Karl R. Buck | Re: (none) |
| drew | Re: Use PERL rather than C for system commands? |
| Theodore Ts'o | Re: demand paging: proposal |
| Steffen Finger | make compatible to bsd-make ? |
| Adrian Bunk | [2.6 patch] unexport icmpmsg_statistics |
| Evgeniy Polyakov | Re: [2/3] POHMELFS: Documentation. |
| jamal | Re: [PATCH 2/3][NET_BATCH] net core use batching |
| Stephane Chazelas | [iproute2] get_hz() with CONFIG_HIGH_RES_TIMERS |
git: | |
| Elijah Newren | Trying to use git-filter-branch to compress history by removing large, obsolete bi... |
| Junio C Hamano | PPC SHA-1 Updates in "pu" |
| Jon Smirl | Re: [PATCH 1/2] t7001: add test for git-mv dir1 dir2/ |
| Junio C Hamano | Re: Octopus merge: unique (?) to git, but is it useful? |
| Problem in scim in Fedora 9 | 48 minutes ago | Linux general |
| Firewall | 14 hours ago | OpenBSD |
| windows folder creation surprise | 17 hours ago | Windows |
| IP layer send packet | 1 day ago | Linux kernel |
| dtrace for linux available | 1 day ago | Linux kernel |
| Unable to mount ramdisk image using UBoot while upgrading to 2.6.15 kernel for a MPC8540 based target | 1 day ago | Linux kernel |
| RealTek RTL8169 - can't connect | 2 days ago | NetBSD |
| vsftpd Upload Problems | 2 days ago | Linux general |
| creating con folder in desktop | 2 days ago | Windows |
| Question about PFRA, dirty page cache, and tmpfs/swap | 2 days ago | Linux kernel |
