I want to modify the linux kernel 2.4 to treat the compressed files as the directory & allow all the directory operations to be performed on them.
This modification will obsolate all the Compression & Decompression tools avilable today,like GUNzip etc.
I want to do this as my semester project.
I have around 3 months with me.
Please tell me about its feasibility.
Or you could do something useful instead
Disc compression exchanges computing power (expensive) for
storage space (cheap), and so it is rarely of any value.
There are already compressed filesystems such as cramfs, an
extension to ISO9660 and plugins for JFFS2. These are aimed at
constant files, or read-only devices where decompression
effectively increases the transfer rate.
Beyond all possible doubt, the compression commands are required.
I do not want the kernel on an ftp/http server to decompress files
before sending them over the internet to me.
There are a few kernel TODO lists and wish lists. Some of them
created by people who know what they are talking about.
My personal item for the wish list is to improve quality.
I have heard worrying things about SCSI error reporting when
the SCSI command set is used over a non-SCSI bus like SATA or USB
(I could be out of date). Proper open source drivers for wireless
network interfaces and graphics cards would be nice too.
That's kind of old thinking.
If you can reduce the number of blocks you have to read any given time, you run the risk of speeding everything up since the disk IO is so muc slower than the CPU cycles which are becoming disposable. I suppose that's why every filesystem that has been created in the last few years seems to have compression built in or a plan and mechanism for adding it.
Now what would be cool is doing it at the device mapper layer, like snapshots, seemless to the filesystems. You'd have to somehow communicate the differences in available space to the filesystem (maybe a resize?) as you compress blocks.
There are easier ways to solve
the I/O throughput problem. Striping and such will get you much better performance than compression ever will.
Besides, what really kills when doing disk I/O isn't throughput, but seek times, and compression does absolutely nothing to solve that.
Ok...
Show me how to stripe 2.5" 4500 RPM laptop drives in a subnotebook. Actually, first, show me how to get more than one in there. :-) Servers aren't the only systems that benefit from performance improvements.
Compression can reduce the number of seeks by reducing total file size. For a given level of fragmentation, the total number of seeks for a linear read of a given file is proportional to its file size. (Granted, there's a bunch of seeking for metadata too.)
Well...
How about just buying a 5200RPM 2,5" drive instead? That will improve both your seek times and throughput, and in terms of time/money invested it's likely the cheaper solution.
look at fuse ( http://fuse.so
look at fuse ( http://fuse.sourceforge.net/ ), might be a reasonable interface to implement your idea, but is probably already done.
Is it possible using FUSE?
I will keep it as option for the user whether he wishes to enable
compression & decompression at Kernel Level.
Fuse is ,I guess, different thing
I will illustrate what will be possible if I modify
the kernel.
Suppose there is one txt file inside zipped file sample.zip
With this modification in Linux Kernel.You will be able to execute following commands
on zipped(or tarred) files
ls sample.zip ( will give you list)
vi sample.zip/abc.txt ( will allow you to edit the file)
etc..
sample.zip may reside on any filesystem!!
Is it possible to do such a thing using FUSE?
Please reply
'ls sample.zip' already has a
'ls sample.zip' already has a well defined meaning.
Changing it to list to contents of a zip file will break existing
scripts, so that would be a really unpopular change.
'vi sample.zip/abc.txt' is more subtle. Changing vi to look
inside zip files could be considered a feature, but making it
general will break any program expexting ENOTDIR.
For your next challenge, tar stands for Tape ARchive, and was
designed on the assumption that random access was very expensive
on backup devices. Tar files can be read in sequence or appended to
efficiently. If you want to read the last file in a tar archive,
you must start at the beginning. If you want to modify a file, you
add a new copy to the end of the archive. This gets more difficult
with compression - you would have to regenerate the compression
algorithm's internal state to append data to a compressed file.
Zip files are better suited for random access because each file
is compressed separately, and the required offsets are stored to
allow random access at file level (but not block level). The bad
news is this reduces the effectiveness of the compression algorithm.
Block level compression hits the compression algorithms
even harder. Knoppix CD's use zisofs so each file on the CD is
compressed, but most of the files are on an ISO image mounted
via a loop device so the compression algorithm has the bulk of
the ISO image to work on instead of lots of little files.
For compression to work well on a filesystem, the filesystem
needs to increase its block size to get better compression, but
not increase the size so much that much of the time is it
decompressing and compressing unused data.
A more interesting concept would be making swap hierarchical.
It present, low priority swap partitions are not used until high
priority ones are full. A more interesting behaviour would be
to have data copied from fast swap devices to slower ones when
the fast device is nearly full. Servers often have a totally
redundant GPU and memory. That memory is slower than system memory
(From the CPU's point of view), but much faster than disc or flash.
If video memory was used as a fast swap device, there is some chance
of getting a performance boost. (Fitting sufficient system RAM
would be a much better choice unless the working set is truely
enormous). Another idea I have heard of is to compress VM
pages that are good candidates for being swapped out, and only swap
out compressed pages. Again this is probably a marginal performance
gain that does nothing of value for the majority of systems.
I think he wants to do what w
I think he wants to do what windows can do. Open and display a zip as a directory. Windows cannot modify the files in the zip, on the fly, just open them. Doesnt gnome do this already?
Dunno, but you're on the right track
I dunno if GNOME or KDE do this, but the appropriate place for "treating ZIP files as directories" is definitely in the shell (graphical or otherwise) and not the kernel. If you want a transparently compressed filesystem (and Windows provides that, too, with folder/file granularity), that needs to live at the kernel level. If you don't understand the difference between the two, then perhaps this isn't the debate you want to participate in. :-)
All that said, if you want read-only access to a ZIP, making ZIP files look like directories, why don't you look into FUSE and see if you can do something there?
Can you explain what I can do
Can you explain what I can do at Fuse?
Suppose I want to let user do the commands ls,cd,vi on zip files.
Should I modify each of this program individually?
I think modifying kernel to look into zip & present it as the directory would be a good idea?
And tell me how it is possible to do in SHELL?
I am talking about Command line interface and not KDE or Gnome.
I will do this project to understand internals of Kernel!!
If you have any good topic (other that this),let me know!!
Regards!!
Check out Reiser v4
History
Reiser v4 already proposed this as a plugin. Linus and many others came down hard on it because of the interface issues (which were also mentioned above).
This idea isn't tied to Rieser, it dates back of coarse. Novell has one and Microsoft had "DriveSpace" (oops, they got sued for that, I mean "Double Space"). Just search around and you'll find things, like Hans talking about the plugin:
http://kerneltrap.org/node/5654
So, if this is really just a programming project and not student research then enjoy practicing your coding. I estimate that would be the main benefit from such a project.
Methods
I do agree that FUSE is the thing to use, or just put it in an app and forget modifying the kernel. You could look into using device mapper (example use: dm-crypt) as well.
Ending Comment
I suggest that you don't say it will "obsolate all the Compression & Decompression tools avilable today" because that means you must implement _all_ the compression algorithms out there today, many of them might be protected by some form of IP or another. Building yet another tool that does xxx is not the way to replace of all existing tools with a global standard - it just causes more fracturing.
Ya,It is just a programming p
Ya,It is just a programming project,That's why We are doing it!!
Hope that I will get support from this site