The kernel newbies community often gets inquiries from CS students who
need a project for their studies and would like to do something with
the Linux kernel, but would also like their code to be useful to the
community afterwards.In order to make it easier for them, I am trying to put together a
page with projects that:
- Are self contained enough that the students can implement the
project by themselves, since that is often a university requirement.
- Are self contained enough that Linux could merge the code (maybe
with additional changes) after the student has been working on it
for a few months.
- Are large enough to qualify as a student project, luckily there is
flexibility here since we get inquiries for anything from 6 week
projects to 6 month projects.If you have ideas on what projects would be useful, please add them
to this page (or email me):http://kernelnewbies.org/KernelProjects
thanks,
Rik
--
"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it." - Brian W. Kernighan
-
Hard stuff:
* network character device -- similar to nbd, but for char devices.
either figure out how to forward ioctls(), or implement
usb-over-network, or...* openMosix -- they seem to have userspace solution, but not GPLed.
* compression for ext4. Its about time someone did it right. Special
bonus if you can do it in a way that it does not slow down. If cpu
is free, compress, if it is busy, just write it straight to disk.
Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
So if I decide that the cpu is busy (because something is asking me to
write the cpu is clearly doing something and hence busy), then I can
skip compression and just write to disk. So by that definition ext4
already does compression. What a simple project. :)Did you mean it ought to come back and do the compression later?
Is it possible that for some data compressing it and writing will take
less time than not compressing it and writing it to disk?--
Len Sorensen
-
Yes. Typically for all zeros. It will be similar for
highly-compressible data (pictures, timetables, ....)root@amd:/data/tmp# time ( cat /dev/zero | head -c 100000000 > delme; sync )
0.04user 0.48system 6.52 (0m6.521s) elapsed 7.97%CPU
root@amd:/data/tmp# time ( cat /dev/zero | head -c 100000000 > delme; sync )
0.05user 0.61system 6.33 (0m6.333s) elapsed 10.42%CPU
root@amd:/data/tmp# time ( cat /dev/zero | head -c 100000000 | gzip - > delme; sync )
1.57user 0.32system 1.74 (0m1.749s) elapsed 100.00%CPU
root@amd:/data/tmp# time ( cat /dev/zero | head -c 100000000 | gzip - > delme; sync )
1.61user 0.18system 1.65 (0m1.652s) elapsed 100.00%CPUPavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
I don't have one. I graduated 7 years ago. I was just pointing holes
If it doesn't it seems the compression feature is going to be rather
unpredictable and my optimization would be perfectly within spec andThat would make it tricky to say if you should ever skip compression due
to cpu load. There is a chance cpu load would be better off by doing--
Len Sorensen
-
And most executables. There's a reason why my vmlinux files are 11M and my
IBM's AIX supported file system compression on the JFS filesystem years ago. I
was able to get up to 30% throughput increases by converting the /usr
filesystem to compressed - because even a 33mhz Power chipset could read in 5
512-byte blocks and decompress it to the original 4K faster than the disk could
read in 8 512-byte blocks. Oh, and it worked for compression on r/w workloads
as well - that was one of the ways to get a RS6K model 250 (which was a
PowerPC601 chipset, a dead heat with a Mac 6600 (same chipset, same clock) to
handle a million e-mail msgs/day - even /var/spool/mqueue worked better.Given that today there's an even *bigger* disparity in CPU speed versus disk
speed, I'd be surprised if it doesn't help today too. As a first try, you
might consider compressing each 4K filesystem block in-place, and only write as
many sectors as the compressed takes (with the obvious fix for the pathological
"grows with compression" case of "just write it without"). Probably even
more wins can be found if you find a way to store the compressed chunks in a
way that minimizes seeks, but that's a filesystem design issue and probably
a too-large project (It's easy to do the stupid way - just store the whole
file as compressed - the tough part is doing it and not making lseek() *too*
painful. Trying to figure out where in a .gz file byte 65536... ouch. ;)
On Fri, 02 Nov 2007 23:08:23 -0400
The problem is that disk seek times have not gotten much
faster over the years, while disk throughput rates have
skyrocketed.Transferring a little less data is not going to help you
when 80% of your disk time is spent seeking, not reading
or writing.--
"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it." - Brian W. Kernighan
-
however, if you can manage to avoid seeks by packing more data onto each
track (or each stripe of a raid array) you could probably see a
significant winthat's something for aspiring (and experianced) filesystem designers to
struggle with for a while (especially trying to figure out what the size
of a track or stripe is for the optimal layout)David Lang
-
This sounds like flash based media are an ideal candidate for compression.
No seek times to speak of, transfer rates that are lower than those of
disks and limited capacity.I believe JFFS2 (a flash filesystem) allready does compression though.
-
-
On 10/15/2007 8:01 AM, Rik van Riel wrote:
> The kernel newbies community often gets inquiries from CS students who
> need a project for their studies and would like to do something with
> the Linux kernel, but would also like their code to be useful to the
> community afterwards.
>
> In order to make it easier for them, I am trying to put together a
> page with projects that:
> - Are self contained enough that the students can implement the
> project by themselves, since that is often a university requirement.
> - Are self contained enough that Linux could merge the code (maybe
> with additional changes) after the student has been working on it
> for a few months.
> - Are large enough to qualify as a student project, luckily there is
> flexibility here since we get inquiries for anything from 6 week
> projects to 6 month projects.
>
> If you have ideas on what projects would be useful, please add them
> to this page (or email me):
>
> http://kernelnewbies.org/KernelProjectsWell, I know something that might be interesting for kernel newbies
including students. So let me share it with you.
It's Ubuntu 7.04 based LiveCD with TOMOYO Linux kernel.Directions:
1. visit the following URL and save ISO image
http://tomoyo.sourceforge.jp/wiki-e/?TomoyoLive
2. burn CD/DVD and boot from the disc
(or start up VM from the downloaded image)
3. open "TOOMYO Linux Policy Editor" icon on the gnome desktop
4. browse "domains" with cursor keys
you can see how processes were created (great experience)
5. choose a domain and enter return key
you can see the behavior of the selected "domain" (ACL mode)
6. enter return to step 5 (domain transition mode)
(repeat 5-7 as you like, type 'r' to refresh screen)
7. enter q to quit the editpolicy programAs it's LiveCD like KNOPPIX, hard disks will not be
affected unless you mount them and operate with intention.
I mean, it's safe to play w...
How about a static code tool that will check for initialization races?
yesterday I found a lurker bug in some of my code that wouldn't have
been exposed had not tripped over it. I wrote some infrastructure code
that initializes its lists and notification trees in late_init.Then I found out that there was as client of my infrastructure calling
my register API at core_init time. It didn't crash / fail noticeably,
but wasn't correct, because at that time I was using a static array.
When I changed my code to use an array of pointers instead it went boom!
(FWIW I've fixed this issue for now...)It made me feel uneasy how that issue got by un-noticed and I worry that
there could be more like it. A tool to scan the code for boot up init
calls and check for any callers into any module for entry before the
module is fully initialized.--mgross
-
Hello,
I read the messages about the company list and now this CS projects list and
I was wondering if is there any similar list of labs/universities that host
PhD projects related to the Linux kernel. I am thinking about switching from
physics to CS and it would be really cool to work with the kernel.Thanks in advance,
Guilherme
-
You might take a look at proceedings for conferences with recent
linux-related papers (linuxsymposium.org, usenix.org, linux.org.au, ?)
and look for urls and presenters with .edu addresses.--b.
-
Is there already a make config option that will do a good job at setting
a default .config file based on what is already running on a system?I get tiered of trimming down my .config for my laptop build so it takes
less than 30min to build a kernel.Bonus credit to additional "expert" options (like those powertop puts
out) for target uses, laptop, HPC, home file share, embedded targets....Oh, and lets make the expert configs easily extensible.
-
another config thing that would be nice would be to take something like
Rob Landley's miniconfig tool and make it work well enough to be
integrated (it creates a version of .config that only contains the things
that need to be set, not everything that's at a default that doesn't make
any difference)David Lang
-
Ehh? You do it once, then leave it aside or in /proc/config.gz, on new
kernel copy it back, "make oldconfig", answer several questions and here
-
yeah I know that. Its a lot more than a few questions, and as we are
talking about a linear search for a fully tweaked .config where each
pass takes 30 min to know if things work this isn't how I want to spend
my time.-
Ah yes, but then you buy a new system to which the old config does not
apply.Folkert van Heusden
--
www.vanheusden.com/multitail - win een vlaai van multivlaai! zorg
ervoor dat multitail opgenomen wordt in Fedora Core, AIX, Solaris of
HP/UX en win een vlaai naar keuze
----------------------------------------------------------------------
Phone: +31-6-41278122, PGP-key: 1F28D8AE, www.vanheusden.com
-
I have discussed this briefly with Kay Sievers.
What udev can provide is the list of modules needed, so what the kernel
need to provide is a simple module to CONFIG option(s) converter + a base
config to start out with.
Nothing particular difficult but needs a few days work to do.Sam
-
could you explain better what you need? I think I've already such
tools ;-)ciao
cate
-
base function:
Starting from a stock distro (FC, Ubuntu, OpenSuSE...) and put down a
kernel.org tree and automatically create a .config with all the drivers
needed for the platform I'm building on.expert configs for different applications:
laptop battery, vitalization, HPC, tiny, multi-media, testing--mgross
-
Too easy. Since opensuse's udev loads most of the modules for your hardware,
all that would be needed is to transform the lsmod list of modules plus
the static options in /proc/config.gz (stuff like psmouse) back
into kconfig options ;-)
-
but than you miss the UBS devices that you eventually plug in.
Anyway, in attachment I send:
a python script that will create the "mod" file. Call it with
one argument: the kernel source directory.The second file "mod" is the output: it lists module with
proper dependencies.BTW I'm restoring the autoconfiguration (but more hackerish
that the old tante versions ;-) )ciao
cate
On Tue, 16 Oct 2007 22:09:04 +0200 (CEST)
Well, at that point it does not know whether or not you
occasionally plug in an ipod or a digital camera.Going back from the lsmod output to all the right CONFIG
options is also not as trivial as it sounds, due to all
the dependencies there are.This project sounds like it could be a great undergraduate
project, maybe built on top of Ketchup to automatically
fetch, configure, compile and install a working kernel :)Are there any volunteers to write down the project
description on the kernelnewbies.org wiki?--
"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it." - Brian W. Kernighan
-
As part of Linux Kernel Driver DataBase, yesterday I "solved"
also this problem:
From a module name, I can obtain relative the kernel
configuration item.You can see the result in http://cateee.net/lkddb
(grep '^drv module' drivers-db). I count 2570 such items.But I've some problems on few cases:
sometime there is one module name with more CONFIG_s.
Normally such cases happens in modules on the same directory,
as support module or as parent module.
I don't see a method to distinguish the right (minimal)
configuration.One solution would be to remove some dependencies on
Makefile, and checking and ev. creating such dependencies
on Kconfig. But this require a kernel modification.Or you think there is a better (non-invasive) method?
ciao
cate-
you can ask the user to plugin all the different devices that they want to
use when doing the config scanbonus points if you have both the ability to go from nothing to a config
_and_ take an existing config and add any additional drivers needed for
the current hardware-
Which is why building an allmod kernel (or what the distros do)
is IMO the better solution.
-
if all you want is a config that will work you are right.
however if you want a good base for an optimized, minimal kernel it's not
much help (other then possibly as a stepping stone to then examine all the
modules that were loaded and document which ones are needed)David Lang
-
that would be cool.
--mgross
-
How about this in the Device Mapper raid-1/mirror code?
/* FIXME: add read balancing */That comment has been in there for many releases. I've wanted read
balancing for several servers and had all sorts of ideas about it, like
adding functions to the underlying device queues to return a "queuing
cost" to determine which is the best queue to add the read request. I
think that could work better for queues like CFQ than the MD
closest-head.An implementation would also need to be benchmarked against the MD
raid-1.Along with the time to submit it to LKML, get it reviewed and polish it
up, it might make a good student project.
--=20
Zan Lynx <zlynx@acm.org>
another couple of raid enhancements would be:
1. teach the system that a raid456 stripe is handled most efficiantly if
treated as a single block of databy this I mean that if you read one block from the stripe the system reads
the entire stripe, so it should take this into account when doing
read-ahead and not always throw away most of the data it read becouse it's
outside the current readahead window (if nothing else, look at putting it
on the tail of the LRU list instead of just forgetting it)if you write one block of the stripe the system must read the stripe, then
update two blocks of the stripe (the data block and the parity block), but
if you are going to write the entire stripe out you can ignore whatever's
there and just calculate the parity block from the data you are writing.
this should make writing to a raid456 stripe as fast as writing to a raid0
stripe (well, almost, you have one more block to write).2. not directly a kernel project, create userspace tools that make
managing raid and partitioning on linux as easy as the zfs tools3. there is currently the ability to grow a raid56 array by adding a disk,
but there is not the ability to take a raid5 array, add a disk and make
the result a raid6 array.David Lang
-
On Mon, 15 Oct 2007 11:10:32 -0600
I've written down the basic description:
http://kernelnewbies.org/KernelProjects/Raid1ReadBalancing
Could you add any ideas that you have to the page?
It is a wiki, so anybody can edit the site (after
creating an account).--
"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it." - Brian W. Kernighan
-
I'm also quite interested in what compsci students can do for the
kernel project. I'm currently doing a little embedded development and
research at school, but I and a few others would jump at the chance
to work on the kernel (besides finding duplicate problems that the
x86 merge is already taking care of, of course. ;)Also (as an aside), we're looking at redoing our operating systems
curriculum out here at school...anyone aware of (relatively good) OS
curricula? (time scope: one semester.)regards/thanks,
--
Doug Whitesell
CSU Channel Islands - Computer Science
"Unprecedented performance: nothing we had has ever worked like this
before..."
-
Maybe this:
Allow removal of select from Kconfig files
Difficulty: 4
Many config options depend on other options is unrelated submenus. As a
result, people have complained about not being able to select the
desired option because they finding all dependencies is too complicated.
Select solves this problem and creates a near-identical new one. Now it
is just as hard to turn some options _off_ as it was before to turn
others _on_.The solution would be to have smarter tools that give the user
information roughly like this:
[ ] CONFIG_FOO
If you enable this option, you will also enable CONFIG_BAR.
Or :
[x] CONFIG_BAR
If you disable this option you will also disable CONFIG_FOO
and CONFIG_FOO2.Difficulty is somewhat increased by the number of tools that require
such functionality. Support for xconfig and menuconfig appears to have
priority as those users have a harder time grepping the kernel.Jörn
--
There is no worse hell than that provided by the regrets
for wasted opportunities.
-- Andre-Louis Moreau in Scarabouche
-
Hi Rik.
In the kernel build area a few possible projects exists.
Increase speed for a build with no updates
==========================================
On a resonably fast machine with a decent config it takes
roughly 10 seconds to do a make where nothing is updated.
Generating one single Makefile is assumed to speed up things
and will in addition allow a simpler syntax as what is used today
for some of the uglier constructs.Contact: Sam Ravnborg <sam@ravnborg.org>
Difficulty: 5
Language: Perl or CIncrease speed for a build wich updates a single file
=====================================================
We often edit a single file and then do a build.
And the result is that we spend 80% of the time linking
the kernel.
So an obvious improvement for the kernel community would
be to improve the speed of the linker (and decrease memory footprint).Contact: ?
Difficulty: ?
Language: CUpdate menuconfig to a modern ncurses look&feel
===============================================
htop, aptitude, tig and other ncurses based programs has
a more modern and effective look&feel than current menuconfig.
Rip out all the lxdialog stuff and replace it with a ncurses
based frontend that looks better and has more functionality.Contact: Sam Ravnborg <sam@ravnborg.org>
Difficulty: 5
Language: CThey are independent but challenging and would be very much appreciated
by the kernel community.I could come up with more projects but these are the ones that are most
straightforward to start with.Sam
-
Isn't make -j 2 or more implemented by running multiple make in sub-dirs ?
Parallel make is more and more used even on cheap hardware.--
Phe-
The kernel build system supports parallel make and I guess all
kernel developers use it. People tell me that a 32 way machine is
quite good for kernel compilation.The bottleneck is that we spawn so many make instances and each have
to read all the same makefiles and stat in total a zillion files
for a simple kernel build.With a single Makefile we can run a single instance make where
we read all files only once and stat the same file only once.Sam
-
-
make -j works fine with an unique Makefile, if that's the question.
Xav
-
Even now, make -j8 really pays off on bigiron AMD.
-
On Mon, 15 Oct 2007 16:23:52 +0200
Thank you Sam, I have added your project ideas to the page.
--
"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it." - Brian W. Kernighan
-
Thanks for adding these.
Sam
-
Thanks very much, Rik. I need this eagerly.
I want to find a kernel project that can both be my graduation thesis
and contribute to the Linux kernel community. I read that page and
think your project--Swapout Clustering is interesting for me.
Is it alright for me to work on it? And can you give some help?Thanks!
--
May the Source Be With You.
-
On Mon, 15 Oct 2007 18:40:34 +0800
You would be the third student to take on that project
simultaneously. That is not a problem for me (on the
contrary, it increases the chances of one codebase being
likeable to Linus), but it does decrease the chances of
your patch being the one to make it upstream.Still, it should be a fun project to implement and
benchmark, so go ahead if you want.--
"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it." - Brian W. Kernighan
-
| Natalie Protasevich | [BUG] New Kernel Bugs |
| Greg KH | [GIT PATCH] driver core patches against 2.6.24 |
| Bart Van Assche | Integration of SCST in the mainstream Linux kernel |
| Andi Kleen | [PATCH x86] [0/16] Various i386/x86-64 changes |
git: | |
| Gerrit Renker | [PATCH 27/37] dccp: Integration of dynamic feature activation - part 2 (server side) |
| Jarek Poplawski | [PATCH] pkt_sched: Destroy gen estimators under rtnl_lock(). |
| Linus Torvalds | Re: [GIT]: Networking |
| Jeff Kirsher | [net-next PATCH 1/7] e1000e: enable CRC stripping by default |
| Jukka Andberg | ata/wdc vs gcc3 on amiga |
| YAMAMOTO Takashi | Re: wd.c patch to reduce kernel stack usage |
| Jason Thorpe | Re: ksyms patches. |
| rick | NFS transport |
