Re: [PATCH 1/7] lguest: documentation pt I: Preparation

Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]
From: Rob Landley
Date: Wednesday, July 25, 2007 - 12:30 pm

On Monday 23 July 2007 10:21:13 pm Randy Dunlap wrote:

My cent and a half:

Writing documentation is not the hardest part.  It's brutally hard, falls way 
behind the rest of development, and we may never have enough of it, but it 
turns out it's not the real _problem_.

The problem, as Rusty pointed out, is that nobody can find the documentation 
we've got because it's horribly indexed.

There's Documentation/ and "make htmldocs" in the kernel, which don't 
cross-reference each other.  Each of those has strong structural constraints: 

  Documentation is text and thus doesn't link out to the rest of the world
  gracefully.  The index really needs to be HTML because it's going to link to
  other HTML, PDF, video, wikis, tarballs of example code, source control web
  interface entries with an interesting checkin comment, and strange things I
  haven't even encountered yet.

  The htmldocs output is generated from the kernel source.  It doesn't even
  link out to the text files in Documentation most of the time, let alone out
  to the web.

People assume including all the documentation into the kernel tarball is a 
reasonable thing to do, but just the Ottawa Linux Symposium PDF files total 
several megabytes, and that's a single source of information, only about half 
of which is actually relevant to the kernel.  (Selecting that half turns out 
to be nontrivial, and it changes over time as speculative things turn real 
and other stuff goes the way of "caloric fluid".  Reiser 4 has bounced back 
and forth something like 5 times now.)

There's documentation out on developer's web pages.  There's documentation in 
wikipedia.  There's documentation on "magazine" style websites (Linux Weekly 
News, Kerneltrap, Linux Journal, and more).  There's documentation in 
developer blogs (kernelplanet.org aggregates several).  There's documentation 
on project pages on sourceforge.  There's documentation in wikis like 
kernelnewbies or Rik van Riel's mm stuff.  There's documentation in freely 
available online books like Linux Device Drivers and Mel Gorman's memory 
management book.

Lots of the time, the _rationale_ for something was explained on linux-kernel 
and the best thing to do to really understand it is link to three or four 
messages out of an lkml archive.  And sometimes, you need a summary.  
(Summarizing the recent GPLv3 discussion from the kernel developers' 
perspective isn't something I'm looking forward to, and no linking to a 1000+ 
message thread and saying "read this" is not a substitute for a coherent 
summary.  Jonathan Corbet does this kind of stuff, but it was still in 
progress last time he wrote about it, and he didn't really try to extract a 
coherent policy decision out of the flamewar and bounce it off Linus for a 
thumbs up/thumbs down.)

So I'm focusing on indexing all this existing (and new) documentation.  I'm 
writing a few bits that I happen to think I know about, or that people come 
to me and ask "where can I find documentation on this" and I can't find any, 
so I research it and write it.  But mostly I'm attempting to turn 
http://kernel.org/doc into the first stop to find something else.

Currently, that page is horrible, mostly because keeping up with the influx of 
NEW information that needs organizing is almost impossible and the huge pile 
of existing information gets neglected.  (I spent almost three months on 
triage, which is kind of frustrating.)  But I'm getting on top of it and hope 
to have a useful (if skeletal) index up there by the end of the week.  
(Moving back to Austin is screwing it up but I'm -><- this close, darn it.)

To get the old to-do heap under control, most of linux-kernel is falling on 
the floor, my to-read pile of things linked from lwn is getting laughably 
long, things are sometimes scrolling off the bottom of kernelplanet.org 
before I get to them, and so on.  But once I've got a skeleton to hang things 
on, I can delegate bits of it (like the whole VFS documentation and 
filesystems under it) to other people.


There's a mailing list, linux-doc@vger.kernel.org, that was _made_ for this 
kind of discussion.  I'm happy to have people tell me what I'm doing wrong, 
make suggestions, or volunteer to tackle some problem.  (Note, after the 
recent hotplug documentation thread, I feel the need to clarify that "you are 
an idiot" does not, in and of itself, qualify as useful feedback.)


Agreed.  Keep in mind that whatever your infrastructure is, you will never be 
generating the bulk of the contributions to it.  You will be integrating 
outside contributions.  And the outside contributions are primarily in HTML 
and PDF these days.  (Postscript less so, but it's still there.  And the 
occasional batch of "source formats" (tex, docbook, etc) from which HTML and 
PDF get produced, but if a web browser can't view the data format directly 
the audience for the documentation's going to be about three people, 
including the author.  Source code comments are an exception to this, but 
that can of worms is familiar enough here already.  I note that man pages are 
also sort of an exception, but a rapidly diminishing one as things like 
doclifter get off the ground.  I note that the maintainer of the man-pages 
package now has his own http://kernel.org/doc/man-pages directory because he 
wants to generate his own html versions rather than having me do it with 
doclifter.  The masters for most of the man pages are now in docbook anyway.)


Keep in mind that the licensing on a lot of documentation allows it to be 
freely redistributed but not freely modified.  (Yes, this sucks, but it's a 
real world problem the same way PDF is a real world format.  Yes you can talk 
to the author, convert it into another format, or write new documentation 
once you've learned what you need from the other documentation.  But this 
takes time.)


I don't want to impose a workflow on documentation authors.  I don't care what 
tools they used to create HTML or PDF: they can do it in emacs or vi, they 
can use a word processor, they can use latex, or something else entirely.  I 
really don't care.  I just want to index the result.  Ideally I want to be 
able to mirror it, and being able to send comments back to the author to get 
updated versions is wonderful but at the moment it's sadly a luxury.

For example, I have Mel Gorman's memory management book mirrored at 
http://kernel.org/doc/gorman but Mel hasn't got time to update it, and he has 
to ping his publisher to see what rights have reverted to him, and when it 
comes to theoretical third-party contributions to it (of which there have so 
far been none) he has to figure out how much control he wants to give up over 
his baby anyway.  I put him and Don Marti of LinuxWorld in touch with each 
other and Mel MIGHT have time to break the book up into a series of smaller 
(updated) articles to run on LinuxWorld, each of which would be much easier 
to replace with a new version if one of them gets too badly outdated...

Half the time, when something gets passed from maintainer to maintainer, it 
gets remastered into a new source format anyway.


Yay Rusty!  He writes good docs.  Kudos to his hamster.

Now the questions are:

1) How do people _find_ this documentation.
2) Review and integrating feedback.
3) Keeping it up to date in future.

I'm happy to index Rusty's doc.  This thread hasn't included the URL?  Google 
comes up with the lwn.net copy of the start of this thread:
http://lwn.net/Articles/242558/

I'm trying to maintain a documentation index: I can't maintain all the 
documentation I index, any more than Linus personally maintains the ipw2200 
wireless driver (and associated firmware).  I can sometimes note when 
something's out of date and try to do something about it, but the "something" 
varies on a case-by-case basis.


If by the current kernel-doc tools you mean the giant perl script to beat 
docbook out of javadoc-style comments out of the kernel source with regexes, 
that's great for documenting the arguments to functions in the kernel, and 
kind of sucks for correlating the most recent release of the man-pages 
package (which documents the syscalls and such people actually _use_) with 
anything else.

Here's a video of a talk Linus Torvalds' gave on git a couple months back:
http://youtube.com/watch?v=4XpnKHJAok8

Is this something kernel developers might be interested in?  Probably.  Is 
this something it makes the SLIGHTEST bit of sense to try to integrate into 
the kernel-doc infrastructure?  Not really, no.

Rob
-- 
"One of my most productive days was throwing away 1000 lines of code."
  - Ken Thompson.
-
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]

Messages in current thread:
[PATCH 1/7] lguest: documentation pt I: Preparation, Rusty Russell, (Fri Jul 20, 6:17 pm)
[PATCH 2/7] lguest: documentation pt II: Guest, Rusty Russell, (Fri Jul 20, 6:18 pm)
[PATCH 3/7] lguest: documentation pt III: Drivers, Rusty Russell, (Fri Jul 20, 6:19 pm)
[PATCH 4/7] lguest: documentation pt IV: Launcher, Rusty Russell, (Fri Jul 20, 6:20 pm)
[PATCH 5/7] lguest: documentation pt V: Host, Rusty Russell, (Fri Jul 20, 6:21 pm)
[PATCH 6/7] lguest: documentation pt VI: Switcher, Rusty Russell, (Fri Jul 20, 6:21 pm)
[PATCH 7/7] lguest: documentation pt VII: FIXMEs, Rusty Russell, (Fri Jul 20, 6:24 pm)
Re: [PATCH 1/7] lguest: documentation pt I: Preparation, Andrew Morton, (Mon Jul 23, 5:12 pm)
Re: [PATCH 1/7] lguest: documentation pt I: Preparation, Rusty Russell, (Mon Jul 23, 6:01 pm)
Re: [PATCH 1/7] lguest: documentation pt I: Preparation, Linus Torvalds, (Mon Jul 23, 6:18 pm)
Re: [PATCH 1/7] lguest: documentation pt I: Preparation, Andrew Morton, (Mon Jul 23, 6:20 pm)
Re: [PATCH 1/7] lguest: documentation pt I: Preparation, Rusty Russell, (Mon Jul 23, 6:39 pm)
Re: [PATCH 1/7] lguest: documentation pt I: Preparation, Rusty Russell, (Mon Jul 23, 6:51 pm)
Re: [PATCH 1/7] lguest: documentation pt I: Preparation, Randy Dunlap, (Mon Jul 23, 7:21 pm)
Re: [PATCH 1/7] lguest: documentation pt I: Preparation, Randy Dunlap, (Mon Jul 23, 8:06 pm)
Re: [PATCH 1/7] lguest: documentation pt I: Preparation, Rusty Russell, (Mon Jul 23, 8:27 pm)
Re: [PATCH 1/7] lguest: documentation pt I: Preparation, Rusty Russell, (Tue Jul 24, 3:28 am)
Re: [PATCH 1/7] lguest: documentation pt I: Preparation , Jonathan Corbet, (Tue Jul 24, 8:13 am)
Re: [PATCH 1/7] lguest: documentation pt I: Preparation, Randy Dunlap, (Tue Jul 24, 9:57 am)
Re: [PATCH 1/7] lguest: documentation pt I: Preparation, Rusty Russell, (Tue Jul 24, 3:35 pm)
Re: [PATCH 1/7] lguest: documentation pt I: Preparation, Rob Landley, (Wed Jul 25, 12:30 pm)
Re: [PATCH 1/7] lguest: documentation pt I: Preparation, Rusty Russell, (Wed Jul 25, 8:35 pm)
Re: [PATCH 1/7] lguest: documentation pt I: Preparation, Rob Landley, (Fri Jul 27, 11:32 am)