Re: LINUX Scheduling

!MAILaRCHIVE_VOTE_RePLACE
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]
Date: Sunday, June 27, 1993 - 8:19 am

Cc: 

In article <20jor2$2d0@hunter-05.cs.strath.ac.uk> gor@cs.strath.ac.uk (Gordon Russell) writes:

The linux scheduling algorithm is one of the simplest ones possible:
what essentially happens is roughly:

 - every task gets a fixed number of ticks (100 Hz), defaulting to 15. 
   At each clocktick the timer interrupt decreases the current task
   counter, and when the counter reaches zero, it sets the
   'need_resched' flag.  Note that this doesn't actually result in a
   task switch yet. 

 - at every return to user mode, the low-level routines check the
   'need_resched' flag, and call the scheduler if it is set, otherwise
   just do a normal return. 

 - the scheduler just checks all processes, selects the one that has the
   highest tick-number, and lets it run.  If all runnable processes have
   zero ticks remaining, the scheduler does an aging algorithm and tries
   again. 

 - the aging algorithm just does a divide-by-two on the tick counter (0
   for all runnable processes, but may be non-zero for sleeping
   processes), and adds the base timer value (15 by default, but changed
   by 'nice' etc) to the result. 

The above is the basic logic - very simple, and results in round-robin
behaviour when all tasks are CPU-bound.  If something sleeps for a
longer time, it will get a higher tick-counter due to the aging
mechanism, so that when it wakes up it can have up to 2*default ticks to
do any work (although I doubt the aging actually makes much difference). 

In addition to the above, the linux 'wake_up()' functions also set the
'need_resched' flag if the task awakened has a higher tick counter than
the currently running task.  This is done to get better IO throughput
and response times.  Note that linux does not keep any state information
in the scheduler: this makes the actual scheduling a bit inefficient,
but gives more freedom in the sleep/wakup code to do some things that
would otherwise be hard to do cleanly. 

The linux scheduler can be bogged down under heaveir load, but the logic
hasn't changed for some time, mainly due to:

 - it's simple, and *very* flexible.  Anybody interested in why things
   are done as they are, should probably take a look at the
   "add_wait_queue()" and "remove_wait_queue()" functions, especially
   the way they are used by select() and the tty routines.  It can be
   confusing to see a process set it's own state to TASK_INTERRUPTIBLE
   (ie sleeping), yet still continue to run, but there are very good
   reasons for it (races). 

 - it actually gives reasonable performance and good responsetimes under
   most circumstances, and has little overhead due to various priority
   calculations.  Jim Wissner actually did some testing on the algorithm
   and implemented a more clever version (with priority queues), but he
   also said that the simple version seemed to work very well in almost
   all circumstances.

Hope that gave you some idea about it,

                Linus
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]

Messages in current thread:
Re: LINUX Scheduling, Linus Torvalds, (Sun Jun 27, 8:19 am)