Pliant 'FastSem' semaphore implementation (as oppsed to 'Sem') uses 'yield'
http://old.fullpliant.org/
Basically, if the ressource you are protecting with the semaphore will be held
for a significant time, then a full semaphore might be better, but if the
ressource will be held just a fiew cycles, then light aquiering might bring best
result because the most significant cost is in aquiering/releasing.
So the aquiering algorithm for fast semaphores might be:
try to aquire with a hardware atomic read and set instruction, then if it fails,
call yield then retry (at least on a single processor single core system).
-