Yes, UltraSPARC has a RAS or Return Address Stack. I think it has
effectively zero latency (ie. you can call some function, immediately
"ret" and it hits the RAS). This is probably because, due to delay slots,
there is always going to be one instruction in between anyways. :)
It doesn't flush the pipeline, it just stalls it waiting for the
address computation.
Branches are predicted and can execute in the same cycle as the
condition-code setting instruction they depend upon.
There really isn't anything special done here for indirect jumps,
other than pushing onto the RAS. Indirects just suck :)
--