> On Tue, 2006-06-27 at 11:38 -0700, James Richard Tyrer wrote:
> > One of the reasons that SMP systems don't scale proportionately is that
> > the processors compete for memory access. It should be noted that this
> > is also the reason that faster clock speeds don't scale proportionately
> > -- memory access becomes the limiting factor.
>
> Intel CPUs has a traditional Front Side Bus (FSB). This means a single
> shared bus between chipset, memory and cpu(s). In other words you have
> two problems; several components compete for bandwidth and noise.
>
> Recent example: on the Xeon line of cpus they could only do 533Mhz
> instead of 800Mhz like the P4 could do because of the reduced signal
> quality due to having another cpu on the bus. This of couse reduced
> the total FSB bandwidth while doubling the theoretical computation rate.
>
> In other words, the already memory/io starved cpu got even worse when
> using two cpus. If using 4-8 cpus this problem gets really critical
> since the FSB is still constant at 533Mhz. This is to some extent
> patched over by increasing cache sizes, something Intel has done to a
> great extent lately.
>
>
> AMD has solved this problem excellently.
> In a single-cpu system the cpu has 1 or 2 integrated memory controllers
> plus a separate Hyper Transport (HT) connection to the chipset. This
> makes sure that chipset and memory traffic never has to compete for
> bandwidth, and also the latency for the cpu to request something from
> memory is greatly reduced.
>
> In 4-cpu systems each cpu has 3 HT connections plus 2 memory channels
> each. Thus you have 8 memory channels, and this provides a huge max
> memory bandwidth.
> One of the cpus use one HT connection to connect to
> the chipset, the other two are used to connect to cpu 2 and 4.
> Cpu 2 uses its HTs connected to cpus: 1,3,4
> Cpu 3 uses its HTs connected to cpus: 2,4
> Cpu 4 uses its HTs connected to cpus: 1,2,3
>
> As you can see cpus 1 and 3 have no direct connection and has to
> relay through cpu 2 or 4 (chooses the one with least load).
> This really has no big impact on performance since the HT links
> are more than fast enough to handle the traffic.
>
> This means that communication between cpus, from any cpu to any
> memory chip, from any cpu to chipset is a lot more complicated
> than just using a common FSB. But this is also regarded as a
> _very_ good solution to the problem since this scales upwards
> to 8 cpus very nicely.
>
> Also, if the OS supports NUMA, it can make sure that a process
> runs on a cpu closer to the memory bus than if just distributed
> randomly. This is a great thing on big systems since on 8-way
> systems a cpu might have to use several relays on its way to
> a distant memory chip. Smaller systems have relatively little
> advantage of it.
>
>
> Intel is rumored to be implementing a similar design as AMD in
> its upcoming cpus.
>
>
> I hope this gives you some valuable information, and hopefully
> I don't have any bad errors in my interpretation of the designs.
>
> -HK
>
> _______________________________________________
> Open-graphics mailing list
>
Open-graphics@duskglow.com
>
http://lists.duskglow.com/mailman/listinfo/open-graphics
> List service provided by Duskglow Consulting, LLC (
www.duskglow.com)
>