Dual core vs. hyper threading

The magic smoke.

Moderators: phlip, Moderators General, Prelates

Dual core vs. hyper threading

Postby enk » Mon Jan 14, 2008 1:52 am UTC

How much is HT like (or unlike) plain dual core?

How does my 3 GHz P4 HT compare to a 1.5 GHz Core 2 Duo for instance?
phlip wrote:Ha HA! Recycled emacs jokes.
User avatar
enk
 
Posts: 754
Joined: Mon Sep 10, 2007 12:20 am UTC
Location: Aalborg, Denmark

Re: Dual core vs. hyper threading

Postby EvanED » Mon Jan 14, 2008 2:11 am UTC

Similarities:
- A SMT (simultaneous multithreading, = HT) chip presents itself to the OS as two CPUs

- The SMT chip can sort of run two threads at the same time


Differences:
- SMT chips share most things. They share dispatch ports, ALUs, memory read/write ports, maybe the reorder buffer, all caches, etc.; Core 2 Duos share just the L2 cache and I/O lines out of the chip between the cores

- In particular, SMT chips are more susceptible to two threads affecting the other's cache behavior. If you're lucky this can actually increase performance as they prefetch for each other, but if they are from separate processes or the same process but are doing much different things, it'll probably be a negative. Multi-core chips can only see this effect at the L2 cache, which means it'll happen less often

- Peak performance of a SMT chip is equal to peak performance of the same chip with HT turned off. (In other words, if the OS sees CPU0 and CPU1 as the two hardware contexts, the peak performance of CPU0 and CPU1 together is equal to the peak performance of CPU0 alone.) Peak performance of a mulitcore chip is equal to twice the performance of one of the cores.
EvanED
 
Posts: 3765
Joined: Mon Aug 07, 2006 6:28 am UTC
Location: Madison, WI

Re: Dual core vs. hyper threading

Postby b.i.o » Mon Jan 14, 2008 2:40 am UTC

Also, Core 2 Duos are a lot more efficient for their clock speed and so it's not an even comparison between the two. Pentium 4's were Intel's attempt to brute force their way to performance superiority by having them run at very high clock speeds. The reason Core 2 Duos are such powerful processors is because they don't do this.
User avatar
b.i.o
Green is the loneliest number
 
Posts: 2511
Joined: Fri Jul 27, 2007 4:38 pm UTC
Location: Hong Kong

Re: Dual core vs. hyper threading

Postby akashra » Mon Jan 14, 2008 2:44 am UTC

EvanED wrote:Similarities:
- A SMT (simultaneous multithreading, = HT) chip presents itself to the OS as two CPUs

- The SMT chip can sort of run two threads at the same time

This is a myth, it's still a single core processor, with only one simultaneous operation in use.

The main difference is the way it provides a much less intensive way of context switching. Internally, what HT does is basically provides a second stack, which can quickly be switched to, on contrast to doing a full context switch, and having to copy around all kinds of data every time a context switch takes place. This massively reduces overhead. In some ways, this can mean HT is actually more efficient than dual-core.
( find / -name \*base\* -exec chown us : us { } \ ; )
akashra
 
Posts: 503
Joined: Tue Jan 01, 2008 6:54 am UTC
Location: Melbourne, AU

Re: Dual core vs. hyper threading

Postby EvanED » Mon Jan 14, 2008 3:05 am UTC

akashra wrote:
EvanED wrote:Similarities:
- A SMT (simultaneous multithreading, = HT) chip presents itself to the OS as two CPUs

- The SMT chip can sort of run two threads at the same time

This is a myth, it's still a single core processor, with only one simultaneous operation in use.

Unless it's superscalar (hint: like the P4), in which case it could be dispatching instructions from both instruction streams at once.

Edit: And even if it's not dispatching instructions from multiple threads (as is the case on, e.g., the Sun Niagra), it's entirely possible and even probable that instructions from multiple streams are in flight, at various stages of the pipeline, at any given instant.
Last edited by EvanED on Mon Jan 14, 2008 3:07 am UTC, edited 1 time in total.
EvanED
 
Posts: 3765
Joined: Mon Aug 07, 2006 6:28 am UTC
Location: Madison, WI

Re: Dual core vs. hyper threading

Postby Anpheus » Mon Jan 14, 2008 3:06 am UTC

akashra wrote:
EvanED wrote:Similarities:
- A SMT (simultaneous multithreading, = HT) chip presents itself to the OS as two CPUs

- The SMT chip can sort of run two threads at the same time

This is a myth, it's still a single core processor, with only one simultaneous operation in use.

The main difference is the way it provides a much less intensive way of context switching. Internally, what HT does is basically provides a second stack, which can quickly be switched to, on contrast to doing a full context switch, and having to copy around all kinds of data every time a context switch takes place. This massively reduces overhead. In some ways, this can mean HT is actually more efficient than dual-core.


Not true, in certain applications the performance is greatly reduced, database applications being highly optimized to work with 'real' cores are chief among them (tests on a stable Linux box with MySQL have shown anywhere from a low single digits performance decrease to a whopping 35% hit.) Performance testing of real-world application rarely shows hyperthreading increasing performance beyond a few percentage points positive, as you said, all it does is provide a marginally less intensive method of context switching. It's... only going to help out some time, and usually only when switching between two threads very rapidly.
Spoiler:
Code: Select all
  /###\_________/###\
  |#################|
  \#################/
   |##┌         ┐##|
   |##  (¯`v´¯)  ##|
   |##  `\ ♥ /´  ##|
   |##   `\¸/´   ##|
   |##└         ┘##|
  /#################\
  |#################|
  \###/¯¯¯¯¯¯¯¯¯\###/
User avatar
Anpheus
I can't get any worse, can I?
 
Posts: 860
Joined: Fri Nov 16, 2007 10:38 pm UTC
Location: A privileged frame of reference.

Re: Dual core vs. hyper threading

Postby akashra » Mon Jan 14, 2008 3:20 am UTC

Did you miss the bit where I said 'can', or are you just trolling for a pointless argument on semantics?
( find / -name \*base\* -exec chown us : us { } \ ; )
akashra
 
Posts: 503
Joined: Tue Jan 01, 2008 6:54 am UTC
Location: Melbourne, AU

Re: Dual core vs. hyper threading

Postby Anpheus » Mon Jan 14, 2008 3:29 am UTC

No, because you're actually wrong because there is no comparison in the performance, the dual core chip will win hands down in every situation, whereas the hyperthreading only provides a better baseline performance when compared to a single-threaded chip.

Edit: Also, don't weasel word your way out of your own statements.
Spoiler:
Code: Select all
  /###\_________/###\
  |#################|
  \#################/
   |##┌         ┐##|
   |##  (¯`v´¯)  ##|
   |##  `\ ♥ /´  ##|
   |##   `\¸/´   ##|
   |##└         ┘##|
  /#################\
  |#################|
  \###/¯¯¯¯¯¯¯¯¯\###/
User avatar
Anpheus
I can't get any worse, can I?
 
Posts: 860
Joined: Fri Nov 16, 2007 10:38 pm UTC
Location: A privileged frame of reference.

Re: Dual core vs. hyper threading

Postby akashra » Mon Jan 14, 2008 3:41 am UTC

You're trying to do the same thing! It's absolutely bullshit to say that a dual-core chip will win *every* time (your words, not mine) - especially if you're loading one core with lots of threads, with the other core a control thread - not at all an uncommon thing to do. Not everything is about raw throughput.

I haven't at all denied that a dual-core chip is generally going to be quicker, but you don't know what you're talking about if you think that it's always going to be.
( find / -name \*base\* -exec chown us : us { } \ ; )
akashra
 
Posts: 503
Joined: Tue Jan 01, 2008 6:54 am UTC
Location: Melbourne, AU

Re: Dual core vs. hyper threading

Postby Anpheus » Mon Jan 14, 2008 3:51 am UTC

Two problems with your counter-argument regarding control threads. The first and most apparent to me was that you'd have to force the threads to maintain those boundaries between cores using an OS API. The control thread / non-control thread core idea is one that you, I believe just invented on the spot. Yes, control threads are sometimes used, no, by default, no OS is going to... I guess you're thinking the OS just knows which thread is a control thread and reserves a core for that thread? Come on. No, the scheduler for Windows at least is going to throw the two most CPU intensive applications on different cores and if one core starts hitting 100% it will pull applications that need more priority off of that core and onto less used ones. The second and less apparent problem was that the above is irrelevant. You see, you're already positing a multithreaded application. In which case, the multiple core machine will win in every case as per the disambiguated situation below. I don't know where you get your ideas but they're a bit of a jumble as it is. Do some research or ask some questions, but don't state facts you aren't aware of.

I will clarify and apologize for saying always. Always, though, is not a weasel word, it is very much the opposite of a weasel word. Saying "always" makes it very easy for people to contradict you, and I didn't make my meaning clear. I'll elaborate on my statement and add a thought experiment of sorts. Compare a single core chip of arbitrary capabilities with an identical chip that has hyperthreading, and an identical chip that replicates all the functions as per modern multiple-core technology (Intel or AMD native multicore, they're pretty close to identical so take your pick.) Each chip has unlimited bandwidth to the rest of the machine. On all applications that use multiple threads, the dual core chip will have the greatest performance. On applications that utilize a single thread, the dual core and single core chip will tie, and the hyperthreaded chip will vary, depending on application, to either a few percent above the throughput of the others, or as much as thirty percent below. However, in no case will a multithreaded application perform more slowly on the dual core CPU than the hyperthreaded single core CPU.
Spoiler:
Code: Select all
  /###\_________/###\
  |#################|
  \#################/
   |##┌         ┐##|
   |##  (¯`v´¯)  ##|
   |##  `\ ♥ /´  ##|
   |##   `\¸/´   ##|
   |##└         ┘##|
  /#################\
  |#################|
  \###/¯¯¯¯¯¯¯¯¯\###/
User avatar
Anpheus
I can't get any worse, can I?
 
Posts: 860
Joined: Fri Nov 16, 2007 10:38 pm UTC
Location: A privileged frame of reference.

Re: Dual core vs. hyper threading

Postby EvanED » Mon Jan 14, 2008 3:56 am UTC

Anpheus wrote:However, in no case will a multithreaded application perform more slowly on the dual core CPU than the hyperthreaded single core CPU.

I'm sure you could concoct an example where this isn't true.

The dual core is almost always the way to go, but there are certain... I would say whatever the opposite of pathological is... examples where the SMT core would be faster.
EvanED
 
Posts: 3765
Joined: Mon Aug 07, 2006 6:28 am UTC
Location: Madison, WI

Re: Dual core vs. hyper threading

Postby Anpheus » Mon Jan 14, 2008 4:00 am UTC

You would be struggling to find an example where a same-speed set of chips, one single core hyperthreaded, one dual-core natively, would perform better on the former.
Spoiler:
Code: Select all
  /###\_________/###\
  |#################|
  \#################/
   |##┌         ┐##|
   |##  (¯`v´¯)  ##|
   |##  `\ ♥ /´  ##|
   |##   `\¸/´   ##|
   |##└         ┘##|
  /#################\
  |#################|
  \###/¯¯¯¯¯¯¯¯¯\###/
User avatar
Anpheus
I can't get any worse, can I?
 
Posts: 860
Joined: Fri Nov 16, 2007 10:38 pm UTC
Location: A privileged frame of reference.

Re: Dual core vs. hyper threading

Postby akashra » Mon Jan 14, 2008 4:00 am UTC

Anpheus wrote:The control thread / non-control thread core idea is one that you, I believe just invented on the spot.

If by 'invented on the spot' you mean 'took from an example of what I work on every day in my full-time job', then yes, you'd be correct.
( find / -name \*base\* -exec chown us : us { } \ ; )
akashra
 
Posts: 503
Joined: Tue Jan 01, 2008 6:54 am UTC
Location: Melbourne, AU

Re: Dual core vs. hyper threading

Postby akashra » Mon Jan 14, 2008 4:01 am UTC

Anpheus wrote:You would be struggling to find an example where a same-speed set of chips, one single core hyperthreaded, one dual-core natively, would perform better on the former.

It might pay you to go do some research on the Sun UltraSPARC T1 processor. Similar kind of technology, it's designed specifically for this kind of problem. They could have just put 32 cores on, but that'd mean way more transistors and cost. Instead it's 8 cores, with 32 virtualised cores (4 threads per core).
( find / -name \*base\* -exec chown us : us { } \ ; )
akashra
 
Posts: 503
Joined: Tue Jan 01, 2008 6:54 am UTC
Location: Melbourne, AU

Re: Dual core vs. hyper threading

Postby Anpheus » Mon Jan 14, 2008 4:06 am UTC

The SPARC T1 and T2 Niagara would suffer from the same problem. Hyperthreading is cheaper in real estate to put on a chip than multiple cores. If they could put 64 hardware cores instead of 8 hardware and 8 threads per core, they'd gain a shitload of performance.
Spoiler:
Code: Select all
  /###\_________/###\
  |#################|
  \#################/
   |##┌         ┐##|
   |##  (¯`v´¯)  ##|
   |##  `\ ♥ /´  ##|
   |##   `\¸/´   ##|
   |##└         ┘##|
  /#################\
  |#################|
  \###/¯¯¯¯¯¯¯¯¯\###/
User avatar
Anpheus
I can't get any worse, can I?
 
Posts: 860
Joined: Fri Nov 16, 2007 10:38 pm UTC
Location: A privileged frame of reference.

Re: Dual core vs. hyper threading

Postby EvanED » Mon Jan 14, 2008 4:08 am UTC

Anpheus wrote:You would be struggling to find an example where a same-speed set of chips, one single core hyperthreaded, one dual-core natively, would perform better on the former.

A real world example? Perhaps. But all you need to do is make it so that the two threads are behaving nicely with respect to the cache when run on an SMT core, and behaving mean when run on separate cores. Both doing lots of writes to the same small area of memory should do that.

Anpheus wrote:The SPARC T1 and T2 Niagara would suffer from the same problem. Hyperthreading is cheaper in real estate to put on a chip than multiple cores. If they could put 64 hardware cores instead of 8 hardware and 8 threads per core, they'd gain a shitload of performance.

On the T1, increasing to 32 native cores would probably give about a 2x speedup over the current setup on the workloads that we were playing around with. (Granted, perhaps not so realistic, but at the same time, perhaps they are. I don't know.) That 2x speedup would be from a 4x increase in transistors.

They would probably be better served to add more SMT cores (increase to, say, 12 four-way SMT cores) rather than do that.

The nice thing about SMT is that it is relatively cheap in terms of silicon, far cheaper than adding a new core.
EvanED
 
Posts: 3765
Joined: Mon Aug 07, 2006 6:28 am UTC
Location: Madison, WI

Re: Dual core vs. hyper threading

Postby akashra » Mon Jan 14, 2008 4:12 am UTC

No, again, because this time instead of having two contexts that you can switch between, you have four. If you're only doing n operations with small stacks, where n is the number of processors, then you won't see any gain, but when you're rapidly switching, having four threads executing on the same core makes much more sense. As soon as you move to n+1, you have to do context switches at some point.

It seems to me that you're arguing that raw throughput is going to be higher, with single threads per core, on a dual-core processor - which is true. However the reality is, systems run far more concurrent threads than we have processors in systems. If we were back in the old days where there was no multitasking, and/or only running a single application each with a single thread for each core, we wouldn't have had the need for hyperthreading.
Things just don't work that way in the real world. The simple fact is, applications tend to use more threads than are available cores.
( find / -name \*base\* -exec chown us : us { } \ ; )
akashra
 
Posts: 503
Joined: Tue Jan 01, 2008 6:54 am UTC
Location: Melbourne, AU

Re: Dual core vs. hyper threading

Postby Anpheus » Mon Jan 14, 2008 4:25 am UTC

And due to this, there's almost always a case where two applications or two threads can run simultaneously. I would say 'always,' in fact, because I'm confident there exists no working environment which suffers under multicore more than hyperthreading.

Regrettably, the argument is moot, modern dual-core processors have individual cores superior to most hyperthreaded architectures... well, you're best bet is to find someone with a dual-core and hyperthreaded Xeon willing to run a VM for a bit.
Spoiler:
Code: Select all
  /###\_________/###\
  |#################|
  \#################/
   |##┌         ┐##|
   |##  (¯`v´¯)  ##|
   |##  `\ ♥ /´  ##|
   |##   `\¸/´   ##|
   |##└         ┘##|
  /#################\
  |#################|
  \###/¯¯¯¯¯¯¯¯¯\###/
User avatar
Anpheus
I can't get any worse, can I?
 
Posts: 860
Joined: Fri Nov 16, 2007 10:38 pm UTC
Location: A privileged frame of reference.

Re: Dual core vs. hyper threading

Postby akashra » Mon Jan 14, 2008 4:40 am UTC

Anpheus wrote:And due to this, there's almost always a case where two applications or two threads can run simultaneously. I would say 'always,' in fact, because I'm confident there exists no working environment which suffers under multicore more than hyperthreading.

Please familiarise yourself with Exchangers, (and to some degree, Barriers and Latches) and then revisit what you've just written.
( find / -name \*base\* -exec chown us : us { } \ ; )
akashra
 
Posts: 503
Joined: Tue Jan 01, 2008 6:54 am UTC
Location: Melbourne, AU

Re: Dual core vs. hyper threading

Postby Anpheus » Mon Jan 14, 2008 4:45 am UTC

Why exactly would I review different locking mechanisms?

Edit: You realize I said "working environment" right? Not "single application instance." I want you to tell me of a real-world working environment that would run better with one core disabled and hyperthreading on, than with both cores enabled and hyperthreading off. To demonstrate this, we can either search for, I believe some of the Xeons that have both available, as I believe there were some dual-core Xeons that had HT, or as I'm not certain about that, we can wait until Nehalem.
Spoiler:
Code: Select all
  /###\_________/###\
  |#################|
  \#################/
   |##┌         ┐##|
   |##  (¯`v´¯)  ##|
   |##  `\ ♥ /´  ##|
   |##   `\¸/´   ##|
   |##└         ┘##|
  /#################\
  |#################|
  \###/¯¯¯¯¯¯¯¯¯\###/
User avatar
Anpheus
I can't get any worse, can I?
 
Posts: 860
Joined: Fri Nov 16, 2007 10:38 pm UTC
Location: A privileged frame of reference.

Re: Dual core vs. hyper threading

Postby akashra » Mon Jan 14, 2008 5:00 am UTC

Now who's trying to weasel their way out of things?
( find / -name \*base\* -exec chown us : us { } \ ; )
akashra
 
Posts: 503
Joined: Tue Jan 01, 2008 6:54 am UTC
Location: Melbourne, AU

Re: Dual core vs. hyper threading

Postby Anpheus » Mon Jan 14, 2008 5:02 am UTC

I'm not, in fact, I would be overjoyed to see you come up with a VM to back up your argument. When Nehalem comes out, we'll be able to prove which one of us is right. Until then, the jury is very much out as, I just looked up, no Xeons simultaneously supported multi-core and hyperthreading. Without a fair comparison we can't prove which one of us is right.

I'll go ahead and say any OS is fine, and that it should be distributed as a disk image or VM to be booted by someone who ends up owning Nehalem. I'll put money on the multi-core throughput beating out the hyper-threading throughput for any benchmarking application of your choice.

Edit: Are those terms acceptable to you?
Spoiler:
Code: Select all
  /###\_________/###\
  |#################|
  \#################/
   |##┌         ┐##|
   |##  (¯`v´¯)  ##|
   |##  `\ ♥ /´  ##|
   |##   `\¸/´   ##|
   |##└         ┘##|
  /#################\
  |#################|
  \###/¯¯¯¯¯¯¯¯¯\###/
User avatar
Anpheus
I can't get any worse, can I?
 
Posts: 860
Joined: Fri Nov 16, 2007 10:38 pm UTC
Location: A privileged frame of reference.

Re: Dual core vs. hyper threading

Postby Anpheus » Tue Jan 15, 2008 6:29 am UTC

I'm sorry I don't want to accuse you of anything so hastily, but... I came up with a way to test which one of us is right, all it would take is a little time, a little effort and knowing someone, anyone who will get Nehalem and is willing to boot an image for us, and... you disappeared?

What conclusions should I draw from that? I'm going to give you the benefit of the doubt and say you didn't see my reply yet or haven't had time to reply, but the alternative is that when confronted with a way to prove your own beliefs true or false, you wavered.
Spoiler:
Code: Select all
  /###\_________/###\
  |#################|
  \#################/
   |##┌         ┐##|
   |##  (¯`v´¯)  ##|
   |##  `\ ♥ /´  ##|
   |##   `\¸/´   ##|
   |##└         ┘##|
  /#################\
  |#################|
  \###/¯¯¯¯¯¯¯¯¯\###/
User avatar
Anpheus
I can't get any worse, can I?
 
Posts: 860
Joined: Fri Nov 16, 2007 10:38 pm UTC
Location: A privileged frame of reference.

Re: Dual core vs. hyper threading

Postby enk » Tue Jan 15, 2008 9:58 am UTC

While we're waiting for Nehalem...

Silver2Falcon wrote:Core 2 Duos are a lot more efficient for their clock speed and so it's not an even comparison between the two.


I know processor architectures improve, but how much?

Comparing my 2004 CPU to the new ones a difficult on the Toms Hardware charts as they use different tests for different years.

I guess a new Core 2 Duo @ 1.5 GHz (I say 1.5 as it's the half of 3) will beat the shit out of my 3 GHz P4 HT, but when will they be about equal? The C2D @ 1.0 GHz? Even lower?
phlip wrote:Ha HA! Recycled emacs jokes.
User avatar
enk
 
Posts: 754
Joined: Mon Sep 10, 2007 12:20 am UTC
Location: Aalborg, Denmark

Re: Dual core vs. hyper threading

Postby Anpheus » Tue Jan 15, 2008 10:13 am UTC

There's no such comparison. You could, of course, use a specific benchmark, but "beat the shit out of" X is so subjective (based on which benchmark you use) in computing that there exists no baseline. You could use MIPS, FLOPS, but even that is subject to the type of benchmarking, whether that's throughput from disk, memory, cache, if you want to find out performance while running a particular OS, whether it will use SSE or just the FPU, etc.
Spoiler:
Code: Select all
  /###\_________/###\
  |#################|
  \#################/
   |##┌         ┐##|
   |##  (¯`v´¯)  ##|
   |##  `\ ♥ /´  ##|
   |##   `\¸/´   ##|
   |##└         ┘##|
  /#################\
  |#################|
  \###/¯¯¯¯¯¯¯¯¯\###/
User avatar
Anpheus
I can't get any worse, can I?
 
Posts: 860
Joined: Fri Nov 16, 2007 10:38 pm UTC
Location: A privileged frame of reference.

Re: Dual core vs. hyper threading

Postby enk » Tue Jan 15, 2008 10:50 am UTC

Anpheus wrote:There's no such comparison. You could, of course, use a specific benchmark, but "beat the shit out of" X is so subjective (based on which benchmark you use) in computing that there exists no baseline. You could use MIPS, FLOPS, but even that is subject to the type of benchmarking, whether that's throughput from disk, memory, cache, if you want to find out performance while running a particular OS, whether it will use SSE or just the FPU, etc.


Can you give me rough estimate...?

If not, what kinds of tasks do the P4 and C2D differ in, provided the rest of the setup is the same?
phlip wrote:Ha HA! Recycled emacs jokes.
User avatar
enk
 
Posts: 754
Joined: Mon Sep 10, 2007 12:20 am UTC
Location: Aalborg, Denmark

Re: Dual core vs. hyper threading

Postby Anpheus » Tue Jan 15, 2008 10:52 am UTC

I don't actually know any hard facts on differences between clock cycles each operation takes. I do know that the netburst architecture (P4 era) advocated an excruciatingly long pipeline while Core 2 utilizes just a whole shitload of processor advancements that came in the interim, a much shorter pipeline compared to the P4, better floating point execution, etc.

I couldn't give you any hard numbers though. If you call up Intel they'll send you a manual for free on their processors, the performance of different operations, etc.
Spoiler:
Code: Select all
  /###\_________/###\
  |#################|
  \#################/
   |##┌         ┐##|
   |##  (¯`v´¯)  ##|
   |##  `\ ♥ /´  ##|
   |##   `\¸/´   ##|
   |##└         ┘##|
  /#################\
  |#################|
  \###/¯¯¯¯¯¯¯¯¯\###/
User avatar
Anpheus
I can't get any worse, can I?
 
Posts: 860
Joined: Fri Nov 16, 2007 10:38 pm UTC
Location: A privileged frame of reference.

Re: Dual core vs. hyper threading

Postby enk » Sat Jan 19, 2008 6:56 pm UTC

Sorry if it's already answered in the skirmish above, but will the P4 run faster non-HT in some situations?
phlip wrote:Ha HA! Recycled emacs jokes.
User avatar
enk
 
Posts: 754
Joined: Mon Sep 10, 2007 12:20 am UTC
Location: Aalborg, Denmark

Re: Dual core vs. hyper threading

Postby davean » Sat Jan 19, 2008 7:13 pm UTC

enk wrote:Sorry if it's already answered in the skirmish above, but will the P4 run faster non-HT in some situations?


Most situations actually ...
User avatar
davean
Site Ninja
 
Posts: 2410
Joined: Sat Apr 08, 2006 7:50 am UTC

Re: Dual core vs. hyper threading

Postby enk » Sat Jan 19, 2008 7:24 pm UTC

davean wrote:
enk wrote:Sorry if it's already answered in the skirmish above, but will the P4 run faster non-HT in some situations?


Most situations actually ...


Then why HT?
phlip wrote:Ha HA! Recycled emacs jokes.
User avatar
enk
 
Posts: 754
Joined: Mon Sep 10, 2007 12:20 am UTC
Location: Aalborg, Denmark

Re: Dual core vs. hyper threading

Postby Anpheus » Sun Jan 20, 2008 6:59 am UTC

It was a bad idea that was initially supposed to be an intermediate step to full-fledged on-chip SMP.
Spoiler:
Code: Select all
  /###\_________/###\
  |#################|
  \#################/
   |##┌         ┐##|
   |##  (¯`v´¯)  ##|
   |##  `\ ♥ /´  ##|
   |##   `\¸/´   ##|
   |##└         ┘##|
  /#################\
  |#################|
  \###/¯¯¯¯¯¯¯¯¯\###/
User avatar
Anpheus
I can't get any worse, can I?
 
Posts: 860
Joined: Fri Nov 16, 2007 10:38 pm UTC
Location: A privileged frame of reference.

Re: Dual core vs. hyper threading

Postby e946 » Sun Jan 20, 2008 7:20 am UTC

It's also a term that just begs to be marketed to hell and back. Without knowing what it was, wouldn't "hyperthreading" sound like an awesome piece of technology?
User avatar
e946
 
Posts: 621
Joined: Wed Jul 11, 2007 6:32 am UTC

Re: Dual core vs. hyper threading

Postby wst » Sun Jan 20, 2008 11:34 am UTC

Didn't AMD make 'hypertransport' before Intel di 'hyperthreading', which was basically the same thing? Or am I getting confused due to the initials? I always thought that HT was a benefit 0.o
"If it looks like a duck, and quacks like a duck, we have at least to consider the possibility that we have a small aquatic bird of the family anatidae on our hands." - Douglas Adams
User avatar
wst
 
Posts: 2582
Joined: Sat Nov 24, 2007 10:06 am UTC

Re: Dual core vs. hyper threading

Postby davean » Sun Jan 20, 2008 3:10 pm UTC

wst wrote:Didn't AMD make 'hypertransport' before Intel di 'hyperthreading', which was basically the same thing? Or am I getting confused due to the initials? I always thought that HT was a benefit 0.o


HyperTransport is AMD's system bus. It is a high speed serial bus for transferring data around locally which can be cache coherent. This is a good thing, you want it or Intel's upcoming copy.

HyperThreading, on the other hand, is a way of "dealing" with the fact that your processor and memory are so miss matched your system is coming apart at the seams. You tack on a spare set of extra registers (cheap since x86 systems lack an appreciable number of registers to begin with) and while one instruction stream is waiting for memory to return data, the other set of registers swaps in and that stream processors with data that is (hopefully) loaded to cache.

This is fine in theory and some several other CPUs use or have used this quite well (IBM's power, Crays MTA). Those didn't happen to use a craptastic and cheap implementation of it though. Which is really the point; Intel's was tacked on, not designed in.

Intel really has a bad history of marketing and cheapness overshadowing processor design. Why can't they just be in the business of making good processors?
User avatar
davean
Site Ninja
 
Posts: 2410
Joined: Sat Apr 08, 2006 7:50 am UTC

Re: Dual core vs. hyper threading

Postby wst » Sun Jan 20, 2008 4:17 pm UTC

davean wrote:
HyperThreading, on the other hand, is a way of "dealing" with the fact that your processor and memory are so miss matched your system is coming apart at the seams. You tack on a spare set of extra registers (cheap since x86 systems lack an appreciable number of registers to begin with) and while one instruction stream is waiting for memory to return data, the other set of registers swaps in and that stream processors with data that is (hopefully) loaded to cache.


That sounds like an accident waiting to happen. I never trusted any of the old P4's, but mainly because I heard they lacked in floating point calculation abilities, but just sticking HT on like that? Whoops. Lucky I stuck with AMD for that time then. (I needed floating point for AI simulation)

As you're in the know, can you confirm that floating point issue? (Not that I'd touch a P4 now with C2Q's out and Phenom yet to prove itself, and it being outclassed totally by my single core Athlon 64 :P )
"If it looks like a duck, and quacks like a duck, we have at least to consider the possibility that we have a small aquatic bird of the family anatidae on our hands." - Douglas Adams
User avatar
wst
 
Posts: 2582
Joined: Sat Nov 24, 2007 10:06 am UTC

Re: Dual core vs. hyper threading

Postby mosc » Thu Jan 24, 2008 12:17 am UTC

Simultanious Multi-threading (Hyper Threading if you're intel) is multiple threads on a single core. That's been said, but here's an easier way to look at it:

Think of a room of secretaries all with their own job functions. Some do filing, some do copying, some do appointment making. Now picture each thread as a boss telling them what to do. Classically, you got one boss and he commands whoever he needs to do his bidding. SMT (or HT) is basically adding another boss. Ideally, each boss doesn't need to use all the secretaries so they can both get what they need done just as fast. The problem you run into with SMT is that if the two tasks conflict, you end up with two bosses yelling for the same secretary to do their bidding and we all know how that turns out.

In order for SMT to give tangible gains, you need to have multiple tasks that do not conflict on resources. If you had, say, one boss who needed some filing and another who needed some copying, they could get along great. Effectively getting both tasks done with one staff. If, however, they both need to make appointments at some point, they will actually slow each other down. It would even be faster in that situation to deal with one and then the other rather than trying to do both at the same time.

So, like all things in architecture, it has more to do with the software than the hardware. If the tasks assigned are not designed to benefit from the SMT, it's actually faster to turn it off. It's a nifty hardware feature and it can be great at specific tasks but since very little software really cares to use it, it's not that effective.
Image
Title: It was given by the XKCD moderators to me because they didn't care what I thought (I made some rantings, etc). I care what YOU think, the joke is forums.xkcd doesn't care what I think.
User avatar
mosc
Doesn't care what you think.
 
Posts: 4955
Joined: Fri May 11, 2007 3:03 pm UTC

Re: Dual core vs. hyper threading

Postby mosc » Thu Jan 24, 2008 12:44 am UTC

wst wrote:That sounds like an accident waiting to happen. I never trusted any of the old P4's, but mainly because I heard they lacked in floating point calculation abilities, but just sticking HT on like that? Whoops. Lucky I stuck with AMD for that time then. (I needed floating point for AI simulation)

As you're in the know, can you confirm that floating point issue? (Not that I'd touch a P4 now with C2Q's out and Phenom yet to prove itself, and it being outclassed totally by my single core Athlon 64 :P )

Intel's netburst was actually pretty good with floating point apps. Don't know where you're getting that. For something like MP3 encoding, they were always very competitive. It's been one of their strengths, not their weakness. HT was also very late to the party. Most P4s don't even have it and many have it turned off. The problem with the P4 was always the netburst architecture.

Netburst was basically very long pipeline and one that used extra caches to store things that have been classically (p3/athlon) done in hard coded routines. The idea was that it would allow for much higher clock speeds (which it did) and work very well on perdictable processes (which it did). The main problem is that a) most software was never re-designed to untilize it b) it was inherently slower on apps that were unpredictable or diverse and c) extra ghz generated extra heat and sucked too much power.

a) re-writing an application to take advantage of new hardware capability is extremely expensive. One of the strengths of the x86 architecture (and why 90%+ of the world's computers run on an calculator's architecture from 1978) is that each iteration works with existing code. Anyway, Netburst worked fine on old code and was a major improvement for certain tasks (predictable and repetitive) but never was embraced. Particularly for games which dictate so much of what we think of as "fast" or "slow", the architecture responded poorly to games which often are very unpredictable and unrepetitive tasks.

b) Netburst is slower on these apps because the penalties for missing are higher. It's hard to give a simplistic explanation of this but basically the thing is like an assembly line with lots of robots. Instead of the conventional archicecture which pre-programs the robots, netburst had some robots be a little dumber (and thus faster) and had a cue card made for them to know what to do. THe issue is if they need to switch cards, they have to go looking. Also, there were more robots in the pipeline (the netburst was basically 20 stages. Later it would be expanded to 28 with prescott.) so more cycles were lost if you had somebody up front say "This isn't what I thought was going to happen. Clear everything after me and we'll start again". Hopefully that helps.

c) More GHZ means more heat and more power. See, you got to understand the history a little for this. The P3 never put out obscene power. I think the hottest was about 70 watts and that was only at peak. However, the first P4s came in at more than that. They caused a revolution in heat sync design and power supply design in order to deal with it. Soon they were well over 100 watts and idled much hotter. Meanwhile, athlon's were essentially just P3s brought up to current process so they ran cooler on fewer ghz. Heat is not just an issue for heatsyncs. It also affects performance. Routing a chip with heat problems is difficult. You have to make sure you don't make one area too hot. You quickly get to the point where you're limited. You've got this long pipe (netburst) ready to go but you can't turn that assembly line up to full speed or robot #5's going to burst even though 1 through 4 and 6 through 20 are fine. That's an exaggeration but you get the idea.

P3s (athlon's like them) and P4s actually have middle chunks to their pipeline which are a-syncronous and that complicates things greatly but I left that out for simplicity. Also, It's been a few years since I was up on this stuff so I can't really talk on the core duo or new AMD chip. I do know that the core duo is a step back to a more p3-like design though and retains very little of the netburst.

davean wrote:This is fine in theory and some several other CPUs use or have used this quite well (IBM's power, Crays MTA). Those didn't happen to use a craptastic and cheap implementation of it though. Which is really the point; Intel's was tacked on, not designed in.

This is false. It was supposed to be SMT from the get go. One of the ways they expected to exploit their much longer pipeline was SMT. The term HT dates back to before the P4 was ever released and early versions internally had it. I used a 1.0ghz P4 which had HT that dated from before the P4's commercial release. Every retail P4 has Hyperthreading, it's just been turned off at the hardware level on most of them. Saying it was never designed from the beginning is completely false!

The real reason it never worked for them was the software. IBM's power and Crays MTA didn't run windows apps designed for a 386, now did they?

Also if I may speak freely Mr. Site admin (if not don't click)
Spoiler:
You may want to avoid obvious opinion fanboy ranting about how craptastic Intel is when the guy clearly is asking for some facts. Also, a little respect for the company that's been the driving force for the industry for like 25 years might do you some good.
Image
Title: It was given by the XKCD moderators to me because they didn't care what I thought (I made some rantings, etc). I care what YOU think, the joke is forums.xkcd doesn't care what I think.
User avatar
mosc
Doesn't care what you think.
 
Posts: 4955
Joined: Fri May 11, 2007 3:03 pm UTC

Re: Dual core vs. hyper threading

Postby wst » Thu Jan 24, 2008 9:44 pm UTC

mosc wrote:
wst wrote:That sounds like an accident waiting to happen. I never trusted any of the old P4's, but mainly because I heard they lacked in floating point calculation abilities, but just sticking HT on like that? Whoops. Lucky I stuck with AMD for that time then. (I needed floating point for AI simulation)

As you're in the know, can you confirm that floating point issue? (Not that I'd touch a P4 now with C2Q's out and Phenom yet to prove itself, and it being outclassed totally by my single core Athlon 64 :P )

Intel's netburst was actually pretty good with floating point apps. Don't know where you're getting that. For something like MP3 encoding, they were always very competitive. It's been one of their strengths, not their weakness. HT was also very late to the party. Most P4s don't even have it and many have it turned off. The problem with the P4 was always the netburst architecture.

A fricken' WALL!


Don't worry, I think my source might have been a bit biased, and I clung to the concept that my ol' Duron was a world beater with a different name, back then.

That wall of info was also interesting. So, summed up, Netburst/HyperThreading was a good idea, but software companies didn't find it economically viable to spend money having a separate thing made to allow it to be used to its full potential.

The reverting back to a more p3-like design isn't too surprising. The K7 and the P4 era was one of mega speeds and drinking power, and if it was innefficient, who cared, it produced more power. K8 gave Intel a wake up call when their P4's were being thrashed up and down by these things with lower clockspeeds, and lower power drain. So Intel obviously looked into efficient CPU design, and look where they are now :D

We really need a processor (hyper)thread just to mutter stuff about processors, share knowledge, and ask questions about features.

On a side note: I'm rapidly getting to like Via for their C7. Awesome low-power processor. A friend of mine's getting one- I'll have a play with it and might get a few to do some parallel processing/home fileserver stuff. But they're sooo cute in their little Pico-ITX mobos. (Yes, hide a full computer in a 5.25" bay!) Okay, OT a bit there, whoops. Sorry guys, just excitable a bit :D
"If it looks like a duck, and quacks like a duck, we have at least to consider the possibility that we have a small aquatic bird of the family anatidae on our hands." - Douglas Adams
User avatar
wst
 
Posts: 2582
Joined: Sat Nov 24, 2007 10:06 am UTC

Re: Dual core vs. hyper threading

Postby mosc » Thu Jan 24, 2008 10:35 pm UTC

You can't equate them as the same thing. Netburst was largely a dead end (at least in the x86 world) but hyperthreading (really, SMT with a fancy name) is totally different.

Much of the reason HT didn't work well was the same reason why dual and quad cores are largely sitting idle. The code they are designed to run simply doesn't handle parallel task lines very well.

You already see SMT used effectively in many other chips (some of which were already mentioned) and I don't think it's going away but in the x86 world, the code is much MUCH slower to change. However, once dual and quad cores become commonplace, you may see SMT re-appear.

On the AMD sucess during that era, I don't credit AMD with much of it personally. Intel's own P3s out performed their P4s early on even with a health 40% clock advantage. It's also important to recognize the tremendous whole Intel backed itself into with it's decision to support exclusively RDRAM and not DDR memory for over a year. This placed early P4s further behind in power consumption and performance/dollar.

Intel got very arrogant with their transition from P3 to P4 in a way they never did before (and haven't done since). They heavily limited third parties from making motherboards, supported only one type of memory, put out the most radically different chip since the first pentium, and set their price points through the ceiling. AMD meanwhile basically kept scaling the same P3 clone while phasing out it's own mobo chipsets in favor of a more competitive third party market. To me, it was more a cast of the old Intel mentality beating out the new one.
Last edited by mosc on Thu Jan 24, 2008 11:13 pm UTC, edited 1 time in total.
Image
Title: It was given by the XKCD moderators to me because they didn't care what I thought (I made some rantings, etc). I care what YOU think, the joke is forums.xkcd doesn't care what I think.
User avatar
mosc
Doesn't care what you think.
 
Posts: 4955
Joined: Fri May 11, 2007 3:03 pm UTC

Re: Dual core vs. hyper threading

Postby wst » Thu Jan 24, 2008 10:54 pm UTC

mosc wrote:You can't equate them as the same thing. Netburst was largely a dead end (at least in the x86 world) but hyperthreading (really, SMT with a fancy name) is totally different.


NetBurst not HT. Whoops. That wall really screwed up my mind and somewhere it got them mixed into one... I'll re-read that post again, very slowly, and re-comment/edit this post when I understand that, before progressing with other comments. Dx

EDIT: I get it! Netburst allowed higher clock speeds, but this made more heat, and was not good for unpredictable stuff. This meant Intel had to be careful with where they put components so they didn't fry the Netburst.

Would floating point come under 'unpredictable'?
Last edited by wst on Fri Jan 25, 2008 6:49 pm UTC, edited 2 times in total.
"If it looks like a duck, and quacks like a duck, we have at least to consider the possibility that we have a small aquatic bird of the family anatidae on our hands." - Douglas Adams
User avatar
wst
 
Posts: 2582
Joined: Sat Nov 24, 2007 10:06 am UTC

Next

Return to Hardware

Who is online

Users browsing this forum: No registered users and 3 guests