BruceTruax
Super Moderator
     
Posts: 54
Registered: 13-5-2009
Member Is Offline
|
posted on 13-5-2009 at 06:48 AM |
|
|
MultiCore Computers and ZEMAX
I just purchased a dual quad core Mac Pro because I have a non-sequential system to design and I needed the extra speed. The new computer is a
2.23GHz system with 8GB RAM and dual quad core Nehalem processors. I have set up the computer so that I can either boot directly into Windows (WinXP
SP3 32 bit) or Mac OS X and run Windows through Parallels. I figured that for ultimate speed I would want to boot directly into windows but that for
most work I would use parallels.
These new processors from Intel support hyperthreading so each of the 8 cores can run two threads, supposedly quite efficiently. When you boot
directly into either Windows or OS X the operating systems think you have 16 total processors. When I first saw this I figured I would be spending a
lot of time booted into Windows because at this time Parallels only supports 8 CPU's.
I ran an experiment tracing 10,000,000 rays in a non sequential system. I did the experiment using three different configurations:
1. Booted natively into Windows using Boot Camp and assigning all 16 CPU's to the ray trace
2. Booted natively into Windows using Boot Camp and assigning 8 of 16 CPU's to the ray trace
3. Booted into Mac OS X running Windows XP in virtualization mode under Parallels. Parallels only supports 8 CPU's at this time so the ZEMAX test
was run assigning all 8 CPU's to the ray trace
The Results
Windows Native 16 CPU's: 102 seconds
Windows Native 8 CPU's: 46.6 seconds
Windows under Parallels emulation 8 CPU's: 59.4 seconds
I also tried using the Task Manager in Windows to set the affinity of the ZEMAX task to either the first 8 CPU's or every other CPU. Assigning
affinity to the first 8 resulted in the same time as telling ZEMAX to use 8 CPU's. Using every other CPU resulted in a time of 65 seconds. I would
guess that by using the first 8 CPU's one core from each CPU is kept busy and if you assign every other CPU it is probably using the 8 hyperthreaded
CPU's of one processor. This could explain the performance hit taken by parallels because during processing as new threads are issued they are
assigned somewhat randomly to the 8 CPU's. This puts parallels performance somewhere between the optimal CPU allocation (first 8) and the slowest
(every other CPU).
One more note: The Nehalem architecture provides for a "Turbo" mode. If only one of the two hyperthreaded cores is busy and the CPU temp is
acceptable the processor can add clock speed in increments of 133MHz. This could also help when the cores such that hyperthreading is not used.
The conclusion:
ZEMAX or possibly Windows does not efficiently use all 16 processors for ray tracing. Best performance is obtained when the number of CPU's equals
the number of physical cores.
Parallels suffers about a 30% performance penalty when compared to windows running under boot camp.
|
|
|
BruceTruax
Super Moderator
     
Posts: 54
Registered: 13-5-2009
Member Is Offline
|
posted on 19-11-2009 at 05:39 PM |
|
|
Update - Parallels 5
I just installed Parallels 5. Running the same design file as in the May 13 post under parallels with 8 cores I now get 49.8 seconds. Only 3.2
seconds (7%) slower than native windows.
It appears that Parallels 5 assigns the 8 windows cores such that one core is used on each processor. This is obvious in Menu meters where now when
running flat out every other processor is at 100% and the other 8 are at 0%. Previously Windows would run on 8 randomly assigned virtual cores and
the cores changed each time a new thread was launched. Often two hyperthreaded cores on the same physical core would be running and as I noticed when
running windows natively, this tends to slow things down, probably due to conflicts in memory access and FPU access.
Bruce Truax
Diffraction Limited Design LLC
|
|
|
Mark
Newbie
Posts: 1
Registered: 9-8-2010
Member Is Offline
|
posted on 9-8-2010 at 03:26 PM |
|
|
Hey Bruce,
Tolis Deslis pointed me at this posting. If you have 8 real cores, ZEMAX will set itself to 8 CPUs. You can over-rule this, but as you've found out,
it doesn't make things go any faster: quite the opposite in fact, as you start to run out of resources.
Check out http://www.zemax.com/kb/articles/196/1/Running-ZEMAX-on--a-Multi-CPU-Computer... and http://www.zemax.com/kb/articles/156/1/How-to-Run-ZEMAX-on-an-Intel-based-App... for more info.
- Mark
|
|
|
BruceTruax
Super Moderator
     
Posts: 54
Registered: 13-5-2009
Member Is Offline
|
posted on 18-8-2010 at 07:19 PM |
|
|
Mark,
I have seen those articles on the ZEMAX website. The articles came out after I posted this information and had a few discussions with them and sent
them a detailed email with the same information. They still do not address the hyperthreading issue.
Bruce
Bruce Truax
Diffraction Limited Design LLC
|
|
|