DLD-LLC Discussion Forums
Not logged in [Login - Register]
Go To Bottom

Printable Version | Subscribe | Add to Favorites  
Author: Subject: MultiCore Computers and ZEMAX
BruceTruax
Super Moderator
*******


Avatar


Posts: 54
Registered: 13-5-2009
Member Is Offline


[*] posted on 13-5-2009 at 06:48 AM
MultiCore Computers and ZEMAX


I just purchased a dual quad core Mac Pro because I have a non-sequential system to design and I needed the extra speed. The new computer is a 2.23GHz system with 8GB RAM and dual quad core Nehalem processors. I have set up the computer so that I can either boot directly into Windows (WinXP SP3 32 bit) or Mac OS X and run Windows through Parallels. I figured that for ultimate speed I would want to boot directly into windows but that for most work I would use parallels.

These new processors from Intel support hyperthreading so each of the 8 cores can run two threads, supposedly quite efficiently. When you boot directly into either Windows or OS X the operating systems think you have 16 total processors. When I first saw this I figured I would be spending a lot of time booted into Windows because at this time Parallels only supports 8 CPU's.

I ran an experiment tracing 10,000,000 rays in a non sequential system. I did the experiment using three different configurations:

1. Booted natively into Windows using Boot Camp and assigning all 16 CPU's to the ray trace

2. Booted natively into Windows using Boot Camp and assigning 8 of 16 CPU's to the ray trace

3. Booted into Mac OS X running Windows XP in virtualization mode under Parallels. Parallels only supports 8 CPU's at this time so the ZEMAX test was run assigning all 8 CPU's to the ray trace


The Results

Windows Native 16 CPU's: 102 seconds
Windows Native 8 CPU's: 46.6 seconds
Windows under Parallels emulation 8 CPU's: 59.4 seconds

I also tried using the Task Manager in Windows to set the affinity of the ZEMAX task to either the first 8 CPU's or every other CPU. Assigning affinity to the first 8 resulted in the same time as telling ZEMAX to use 8 CPU's. Using every other CPU resulted in a time of 65 seconds. I would guess that by using the first 8 CPU's one core from each CPU is kept busy and if you assign every other CPU it is probably using the 8 hyperthreaded CPU's of one processor. This could explain the performance hit taken by parallels because during processing as new threads are issued they are assigned somewhat randomly to the 8 CPU's. This puts parallels performance somewhere between the optimal CPU allocation (first 8) and the slowest (every other CPU).

One more note: The Nehalem architecture provides for a "Turbo" mode. If only one of the two hyperthreaded cores is busy and the CPU temp is acceptable the processor can add clock speed in increments of 133MHz. This could also help when the cores such that hyperthreading is not used.

The conclusion:

ZEMAX or possibly Windows does not efficiently use all 16 processors for ray tracing. Best performance is obtained when the number of CPU's equals the number of physical cores.

Parallels suffers about a 30% performance penalty when compared to windows running under boot camp.
View user's profile Visit user's homepage View All Posts By User U2U Member
BruceTruax
Super Moderator
*******


Avatar


Posts: 54
Registered: 13-5-2009
Member Is Offline


[*] posted on 19-11-2009 at 05:39 PM
Update - Parallels 5


I just installed Parallels 5. Running the same design file as in the May 13 post under parallels with 8 cores I now get 49.8 seconds. Only 3.2 seconds (7%) slower than native windows.

It appears that Parallels 5 assigns the 8 windows cores such that one core is used on each processor. This is obvious in Menu meters where now when running flat out every other processor is at 100% and the other 8 are at 0%. Previously Windows would run on 8 randomly assigned virtual cores and the cores changed each time a new thread was launched. Often two hyperthreaded cores on the same physical core would be running and as I noticed when running windows natively, this tends to slow things down, probably due to conflicts in memory access and FPU access.




Bruce Truax
Diffraction Limited Design LLC
View user's profile Visit user's homepage View All Posts By User U2U Member
Mark
Newbie
*




Posts: 1
Registered: 9-8-2010
Member Is Offline


[*] posted on 9-8-2010 at 03:26 PM


Hey Bruce,

Tolis Deslis pointed me at this posting. If you have 8 real cores, ZEMAX will set itself to 8 CPUs. You can over-rule this, but as you've found out, it doesn't make things go any faster: quite the opposite in fact, as you start to run out of resources.

Check out http://www.zemax.com/kb/articles/196/1/Running-ZEMAX-on--a-Multi-CPU-Computer... and http://www.zemax.com/kb/articles/156/1/How-to-Run-ZEMAX-on-an-Intel-based-App... for more info.

- Mark
View user's profile View All Posts By User U2U Member
BruceTruax
Super Moderator
*******


Avatar


Posts: 54
Registered: 13-5-2009
Member Is Offline


[*] posted on 18-8-2010 at 07:19 PM


Mark,

I have seen those articles on the ZEMAX website. The articles came out after I posted this information and had a few discussions with them and sent them a detailed email with the same information. They still do not address the hyperthreading issue.

Bruce




Bruce Truax
Diffraction Limited Design LLC
View user's profile Visit user's homepage View All Posts By User U2U Member

  Go To Top

Powered by XMB
Developed By The XMB Group © 2001-2008
[Queries: 18] [PHP: 43.8% - SQL: 56.2%]