Join PrimeGrid
Returning Participants
Community
Leader Boards
Results
Other
drummers-lowrise
|
Message boards :
Aggie The Pew message board
: W/U runtimes
Author |
Message |
Neo Volunteer tester
 Send message
Joined: 28 Oct 10 Posts: 710 ID: 71509 Credit: 91,178,992 RAC: 0
                   
|
Hey team,
Do you see a speed-up in the per bit timings for each type of w/u, but especially a longer w/u, if you are running a mix, rather than all of your cores on the MEGA port?
My findings:
5 cores running MEGA w/u's : 4.8 ms/bit 256K FFT
compared against
2 cores running MEGA w/u's :
2.8 ms/bit 192K FFT
3.7 ms/bit 256K FFT with
4 cores running PPSeLow w/u's: .184 ms/bit (41 seconds per w/u)
I am running a 1055T 6x o/c'd to 3.19 ghz, but also running DDR2-800, so, I am thinking I might have a memory bottleneck somewhere when running 5 or 6 mega w/u's at a time. Anyone get similar results?
Neo
| |
|
|
Hey team,
Do you see a speed-up in the per bit timings for each type of w/u, but especially a longer w/u, if you are running a mix, rather than all of your cores on the MEGA port?
Interesting question. I was asking myself a similar question when doubling up on MEGA wu's per core.
Since the team has so valiantly held our position on the MEGA port for the time being, I think I'll run some tests. I'll let you know my findings.
How much RAM are you running Neo?
--C | |
|
Neo Volunteer tester
 Send message
Joined: 28 Oct 10 Posts: 710 ID: 71509 Credit: 91,178,992 RAC: 0
                   
|
I have 8 gigs of ram.
4 gigs of DDR-2 PC-8500
4 gigs of DDR-2 PC-6400
All eight run at 800 mhz. (or 400mhz x 2)
I have an AMD 1055T x6 @ 2.8;
Due to the core multiplier being locked on this CPU (non black edition), I had to increase the HT ref clock from 200 to 228 mhz. I downclocked my memory to 780 mhz.
My nvidia gt430 however is really burning up. I have it downclocked all the way to 525mhz and my temps are 68 degrees. Is it possible that the MB is delivering more power to my graphics card now? I used to be able to run it at stock 700mhz and it would sit at 68 while doing a manual tpsieve...
Neo | |
|
|
Can you d/c the memory on your gt430 instead? There was some other thread on the main boards that recommended doing that to save on power and heat for PG as apparently the memory clock isn't as important here (advice does not apply if you run other boinc projects, I guess).
--Gary | |
|
|
I just completed some quick tests.
-1 Mega wu per core vs. 2 mega wu's per core w/o GPU
-1 Mega wu per core vs. 2 mega wu's per core w/ GPU
-1 Mega wu per core + 1 PPSElow wu per core w/o GPU
-1 Mega wu per core + 1 PPSElow wu per core w/ GPU
-2 Mega wu's per core + 2 PPSElow wu's per core w/o GPU
-2 Mega wu's per core + 2 PPSElow wu's per core w GPU
The matrix is a bit much to post, but here are my conclusions:
To address Neo's question first.
-Without the GPU, irrespective of 1 or 2 wu's
per core or whether a mix of Mega and Low wu's or all mega wu's, my results were almost
linear. In all cases, there was slight advantage to running 2 wu's per core. There
was no bottleneck for me. Note: in all tests I ran a 50-50 ratio of mega and low
wu's. Also time per bit was consistent.
-Add in GPU. There were spikes in time per bit on Mega wu's when applying 2 wu's per core. As much as 50%. I hit my bottleneck. When applying 2 mega and 2 low wu's per core, the spikes in speed almost disappeared however.
In my case, if I'm using the GPU and all wu's are dedicated to Mega, then I'm better off with 1 wu per core. However, if I have a mix of Mega and Low wu's (50-50), or any other scenario I tested I would use 2 wu's per core. I didn't feel like doing the arithmetic, so I can't quantify the advantage of 2 wu's per core vs. 1 wu per core. The advantage was very slight for Mega wu's and better for Low wu's.
I'm running 8 Gb DDR3 downclocked to 1600.
I'd be interested to hear of any other observations of the team with their specific
systems.
Hope everyone is having a nice weekend. For the Americans, enjoy the 4th. For the Brits, enjoy the 4th! :)
| |
|
Neo Volunteer tester
 Send message
Joined: 28 Oct 10 Posts: 710 ID: 71509 Credit: 91,178,992 RAC: 0
                   
|
Can you d/c the memory on your gt430 instead? There was some other thread on the main boards that recommended doing that to save on power and heat for PG as apparently the memory clock isn't as important here (advice does not apply if you run other boinc projects, I guess).
--Gary
Yes, I have heard the same thing. Memory is downclocked all the way to 525mhz.
What is odd, is that my card will go to 74 degrees down to 60, and then back up to 74 about every 20 or so minutes. My tpsieve P/sec rate stays the same (I'm sieving on my GPU via TPsieve). So.... | |
|
Neo Volunteer tester
 Send message
Joined: 28 Oct 10 Posts: 710 ID: 71509 Credit: 91,178,992 RAC: 0
                   
|
Caravaggio,
I couldn't see your systems under your profile because they're hidden, however, assuming you have 6 cores, or 8 cores, I'm talking about running one w/u per core only. So, I guess the comparison would come from
All cores running 1 MEGA w/u each
vs.
All cores active but two cores running 1 MEGA w/u each
and remaining cores running 1 PPSeLow w/u each
Neo | |
|
|
Neo, sorry I didn't directly answer your question. Yes, I did see an improvement in speed on the mega wu's when I dropped some mega wu's and replaced them with PPSElow wu's. Both with 1 wu per core, and 2 wu's.
I have an i7 2600K running at 4Ghz. | |
|
|
Not quite in the league of Caravaggio's robust and very in depth testing, mainly because I stumbled on this by accident when the 27 port went down, but when I run all 4 cores on port 27 I get an average of 8 hours per WU. When I run 3 cores running port 27 and 1 core running PPSElow I get an average of 6 hours per WU.
The way I see it, running 3 cores (port 27) produces the same amount of crunching as running 4 cores, plus I get a core running PPSElow stuff.
I hope my math is good:
4 cores after 24 hours = 12 WU's
3 cores after 24 hours = 12 WU's
____________
Welcome to Holland
| |
|
|
I find this interesting. I should play a little myself.
As my main box has plenty of ram, (8gb DDR3) and quad core, im running 4 cores on 27 port, and nothing else.
My other two rigs are both 1090T cpus with 6 cores, but i'm also running my GPU on them for some sieving. those rigs only each have 2 gb of ram though, so I could have some problems there.
I may have to start playing with mixing what ports i'm on to see if I can see the same type of time improvements.
____________
| |
|
Message boards :
Aggie The Pew message board
: W/U runtimes |