Join PrimeGrid
Returning Participants
Community
Leader Boards
Results
Other
drummers-lowrise
|
Message boards :
Number crunching :
GPU tasks faster when PPS runs
Author |
Message |
Bur Volunteer tester
 Send message
Joined: 25 Feb 20 Posts: 332 ID: 1241833 Credit: 22,611,276 RAC: 0
               
|
I'm running GFN17-mega on a 1660 Super. So far the CPU (i5-4790K) was running Cullen, Woodall or SoB in MT (4 threads). I had a GFN throughput of 200 per 24 h.
Now in preparation for the challenge I switched to PPS-mega and throughput increased to 215 per 24 h.
HT is enabled in the BIOS, BOINC is set to 50% CPU and I assigned cores 0,2,4,6 to LLR2 task. GFN task was assigned all cores.
Why is the GPU task that strongly affected by the choice of subproject? All are using 4 threads. Is it the smaller FFT size of PPS-mega? More importantly, can anything be done to minimize the impact of the CPU task on the GPU?
____________
Primes: 1281979 & 12+8+1979 & 1+2+8+1+9+7+9 & 1^2+2^2+8^2+1^2+9^2+7^2+9^2 & 12*8+19*79 & 12^8-1979 & 1281979 + 4 (cousin prime) | |
|
Yves GallotVolunteer developer Project scientist Send message
Joined: 19 Aug 12 Posts: 644 ID: 164101 Credit: 305,010,093 RAC: 0

|
If the FFT data size is larger than L3 cache size then the L3 cache is continually cleared.
The GPU driver still has to run (to control execution of OpenCL code) and rather than reading data from L3 cache it must tackle LLR in order to read memory. | |
|
Bur Volunteer tester
 Send message
Joined: 25 Feb 20 Posts: 332 ID: 1241833 Credit: 22,611,276 RAC: 0
               
|
Ok, thanks. The throughput increased to 235 vs 200.
But there's nothing that can be done besides having a larger L3 cache, I guess?
____________
Primes: 1281979 & 12+8+1979 & 1+2+8+1+9+7+9 & 1^2+2^2+8^2+1^2+9^2+7^2+9^2 & 12*8+19*79 & 12^8-1979 & 1281979 + 4 (cousin prime) | |
|
Yves GallotVolunteer developer Project scientist Send message
Joined: 19 Aug 12 Posts: 644 ID: 164101 Credit: 305,010,093 RAC: 0

|
But there's nothing that can be done besides having a larger L3 cache, I guess?
Yes.
i7-4790K 4 cores, L3 8 MB
i5-10600K 6 cores, L3 12 MB
i7-10700K 8 cores, L3 16 MB
Ryzen 5 3600X/5600X 6 cores, L3 32 MB
Ryzen 7 3800X/5800X 8 cores, L3 32 MB | |
|
|
But there's nothing that can be done besides having a larger L3 cache, I guess?
Yes.
i7-4790K 4 cores, L3 8 MB
i5-10600K 6 cores, L3 12 MB
i7-10700K 8 cores, L3 16 MB
Ryzen 5 3600X/5600X 6 cores, L3 32 MB
Ryzen 7 3800X/5800X 8 cores, L3 32 MB
your ryzen numbers aren't right from a performance perspective:
3600X - L3 2x16MB
5600X - L3 32 MB
3800X - L3 2x16MB
5800X - L3 32 MB | |
|
Bur Volunteer tester
 Send message
Joined: 25 Feb 20 Posts: 332 ID: 1241833 Credit: 22,611,276 RAC: 0
               
|
Unfortunately it seems the i7-4790K is about the fastest you can go with the LGA 1150 socket. And I don't feel like replacing the mainboard...
____________
Primes: 1281979 & 12+8+1979 & 1+2+8+1+9+7+9 & 1^2+2^2+8^2+1^2+9^2+7^2+9^2 & 12*8+19*79 & 12^8-1979 & 1281979 + 4 (cousin prime) | |
|
Post to thread
Message boards :
Number crunching :
GPU tasks faster when PPS runs |