PrimeGrid
Please visit donation page to help the project cover running costs for this month

Toggle Menu

Join PrimeGrid

Returning Participants

Community

Leader Boards

Results

Other

drummers-lowrise

Advanced search

Message boards : The Riesel Problem : Optimal number of instances on i5 and i7 for Reisel problem

Author Message
jMac
Send message
Joined: 23 Jul 16
Posts: 23
ID: 452768
Credit: 151,907,713
RAC: 0
321 LLR Bronze: Earned 10,000 credits (46,709)Cullen LLR Silver: Earned 100,000 credits (392,857)ESP LLR Silver: Earned 100,000 credits (142,149)Generalized Cullen/Woodall LLR Bronze: Earned 10,000 credits (13,246)PSP LLR Gold: Earned 500,000 credits (847,113)SoB LLR Emerald: Earned 50,000,000 credits (83,770,200)SR5 LLR Bronze: Earned 10,000 credits (19,916)TRP LLR Emerald: Earned 50,000,000 credits (66,152,343)Woodall LLR Silver: Earned 100,000 credits (402,368)Generalized Cullen/Woodall Sieve (suspended) Bronze: Earned 10,000 credits (55,158)TRP Sieve (suspended) Bronze: Earned 10,000 credits (45,294)
Message 109084 - Posted: 25 Jul 2017 | 17:36:41 UTC

I have several i5s and i7s running the Reisel problem when they aren't doing something else. What is the optimum number of threads to run on th i5 and on the i7, for the maximum throughput? (I generally have four running on each.)

Profile Michael GoetzProject donor
Volunteer moderator
Project administrator
Avatar
Send message
Joined: 21 Jan 10
Posts: 13513
ID: 53948
Credit: 237,712,514
RAC: 0
Found 5 primes in the 2021 Tour de PrimesFound 5 mega primes in the 2021 Tour de PrimesThe "Shut up already!" badge:  This loud mouth has mansplained on the forums over 10 thousand times!  Sheesh!!!Discovered the World's First GFN-19 prime!!!Discovered 6 mega primesFound 1 prime in the 2018 Tour de PrimesFound 1 prime in the 2019 Tour de PrimesFound 1 prime in the 2020 Tour de Primes321 LLR Ruby: Earned 2,000,000 credits (2,822,730)Cullen LLR Ruby: Earned 2,000,000 credits (3,624,591)ESP LLR Turquoise: Earned 5,000,000 credits (5,021,269)Generalized Cullen/Woodall LLR Ruby: Earned 2,000,000 credits (2,145,754)PPS LLR Jade: Earned 10,000,000 credits (16,008,485)PSP LLR Turquoise: Earned 5,000,000 credits (5,197,957)SoB LLR Sapphire: Earned 20,000,000 credits (34,291,181)SR5 LLR Jade: Earned 10,000,000 credits (10,007,110)SGS LLR Ruby: Earned 2,000,000 credits (3,252,256)TRP LLR Turquoise: Earned 5,000,000 credits (5,084,329)Woodall LLR Ruby: Earned 2,000,000 credits (2,911,985)321 Sieve Jade: Earned 10,000,000 credits (10,061,196)Cullen/Woodall Sieve (suspended) Ruby: Earned 2,000,000 credits (4,922,111)Generalized Cullen/Woodall Sieve (suspended) Turquoise: Earned 5,000,000 credits (5,059,304)PPS Sieve Sapphire: Earned 20,000,000 credits (22,885,121)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Amethyst: Earned 1,000,000 credits (1,035,522)TRP Sieve (suspended) Ruby: Earned 2,000,000 credits (2,051,121)AP 26/27 Jade: Earned 10,000,000 credits (10,118,303)GFN Emerald: Earned 50,000,000 credits (76,771,161)PSA Jade: Earned 10,000,000 credits (12,445,029)
Message 109085 - Posted: 25 Jul 2017 | 17:57:15 UTC - in response to Message 109084.

I have several i5s and i7s running the Reisel problem when they aren't doing something else. What is the optimum number of threads to run on th i5 and on the i7, for the maximum throughput? (I generally have four running on each.)


If you're not also running a GPU in those computers, my advice would be as follows:

1) On the i5s, run all 4 cores
2) On the i7s, run 50% (i.e., 4 cores), *OR* disable hyperthreading in the BIOS and run 100% (also 4 cores).
3) Use app_config to run TRP in multi-threaded mode, using 4 threads for the task.

You will then be running a single TRP task on all 4 cores (all four "full" cores in the case of the i7s). That's the most efficient way to run TRP on those computers.

app_config.xml should look like this:

<app_config> <app> <name>llrTRP</name> <fraction_done_exact/> <max_concurrent>1</max_concurrent> </app> <app_version> <app_name>llrTRP</app_name> <cmdline>-t 4</cmdline> <avg_ncpus>4</avg_ncpus> </app_version> </app_config>


The app_config.xml file should be in C:\ProgramData\BOINC\projects\www.primegrid.com\
assuming you're running a normal Windows installation.

If you ARE also running a GPU, you *might* want to leave a CPU core free to service the GPU. You'll get more done on the GPU, but less on the CPU. Note that in this scenario you might want to leave hyperthreading turned on on the i7s and use the hyperthreads to service the GPU. Leave HT on and set the number of CPUs to 50%.

To leave a (full) core free for a GPU when there's no hyperthreading (either the i5s, or if you have turned off hyperthreading on the i7s), set the number of CPUs to 75% and change "4" to 3" in the two lines near the end of the app_config.xml file.
____________
My lucky number is 75898524288+1

Profile Rafael
Volunteer tester
Avatar
Send message
Joined: 22 Oct 14
Posts: 885
ID: 370496
Credit: 334,085,845
RAC: 0
321 LLR Turquoise: Earned 5,000,000 credits (8,236,942)Cullen LLR Turquoise: Earned 5,000,000 credits (8,028,695)ESP LLR Turquoise: Earned 5,000,000 credits (8,027,771)Generalized Cullen/Woodall LLR Turquoise: Earned 5,000,000 credits (7,259,087)PPS LLR Turquoise: Earned 5,000,000 credits (7,722,618)PSP LLR Turquoise: Earned 5,000,000 credits (7,137,796)SoB LLR Turquoise: Earned 5,000,000 credits (6,941,728)SR5 LLR Turquoise: Earned 5,000,000 credits (7,186,115)SGS LLR Turquoise: Earned 5,000,000 credits (7,263,666)TRP LLR Turquoise: Earned 5,000,000 credits (8,751,781)Woodall LLR Turquoise: Earned 5,000,000 credits (7,119,125)321 Sieve Jade: Earned 10,000,000 credits (10,033,828)Generalized Cullen/Woodall Sieve (suspended) Jade: Earned 10,000,000 credits (10,037,204)PPS Sieve Jade: Earned 10,000,000 credits (10,305,147)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Ruby: Earned 2,000,000 credits (2,000,053)TRP Sieve (suspended) Ruby: Earned 2,000,000 credits (2,030,160)AP 26/27 Jade: Earned 10,000,000 credits (12,282,634)GFN Sapphire: Earned 20,000,000 credits (29,911,497)PSA Double Bronze: Earned 100,000,000 credits (170,761,999)
Message 109086 - Posted: 25 Jul 2017 | 17:58:53 UTC - in response to Message 109084.
Last modified: 25 Jul 2017 | 18:01:01 UTC

I have several i5s and i7s running the Reisel problem when they aren't doing something else. What is the optimum number of threads to run on th i5 and on the i7, for the maximum throughput? (I generally have four running on each.)

That's... not an easy question. There many thing that affect performance, just knowing if it's an i5 or i7 isn't really helpful. And while we can see your PCs to know their clocks and generations, it gives us no info on RAM, which also plays a major factor. And even then, it's still hard to guess performance just based on specs - this is how difficult of a question we're talking about.

If you want as much precision as possible on the answer, download the BETA Prime95 29.2 and use the benchmark feature. Please look at the print below:

http://i.imgur.com/aCwQ5lJ.png
*EDIT: on the "number of workers to benchmark" field, it should be 1,2,4, not 1,4. Oops, made a typo on the print.

On the right, the benchmark settings you should use (for Riesel, 864k FFT). On the left, the results of a quick run I did on one of my PCs. The first number shows 4 cores crunching a single WU. The second is for 2 cores processing one unit, with 2 units running at the same time. And the last refers to each core running it's own WU. On my PC, seems like 4 cores one 1 unit is the best; on yours, it might be 2 units with 2 cores each, or maybe even 1c for each unit. Who knows?

mackerelProject donor
Volunteer tester
Avatar
Send message
Joined: 2 Oct 08
Posts: 2460
ID: 29980
Credit: 442,802,854
RAC: 0
Discovered 4 mega primesEliminated 1 conjecture "k"Found 3 primes in the 2018 Tour de PrimesFound 1 mega prime in the 2018 Tour de PrimesFound 5 primes in the 2019 Tour de PrimesFound 6 primes in the 2020 Tour de Primes321 LLR Turquoise: Earned 5,000,000 credits (8,774,878)Cullen LLR Turquoise: Earned 5,000,000 credits (5,903,451)ESP LLR Turquoise: Earned 5,000,000 credits (6,454,573)Generalized Cullen/Woodall LLR Turquoise: Earned 5,000,000 credits (5,122,074)PPS LLR Emerald: Earned 50,000,000 credits (83,377,417)PSP LLR Jade: Earned 10,000,000 credits (15,223,714)SoB LLR Jade: Earned 10,000,000 credits (17,319,914)SR5 LLR Sapphire: Earned 20,000,000 credits (23,996,561)SGS LLR Turquoise: Earned 5,000,000 credits (7,342,780)TPS LLR (retired) Bronze: Earned 10,000 credits (34,130)TRP LLR Jade: Earned 10,000,000 credits (19,866,589)Woodall LLR Turquoise: Earned 5,000,000 credits (8,171,820)321 Sieve Sapphire: Earned 20,000,000 credits (20,236,219)Cullen/Woodall Sieve (suspended) Turquoise: Earned 5,000,000 credits (5,383,853)Generalized Cullen/Woodall Sieve (suspended) Sapphire: Earned 20,000,000 credits (20,626,419)PPS Sieve Emerald: Earned 50,000,000 credits (76,969,144)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Ruby: Earned 2,000,000 credits (2,293,882)TRP Sieve (suspended) Turquoise: Earned 5,000,000 credits (5,012,757)AP 26/27 Sapphire: Earned 20,000,000 credits (21,918,894)GFN Emerald: Earned 50,000,000 credits (76,466,089)PSA Ruby: Earned 2,000,000 credits (2,939,755)
Message 109087 - Posted: 25 Jul 2017 | 18:07:55 UTC

There's good answers already. While not directly relevant here, it may be in the near future with ever more cores in CPUs. There seems to be some inefficiency in running more than around 8 cores, and in that situation running into the additional threads beyond real cores seemed to help. Safest option, try it all and see how it actually responds.

The Prime95 benchmark is a good indicator, but I suspect things can run a little differently in LLR, so don't purely rely on P95.

Profile chip
Avatar
Send message
Joined: 12 Apr 11
Posts: 136
ID: 94709
Credit: 231,058,792
RAC: 0
Discovered 1 mega prime321 LLR Ruby: Earned 2,000,000 credits (3,002,800)ESP LLR Ruby: Earned 2,000,000 credits (2,006,207)PPS LLR Turquoise: Earned 5,000,000 credits (6,001,018)PSP LLR Ruby: Earned 2,000,000 credits (3,026,236)SoB LLR Ruby: Earned 2,000,000 credits (2,015,539)SR5 LLR Ruby: Earned 2,000,000 credits (2,000,481)SGS LLR Ruby: Earned 2,000,000 credits (2,000,014)TRP LLR Ruby: Earned 2,000,000 credits (3,000,865)321 Sieve Ruby: Earned 2,000,000 credits (2,000,357)Cullen/Woodall Sieve (suspended) Ruby: Earned 2,000,000 credits (4,000,096)PPS Sieve Emerald: Earned 50,000,000 credits (50,000,422)TRP Sieve (suspended) Ruby: Earned 2,000,000 credits (2,000,500)AP 26/27 Emerald: Earned 50,000,000 credits (50,003,824)GFN Emerald: Earned 50,000,000 credits (50,000,235)PSA Emerald: Earned 50,000,000 credits (50,000,198)
Message 122127 - Posted: 6 Nov 2018 | 11:30:41 UTC

Multithread results:
CPU time ~= Real computation time * Core count
Run time = ... WTF?

recoil44Project donor
Avatar
Send message
Joined: 20 Dec 15
Posts: 167
ID: 433037
Credit: 411,347,492
RAC: 0
Discovered 1 mega primeFound 1 prime in the 2019 Tour de PrimesFound 1 mega prime in the 2019 Tour de Primes321 LLR Turquoise: Earned 5,000,000 credits (5,019,377)Cullen LLR Turquoise: Earned 5,000,000 credits (5,064,170)ESP LLR Turquoise: Earned 5,000,000 credits (5,001,303)Generalized Cullen/Woodall LLR Turquoise: Earned 5,000,000 credits (5,094,139)PPS LLR Turquoise: Earned 5,000,000 credits (6,021,807)PSP LLR Turquoise: Earned 5,000,000 credits (5,085,124)SoB LLR Turquoise: Earned 5,000,000 credits (5,042,051)SR5 LLR Turquoise: Earned 5,000,000 credits (5,077,248)SGS LLR Turquoise: Earned 5,000,000 credits (5,000,448)TRP LLR Turquoise: Earned 5,000,000 credits (5,003,404)Woodall LLR Turquoise: Earned 5,000,000 credits (5,035,352)321 Sieve Turquoise: Earned 5,000,000 credits (5,061,436)Generalized Cullen/Woodall Sieve (suspended) Jade: Earned 10,000,000 credits (10,012,148)PPS Sieve Double Bronze: Earned 100,000,000 credits (108,556,313)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Ruby: Earned 2,000,000 credits (2,089,865)TRP Sieve (suspended) Turquoise: Earned 5,000,000 credits (5,297,782)AP 26/27 Emerald: Earned 50,000,000 credits (50,153,415)GFN Double Bronze: Earned 100,000,000 credits (106,065,222)PSA Emerald: Earned 50,000,000 credits (67,666,890)
Message 122157 - Posted: 6 Nov 2018 | 19:23:54 UTC - in response to Message 122127.

You're seeing stacked run time due to grabbing more than 1 WU at the start of a multi-threading run. This is a known BOINC issue, not PGs fault.

If you're trying to get an average run time either don't use the obviously high ones or work through and subtract the times of the ones done before it.

Profile chip
Avatar
Send message
Joined: 12 Apr 11
Posts: 136
ID: 94709
Credit: 231,058,792
RAC: 0
Discovered 1 mega prime321 LLR Ruby: Earned 2,000,000 credits (3,002,800)ESP LLR Ruby: Earned 2,000,000 credits (2,006,207)PPS LLR Turquoise: Earned 5,000,000 credits (6,001,018)PSP LLR Ruby: Earned 2,000,000 credits (3,026,236)SoB LLR Ruby: Earned 2,000,000 credits (2,015,539)SR5 LLR Ruby: Earned 2,000,000 credits (2,000,481)SGS LLR Ruby: Earned 2,000,000 credits (2,000,014)TRP LLR Ruby: Earned 2,000,000 credits (3,000,865)321 Sieve Ruby: Earned 2,000,000 credits (2,000,357)Cullen/Woodall Sieve (suspended) Ruby: Earned 2,000,000 credits (4,000,096)PPS Sieve Emerald: Earned 50,000,000 credits (50,000,422)TRP Sieve (suspended) Ruby: Earned 2,000,000 credits (2,000,500)AP 26/27 Emerald: Earned 50,000,000 credits (50,003,824)GFN Emerald: Earned 50,000,000 credits (50,000,235)PSA Emerald: Earned 50,000,000 credits (50,000,198)
Message 122189 - Posted: 7 Nov 2018 | 8:47:07 UTC
Last modified: 7 Nov 2018 | 9:00:26 UTC

w/o HT:

Timings for 960K all-complex FFT length (6 cores, 1 worker): 0.61 ms. Throughput: 1637.29 iter/sec. Timings for 960K all-complex FFT length (6 cores, 2 workers): 1.25, 1.24 ms. Throughput: 1610.60 iter/sec. Timings for 960K all-complex FFT length (6 cores, 3 workers): 2.26, 2.25, 2.25 ms. Throughput: 1332.79 iter/sec. Timings for 960K all-complex FFT length (6 cores, 4 workers): 4.72, 4.64, 2.32, 2.36 ms. Throughput: 1281.87 iter/sec. Timings for 960K all-complex FFT length (6 cores, 5 workers): 4.88, 4.85, 4.86, 4.86, 2.30 ms. Throughput: 1257.76 iter/sec. Timings for 960K all-complex FFT length (6 cores, 6 workers): 5.27, 5.20, 5.22, 5.21, 5.16, 5.22 ms. Throughput: 1150.77 iter/sec.


with HT:
Timings for 1280K all-complex FFT length (6 cores, 1 worker): 0.81 ms. Throughput: 1230.40 iter/sec. Timings for 1280K all-complex FFT length (6 cores hyperthreaded, 1 worker): 0.77 ms. Throughput: 1296.76 iter/sec.


large FFT with HT:
Timings for 1920K all-complex FFT length (6 cores, 1 worker): 1.32 ms. Throughput: 760.07 iter/sec. Timings for 1920K all-complex FFT length (6 cores hyperthreaded, 1 worker): 1.39 ms. Throughput: 719.25 iter/sec.


P.S. on i7-8700K lock 4000MHz

Profile chip
Avatar
Send message
Joined: 12 Apr 11
Posts: 136
ID: 94709
Credit: 231,058,792
RAC: 0
Discovered 1 mega prime321 LLR Ruby: Earned 2,000,000 credits (3,002,800)ESP LLR Ruby: Earned 2,000,000 credits (2,006,207)PPS LLR Turquoise: Earned 5,000,000 credits (6,001,018)PSP LLR Ruby: Earned 2,000,000 credits (3,026,236)SoB LLR Ruby: Earned 2,000,000 credits (2,015,539)SR5 LLR Ruby: Earned 2,000,000 credits (2,000,481)SGS LLR Ruby: Earned 2,000,000 credits (2,000,014)TRP LLR Ruby: Earned 2,000,000 credits (3,000,865)321 Sieve Ruby: Earned 2,000,000 credits (2,000,357)Cullen/Woodall Sieve (suspended) Ruby: Earned 2,000,000 credits (4,000,096)PPS Sieve Emerald: Earned 50,000,000 credits (50,000,422)TRP Sieve (suspended) Ruby: Earned 2,000,000 credits (2,000,500)AP 26/27 Emerald: Earned 50,000,000 credits (50,003,824)GFN Emerald: Earned 50,000,000 credits (50,000,235)PSA Emerald: Earned 50,000,000 credits (50,000,198)
Message 122196 - Posted: 7 Nov 2018 | 11:33:04 UTC
Last modified: 7 Nov 2018 | 11:33:33 UTC

TRP (Credit = 4,061)

1. Intel(R) Core(TM) i7-3720QM CPU @ 2.6GHz
Using AVX FFT length 864K, Pass1=384, Pass2=2304, 4 threads
Real Time = 19,523 s
CPU Time = 76,300 s
PPD ~= 18k
CPU Power = 30W
Efficiency math = 1730
Efficiency power = 600

2. Intel(R) Core(TM) i7-4600U CPU @ 2.2GHz
Using FMA3 FFT length 864K, Pass1=384, Pass2=2304, 2 threads
Real Time = 31,010 s
CPU Time = 61,307 s
PPD ~= 11.3k
CPU Power = 17W
Efficiency math = 2568
Efficiency power = 665

3. Intel(R) Core(TM) i7-8700K CPU @ 4.0GHz
Using FMA3 FFT length 864K, Pass1=384, Pass2=2304, 6 threads
Real Time = 5,105 s
CPU Time = 30,237 s
PPD ~= 68.7k
CPU Power = 115W
Efficiency math = 2862
Efficiency power = 600

P.S. Efficiency math = PPD / (Frequency * Core), Efficiency math = PPD / Power

Message boards : The Riesel Problem : Optimal number of instances on i5 and i7 for Reisel problem

[Return to PrimeGrid main page]
DNS Powered by DNSEXIT.COM
Copyright © 2005 - 2023 Rytis Slatkevičius (contact) and PrimeGrid community. Server load 0.00, 0.00, 0.00
Generated 29 Sep 2023 | 23:24:00 UTC