1)
Message boards :
Wieferich and WallSunSun Prime Search :
WW 1.05 credits
(Message 147267)
Posted 106 days ago by Yves Gallot
WW 1.05
GTX 1660 11.09
RTX 2070 14.7
What I see on your computer is
RTX 2070: 2304 cores @ 1620MHz, 666 sec
GTX 1660 SUPER: 1408 cores @ 1785MHz, 1000 sec
core.freq ratio ~ 1.5, speed ratio 1.5
and 14.7/11.09 ~ 1.33...?

2)
Message boards :
Wieferich and WallSunSun Prime Search :
New Version Testing
(Message 147252)
Posted 106 days ago by Yves Gallot
Final test.
I searched (with a version of the program compiled with a larger threshold) for "almostnearprimes" and found:
18446736287849978869 is a Wieferich special instance (1 +39003642 p)
18446742680598714497 is a WallSunSun special instance (+0 59149319 p)
Both results are correct and close to 2^{64}.

3)
Message boards :
Wieferich and WallSunSun Prime Search :
New Version Testing
(Message 147250)
Posted 106 days ago by Yves Gallot
I'm using 27.20.100.7987 under Windows. I do own an 8400 which has HD630, I can test it later once I eventually get back home.
If it runs on the HD 520, it will on the HD 630.
My computer doesn't want to update my iGPU driver because it is the latest version of the constructor and 64bit atomics were not implemented in this release. Surprising because this is OpenCL 2.1 NEO and FP64 are supported. Not a real problem but I like to validate programs on iGPU, I already found a bug in genefer that was not detected with Nvidia driver.

4)
Message boards :
Wieferich and WallSunSun Prime Search :
New Version Testing
(Message 147237)
Posted 106 days ago by Yves Gallot
As usual, I ran some tests on one of my Intel iGPUs. I ran it on my 6006u's HD520, and everything worked without a hitch. Improvement was between ~3x to almost 4x in one of the tests. Very impressive!
It's interesting, another OpenCL compiler is a good test.
On my computer, WWocl doesn't work on iGPU: unsupported OpenCL extension 'cl_khr_int64_base_atomics'.
The iGPU is an HD 630 and the driver is 23.20.16.4973. What is your driver?

5)
Message boards :
Generalized Fermat Prime Search :
GFN CPU task L3 Cache size?
(Message 146975)
Posted 115 days ago by Yves Gallot
Is there any published L3 cache usage for GFN CPU tasks, like LLR tasks publish their memory usage?
fma/avx/sse4/sse2 transforms, GFNn: 2^{n} * size of FP64
fmai/avxi/sse4i/sse2i transforms, GFNn: 2^{n+1} * size of FP64
GFN16: 1M
GFN17: 2M
GFN18: 4M
GFN19: 8M
GFN20: 16M

6)
Message boards :
Number crunching :
Great Conjunction Challenge
(Message 146915)
Posted 117 days ago by Yves Gallot
I am running on i78650U CPU @ 1.90GHz. Why are my estimated times for the Genfer19 Tasks at 50 days; Genfer 18 at 10 days, and Genfer 20 at 200 days? At those completion times, I will never get any credit for this challenge
You are running 8 tasks on a 4core processor. Set usage limit to 50% of the CPU.
On this processor, 4 GFN20 can be completed during this challenge.

7)
Message boards :
Proth Prime Search :
Can PPSMega be a Fermat Divisor?
(Message 146762)
Posted 121 days ago by Yves Gallot
(As an aside, the extremely lown, highk end of the FermatSearch spectrum doesn't even search one candidate at a time; they literally expand out the Fermat number and try to factor it with the elliptic curve method. That's how we know some Fermat divisors where k has ~50 digits.)
And we know a Fermat divisor where k has > 500 digits, the largest prime factor of F_{11}. :)

8)
Message boards :
Number crunching :
Start discussing new goals for 2021
(Message 146621)
Posted 125 days ago by Yves Gallot
S out of curoisity is there a GFN24 thru GFN100 etc set of numbers to crunch at some point in the future? I'm not asking for a time frame just IF there are possible prime numbers that fit into those categories too.
Technically speaking, these GFN exponents can be sieved. But the size of the remaining candidates grows quickly and the primality test of these numbers can't yet be computed in a reasonable amount of time.
Because of rapid advances in hardware technology, it's unnecessary to sieve a range that will be tested in 10 or 20 years' time.
GFN22 > 25M digits
GFN23 > 50M digits
GFN24 > 100M digits
GFN25 > 250M digits
GFN26 > 500M digits
GFN27 > 1000M digits

9)
Message boards :
Number crunching :
Start discussing new goals for 2021
(Message 146598)
Posted 126 days ago by Yves Gallot
Is it related to the fact that the doubleprecision floatingpoint format uses 52 bits (out of 64) for its mantissa component? /JeppeSN
Yes. x mod p = x  p * [x/p]. With FP64, p_inv = 1.0/p can be precomputed and x mod p = x  p * [x * p_inv]. This is a FP implementation of Barrett reduction.
AVX instruction set can test four primes simultaneouly.

10)
Message boards :
Number crunching :
Start discussing new goals for 2021
(Message 146590)
Posted 126 days ago by Yves Gallot
It shouldn't be too hard to support up to 2^62, although AVX code cannot be used between 2^52 and 2^62.
This has me quite curious.
Please, explain more!
AVX2 instruction set has no 64bit multiplication. Four 64bit integers can be added or subtracted but the multiplication is 32x32 => 64.
For this reason, basic 64bit integer instructions are faster. If four primes are tested simultaneouly, the latency of the MUL operation (34 cycles) can be hidden and the throughput is 1 cycle.
The best algorithm for sieving is Montgomery modular multiplication. The modular multiplication of two numbers for any p < 2^64 is calculated with 3 multiplications, 1 subtraction and 1 conditional addition. The theoretical speed limit of the modular multiplication of two 64bit integers is 3 cycles because the subtraction and conditional addition can be executed in parallel on another unit of the core. It's difficult to achieve because of the small number of registers of x64 but with hyperthreading, we are close to the limit.
