PrimeGrid
Please visit donation page to help the project cover running costs for this month

Toggle Menu

Join PrimeGrid

Returning Participants

Community

Leader Boards

Results

Other

drummers-lowrise

Advanced search

Message boards : Generalized Fermat Prime Search : Source code of Genefer for OpenCL is available.

Author Message
Yves Gallot
Volunteer developer
Project scientist
Send message
Joined: 19 Aug 12
Posts: 513
ID: 164101
Credit: 295,254,118
RAC: 5,845
GFN Double Silver: Earned 200,000,000 credits (295,254,118)
Message 67982 - Posted: 30 Jul 2013 | 21:31:17 UTC

The source code of a variant of Genefer for OpenCL is available on assembla:
[...]\branches\yves\2013\OclGenefer and [...]\branches\yves\2013\Common

I tested it with NVidia SDK on Fermi and Kepler GPUs and with Intel SDK for OpenCL, running on the CPU because Intel HD Graphics have no FP64. I have no ATI card, but the code doesn't depend on any external library then it should run on ATI cards.

It is about as fast as GeneferCUDA on my computers but the algorithm is different then speed ratio depends on hardware and exponent.

It is not finished: error is not computed and many improvements are still possible.

The source code is "standard" C++, then you can compile it with Visual Studio 2012 (Express) or gcc with NVidia or ATI or Intel SDK for OpenCL.

Yves

Yves Gallot
Volunteer developer
Project scientist
Send message
Joined: 19 Aug 12
Posts: 513
ID: 164101
Credit: 295,254,118
RAC: 5,845
GFN Double Silver: Earned 200,000,000 credits (295,254,118)
Message 67999 - Posted: 31 Jul 2013 | 12:31:21 UTC

Here's some benches on NVidia: this version doesn't like Fermi architecture :-(

GTX 680 (GK 104) Genefer CUDA 538452^1048576+1 Time: 2.5 ms/mul. Err: 0.2031 6009544 digits 440400^2097152+1 Time: 4.76 ms/mul. Err: 0.2051 11836006 digits 360204^4194304+1 Time: 9.27 ms/mul. Err: 0.2167 23305854 digits 294612^8388608+1 Time: 20.5 ms/mul. Err: 0.1797 45879398 digits Genefer OpenCL 538452^1048576+1 Time: 3.11 ms/mul. 6009544 digits 440400^2097152+1 Time: 6.7 ms/mul. 11836006 digits 360204^4194304+1 Time: 12.9 ms/mul. 23305854 digits 294612^8388608+1 Time: 26.8 ms/mul. 45879398 digits ------------------------------------------ GTX 660 (GK 106) Genefer CUDA 538452^1048576+1 Time: 3.99 ms/mul. Err: 0.2031 6009544 digits 440400^2097152+1 Time: 7.87 ms/mul. Err: 0.2051 11836006 digits 360204^4194304+1 Time: 15.9 ms/mul. Err: 0.2167 23305854 digits 294612^8388608+1 Time: 35.1 ms/mul. Err: 0.1797 45879398 digits Genefer OpenCL 538452^1048576+1 Time: 4.82 ms/mul. 6009544 digits 440400^2097152+1 Time: 10.6 ms/mul. 11836006 digits 360204^4194304+1 Time: 20.7 ms/mul. 23305854 digits 294612^8388608+1 Time: 43 ms/mul. 45879398 digits ------------------------------------------ GTX 580 (GF 110) Genefer CUDA 538452^1048576+1 Time: 1.72 ms/mul. Err: 0.2031 6009544 digits 440400^2097152+1 Time: 3.28 ms/mul. Err: 0.1953 11836006 digits 360204^4194304+1 Time: 6.8 ms/mul. Err: 0.2031 23305854 digits 294612^8388608+1 Time: 14.1 ms/mul. Err: 0.1797 45879398 digits Genefer OpenCL 538452^1048576+1 Time: 2.83 ms/mul. 6009544 digits 440400^2097152+1 Time: 6.17 ms/mul. 11836006 digits 360204^4194304+1 Time: 15.5 ms/mul. 23305854 digits 294612^8388608+1 Time: 38.9 ms/mul. 45879398 digits

Profile chip
Avatar
Send message
Joined: 12 Apr 11
Posts: 128
ID: 94709
Credit: 164,082,201
RAC: 5,606
321 LLR Amethyst: Earned 1,000,000 credits (1,081,800)ESP LLR Amethyst: Earned 1,000,000 credits (1,958,365)PPS LLR Ruby: Earned 2,000,000 credits (3,000,162)PSP LLR Amethyst: Earned 1,000,000 credits (1,022,562)SoB LLR Ruby: Earned 2,000,000 credits (2,015,539)SR5 LLR Ruby: Earned 2,000,000 credits (2,000,481)SGS LLR Ruby: Earned 2,000,000 credits (2,000,014)TRP LLR Ruby: Earned 2,000,000 credits (3,000,865)321 Sieve Ruby: Earned 2,000,000 credits (2,000,357)Cullen/Woodall Sieve (suspended) Ruby: Earned 2,000,000 credits (4,000,096)PPS Sieve Emerald: Earned 50,000,000 credits (50,000,422)TRP Sieve (suspended) Ruby: Earned 2,000,000 credits (2,000,500)AP 26/27 Sapphire: Earned 20,000,000 credits (20,000,721)GFN Sapphire: Earned 20,000,000 credits (20,000,117)PSA Emerald: Earned 50,000,000 credits (50,000,198)
Message 68002 - Posted: 31 Jul 2013 | 14:02:42 UTC

Very necessary tests on Tahiti.

Profile rebirther
Avatar
Send message
Joined: 10 Aug 05
Posts: 783
ID: 85
Credit: 175,774,608
RAC: 0
Eliminated 3 conjecture "k"s321 LLR Silver: Earned 100,000 credits (186,594)Cullen LLR Silver: Earned 100,000 credits (106,665)ESP LLR Gold: Earned 500,000 credits (502,416)PPS LLR Gold: Earned 500,000 credits (504,111)PSP LLR Gold: Earned 500,000 credits (513,785)SoB LLR Gold: Earned 500,000 credits (564,944)SR5 LLR Ruby: Earned 2,000,000 credits (2,790,118)SGS LLR Gold: Earned 500,000 credits (501,099)TPS LLR (retired) Bronze: Earned 10,000 credits (46,235)TRP LLR Gold: Earned 500,000 credits (708,706)Woodall LLR Silver: Earned 100,000 credits (133,626)321 Sieve Bronze: Earned 10,000 credits (21,527)Cullen/Woodall Sieve (suspended) Jade: Earned 10,000,000 credits (14,729,132)PPS Sieve Double Bronze: Earned 100,000,000 credits (132,786,707)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Gold: Earned 500,000 credits (510,317)TRP Sieve (suspended) Gold: Earned 500,000 credits (620,991)AP 26/27 Silver: Earned 100,000 credits (418,876)GFN Amethyst: Earned 1,000,000 credits (1,795,613)PSA Jade: Earned 10,000,000 credits (18,329,123)
Message 68003 - Posted: 31 Jul 2013 | 14:07:41 UTC

Tried to compiling but end up with the same errors as with other genefercuda version, maybe a problem with OpenCL.lib.

Profile rebirther
Avatar
Send message
Joined: 10 Aug 05
Posts: 783
ID: 85
Credit: 175,774,608
RAC: 0
Eliminated 3 conjecture "k"s321 LLR Silver: Earned 100,000 credits (186,594)Cullen LLR Silver: Earned 100,000 credits (106,665)ESP LLR Gold: Earned 500,000 credits (502,416)PPS LLR Gold: Earned 500,000 credits (504,111)PSP LLR Gold: Earned 500,000 credits (513,785)SoB LLR Gold: Earned 500,000 credits (564,944)SR5 LLR Ruby: Earned 2,000,000 credits (2,790,118)SGS LLR Gold: Earned 500,000 credits (501,099)TPS LLR (retired) Bronze: Earned 10,000 credits (46,235)TRP LLR Gold: Earned 500,000 credits (708,706)Woodall LLR Silver: Earned 100,000 credits (133,626)321 Sieve Bronze: Earned 10,000 credits (21,527)Cullen/Woodall Sieve (suspended) Jade: Earned 10,000,000 credits (14,729,132)PPS Sieve Double Bronze: Earned 100,000,000 credits (132,786,707)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Gold: Earned 500,000 credits (510,317)TRP Sieve (suspended) Gold: Earned 500,000 credits (620,991)AP 26/27 Silver: Earned 100,000 credits (418,876)GFN Amethyst: Earned 1,000,000 credits (1,795,613)PSA Jade: Earned 10,000,000 credits (18,329,123)
Message 68007 - Posted: 31 Jul 2013 | 16:00:51 UTC

App crashed with "cannot create program".

Added current status to my dropbox. If someone is interested tell me.

Yves Gallot
Volunteer developer
Project scientist
Send message
Joined: 19 Aug 12
Posts: 513
ID: 164101
Credit: 295,254,118
RAC: 5,845
GFN Double Silver: Earned 200,000,000 credits (295,254,118)
Message 68011 - Posted: 31 Jul 2013 | 16:26:13 UTC - in response to Message 68003.

You can compile it with cl.h and OpenCL.lib from any OpenCL SDK (Intel, Nvidia or ATI).
The executable will mount the OpenCL.dll installed by graphics driver(s).
On my laptop, with a HD 4000 and a GeForce GT 740M, any OpenCL binary can run on Intel GPU and on NVidia GPU. I don't have to generate one with Intel SDK and another one with NVidia SDK.

Yves Gallot
Volunteer developer
Project scientist
Send message
Joined: 19 Aug 12
Posts: 513
ID: 164101
Credit: 295,254,118
RAC: 5,845
GFN Double Silver: Earned 200,000,000 credits (295,254,118)
Message 68012 - Posted: 31 Jul 2013 | 16:28:06 UTC - in response to Message 68007.

App crashed with "cannot create program".

Added current status to my dropbox. If someone is interested tell me.


It doesn't find "Genefer.cl".

Profile rebirther
Avatar
Send message
Joined: 10 Aug 05
Posts: 783
ID: 85
Credit: 175,774,608
RAC: 0
Eliminated 3 conjecture "k"s321 LLR Silver: Earned 100,000 credits (186,594)Cullen LLR Silver: Earned 100,000 credits (106,665)ESP LLR Gold: Earned 500,000 credits (502,416)PPS LLR Gold: Earned 500,000 credits (504,111)PSP LLR Gold: Earned 500,000 credits (513,785)SoB LLR Gold: Earned 500,000 credits (564,944)SR5 LLR Ruby: Earned 2,000,000 credits (2,790,118)SGS LLR Gold: Earned 500,000 credits (501,099)TPS LLR (retired) Bronze: Earned 10,000 credits (46,235)TRP LLR Gold: Earned 500,000 credits (708,706)Woodall LLR Silver: Earned 100,000 credits (133,626)321 Sieve Bronze: Earned 10,000 credits (21,527)Cullen/Woodall Sieve (suspended) Jade: Earned 10,000,000 credits (14,729,132)PPS Sieve Double Bronze: Earned 100,000,000 credits (132,786,707)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Gold: Earned 500,000 credits (510,317)TRP Sieve (suspended) Gold: Earned 500,000 credits (620,991)AP 26/27 Silver: Earned 100,000 credits (418,876)GFN Amethyst: Earned 1,000,000 credits (1,795,613)PSA Jade: Earned 10,000,000 credits (18,329,123)
Message 68013 - Posted: 31 Jul 2013 | 16:33:02 UTC - in response to Message 68012.
Last modified: 31 Jul 2013 | 16:51:54 UTC

App crashed with "cannot create program".

Added current status to my dropbox. If someone is interested tell me.


It doesn't find "Genefer.cl".


Oh, thx, my fault, forgot to add it.

Test results on Asus HD7950:

Platform 'AMD Accelerated Parallel Processing': 1 GPU device(s) found. Platform 'AMD Accelerated Parallel Processing': 1 CPU device(s) found. "C:\Users\user\AppData\Local\Temp\OCLBA77.tmp.cl", line 12: warning: OpenCL extension is now part of core #pragma OPENCL EXTENSION cl_khr_fp64 : enable ^ Running on platform 'AMD Accelerated Parallel Processing' and device 'Tahiti'. 2199064^8192+1 Time: 99.6 us/mul. 51956 digits 1798620^16384+1 Time: 95.8 us/mul. 102481 digits 1471094^32768+1 Time: 98 us/mul. 202102 digits 1203210^65536+1 Time: 122 us/mul. 398482 digits 984108^131072+1 Time: 182 us/mul. 785521 digits 804904^262144+1 Time: 385 us/mul. 1548156 digits 658332^524288+1 Time: 916 us/mul. 3050541 digits 538452^1048576+1 Time: 2.01 ms/mul. 6009544 digits 440400^2097152+1 Time: 4.53 ms/mul. 11836006 digits 360204^4194304+1 Time: 12 ms/mul. 23305854 digits 294612^8388608+1 Time: 31.1 ms/mul. 45879398 digits 30^32+1 is a probable prime. (0.8 sec., err = 0.00e+000) 20000066^32+1 is a probable prime. (0.8 sec., err = 0.00e+000) 102^64+1 is a probable prime. (0.8 sec., err = 0.00e+000) 15000250^64+1 is a probable prime. (0.9 sec., err = 0.00e+000) 120^128+1 is a probable prime. (0.8 sec., err = 0.00e+000) 10000038^128+1 is a probable prime. (0.9 sec., err = 0.00e+000) 278^256+1 is a probable prime. (0.9 sec., err = 0.00e+000) 5684328^256+1 is a probable prime. (1.1 sec., err = 0.00e+000) 46^512+1 is a probable prime. (1.0 sec., err = 0.00e+000) 4619000^512+1 is a probable prime. (1.5 sec., err = 0.00e+000) 824^1024+1 is a probable prime. (1.4 sec., err = 0.00e+000) 3752220^1024+1 is a probable prime. (2.1 sec., err = 0.00e+000) 150^2048+1 is a probable prime. (1.8 sec., err = 0.00e+000) 3066672^2048+1 is a probable prime. (3.7 sec., err = 0.00e+000) 1534^4096+1 is a probable prime. (4.0 sec., err = 0.00e+000) 2485064^4096+1 is a probable prime. (7.3 sec., err = 0.00e+000) 30406^8192+1 is a probable prime. (12.2 sec., err = 0.00e+000) 2030234^8192+1 is a probable prime. (17.0 sec., err = 0.00e+000) 67234^16384+1 is a probable prime. (25.7 sec., err = 0.00e+000) 1651902^16384+1 is a probable prime. (32.7 sec., err = 0.00e+000) 70906^32768+1 is a probable prime. (50.5 sec., err = 0.00e+000) 1277444^32768+1 is a probable prime. (66.1 sec., err = 0.00e+000)


abort rest, taking too long but its working.

What are command lines for the app?

Profile Michael GoetzProject donor
Volunteer moderator
Project administrator
Project scientist
Avatar
Send message
Joined: 21 Jan 10
Posts: 12669
ID: 53948
Credit: 184,131,428
RAC: 10,627
The "Shut up already!" badge:  This loud mouth has mansplained on the forums over 10 thousand times!  Sheesh!!!Discovered the World's First GFN-19 prime!!!Discovered 1 mega primeFound 1 prime in the 2018 Tour de PrimesFound 1 prime in the 2019 Tour de Primes321 LLR Ruby: Earned 2,000,000 credits (2,063,182)Cullen LLR Ruby: Earned 2,000,000 credits (2,005,249)ESP LLR Ruby: Earned 2,000,000 credits (3,820,430)Generalized Cullen/Woodall LLR Ruby: Earned 2,000,000 credits (2,145,754)PPS LLR Ruby: Earned 2,000,000 credits (2,773,744)PSP LLR Ruby: Earned 2,000,000 credits (2,632,269)SoB LLR Sapphire: Earned 20,000,000 credits (34,158,496)SR5 LLR Turquoise: Earned 5,000,000 credits (8,293,415)SGS LLR Ruby: Earned 2,000,000 credits (2,012,781)TRP LLR Ruby: Earned 2,000,000 credits (2,737,347)Woodall LLR Ruby: Earned 2,000,000 credits (2,195,123)321 Sieve Turquoise: Earned 5,000,000 credits (5,046,112)Cullen/Woodall Sieve (suspended) Ruby: Earned 2,000,000 credits (4,170,256)Generalized Cullen/Woodall Sieve Turquoise: Earned 5,000,000 credits (5,059,304)PPS Sieve Sapphire: Earned 20,000,000 credits (20,110,788)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Amethyst: Earned 1,000,000 credits (1,035,522)TRP Sieve (suspended) Ruby: Earned 2,000,000 credits (2,051,121)AP 26/27 Turquoise: Earned 5,000,000 credits (7,090,096)GFN Emerald: Earned 50,000,000 credits (64,594,991)PSA Jade: Earned 10,000,000 credits (10,135,447)
Message 68017 - Posted: 31 Jul 2013 | 18:48:19 UTC - in response to Message 68013.

What are command lines for the app?


There are no command line options for this program. What you see is all it can do.

Running a predefined list of known primes through the algorithm is all that program does. It's not intended to be a complete application. Like Shoichiro's OpenCL code, it's the code for an algorithm that's intended to be plugged into our existing Genefer framework. It's not intended to be useful on its own.

After 18 months of wishing we had an OpenCL implementation, we now have two!

Which to use? Yves' is more portable as it does not require an external FFT library. Yves wrote his own transforms in portable C++, so it will run on any platform. Shoichiro used AMD's FFT library, which Iain says isn't available on Mac. It would be interesting to see whether Yves' transform is faster than the AMD libraries.

I had relatively little difficulty building Yves' application with WS2012 Express as an x64 app using the CUDA 3.2 toolkit. Run times on my GTX 460 are slightly more than twice as slow as the CUDA version.
____________
Please do not PM me with support questions. Ask on the forums instead. Thank you!

My lucky number is 75898524288+1

Profile rebirther
Avatar
Send message
Joined: 10 Aug 05
Posts: 783
ID: 85
Credit: 175,774,608
RAC: 0
Eliminated 3 conjecture "k"s321 LLR Silver: Earned 100,000 credits (186,594)Cullen LLR Silver: Earned 100,000 credits (106,665)ESP LLR Gold: Earned 500,000 credits (502,416)PPS LLR Gold: Earned 500,000 credits (504,111)PSP LLR Gold: Earned 500,000 credits (513,785)SoB LLR Gold: Earned 500,000 credits (564,944)SR5 LLR Ruby: Earned 2,000,000 credits (2,790,118)SGS LLR Gold: Earned 500,000 credits (501,099)TPS LLR (retired) Bronze: Earned 10,000 credits (46,235)TRP LLR Gold: Earned 500,000 credits (708,706)Woodall LLR Silver: Earned 100,000 credits (133,626)321 Sieve Bronze: Earned 10,000 credits (21,527)Cullen/Woodall Sieve (suspended) Jade: Earned 10,000,000 credits (14,729,132)PPS Sieve Double Bronze: Earned 100,000,000 credits (132,786,707)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Gold: Earned 500,000 credits (510,317)TRP Sieve (suspended) Gold: Earned 500,000 credits (620,991)AP 26/27 Silver: Earned 100,000 credits (418,876)GFN Amethyst: Earned 1,000,000 credits (1,795,613)PSA Jade: Earned 10,000,000 credits (18,329,123)
Message 68018 - Posted: 31 Jul 2013 | 19:11:52 UTC - in response to Message 68017.

What are command lines for the app?


There are no command line options for this program. What you see is all it can do.

Running a predefined list of known primes through the algorithm is all that program does. It's not intended to be a complete application. Like Shoichiro's OpenCL code, it's the code for an algorithm that's intended to be plugged into our existing Genefer framework. It's not intended to be useful on its own.

After 18 months of wishing we had an OpenCL implementation, we now have two!

Which to use? Yves' is more portable as it does not require an external FFT library. Yves wrote his own transforms in portable C++, so it will run on any platform. Shoichiro used AMD's FFT library, which Iain says isn't available on Mac. It would be interesting to see whether Yves' transform is faster than the AMD libraries.

I had relatively little difficulty building Yves' application with WS2012 Express as an x64 app using the CUDA 3.2 toolkit. Run times on my GTX 460 are slightly more than twice as slow as the CUDA version.


You are right, Yves looks much better to only use 2 files. But you can split CUDA and OpenCL for Nvidia/ATI cards. I wish OpenCL on these highend cards of ATI could be much faster than CUDA.

Yves Gallot
Volunteer developer
Project scientist
Send message
Joined: 19 Aug 12
Posts: 513
ID: 164101
Credit: 295,254,118
RAC: 5,845
GFN Double Silver: Earned 200,000,000 credits (295,254,118)
Message 68019 - Posted: 31 Jul 2013 | 20:18:19 UTC - in response to Message 68017.

What are command lines for the app?


Sorry but it is just a first release for tests.
I will update it and add the command line option "-q".


Test results on Asus HD7950: [...]

HD7950 is about as fast as a GTX 680. It seems in accordance with other benchmarks.


I had relatively little difficulty building Yves' application with VS2012 Express as an x64 app.

Note that a win32 app is as fast as a x64 one.

I have been playing with OpenCL for only two months. Then there are still many variations to test and a large speed improvement is expected!

Yves Gallot
Volunteer developer
Project scientist
Send message
Joined: 19 Aug 12
Posts: 513
ID: 164101
Credit: 295,254,118
RAC: 5,845
GFN Double Silver: Earned 200,000,000 credits (295,254,118)
Message 68022 - Posted: 31 Jul 2013 | 23:10:48 UTC - in response to Message 68019.

What are command lines for the app?


I just added the option -q "b^N+1".

rogue
Volunteer developer
Avatar
Send message
Joined: 8 Sep 07
Posts: 1180
ID: 12001
Credit: 18,565,548
RAC: 0
PPS LLR Bronze: Earned 10,000 credits (31,229)PSA Jade: Earned 10,000,000 credits (18,533,435)
Message 68023 - Posted: 1 Aug 2013 | 0:41:19 UTC

Hello Yves, it has been a while. You might recall me from helping get the original genefer running on PowerMac. I wrote the checkpointing code that the various flavors of genefer use today.

I'm excited to see you back in action on this. I've been waiting for someone to port this to OpenCL so that I can run it on my Mac Pro as it has an ATI card.

Now if you could only port a command line version of Proth to OpenCL... :-)

To get this to build on Mac, I had to make this change:

#ifdef __APPLE__
#include <OpenCL/cl.h>
#else
#include <CL/cl.h>
#endif

in OclGenefer.cpp.

Also, to use clang (Apple's newer incarnation of gcc) I had to make this change:

virtual ~ITransform() {};

although llvm (Apple's older incarnation of gcc) allowed the original code:

virtual ~ITransform() = 0 {};

and the updated version.

Unfortunately, I am running into this problem with clang:

OclGenefer.cpp:176:29: error: no matching conversion for functional-style cast from 'std::ifstream' (aka 'basic_ifstream<char>') to 'std::istreambuf_iterator<char>'
const std::string source((std::istreambuf_iterator<char>(std::ifstream("Genefer.cl", std::ifstream::in))), std::istreambuf_iterator<char>());
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/usr/include/c++/4.2.1/bits/streambuf_iterator.h:48:11: note: candidate constructor (the implicit copy constructor) not viable: no known conversion from 'std::ifstream' (aka 'basic_ifstream<char>') to 'const std::istreambuf_iterator<char, std::char_traits<char> >' for 1st argument
class istreambuf_iterator
^
/usr/include/c++/4.2.1/bits/streambuf_iterator.h:98:7: note: candidate constructor not viable: no known conversion from 'std::ifstream' (aka 'basic_ifstream<char>') to 'istream_type &' (aka 'basic_istream<char, std::char_traits<char> > &') for 1st argument
istreambuf_iterator(istream_type& __s) throw()
^
/usr/include/c++/4.2.1/bits/streambuf_iterator.h:102:7: note: candidate constructor not viable: no known conversion from 'std::ifstream' (aka 'basic_ifstream<char>') to 'streambuf_type *' (aka 'basic_streambuf<char, std::char_traits<char> > *') for 1st argument
istreambuf_iterator(streambuf_type* __s) throw()
^
/usr/include/c++/4.2.1/bits/streambuf_iterator.h:94:7: note: candidate constructor not viable: requires 0 arguments, but 1 was provided
istreambuf_iterator() throw()


llvm doesn't give as useful a message:

OclGenefer.cpp: In constructor ‘Program::Program(bool)’:
OclGenefer.cpp:176: error: invalid conversion from ‘void*’ to ‘std::basic_streambuf<char, std::char_traits<char> >*’
OclGenefer.cpp:176: error: initializing argument 1 of ‘std::istreambuf_iterator<_CharT, _Traits>::istreambuf_iterator(std::basic_streambuf<_CharT, _Traits>*) [with _CharT = char, _Traits = std::char_traits<char>]’

I'm not good enough with C++ streams to fix this.

Profile Roger
Volunteer moderator
Project administrator
Volunteer developer
Volunteer tester
Project scientist
Avatar
Send message
Joined: 27 Nov 11
Posts: 1112
ID: 120786
Credit: 261,530,863
RAC: 1,408
Found 1 prime in the 2018 Tour de Primes321 LLR Ruby: Earned 2,000,000 credits (2,012,522)Cullen LLR Amethyst: Earned 1,000,000 credits (1,359,862)ESP LLR Ruby: Earned 2,000,000 credits (2,213,934)Generalized Cullen/Woodall LLR Ruby: Earned 2,000,000 credits (2,088,705)PPS LLR Ruby: Earned 2,000,000 credits (2,617,785)PSP LLR Ruby: Earned 2,000,000 credits (2,420,512)SoB LLR Amethyst: Earned 1,000,000 credits (1,780,064)SR5 LLR Ruby: Earned 2,000,000 credits (2,238,295)SGS LLR Ruby: Earned 2,000,000 credits (2,139,392)TRP LLR Ruby: Earned 2,000,000 credits (2,125,391)Woodall LLR Amethyst: Earned 1,000,000 credits (1,311,937)321 Sieve Turquoise: Earned 5,000,000 credits (5,190,731)Cullen/Woodall Sieve (suspended) Silver: Earned 100,000 credits (207,387)Generalized Cullen/Woodall Sieve Turquoise: Earned 5,000,000 credits (5,049,697)PPS Sieve Double Bronze: Earned 100,000,000 credits (100,422,123)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Ruby: Earned 2,000,000 credits (3,227,972)TRP Sieve (suspended) Turquoise: Earned 5,000,000 credits (5,021,659)AP 26/27 Sapphire: Earned 20,000,000 credits (20,295,860)GFN Emerald: Earned 50,000,000 credits (56,515,310)PSA Sapphire: Earned 20,000,000 credits (43,298,465)
Message 68030 - Posted: 1 Aug 2013 | 22:38:54 UTC - in response to Message 68013.

Test result with HD7970Ghz using Rebirther's Dropbox genefer.exe:



Was using 18% of CPU (X6 1100T) and up to 99% of GPU. The high use of GPU is too aggressive and program does not finish long runs.
Not shown in screen shot: 572186^131072+1 took 3424.5 seconds and 2418^262144+1 took 8000+ seconds.

Yves Gallot
Volunteer developer
Project scientist
Send message
Joined: 19 Aug 12
Posts: 513
ID: 164101
Credit: 295,254,118
RAC: 5,845
GFN Double Silver: Earned 200,000,000 credits (295,254,118)
Message 68056 - Posted: 2 Aug 2013 | 10:00:28 UTC - in response to Message 68023.

Hi Mark,

Also, to use clang (Apple's newer incarnation of gcc) I had to make this change:
virtual ~ITransform() {};
although llvm (Apple's older incarnation of gcc) allowed the original code:
virtual ~ITransform() = 0 {};
and the updated version.


This is a bug of the compiler with pure virtual destructor !?
I removed polymorphism, it is useless in this version... but it will not work with the full version of Genefer. Then maybe you could report the bug.

I removed STL calls (and then C++ streams).

You can download the latest version.

Yves Gallot
Volunteer developer
Project scientist
Send message
Joined: 19 Aug 12
Posts: 513
ID: 164101
Credit: 295,254,118
RAC: 5,845
GFN Double Silver: Earned 200,000,000 credits (295,254,118)
Message 68057 - Posted: 2 Aug 2013 | 10:28:20 UTC - in response to Message 68030.

Test result with HD7970Ghz using Rebirther's Dropbox genefer.exe:

Thank's for the results.

I don't understand the second warning with FP_CONTRACT.
This pragma is defined with OpenCL 1.0, 1.1 and 1.2.
Rebirther doesn't have this warning on his computer.

Something else that I don't understand:
If 1203210^65536+1 Time: 164 us/mul. and 984108^131072+1 Time: 236 us/mul. then
if 48594^65536+1 172 sec. we expect that
62722^131072+1 ~ 172 * 236/164 * 2 ~ 500 sec
1500 sec is too slow.

Was using 18% of CPU (X6 1100T) and up to 99% of GPU.

This should be 0% of CPU and up to 99% of GPU. Why CPU is running ?

Profile rebirther
Avatar
Send message
Joined: 10 Aug 05
Posts: 783
ID: 85
Credit: 175,774,608
RAC: 0
Eliminated 3 conjecture "k"s321 LLR Silver: Earned 100,000 credits (186,594)Cullen LLR Silver: Earned 100,000 credits (106,665)ESP LLR Gold: Earned 500,000 credits (502,416)PPS LLR Gold: Earned 500,000 credits (504,111)PSP LLR Gold: Earned 500,000 credits (513,785)SoB LLR Gold: Earned 500,000 credits (564,944)SR5 LLR Ruby: Earned 2,000,000 credits (2,790,118)SGS LLR Gold: Earned 500,000 credits (501,099)TPS LLR (retired) Bronze: Earned 10,000 credits (46,235)TRP LLR Gold: Earned 500,000 credits (708,706)Woodall LLR Silver: Earned 100,000 credits (133,626)321 Sieve Bronze: Earned 10,000 credits (21,527)Cullen/Woodall Sieve (suspended) Jade: Earned 10,000,000 credits (14,729,132)PPS Sieve Double Bronze: Earned 100,000,000 credits (132,786,707)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Gold: Earned 500,000 credits (510,317)TRP Sieve (suspended) Gold: Earned 500,000 credits (620,991)AP 26/27 Silver: Earned 100,000 credits (418,876)GFN Amethyst: Earned 1,000,000 credits (1,795,613)PSA Jade: Earned 10,000,000 credits (18,329,123)
Message 68063 - Posted: 2 Aug 2013 | 14:26:00 UTC

The latest app with result, Yves you need a version number...

OclGenefer 2013, Copyright (C) 2001-2013, Yves Gallot. Options: -q "b^N+1" Test expression. Platform 'AMD Accelerated Parallel Processing': 1 GPU device(s) found. Platform 'AMD Accelerated Parallel Processing': 1 CPU device(s) found. Running on platform 'AMD Accelerated Parallel Processing' and device 'Tahiti'. 2199064^8192+1 Time: 98.9 us/mul. 51956 digits 1798620^16384+1 Time: 96.8 us/mul. 102481 digits 1471094^32768+1 Time: 99.2 us/mul. 202102 digits 1203210^65536+1 Time: 123 us/mul. 398482 digits 984108^131072+1 Time: 177 us/mul. 785521 digits 804904^262144+1 Time: 386 us/mul. 1548156 digits 658332^524288+1 Time: 910 us/mul. 3050541 digits 538452^1048576+1 Time: 1.98 ms/mul. 6009544 digits 440400^2097152+1 Time: 4.53 ms/mul. 11836006 digits 360204^4194304+1 Time: 12.3 ms/mul. 23305854 digits 294612^8388608+1 Time: 31.3 ms/mul. 45879398 digits 30^32+1 is a probable prime. (0.8 sec., err = 0.00e+000) 20000066^32+1 is a probable prime. (0.9 sec., err = 0.00e+000) 102^64+1 is a probable prime. (0.8 sec., err = 0.00e+000) 15000250^64+1 is a probable prime. (0.9 sec., err = 0.00e+000) 120^128+1 is a probable prime. (0.8 sec., err = 0.00e+000) 10000038^128+1 is a probable prime. (1.0 sec., err = 0.00e+000) 278^256+1 is a probable prime. (0.9 sec., err = 0.00e+000) 5684328^256+1 is a probable prime. (1.1 sec., err = 0.00e+000) 46^512+1 is a probable prime. (1.0 sec., err = 0.00e+000) 4619000^512+1 is a probable prime. (1.4 sec., err = 0.00e+000) 824^1024+1 is a probable prime. (1.4 sec., err = 0.00e+000) 3752220^1024+1 is a probable prime. (2.1 sec., err = 0.00e+000) 150^2048+1 is a probable prime. (1.8 sec., err = 0.00e+000)...

Profile Michael GoetzProject donor
Volunteer moderator
Project administrator
Project scientist
Avatar
Send message
Joined: 21 Jan 10
Posts: 12669
ID: 53948
Credit: 184,131,428
RAC: 10,627
The "Shut up already!" badge:  This loud mouth has mansplained on the forums over 10 thousand times!  Sheesh!!!Discovered the World's First GFN-19 prime!!!Discovered 1 mega primeFound 1 prime in the 2018 Tour de PrimesFound 1 prime in the 2019 Tour de Primes321 LLR Ruby: Earned 2,000,000 credits (2,063,182)Cullen LLR Ruby: Earned 2,000,000 credits (2,005,249)ESP LLR Ruby: Earned 2,000,000 credits (3,820,430)Generalized Cullen/Woodall LLR Ruby: Earned 2,000,000 credits (2,145,754)PPS LLR Ruby: Earned 2,000,000 credits (2,773,744)PSP LLR Ruby: Earned 2,000,000 credits (2,632,269)SoB LLR Sapphire: Earned 20,000,000 credits (34,158,496)SR5 LLR Turquoise: Earned 5,000,000 credits (8,293,415)SGS LLR Ruby: Earned 2,000,000 credits (2,012,781)TRP LLR Ruby: Earned 2,000,000 credits (2,737,347)Woodall LLR Ruby: Earned 2,000,000 credits (2,195,123)321 Sieve Turquoise: Earned 5,000,000 credits (5,046,112)Cullen/Woodall Sieve (suspended) Ruby: Earned 2,000,000 credits (4,170,256)Generalized Cullen/Woodall Sieve Turquoise: Earned 5,000,000 credits (5,059,304)PPS Sieve Sapphire: Earned 20,000,000 credits (20,110,788)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Amethyst: Earned 1,000,000 credits (1,035,522)TRP Sieve (suspended) Ruby: Earned 2,000,000 credits (2,051,121)AP 26/27 Turquoise: Earned 5,000,000 credits (7,090,096)GFN Emerald: Earned 50,000,000 credits (64,594,991)PSA Jade: Earned 10,000,000 credits (10,135,447)
Message 68066 - Posted: 2 Aug 2013 | 14:58:24 UTC - in response to Message 68063.

The latest app with result, Yves you need a version number...


He doesn't need a version number because that's not an app. It's just some code that will be included in the real app at some point. It's not released code.


____________
Please do not PM me with support questions. Ask on the forums instead. Thank you!

My lucky number is 75898524288+1

Profile shoichiro yamada
Send message
Joined: 6 Dec 10
Posts: 50
ID: 75840
Credit: 3,315,642
RAC: 0
PPS Sieve Gold: Earned 500,000 credits (586,188)TRP Sieve (suspended) Bronze: Earned 10,000 credits (21,775)PSA Ruby: Earned 2,000,000 credits (2,696,411)
Message 68069 - Posted: 2 Aug 2013 | 15:31:46 UTC

HD7750 with Ubuntu

$ ./Genefer OclGenefer 2013, Copyright (C) 2001-2013, Yves Gallot. Options: -q "b^N+1" Test expression. Platform 'AMD Accelerated Parallel Processing': 1 GPU device(s) found. Platform 'AMD Accelerated Parallel Processing': 1 CPU device(s) found. "/tmp/OCLYaqLaz.cl", line 12: warning: OpenCL extension is now part of core #pragma OPENCL EXTENSION cl_khr_fp64 : enable ^ Running on platform 'AMD Accelerated Parallel Processing' and device 'Capeverde'. 2199064^8192+1 Time: 122 us/mul. 51956 digits 1798620^16384+1 Time: 183 us/mul. 102481 digits 1471094^32768+1 Time: 122 us/mul. 202102 digits 1203210^65536+1 Time: 244 us/mul. 398482 digits 984108^131072+1 Time: 488 us/mul. 785521 digits 804904^262144+1 Time: 977 us/mul. 1548156 digits 658332^524288+1 Time: 3.91 ms/mul. 3050541 digits 538452^1048576+1 Time: 7.81 ms/mul. 6009544 digits 440400^2097152+1 Time: 15.6 ms/mul. 11836006 digits 360204^4194304+1 Time: 31.2 ms/mul. 23305854 digits 294612^8388608+1 Time: 125 ms/mul. 45879398 digits 30^32+1 is a probable prime. (1.0 sec., err = 0.00e+00) 20000066^32+1 is a probable prime. (1.0 sec., err = 0.00e+00) 102^64+1 is a probable prime. (1.0 sec., err = 0.00e+00) 15000250^64+1 is a probable prime. (1.0 sec., err = 0.00e+00) 120^128+1 is a probable prime. (2.0 sec., err = 0.00e+00) 10000038^128+1 is a probable prime. (1.0 sec., err = 0.00e+00) 278^256+1 is a probable prime. (2.0 sec., err = 0.00e+00) 5684328^256+1 is a probable prime. (1.0 sec., err = 0.00e+00) 46^512+1 is a probable prime. (2.0 sec., err = 0.00e+00) 4619000^512+1 is a probable prime. (2.0 sec., err = 0.00e+00) 824^1024+1 is a probable prime. (2.0 sec., err = 0.00e+00) 3752220^1024+1 is a probable prime. (4.0 sec., err = 0.00e+00) 150^2048+1 is a probable prime. (2.0 sec., err = 0.00e+00) 3066672^2048+1 is a probable prime. (6.0 sec., err = 0.00e+00) 1534^4096+1 is a probable prime. (8.0 sec., err = 0.00e+00) 2485064^4096+1 is a probable prime. (13.0 sec., err = 0.00e+00) 30406^8192+1 is a probable prime. (20.0 sec., err = 0.00e+00) 2030234^8192+1 is a probable prime. (27.0 sec., err = 0.00e+00) 67234^16384+1 is a probable prime. (42.0 sec., err = 0.00e+00) 1651902^16384+1 is a probable prime. (54.0 sec., err = 0.00e+00) 70906^32768+1 is a probable prime. (107.0 sec., err = 0.00e+00) 1277444^32768+1 is a probable prime. (133.0 sec., err = 0.00e+00) 48594^65536+1 is a probable prime. (327.0 sec., err = 0.00e+00) 857678^65536+1 is a probable prime. (407.0 sec., err = 0.00e+00) 62722^131072+1 is a probable prime. (1342.0 sec., err = 0.00e+00) 572186^131072+1 is a probable prime. (1608.0 sec., err = 0.00e+00)

Yves Gallot
Volunteer developer
Project scientist
Send message
Joined: 19 Aug 12
Posts: 513
ID: 164101
Credit: 295,254,118
RAC: 5,845
GFN Double Silver: Earned 200,000,000 credits (295,254,118)
Message 68073 - Posted: 2 Aug 2013 | 16:36:50 UTC - in response to Message 68057.

Test result with HD7970Ghz using Rebirther's Dropbox genefer.exe [...]

There is clearly a problem with this card if we compare to the HD7750 results :

HD7750 1203210^65536+1 Time: 244 us/mul. 984108^131072+1 Time: 488 us/mul. 48594^65536+1 327 sec... OK 62722^131072+1 1342 sec... OK HD7970Ghz 1203210^65536+1 Time: 164 us/mul. 984108^131072+1 Time: 236 us/mul 48594^65536+1 172 sec... OK 62722^131072+1 1509 sec... NOK


Is the graphic driver up-to-date ?

rogue
Volunteer developer
Avatar
Send message
Joined: 8 Sep 07
Posts: 1180
ID: 12001
Credit: 18,565,548
RAC: 0
PPS LLR Bronze: Earned 10,000 credits (31,229)PSA Jade: Earned 10,000,000 credits (18,533,435)
Message 68083 - Posted: 2 Aug 2013 | 22:44:34 UTC

Ugh! The Radeon HD 5870 only supports cl_amd_fp64, not cl_khr_fp64. I can't run the code on my MacPro. When it tries to use the CPU, I get this error:

inline Complex _MulWeightOut(const Complex z, const Complex aw, const double nInv, const double g)
^

Running on platform 'Apple' and device 'Intel(R) Xeon(R) CPU W3530 @ 2.80GHz'.


Error detected on GPU device.

I don't expect it to be fast on the CPU though, but I wouldn't expect this error either.

Profile Roger
Volunteer moderator
Project administrator
Volunteer developer
Volunteer tester
Project scientist
Avatar
Send message
Joined: 27 Nov 11
Posts: 1112
ID: 120786
Credit: 261,530,863
RAC: 1,408
Found 1 prime in the 2018 Tour de Primes321 LLR Ruby: Earned 2,000,000 credits (2,012,522)Cullen LLR Amethyst: Earned 1,000,000 credits (1,359,862)ESP LLR Ruby: Earned 2,000,000 credits (2,213,934)Generalized Cullen/Woodall LLR Ruby: Earned 2,000,000 credits (2,088,705)PPS LLR Ruby: Earned 2,000,000 credits (2,617,785)PSP LLR Ruby: Earned 2,000,000 credits (2,420,512)SoB LLR Amethyst: Earned 1,000,000 credits (1,780,064)SR5 LLR Ruby: Earned 2,000,000 credits (2,238,295)SGS LLR Ruby: Earned 2,000,000 credits (2,139,392)TRP LLR Ruby: Earned 2,000,000 credits (2,125,391)Woodall LLR Amethyst: Earned 1,000,000 credits (1,311,937)321 Sieve Turquoise: Earned 5,000,000 credits (5,190,731)Cullen/Woodall Sieve (suspended) Silver: Earned 100,000 credits (207,387)Generalized Cullen/Woodall Sieve Turquoise: Earned 5,000,000 credits (5,049,697)PPS Sieve Double Bronze: Earned 100,000,000 credits (100,422,123)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Ruby: Earned 2,000,000 credits (3,227,972)TRP Sieve (suspended) Turquoise: Earned 5,000,000 credits (5,021,659)AP 26/27 Sapphire: Earned 20,000,000 credits (20,295,860)GFN Emerald: Earned 50,000,000 credits (56,515,310)PSA Sapphire: Earned 20,000,000 credits (43,298,465)
Message 68085 - Posted: 2 Aug 2013 | 22:59:20 UTC - in response to Message 68073.

Running Catalyst 12.8. That might explain the extra pragma warning and the test result.

Profile Roger
Volunteer moderator
Project administrator
Volunteer developer
Volunteer tester
Project scientist
Avatar
Send message
Joined: 27 Nov 11
Posts: 1112
ID: 120786
Credit: 261,530,863
RAC: 1,408
Found 1 prime in the 2018 Tour de Primes321 LLR Ruby: Earned 2,000,000 credits (2,012,522)Cullen LLR Amethyst: Earned 1,000,000 credits (1,359,862)ESP LLR Ruby: Earned 2,000,000 credits (2,213,934)Generalized Cullen/Woodall LLR Ruby: Earned 2,000,000 credits (2,088,705)PPS LLR Ruby: Earned 2,000,000 credits (2,617,785)PSP LLR Ruby: Earned 2,000,000 credits (2,420,512)SoB LLR Amethyst: Earned 1,000,000 credits (1,780,064)SR5 LLR Ruby: Earned 2,000,000 credits (2,238,295)SGS LLR Ruby: Earned 2,000,000 credits (2,139,392)TRP LLR Ruby: Earned 2,000,000 credits (2,125,391)Woodall LLR Amethyst: Earned 1,000,000 credits (1,311,937)321 Sieve Turquoise: Earned 5,000,000 credits (5,190,731)Cullen/Woodall Sieve (suspended) Silver: Earned 100,000 credits (207,387)Generalized Cullen/Woodall Sieve Turquoise: Earned 5,000,000 credits (5,049,697)PPS Sieve Double Bronze: Earned 100,000,000 credits (100,422,123)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Ruby: Earned 2,000,000 credits (3,227,972)TRP Sieve (suspended) Turquoise: Earned 5,000,000 credits (5,021,659)AP 26/27 Sapphire: Earned 20,000,000 credits (20,295,860)GFN Emerald: Earned 50,000,000 credits (56,515,310)PSA Sapphire: Earned 20,000,000 credits (43,298,465)
Message 68086 - Posted: 3 Aug 2013 | 6:28:12 UTC - in response to Message 68085.

HD7970Ghz test result with new version:



Same %18 CPU and 99% GPU usage. Catalyst 12.8. I'll update to 13.4 and run again. Not shown is 572186^131072+1 that took 5985.4 sec.

Yves Gallot
Volunteer developer
Project scientist
Send message
Joined: 19 Aug 12
Posts: 513
ID: 164101
Credit: 295,254,118
RAC: 5,845
GFN Double Silver: Earned 200,000,000 credits (295,254,118)
Message 68089 - Posted: 3 Aug 2013 | 10:27:20 UTC - in response to Message 68083.

Ugh! The Radeon HD 5870 only supports cl_amd_fp64, not cl_khr_fp64. I can't run the code on my MacPro.

It should run with cl_amd_fp64 (I think: I can't find cl_amd_fp64 instruction set).
But 5 instructions only are called : +, - , *, fma, rint.

There was an error in the logic of the defines : cl_amd_fp64 was not set with OpenCL 1.2. I solved this. You can download the latest release.
I removed "_MulWeightOut", it was an unreachable code.

Profile Roger
Volunteer moderator
Project administrator
Volunteer developer
Volunteer tester
Project scientist
Avatar
Send message
Joined: 27 Nov 11
Posts: 1112
ID: 120786
Credit: 261,530,863
RAC: 1,408
Found 1 prime in the 2018 Tour de Primes321 LLR Ruby: Earned 2,000,000 credits (2,012,522)Cullen LLR Amethyst: Earned 1,000,000 credits (1,359,862)ESP LLR Ruby: Earned 2,000,000 credits (2,213,934)Generalized Cullen/Woodall LLR Ruby: Earned 2,000,000 credits (2,088,705)PPS LLR Ruby: Earned 2,000,000 credits (2,617,785)PSP LLR Ruby: Earned 2,000,000 credits (2,420,512)SoB LLR Amethyst: Earned 1,000,000 credits (1,780,064)SR5 LLR Ruby: Earned 2,000,000 credits (2,238,295)SGS LLR Ruby: Earned 2,000,000 credits (2,139,392)TRP LLR Ruby: Earned 2,000,000 credits (2,125,391)Woodall LLR Amethyst: Earned 1,000,000 credits (1,311,937)321 Sieve Turquoise: Earned 5,000,000 credits (5,190,731)Cullen/Woodall Sieve (suspended) Silver: Earned 100,000 credits (207,387)Generalized Cullen/Woodall Sieve Turquoise: Earned 5,000,000 credits (5,049,697)PPS Sieve Double Bronze: Earned 100,000,000 credits (100,422,123)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Ruby: Earned 2,000,000 credits (3,227,972)TRP Sieve (suspended) Turquoise: Earned 5,000,000 credits (5,021,659)AP 26/27 Sapphire: Earned 20,000,000 credits (20,295,860)GFN Emerald: Earned 50,000,000 credits (56,515,310)PSA Sapphire: Earned 20,000,000 credits (43,298,465)
Message 68091 - Posted: 3 Aug 2013 | 12:53:02 UTC - in response to Message 68086.
Last modified: 3 Aug 2013 | 13:23:00 UTC

Tested HD7970Ghz with Catalyst 13.4. Behavior is different. Now uses around 6% of CPU but still 99% of GPU.
No #pragma warning, faster times, but maybe a memory leak when testing 676754^262144+1?
In this screen shot you can see memory use maxed out at 6GB, is not using any GPU, so has gone zombie:



I tested -q "676754^262144+1", -q "75898^524288+1" and -q "475856^524288+1" and all had memory usage that quickly went crazy after only seconds, as shown by Task Manager.
Note this test was straight after updating to Catalyst 13.4.
I reran the test after a reboot and got some slightly better times for high N and slightly worse times for low N.

rogue
Volunteer developer
Avatar
Send message
Joined: 8 Sep 07
Posts: 1180
ID: 12001
Credit: 18,565,548
RAC: 0
PPS LLR Bronze: Earned 10,000 credits (31,229)PSA Jade: Earned 10,000,000 credits (18,533,435)
Message 68092 - Posted: 3 Aug 2013 | 13:40:53 UTC - in response to Message 68089.

Ugh! The Radeon HD 5870 only supports cl_amd_fp64, not cl_khr_fp64. I can't run the code on my MacPro.

It should run with cl_amd_fp64 (I think: I can't find cl_amd_fp64 instruction set).
But 5 instructions only are called : +, - , *, fma, rint.

There was an error in the logic of the defines : cl_amd_fp64 was not set with OpenCL 1.2. I solved this. You can download the latest release.
I removed "_MulWeightOut", it was an unreachable code.


That only solves part of the problem. This code:

clGetDeviceInfo(devices[d], CL_DEVICE_PREFERRED_VECTOR_WIDTH_DOUBLE, sizeof(cl_uint), &dWidth, NULL);

return a dWidth of 0 on the HK 5870, thus the device cannot be chosen.

Yves Gallot
Volunteer developer
Project scientist
Send message
Joined: 19 Aug 12
Posts: 513
ID: 164101
Credit: 295,254,118
RAC: 5,845
GFN Double Silver: Earned 200,000,000 credits (295,254,118)
Message 68097 - Posted: 3 Aug 2013 | 17:39:27 UTC - in response to Message 68091.

Tested HD7970Ghz with Catalyst 13.4. Behavior is different. Now uses around 6% of CPU but still 99% of GPU.
No #pragma warning, faster times, but maybe a memory leak when testing 676754^262144+1?
In this screen shot you can see memory use maxed out at 6GB, is not using any GPU, so has gone zombie:
I tested -q "676754^262144+1", -q "75898^524288+1" and -q "475856^524288+1" and all had memory usage that quickly went crazy after only seconds, as shown by Task Manager.

On my computer, OclGenefer -q "475856^524288+1"
=> CPU Memory = 29.7 MB (the program allocates memory when it starts but not during the computation),
=> CPU load = 12.5 % (a bug of NVidia driver: https://forums.geforce.com/default/topic/543115/opencl-driver-support-for-fah/)
=> GPU Load = 99 %.
I don't understand... someone else can check this with an ATI card ?

Profile rebirther
Avatar
Send message
Joined: 10 Aug 05
Posts: 783
ID: 85
Credit: 175,774,608
RAC: 0
Eliminated 3 conjecture "k"s321 LLR Silver: Earned 100,000 credits (186,594)Cullen LLR Silver: Earned 100,000 credits (106,665)ESP LLR Gold: Earned 500,000 credits (502,416)PPS LLR Gold: Earned 500,000 credits (504,111)PSP LLR Gold: Earned 500,000 credits (513,785)SoB LLR Gold: Earned 500,000 credits (564,944)SR5 LLR Ruby: Earned 2,000,000 credits (2,790,118)SGS LLR Gold: Earned 500,000 credits (501,099)TPS LLR (retired) Bronze: Earned 10,000 credits (46,235)TRP LLR Gold: Earned 500,000 credits (708,706)Woodall LLR Silver: Earned 100,000 credits (133,626)321 Sieve Bronze: Earned 10,000 credits (21,527)Cullen/Woodall Sieve (suspended) Jade: Earned 10,000,000 credits (14,729,132)PPS Sieve Double Bronze: Earned 100,000,000 credits (132,786,707)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Gold: Earned 500,000 credits (510,317)TRP Sieve (suspended) Gold: Earned 500,000 credits (620,991)AP 26/27 Silver: Earned 100,000 credits (418,876)GFN Amethyst: Earned 1,000,000 credits (1,795,613)PSA Jade: Earned 10,000,000 credits (18,329,123)
Message 68099 - Posted: 3 Aug 2013 | 18:42:26 UTC - in response to Message 68097.
Last modified: 3 Aug 2013 | 18:42:35 UTC

Tested HD7970Ghz with Catalyst 13.4. Behavior is different. Now uses around 6% of CPU but still 99% of GPU.
No #pragma warning, faster times, but maybe a memory leak when testing 676754^262144+1?
In this screen shot you can see memory use maxed out at 6GB, is not using any GPU, so has gone zombie:
I tested -q "676754^262144+1", -q "75898^524288+1" and -q "475856^524288+1" and all had memory usage that quickly went crazy after only seconds, as shown by Task Manager.

On my computer, OclGenefer -q "475856^524288+1"
=> CPU Memory = 29.7 MB (the program allocates memory when it starts but not during the computation),
=> CPU load = 12.5 % (a bug of NVidia driver: https://forums.geforce.com/default/topic/543115/opencl-driver-support-for-fah/)
=> GPU Load = 99 %.
I don't understand... someone else can check this with an ATI card ?


Tested and confirmed. The program took all my 16GB memory and cpu was at full usage (memory leak).

Yves Gallot
Volunteer developer
Project scientist
Send message
Joined: 19 Aug 12
Posts: 513
ID: 164101
Credit: 295,254,118
RAC: 5,845
GFN Double Silver: Earned 200,000,000 credits (295,254,118)
Message 68105 - Posted: 3 Aug 2013 | 22:06:28 UTC - in response to Message 68099.
Last modified: 3 Aug 2013 | 22:12:50 UTC

Tested HD7970Ghz [...] had memory usage that quickly went crazy after only seconds.

Tested and confirmed. The program took all my 16GB memory and cpu was at full usage (memory leak).

I've got it! The same problem occured with Intel's driver.
I enqueued all OpenCL commands with a single CLFinish (which is a synchronization point) at the end. Intel and ATI drivers create a huge stack with all commands!
You can download the lastest version (OclGenefer 2013-08-03). It enqueues 1024 commands and CPU waits GPU, etc.
This solves the problem with Intel's driver.

Profile rebirther
Avatar
Send message
Joined: 10 Aug 05
Posts: 783
ID: 85
Credit: 175,774,608
RAC: 0
Eliminated 3 conjecture "k"s321 LLR Silver: Earned 100,000 credits (186,594)Cullen LLR Silver: Earned 100,000 credits (106,665)ESP LLR Gold: Earned 500,000 credits (502,416)PPS LLR Gold: Earned 500,000 credits (504,111)PSP LLR Gold: Earned 500,000 credits (513,785)SoB LLR Gold: Earned 500,000 credits (564,944)SR5 LLR Ruby: Earned 2,000,000 credits (2,790,118)SGS LLR Gold: Earned 500,000 credits (501,099)TPS LLR (retired) Bronze: Earned 10,000 credits (46,235)TRP LLR Gold: Earned 500,000 credits (708,706)Woodall LLR Silver: Earned 100,000 credits (133,626)321 Sieve Bronze: Earned 10,000 credits (21,527)Cullen/Woodall Sieve (suspended) Jade: Earned 10,000,000 credits (14,729,132)PPS Sieve Double Bronze: Earned 100,000,000 credits (132,786,707)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Gold: Earned 500,000 credits (510,317)TRP Sieve (suspended) Gold: Earned 500,000 credits (620,991)AP 26/27 Silver: Earned 100,000 credits (418,876)GFN Amethyst: Earned 1,000,000 credits (1,795,613)PSA Jade: Earned 10,000,000 credits (18,329,123)
Message 68106 - Posted: 3 Aug 2013 | 22:42:09 UTC - in response to Message 68105.

Tested HD7970Ghz [...] had memory usage that quickly went crazy after only seconds.

Tested and confirmed. The program took all my 16GB memory and cpu was at full usage (memory leak).

I've got it! The same problem occured with Intel's driver.
I enqueued all OpenCL commands with a single CLFinish (which is a synchronization point) at the end. Intel and ATI drivers create a huge stack with all commands!
You can download the lastest version (OclGenefer 2013-08-03). It enqueues 1024 commands and CPU waits GPU, etc.
This solves the problem with Intel's driver.


Looks good on ATI now.

Profile Roger
Volunteer moderator
Project administrator
Volunteer developer
Volunteer tester
Project scientist
Avatar
Send message
Joined: 27 Nov 11
Posts: 1112
ID: 120786
Credit: 261,530,863
RAC: 1,408
Found 1 prime in the 2018 Tour de Primes321 LLR Ruby: Earned 2,000,000 credits (2,012,522)Cullen LLR Amethyst: Earned 1,000,000 credits (1,359,862)ESP LLR Ruby: Earned 2,000,000 credits (2,213,934)Generalized Cullen/Woodall LLR Ruby: Earned 2,000,000 credits (2,088,705)PPS LLR Ruby: Earned 2,000,000 credits (2,617,785)PSP LLR Ruby: Earned 2,000,000 credits (2,420,512)SoB LLR Amethyst: Earned 1,000,000 credits (1,780,064)SR5 LLR Ruby: Earned 2,000,000 credits (2,238,295)SGS LLR Ruby: Earned 2,000,000 credits (2,139,392)TRP LLR Ruby: Earned 2,000,000 credits (2,125,391)Woodall LLR Amethyst: Earned 1,000,000 credits (1,311,937)321 Sieve Turquoise: Earned 5,000,000 credits (5,190,731)Cullen/Woodall Sieve (suspended) Silver: Earned 100,000 credits (207,387)Generalized Cullen/Woodall Sieve Turquoise: Earned 5,000,000 credits (5,049,697)PPS Sieve Double Bronze: Earned 100,000,000 credits (100,422,123)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Ruby: Earned 2,000,000 credits (3,227,972)TRP Sieve (suspended) Turquoise: Earned 5,000,000 credits (5,021,659)AP 26/27 Sapphire: Earned 20,000,000 credits (20,295,860)GFN Emerald: Earned 50,000,000 credits (56,515,310)PSA Sapphire: Earned 20,000,000 credits (43,298,465)
Message 68107 - Posted: 3 Aug 2013 | 23:34:12 UTC - in response to Message 68105.

Bug is fixed, 676754^262144+1 test uses 39.0MB memory, only 2% of CPU and 97% of GPU. This is a much better combination, 99% GPU makes the computer less responsive.
^524288 tests use 1.0% of CPU, 43.9MB and 98% GPU. Usage depends on exponent being tested.

Profile Roger
Volunteer moderator
Project administrator
Volunteer developer
Volunteer tester
Project scientist
Avatar
Send message
Joined: 27 Nov 11
Posts: 1112
ID: 120786
Credit: 261,530,863
RAC: 1,408
Found 1 prime in the 2018 Tour de Primes321 LLR Ruby: Earned 2,000,000 credits (2,012,522)Cullen LLR Amethyst: Earned 1,000,000 credits (1,359,862)ESP LLR Ruby: Earned 2,000,000 credits (2,213,934)Generalized Cullen/Woodall LLR Ruby: Earned 2,000,000 credits (2,088,705)PPS LLR Ruby: Earned 2,000,000 credits (2,617,785)PSP LLR Ruby: Earned 2,000,000 credits (2,420,512)SoB LLR Amethyst: Earned 1,000,000 credits (1,780,064)SR5 LLR Ruby: Earned 2,000,000 credits (2,238,295)SGS LLR Ruby: Earned 2,000,000 credits (2,139,392)TRP LLR Ruby: Earned 2,000,000 credits (2,125,391)Woodall LLR Amethyst: Earned 1,000,000 credits (1,311,937)321 Sieve Turquoise: Earned 5,000,000 credits (5,190,731)Cullen/Woodall Sieve (suspended) Silver: Earned 100,000 credits (207,387)Generalized Cullen/Woodall Sieve Turquoise: Earned 5,000,000 credits (5,049,697)PPS Sieve Double Bronze: Earned 100,000,000 credits (100,422,123)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Ruby: Earned 2,000,000 credits (3,227,972)TRP Sieve (suspended) Turquoise: Earned 5,000,000 credits (5,021,659)AP 26/27 Sapphire: Earned 20,000,000 credits (20,295,860)GFN Emerald: Earned 50,000,000 credits (56,515,310)PSA Sapphire: Earned 20,000,000 credits (43,298,465)
Message 68118 - Posted: 4 Aug 2013 | 10:48:34 UTC - in response to Message 68107.

Full test run on HD7970Ghz with Catalyst 13.4:



Is all sweetness and light. What's the next step for this algorithm?
Great work Yves!

Yves Gallot
Volunteer developer
Project scientist
Send message
Joined: 19 Aug 12
Posts: 513
ID: 164101
Credit: 295,254,118
RAC: 5,845
GFN Double Silver: Earned 200,000,000 credits (295,254,118)
Message 68142 - Posted: 5 Aug 2013 | 12:25:16 UTC - in response to Message 68118.

Full test run on HD7970Ghz with Catalyst 13.4.
[...] What's the next step for this algorithm?

Thank's, it works fine for a first release.
The next steps are:
1- To improve performance. On one hand, we have Folding@Home FP64 benchmark:
GTX Titan 17.9 HD7990 17.1 HD7970GHz 10.1 GTX 780 8.7 GTX 580 8.3 GTX 680 5.5

and on the other hand GeneferOpenCL and GeneferCUDA comparison:
GTX580
GNF-OCL 2^20 2.8 ms/mul GNF-OCL 2^22 15.5 ms/mul GNF-CUDA 2^20 1.7 ms/mul GNF-CUDA 2^22 6.8 ms/mul

GTX680
GNF-OCL 2^20 3.1 ms/mul GNF-OCL 2^22 12.9 ms/mul GNF-CUDA 2^20 2.5 ms/mul GNF-CUDA 2^22 9.3 ms/mul

If GeneferOpenCL is as fast as GeneferCUDA then the expected timing on a HD7970GHz is
GNF-OCL 2^20 2 ms/mul -> 1.5 ms/mul GNF-OCL 2^22 12.4 ms/mul -> 5.6 ms/mul

It is a reasonnable target because many optimizations have not been tested.

2- To plug the transform into Genefer application. This can be done quickly because the interface of OclTransform is based on AVXTransform.
I'm finishing the tests of some "simple" optimizations and I will create a full GeneferOpenCL application.

Yves Gallot
Volunteer developer
Project scientist
Send message
Joined: 19 Aug 12
Posts: 513
ID: 164101
Credit: 295,254,118
RAC: 5,845
GFN Double Silver: Earned 200,000,000 credits (295,254,118)
Message 68236 - Posted: 10 Aug 2013 | 11:32:46 UTC

My benchmark was incorrect.
That's good news, because OclGenefer runs faster than it was displayed!
During the benchmark (which is similar to Genefer code), 24 loops are computed (to initialize memory and caches) and then the true benchmark starts. But I forget to add a synchronization point at the end of the 24 loops, then the computation of these transforms biased estimates (especially for large n).
The error is corrected in OclGenefer 2013-08-10.
I'm on holiday and tested the program on a GeForce GT 740M (GK107, 2 SMX, 384 shaders). OclGenefer is faster than GeneferCUDA on it.
Then today, GeneferCUDA is (still :-) ) faster on Fermi, but OclGenefer is faster on Kepler with few SMX (it should be faster on GK107, i.e. GT 640 and GTX 650). I cannot extrapolate to GK104 (GTX 680) or GK110 (GTX 780 & Titan)... a benchmark on the Titan would be very interesting.

Profile Roger
Volunteer moderator
Project administrator
Volunteer developer
Volunteer tester
Project scientist
Avatar
Send message
Joined: 27 Nov 11
Posts: 1112
ID: 120786
Credit: 261,530,863
RAC: 1,408
Found 1 prime in the 2018 Tour de Primes321 LLR Ruby: Earned 2,000,000 credits (2,012,522)Cullen LLR Amethyst: Earned 1,000,000 credits (1,359,862)ESP LLR Ruby: Earned 2,000,000 credits (2,213,934)Generalized Cullen/Woodall LLR Ruby: Earned 2,000,000 credits (2,088,705)PPS LLR Ruby: Earned 2,000,000 credits (2,617,785)PSP LLR Ruby: Earned 2,000,000 credits (2,420,512)SoB LLR Amethyst: Earned 1,000,000 credits (1,780,064)SR5 LLR Ruby: Earned 2,000,000 credits (2,238,295)SGS LLR Ruby: Earned 2,000,000 credits (2,139,392)TRP LLR Ruby: Earned 2,000,000 credits (2,125,391)Woodall LLR Amethyst: Earned 1,000,000 credits (1,311,937)321 Sieve Turquoise: Earned 5,000,000 credits (5,190,731)Cullen/Woodall Sieve (suspended) Silver: Earned 100,000 credits (207,387)Generalized Cullen/Woodall Sieve Turquoise: Earned 5,000,000 credits (5,049,697)PPS Sieve Double Bronze: Earned 100,000,000 credits (100,422,123)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Ruby: Earned 2,000,000 credits (3,227,972)TRP Sieve (suspended) Turquoise: Earned 5,000,000 credits (5,021,659)AP 26/27 Sapphire: Earned 20,000,000 credits (20,295,860)GFN Emerald: Earned 50,000,000 credits (56,515,310)PSA Sapphire: Earned 20,000,000 credits (43,298,465)
Message 68245 - Posted: 11 Aug 2013 | 7:50:28 UTC - in response to Message 68236.
Last modified: 11 Aug 2013 | 8:09:28 UTC

New assembla OclGenefer.cpp revision 381 code run on HD7970Ghz:



Benchmarks for large N are definitely faster. New device initialisation information.
I guess the full probable prime tests are unaffected by the latest code change?

GFN-OCL 2^20 1.84 ms/mul
2^20 Goal=1.5 ms/mul => another 18% drop required
GFN-OCL 2^22 7.67 ms/mul
2^22 Goal=5.6 ms/mul => another 27% drop required

Yves Gallot
Volunteer developer
Project scientist
Send message
Joined: 19 Aug 12
Posts: 513
ID: 164101
Credit: 295,254,118
RAC: 5,845
GFN Double Silver: Earned 200,000,000 credits (295,254,118)
Message 68246 - Posted: 11 Aug 2013 | 10:15:28 UTC - in response to Message 68245.

New device initialisation information.

Thanks, this information is very useful for optimization. Programming guides don't give all parameters.
On a Fermi, it is
Global mem size = 2048 MB, cache size = 32 kB (ReadWrite), cache line size = 128 Bytes. Local mem size = 48 kB (dedicated), Constant mem size = 64 kB. Max workgroup size = 1024.

I guess the full probable prime tests are unaffected by the latest code change?

Yes, OpenCL code didn't change in this version.

GFN-OCL 2^20 1.84 ms/mul
2^20 Goal=1.5 ms/mul => another 18% drop required
GFN-OCL 2^22 7.67 ms/mul
2^22 Goal=5.6 ms/mul => another 27% drop required

And then it will run faster on a HD 7970 GHz than GeneferCUDA on a GeForce GTX Titan :o)

Yves Gallot
Volunteer developer
Project scientist
Send message
Joined: 19 Aug 12
Posts: 513
ID: 164101
Credit: 295,254,118
RAC: 5,845
GFN Double Silver: Earned 200,000,000 credits (295,254,118)
Message 68288 - Posted: 12 Aug 2013 | 20:57:06 UTC

A new version is available on assembla (2013-08-12).
It is really faster than the previous one on my laptop, I hope that it is fast on high and mid range graphics cards.
Max error is computed.

Any benchmark is welcome.

If tests are ok, the next version will be a beta release of geneferOpenCL.

Profile Roger
Volunteer moderator
Project administrator
Volunteer developer
Volunteer tester
Project scientist
Avatar
Send message
Joined: 27 Nov 11
Posts: 1112
ID: 120786
Credit: 261,530,863
RAC: 1,408
Found 1 prime in the 2018 Tour de Primes321 LLR Ruby: Earned 2,000,000 credits (2,012,522)Cullen LLR Amethyst: Earned 1,000,000 credits (1,359,862)ESP LLR Ruby: Earned 2,000,000 credits (2,213,934)Generalized Cullen/Woodall LLR Ruby: Earned 2,000,000 credits (2,088,705)PPS LLR Ruby: Earned 2,000,000 credits (2,617,785)PSP LLR Ruby: Earned 2,000,000 credits (2,420,512)SoB LLR Amethyst: Earned 1,000,000 credits (1,780,064)SR5 LLR Ruby: Earned 2,000,000 credits (2,238,295)SGS LLR Ruby: Earned 2,000,000 credits (2,139,392)TRP LLR Ruby: Earned 2,000,000 credits (2,125,391)Woodall LLR Amethyst: Earned 1,000,000 credits (1,311,937)321 Sieve Turquoise: Earned 5,000,000 credits (5,190,731)Cullen/Woodall Sieve (suspended) Silver: Earned 100,000 credits (207,387)Generalized Cullen/Woodall Sieve Turquoise: Earned 5,000,000 credits (5,049,697)PPS Sieve Double Bronze: Earned 100,000,000 credits (100,422,123)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Ruby: Earned 2,000,000 credits (3,227,972)TRP Sieve (suspended) Turquoise: Earned 5,000,000 credits (5,021,659)AP 26/27 Sapphire: Earned 20,000,000 credits (20,295,860)GFN Emerald: Earned 50,000,000 credits (56,515,310)PSA Sapphire: Earned 20,000,000 credits (43,298,465)
Message 68312 - Posted: 13 Aug 2013 | 13:09:43 UTC - in response to Message 68288.
Last modified: 13 Aug 2013 | 13:10:09 UTC

Latest assembla Genefer.cl and OclGenefer.cpp revision 382 code run on HD7970Ghz:



Of course I will run full test after the challenge.

GFN-OCL 2^20 1.46 ms/mul
2^20 Goal=1.5 ms/mul => Goal achieved!
GFN-OCL 2^22 5.61 ms/mul
2^22 Goal=5.6 ms/mul => Goal achieved!
Congrats!

And then it will run faster on a HD 7970 GHz than GeneferCUDA on a GeForce GTX Titan :o)

Profile rebirther
Avatar
Send message
Joined: 10 Aug 05
Posts: 783
ID: 85
Credit: 175,774,608
RAC: 0
Eliminated 3 conjecture "k"s321 LLR Silver: Earned 100,000 credits (186,594)Cullen LLR Silver: Earned 100,000 credits (106,665)ESP LLR Gold: Earned 500,000 credits (502,416)PPS LLR Gold: Earned 500,000 credits (504,111)PSP LLR Gold: Earned 500,000 credits (513,785)SoB LLR Gold: Earned 500,000 credits (564,944)SR5 LLR Ruby: Earned 2,000,000 credits (2,790,118)SGS LLR Gold: Earned 500,000 credits (501,099)TPS LLR (retired) Bronze: Earned 10,000 credits (46,235)TRP LLR Gold: Earned 500,000 credits (708,706)Woodall LLR Silver: Earned 100,000 credits (133,626)321 Sieve Bronze: Earned 10,000 credits (21,527)Cullen/Woodall Sieve (suspended) Jade: Earned 10,000,000 credits (14,729,132)PPS Sieve Double Bronze: Earned 100,000,000 credits (132,786,707)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Gold: Earned 500,000 credits (510,317)TRP Sieve (suspended) Gold: Earned 500,000 credits (620,991)AP 26/27 Silver: Earned 100,000 credits (418,876)GFN Amethyst: Earned 1,000,000 credits (1,795,613)PSA Jade: Earned 10,000,000 credits (18,329,123)
Message 68313 - Posted: 13 Aug 2013 | 14:21:44 UTC

HD7950

OclGenefer 2013-08-12, Copyright (C) 2001-2013, Yves Gallot. Options: -q "b^N+1" Test expression. Platform 'AMD Accelerated Parallel Processing': 1 GPU device(s) found. Platform 'AMD Accelerated Parallel Processing': 1 CPU device(s) found. Running on platform 'AMD Accelerated Parallel Processing', device 'Tahiti', vers ion 'OpenCL 1.2 AMD-APP (1124.2)' and driver '1124.2 (VM)'. Clock frequency = 900 MHz, compute units = 28. Global mem size = 2048 MB, cache size = 16 kB (ReadWrite), cache line size = 6 4 Bytes. Local mem size = 32 kB (dedicated), Constant mem size = 64 kB. Max workgroup size = 256. 2199064^8192+1 Time: 83.5 us/mul. Err: 0.2188 51956 digits 1798620^16384+1 Time: 87 us/mul. Err: 0.2227 102481 digits 1471094^32768+1 Time: 86.4 us/mul. Err: 0.2383 202102 digits 1203210^65536+1 Time: 123 us/mul. Err: 0.2305 398482 digits 984108^131072+1 Time: 166 us/mul. Err: 0.2188 785521 digits 804904^262144+1 Time: 353 us/mul. Err: 0.2266 1548156 digits 658332^524288+1 Time: 770 us/mul. Err: 0.2227 3050541 digits 538452^1048576+1 Time: 1.49 ms/mul. Err: 0.2109 6009544 digits 440400^2097152+1 Time: 2.9 ms/mul. Err: 0.2188 11836006 digits 360204^4194304+1 Time: 5.61 ms/mul. Err: 0.2109 23305854 digits 294612^8388608+1 Time: 11.1 ms/mul. Err: 0.2109 45879398 digits 102^64+1 is a probable prime. (0.8 sec., err = 1.46e-011) 15000250^64+1 is a probable prime. (0.8 sec., err = 0.375) 120^128+1 is a probable prime. (0.8 sec., err = 5.09e-011) 10000038^128+1 is a probable prime. (1.0 sec., err = 0.344) 278^256+1 is a probable prime. (0.9 sec., err = 3.49e-010) 5684328^256+1 is a probable prime. (1.2 sec., err = 0.164) 46^512+1 is a probable prime. (1.0 sec., err = 1.73e-011) 4619000^512+1 is a probable prime. (1.7 sec., err = 0.174) 824^1024+1 is a probable prime. (1.5 sec., err = 8.5e-009) 3752220^1024+1 is a probable prime. (2.5 sec., err = 0.188) 150^2048+1 is a probable prime. (2.0 sec., err = 4.37e-010) Testing 3066672^2048+1...


Its faster than the old test but the error rate is much higher.

Yves Gallot
Volunteer developer
Project scientist
Send message
Joined: 19 Aug 12
Posts: 513
ID: 164101
Credit: 295,254,118
RAC: 5,845
GFN Double Silver: Earned 200,000,000 credits (295,254,118)
Message 68317 - Posted: 13 Aug 2013 | 16:07:48 UTC - in response to Message 68313.

The error was not computed in the previous release.
HD 7950 is about as fast as the HD 7970 GHz... is it an overclocked HD7950 ?

The HD 7950 is faster than a GeForce GTX Titan with full DP enabled!
The relative GPU value (genefer mark / price) is 4 / 1.

Great, I'm working on the OpenCL version of the real genefer app.

Profile rebirther
Avatar
Send message
Joined: 10 Aug 05
Posts: 783
ID: 85
Credit: 175,774,608
RAC: 0
Eliminated 3 conjecture "k"s321 LLR Silver: Earned 100,000 credits (186,594)Cullen LLR Silver: Earned 100,000 credits (106,665)ESP LLR Gold: Earned 500,000 credits (502,416)PPS LLR Gold: Earned 500,000 credits (504,111)PSP LLR Gold: Earned 500,000 credits (513,785)SoB LLR Gold: Earned 500,000 credits (564,944)SR5 LLR Ruby: Earned 2,000,000 credits (2,790,118)SGS LLR Gold: Earned 500,000 credits (501,099)TPS LLR (retired) Bronze: Earned 10,000 credits (46,235)TRP LLR Gold: Earned 500,000 credits (708,706)Woodall LLR Silver: Earned 100,000 credits (133,626)321 Sieve Bronze: Earned 10,000 credits (21,527)Cullen/Woodall Sieve (suspended) Jade: Earned 10,000,000 credits (14,729,132)PPS Sieve Double Bronze: Earned 100,000,000 credits (132,786,707)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Gold: Earned 500,000 credits (510,317)TRP Sieve (suspended) Gold: Earned 500,000 credits (620,991)AP 26/27 Silver: Earned 100,000 credits (418,876)GFN Amethyst: Earned 1,000,000 credits (1,795,613)PSA Jade: Earned 10,000,000 credits (18,329,123)
Message 68318 - Posted: 13 Aug 2013 | 16:17:24 UTC - in response to Message 68317.

The error was not computed in the previous release.
HD 7950 is about as fast as the HD 7970 GHz... is it an overclocked HD7950 ?

The HD 7950 is faster than a GeForce GTX Titan with full DP enabled!
The relative GPU value (genefer mark / price) is 4 / 1.

Great, I'm working on the OpenCL version of the real genefer app.


Yes. factory OC 900/1250

rogue
Volunteer developer
Avatar
Send message
Joined: 8 Sep 07
Posts: 1180
ID: 12001
Credit: 18,565,548
RAC: 0
PPS LLR Bronze: Earned 10,000 credits (31,229)PSA Jade: Earned 10,000,000 credits (18,533,435)
Message 68319 - Posted: 13 Aug 2013 | 16:50:11 UTC

Yves, should I be able to use my 5870 or not? If not, I'm okay with that (albeit disappointed).

Yves Gallot
Volunteer developer
Project scientist
Send message
Joined: 19 Aug 12
Posts: 513
ID: 164101
Credit: 295,254,118
RAC: 5,845
GFN Double Silver: Earned 200,000,000 credits (295,254,118)
Message 68321 - Posted: 13 Aug 2013 | 17:15:07 UTC - in response to Message 68319.

Yves, should I be able to use my 5870 or not? If not, I'm okay with that (albeit disappointed).

HD 5870 and 5850 have double precision FP, then it should be able to run.

Try it without the test on CL_DEVICE_PREFERRED_VECTOR_WIDTH_DOUBLE.
If it works, I will replace it by CL_DEVICE_EXTENSIONS and check if it contains cl_khr_fp64 or cl_amd_fp64.

Profile Michael GoetzProject donor
Volunteer moderator
Project administrator
Project scientist
Avatar
Send message
Joined: 21 Jan 10
Posts: 12669
ID: 53948
Credit: 184,131,428
RAC: 10,627
The "Shut up already!" badge:  This loud mouth has mansplained on the forums over 10 thousand times!  Sheesh!!!Discovered the World's First GFN-19 prime!!!Discovered 1 mega primeFound 1 prime in the 2018 Tour de PrimesFound 1 prime in the 2019 Tour de Primes321 LLR Ruby: Earned 2,000,000 credits (2,063,182)Cullen LLR Ruby: Earned 2,000,000 credits (2,005,249)ESP LLR Ruby: Earned 2,000,000 credits (3,820,430)Generalized Cullen/Woodall LLR Ruby: Earned 2,000,000 credits (2,145,754)PPS LLR Ruby: Earned 2,000,000 credits (2,773,744)PSP LLR Ruby: Earned 2,000,000 credits (2,632,269)SoB LLR Sapphire: Earned 20,000,000 credits (34,158,496)SR5 LLR Turquoise: Earned 5,000,000 credits (8,293,415)SGS LLR Ruby: Earned 2,000,000 credits (2,012,781)TRP LLR Ruby: Earned 2,000,000 credits (2,737,347)Woodall LLR Ruby: Earned 2,000,000 credits (2,195,123)321 Sieve Turquoise: Earned 5,000,000 credits (5,046,112)Cullen/Woodall Sieve (suspended) Ruby: Earned 2,000,000 credits (4,170,256)Generalized Cullen/Woodall Sieve Turquoise: Earned 5,000,000 credits (5,059,304)PPS Sieve Sapphire: Earned 20,000,000 credits (20,110,788)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Amethyst: Earned 1,000,000 credits (1,035,522)TRP Sieve (suspended) Ruby: Earned 2,000,000 credits (2,051,121)AP 26/27 Turquoise: Earned 5,000,000 credits (7,090,096)GFN Emerald: Earned 50,000,000 credits (64,594,991)PSA Jade: Earned 10,000,000 credits (10,135,447)
Message 68322 - Posted: 13 Aug 2013 | 17:22:31 UTC

On my stock GTX 460:

GeneferCUDA:

genefercuda 3.1.0-1 (Windows 32-bit CUDA) Copyright 2001-2013, Yves Gallot Copyright 2009, Mark Rodenkirch, David Underbakke Copyright 2010-2012, Shoichiro Yamada, Ken Brazier Copyright 2011-2013, Iain Bethune, Michael Goetz, Ronald Schneider Command line: genefercuda -b Generalized Fermat Number Bench 2199064^8192+1 Time: 164 us/mul. Err: 0.2188 51956 digits 1798620^16384+1 Time: 162 us/mul. Err: 0.2188 102481 digits 1471094^32768+1 Time: 214 us/mul. Err: 0.2031 202102 digits 1203210^65536+1 Time: 338 us/mul. Err: 0.2031 398482 digits 984108^131072+1 Time: 599 us/mul. Err: 0.2070 785521 digits 804904^262144+1 Time: 1.06 ms/mul. Err: 0.2031 1548156 digits 658332^524288+1 Time: 1.99 ms/mul. Err: 0.2266 3050541 digits 538452^1048576+1 Time: 3.98 ms/mul. Err: 0.2031 6009544 digits 440400^2097152+1 Time: 8.3 ms/mul. Err: 0.1953 11836006 digits 360204^4194304+1 Time: 16.7 ms/mul. Err: 0.2031 23305854 digits 294612^8388608+1 Time: 34.7 ms/mul. Err: 0.1797 45879398 digits


The first OpenCL version:

OclGenefer 2013, Copyright (C) 2001-2013, Yves Gallot. Platform 'NVIDIA CUDA': 1 GPU device(s) found. Platform 'AMD Accelerated Parallel Processing': 1 CPU device(s) found. Running on platform 'NVIDIA CUDA' and device 'GeForce GTX 460'. 2199064^8192+1 Time: 116 us/mul. 51956 digits 1798620^16384+1 Time: 193 us/mul. 102481 digits 1471094^32768+1 Time: 327 us/mul. 202102 digits 1203210^65536+1 Time: 439 us/mul. 398482 digits 984108^131072+1 Time: 726 us/mul. 785521 digits 804904^262144+1 Time: 1.61 ms/mul. 1548156 digits 658332^524288+1 Time: 3.04 ms/mul. 3050541 digits 538452^1048576+1 Time: 6.39 ms/mul. 6009544 digits 440400^2097152+1 Time: 14.4 ms/mul. 11836006 digits 360204^4194304+1 Time: 35.2 ms/mul. 23305854 digits 294612^8388608+1 Time: 88.4 ms/mul. 45879398 digits


The most recent OpenCL version:

OclGenefer 2013-08-12, Copyright (C) 2001-2013, Yves Gallot. Options: -q "b^N+1" Test expression. Platform 'NVIDIA CUDA': 1 GPU device(s) found. Platform 'AMD Accelerated Parallel Processing': 1 CPU device(s) found. Running on platform 'NVIDIA CUDA', device 'GeForce GTX 460', version 'OpenCL 1.1 CUDA' and driver '320.57'. Clock frequency = 1350 MHz, compute units = 7. Global mem size = 1024 MB, cache size = 112 kB (ReadWrite), cache line size = 128 Bytes. Local mem size = 48 kB (dedicated), Constant mem size = 64 kB. Max workgroup size = 1024. 2199064^8192+1 Time: 85.3 us/mul. Err: 0.2188 51956 digits 1798620^16384+1 Time: 127 us/mul. Err: 0.2266 102481 digits 1471094^32768+1 Time: 168 us/mul. Err: 0.2344 202102 digits 1203210^65536+1 Time: 306 us/mul. Err: 0.2188 398482 digits 984108^131072+1 Time: 557 us/mul. Err: 0.2422 785521 digits 804904^262144+1 Time: 1.13 ms/mul. Err: 0.2178 1548156 digits 658332^524288+1 Time: 2.19 ms/mul. Err: 0.2256 3050541 digits 538452^1048576+1 Time: 4.56 ms/mul. Err: 0.2031 6009544 digits 440400^2097152+1 Time: 9.48 ms/mul. Err: 0.2305 11836006 digits 360204^4194304+1 Time: 19.9 ms/mul. Err: 0.1953 23305854 digits 294612^8388608+1 Time: 42.2 ms/mul. Err: 0.1973 45879398 digits


____________
Please do not PM me with support questions. Ask on the forums instead. Thank you!

My lucky number is 75898524288+1

rogue
Volunteer developer
Avatar
Send message
Joined: 8 Sep 07
Posts: 1180
ID: 12001
Credit: 18,565,548
RAC: 0
PPS LLR Bronze: Earned 10,000 credits (31,229)PSA Jade: Earned 10,000,000 credits (18,533,435)
Message 68344 - Posted: 14 Aug 2013 | 0:07:09 UTC

I think that timings are incorrect. If they are correct, then something appears to be impacting the overall throughput. I commented out the check for CL_DEVICE_PREFERRED_VECTOR_WIDTH_DOUBLE.

Running on platform 'Apple', device 'ATI Radeon HD 5870', version 'OpenCL 1.1 ' and driver '1.0'.
Clock frequency = 850 MHz, compute units = 20.
Global mem size = 1024 MB, cache size = 0 kB (None), cache line size = 0 Bytes.
Local mem size = 32 kB (dedicated), Constant mem size = 64 kB.
Max workgroup size = 1024.

2199064^8192+1 Time: 169 us/mul. Err: 0.2344 51956 digits
1798620^16384+1 Time: 173 us/mul. Err: 0.2344 102481 digits
1471094^32768+1 Time: 171 us/mul. Err: 0.2734 202102 digits
1203210^65536+1 Time: 177 us/mul. Err: 0.2344 398482 digits
984108^131072+1 Time: 178 us/mul. Err: 0.2584 785521 digits
804904^262144+1 Time: 249 us/mul. Err: 0.2486 1548156 digits
658332^524288+1 Time: 252 us/mul. Err: 0.2378 3050541 digits
538452^1048576+1 Time: 266 us/mul. Err: 0.2188 6009544 digits
440400^2097152+1 Time: 304 us/mul. Err: 0.2275 11836006 digits
360204^4194304+1 Time: 352 us/mul. Err: 0.2168 23305854 digits
294612^8388608+1 Time: 337 us/mul. Err: 0.2344 45879398 digits

102^64+1 is a probable prime. (0.1 sec., err = 1.82e-11)
15000250^64+1 is a probable prime. (0.3 sec., err = 0.406)
120^128+1 is a probable prime. (0.2 sec., err = 5.09e-11)
10000038^128+1 is a probable prime. (0.5 sec., err = 0.406)
278^256+1 is a probable prime. (0.3 sec., err = 3.53e-10)
5684328^256+1 is a probable prime. (0.9 sec., err = 0.172)
46^512+1 is a probable prime. (0.5 sec., err = 1.64e-11)
4619000^512+1 is a probable prime. (1.8 sec., err = 0.195)
824^1024+1 is a probable prime. (1.6 sec., err = 9.08e-09)
3752220^1024+1 is a probable prime. (3.7 sec., err = 0.195)
150^2048+1 is a probable prime. (2.4 sec., err = 5.24e-10)
3066672^2048+1 is a probable prime. (7.1 sec., err = 0.219)
1534^4096+1 is a probable prime. (7.3 sec., err = 8.2e-08)
2485064^4096+1 is a probable prime. (14.5 sec., err = 0.215)
30406^8192+1 is a probable prime. (20.7 sec., err = 5.25e-05)
2030234^8192+1 is a probable prime. (28.9 sec., err = 0.234)
67234^16384+1 is a probable prime. (44.9 sec., err = 0.00037)
1651902^16384+1 is a probable prime. (57.9 sec., err = 0.22)
70906^32768+1 is a probable prime. (92.1 sec., err = 0.000671)
1277444^32768+1 is a probable prime. (113.7 sec., err = 0.227)
48594^65536+1 is a probable prime. (174.8 sec., err = 0.000458)
857678^65536+1 is a probable prime. (222.8 sec., err = 0.148)
62722^131072+1 is a probable prime. (365.5 sec., err = 0.00122)
572186^131072+1 is a probable prime. (429.6 sec., err = 0.105)
24518^262144+1 is a probable prime. (904.9 sec., err = 0.000275)

Yves Gallot
Volunteer developer
Project scientist
Send message
Joined: 19 Aug 12
Posts: 513
ID: 164101
Credit: 295,254,118
RAC: 5,845
GFN Double Silver: Earned 200,000,000 credits (295,254,118)
Message 68348 - Posted: 14 Aug 2013 | 10:43:29 UTC - in response to Message 68344.

I think that timings are incorrect. If they are correct, then something appears to be impacting the overall throughput. I commented out the check for CL_DEVICE_PREFERRED_VECTOR_WIDTH_DOUBLE. [...]

The HD 5870 is very fast!
24518^262144+1: 904.9 sec => 904.9 / (262144 * log2(24518)) = 237 us/mul.
genefercuda on a GTX 460: b^262144+1 Time: 1.06 ms/mul.
This program on a HD 7970 GHz: b^262144+1 Time: 336 us/mul.

If the timing is correct for N = 8388608, your computer can enter in the TOP500 list of supercomputers :o)
... or there is a problem with the accuracy of the clock() function or with OpenGL synchronization.

I will replace CL_DEVICE_PREFERRED_VECTOR_WIDTH_DOUBLE test by CL_DEVICE_EXTENSIONS to support cl_amd_fp64.

Profile shoichiro yamada
Send message
Joined: 6 Dec 10
Posts: 50
ID: 75840
Credit: 3,315,642
RAC: 0
PPS Sieve Gold: Earned 500,000 credits (586,188)TRP Sieve (suspended) Bronze: Earned 10,000 credits (21,775)PSA Ruby: Earned 2,000,000 credits (2,696,411)
Message 68350 - Posted: 14 Aug 2013 | 11:28:20 UTC

clock() is Elapsed Time on Windows.
clock() is Cpu Time on Unix.
time() is Elapsed Time on Unix,but resolution is second.

Yves Gallot
Volunteer developer
Project scientist
Send message
Joined: 19 Aug 12
Posts: 513
ID: 164101
Credit: 295,254,118
RAC: 5,845
GFN Double Silver: Earned 200,000,000 credits (295,254,118)
Message 68351 - Posted: 14 Aug 2013 | 11:38:00 UTC - in response to Message 68350.

clock() is Elapsed Time on Windows.
clock() is Cpu Time on Unix.

OK. Apple OS is Unix and because computation is running on GPU, CPU time ~ 0. Thanks!
It will be correct with the real genefer that uses gettimeofday() on Unix.

Yves Gallot
Volunteer developer
Project scientist
Send message
Joined: 19 Aug 12
Posts: 513
ID: 164101
Credit: 295,254,118
RAC: 5,845
GFN Double Silver: Earned 200,000,000 credits (295,254,118)
Message 68352 - Posted: 14 Aug 2013 | 11:51:12 UTC

A new version is available on assembla (2013-08-14).

It supports "cl_khr_fp64" and "cl_amd_fp64".

A bug was corrected, a possible final round-off error. It occurred with composite numbers but not with primes!

This may be the last release of this program: the next one is a beta of "GeneferOCL" (the real genefer app).

Profile Michael GoetzProject donor
Volunteer moderator
Project administrator
Project scientist
Avatar
Send message
Joined: 21 Jan 10
Posts: 12669
ID: 53948
Credit: 184,131,428
RAC: 10,627
The "Shut up already!" badge:  This loud mouth has mansplained on the forums over 10 thousand times!  Sheesh!!!Discovered the World's First GFN-19 prime!!!Discovered 1 mega primeFound 1 prime in the 2018 Tour de PrimesFound 1 prime in the 2019 Tour de Primes321 LLR Ruby: Earned 2,000,000 credits (2,063,182)Cullen LLR Ruby: Earned 2,000,000 credits (2,005,249)ESP LLR Ruby: Earned 2,000,000 credits (3,820,430)Generalized Cullen/Woodall LLR Ruby: Earned 2,000,000 credits (2,145,754)PPS LLR Ruby: Earned 2,000,000 credits (2,773,744)PSP LLR Ruby: Earned 2,000,000 credits (2,632,269)SoB LLR Sapphire: Earned 20,000,000 credits (34,158,496)SR5 LLR Turquoise: Earned 5,000,000 credits (8,293,415)SGS LLR Ruby: Earned 2,000,000 credits (2,012,781)TRP LLR Ruby: Earned 2,000,000 credits (2,737,347)Woodall LLR Ruby: Earned 2,000,000 credits (2,195,123)321 Sieve Turquoise: Earned 5,000,000 credits (5,046,112)Cullen/Woodall Sieve (suspended) Ruby: Earned 2,000,000 credits (4,170,256)Generalized Cullen/Woodall Sieve Turquoise: Earned 5,000,000 credits (5,059,304)PPS Sieve Sapphire: Earned 20,000,000 credits (20,110,788)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Amethyst: Earned 1,000,000 credits (1,035,522)TRP Sieve (suspended) Ruby: Earned 2,000,000 credits (2,051,121)AP 26/27 Turquoise: Earned 5,000,000 credits (7,090,096)GFN Emerald: Earned 50,000,000 credits (64,594,991)PSA Jade: Earned 10,000,000 credits (10,135,447)
Message 68357 - Posted: 14 Aug 2013 | 15:54:58 UTC

I expect I'll be releasing GeneferOCL as a beta/app_info app later today. I'm playing with it right now.

It appears to be happy running on ANYTHING. On my computer, specifying "--device 0" runs on my GTX 460 and "--device 1" runs on all 4 cores of my Core2Quad.

Although I don't have this hardware on this computer, it should also run on an ATI or Intel GPU just as easily.

So caution is warranted when running this on computers with multiple OpenCL capable devices.

This raises the possibility of officially supporting not just AMD/AT GPUs but also Intel GPUs, and possibly CPU-based multicore apps. (Well, that last part might be wishful thinking. It looks like GeneferOCL running on all 4 cores of my computer is about 10 times SLOWER than GenefX64 running on a single core.)

OpenCL on CPU:

Command line: geneferocl -q 212346^1048576+1 -d 1 Running on platform 'AMD Accelerated Parallel Processing', device 'Intel(R) Core(TM)2 Quad CPU @ 2.40GHz', version 'OpenCL 1.1 AMD-APP (831.4)' and driver '2.0'. Testing 212346^1048576+1... Starting initialization... Initialization complete (6.343 seconds). Estimated total run time for 212346^1048576+1 is 2940:06:11


GenefX64 on CPU:

Command line: genefx64 -q 212346^1048576+1 Priority change succeeded. Testing 212346^1048576+1... Starting initialization... Initialization complete (327.333 seconds). Estimated total run time for 212346^1048576+1 is 246:35:21


OpenCl on GTX 460:

Command line: geneferocl -q 212346^1048576+1 -d 0 Running on platform 'NVIDIA CUDA', device 'GeForce GTX 460', version 'OpenCL 1.1 CUDA' and driver '320.57'. Testing 212346^1048576+1... Starting initialization... Initialization complete (6.359 seconds). Estimated total run time for 212346^1048576+1 is 23:28:41


CUDA on GTX 460:

Command line: genefercuda -q 212346^1048576+1 -shift 7 Testing 212346^1048576+1... SHIFT override specified; using SHIFT=7 (instead of default value of 7) Starting initialization... maxErr during b^N initialization = 0.0000 (21.825 seconds). Estimated total run time for 212346^1048576+1 is 20:26:50


____________
Please do not PM me with support questions. Ask on the forums instead. Thank you!

My lucky number is 75898524288+1

Yves Gallot
Volunteer developer
Project scientist
Send message
Joined: 19 Aug 12
Posts: 513
ID: 164101
Credit: 295,254,118
RAC: 5,845
GFN Double Silver: Earned 200,000,000 credits (295,254,118)
Message 68360 - Posted: 14 Aug 2013 | 17:25:50 UTC - in response to Message 68357.

[...], it should also run on an ATI or Intel GPU just as easily.

Intel HD Graphics doesn't support double FP.

Intel CPU does but that's incredibly slow!
On my laptop:
geneferocl -q 212346^32768+1 -d 1 Running on platform 'Intel(R) OpenCL', device 'Intel(R) Core(TM) i3-3217U CPU @ 1.80GHz' Estimated total run time for 212346^32768+1 is 4:44:56


geneferavx -q "212346^32768+1" Estimated total run time for 212346^32768+1 is 0:04:04

And according to Intel's OpenCL compiler: "all kernel functions were successfully vectorized".
Does Intel's SDK emulate double FP ?

Profile Michael GoetzProject donor
Volunteer moderator
Project administrator
Project scientist
Avatar
Send message
Joined: 21 Jan 10
Posts: 12669
ID: 53948
Credit: 184,131,428
RAC: 10,627
The "Shut up already!" badge:  This loud mouth has mansplained on the forums over 10 thousand times!  Sheesh!!!Discovered the World's First GFN-19 prime!!!Discovered 1 mega primeFound 1 prime in the 2018 Tour de PrimesFound 1 prime in the 2019 Tour de Primes321 LLR Ruby: Earned 2,000,000 credits (2,063,182)Cullen LLR Ruby: Earned 2,000,000 credits (2,005,249)ESP LLR Ruby: Earned 2,000,000 credits (3,820,430)Generalized Cullen/Woodall LLR Ruby: Earned 2,000,000 credits (2,145,754)PPS LLR Ruby: Earned 2,000,000 credits (2,773,744)PSP LLR Ruby: Earned 2,000,000 credits (2,632,269)SoB LLR Sapphire: Earned 20,000,000 credits (34,158,496)SR5 LLR Turquoise: Earned 5,000,000 credits (8,293,415)SGS LLR Ruby: Earned 2,000,000 credits (2,012,781)TRP LLR Ruby: Earned 2,000,000 credits (2,737,347)Woodall LLR Ruby: Earned 2,000,000 credits (2,195,123)321 Sieve Turquoise: Earned 5,000,000 credits (5,046,112)Cullen/Woodall Sieve (suspended) Ruby: Earned 2,000,000 credits (4,170,256)Generalized Cullen/Woodall Sieve Turquoise: Earned 5,000,000 credits (5,059,304)PPS Sieve Sapphire: Earned 20,000,000 credits (20,110,788)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Amethyst: Earned 1,000,000 credits (1,035,522)TRP Sieve (suspended) Ruby: Earned 2,000,000 credits (2,051,121)AP 26/27 Turquoise: Earned 5,000,000 credits (7,090,096)GFN Emerald: Earned 50,000,000 credits (64,594,991)PSA Jade: Earned 10,000,000 credits (10,135,447)
Message 68363 - Posted: 14 Aug 2013 | 17:37:06 UTC - in response to Message 68360.

Intel HD Graphics doesn't support double FP.


We can forget about Intel GPUs. Oh well.

Intel CPU does but that's incredibly slow!
On my laptop:
geneferocl -q 212346^32768+1 -d 1 Running on platform 'Intel(R) OpenCL', device 'Intel(R) Core(TM) i3-3217U CPU @ 1.80GHz' Estimated total run time for 212346^32768+1 is 4:44:56


geneferavx -q "212346^32768+1" Estimated total run time for 212346^32768+1 is 0:04:04

And according to Intel's OpenCL compiler: "all kernel functions were successfully vectorized".
Does Intel's SDK emulate double FP ?


That could explain it. So, it will run... although "crawl" would be a better word than "run".
____________
Please do not PM me with support questions. Ask on the forums instead. Thank you!

My lucky number is 75898524288+1

Profile DeleteNullProject donor
Volunteer tester
Avatar
Send message
Joined: 6 Apr 06
Posts: 226
ID: 2663
Credit: 5,102,624,386
RAC: 136,622
Discovered 16 mega primesEliminated 2 conjecture "k"sFound 4 primes in the 2018 Tour de PrimesFound 1 mega prime in the 2018 Tour de PrimesFound 4 primes in the 2019 Tour de PrimesFound 2 mega primes in the 2019 Tour de PrimesFound 1 prime in the 2019 Tour de Primes Mountain Stage321 LLR Emerald: Earned 50,000,000 credits (50,589,422)Cullen LLR Emerald: Earned 50,000,000 credits (51,425,403)ESP LLR Emerald: Earned 50,000,000 credits (59,747,151)Generalized Cullen/Woodall LLR Emerald: Earned 50,000,000 credits (59,199,415)PPS LLR Emerald: Earned 50,000,000 credits (97,915,589)PSP LLR Emerald: Earned 50,000,000 credits (55,716,862)SoB LLR Emerald: Earned 50,000,000 credits (71,933,705)SR5 LLR Emerald: Earned 50,000,000 credits (50,919,719)SGS LLR Emerald: Earned 50,000,000 credits (51,327,296)TPS LLR (retired) Bronze: Earned 10,000 credits (61,785)TRP LLR Emerald: Earned 50,000,000 credits (61,671,238)Woodall LLR Emerald: Earned 50,000,000 credits (50,544,668)321 Sieve Emerald: Earned 50,000,000 credits (56,793,407)Cullen/Woodall Sieve (suspended) Ruby: Earned 2,000,000 credits (4,685,309)Generalized Cullen/Woodall Sieve Emerald: Earned 50,000,000 credits (53,324,439)PPS Sieve Double Ruby: Earned 2,000,000,000 credits (3,085,564,413)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Sapphire: Earned 20,000,000 credits (20,057,785)TRP Sieve (suspended) Emerald: Earned 50,000,000 credits (50,146,560)AP 26/27 Double Bronze: Earned 100,000,000 credits (164,762,545)GFN Double Gold: Earned 500,000,000 credits (820,619,526)PSA Double Bronze: Earned 100,000,000 credits (185,423,866)
Message 68373 - Posted: 14 Aug 2013 | 19:24:20 UTC - in response to Message 68358.

Hi,

testing with my 7950 the estimated runtime (after 3% work done) is 27050 seconds. GPU load is 99%, CPU is about 1%. I am using this app_info.xml:

<app_info>
<app>
<name>genefer</name>
<user_friendly_name>Genefer OCL</user_friendly_name>
</app>
<file_info>
<name>geneferocl-windows.exe</name>
<executable/>
</file_info>
<app_version>
<app_name>genefer</app_name>
<version_num>206</version_num>
<api_version>7.0.64</api_version>
<avg_ncpus>1.000000</avg_ncpus>
<max_ncpus>1.000000</max_ncpus>
<file_ref>
<file_name>geneferocl-windows.exe</file_name>
<main_program/>
</file_ref>
<platform>windows_intelx86</platform>
<coproc>
<type>ATI</type>
<count>1.000000</count>
</coproc>
</app_version>
</app_info>

____________
DeleteNull

Profile Roger
Volunteer moderator
Project administrator
Volunteer developer
Volunteer tester
Project scientist
Avatar
Send message
Joined: 27 Nov 11
Posts: 1112
ID: 120786
Credit: 261,530,863
RAC: 1,408
Found 1 prime in the 2018 Tour de Primes321 LLR Ruby: Earned 2,000,000 credits (2,012,522)Cullen LLR Amethyst: Earned 1,000,000 credits (1,359,862)ESP LLR Ruby: Earned 2,000,000 credits (2,213,934)Generalized Cullen/Woodall LLR Ruby: Earned 2,000,000 credits (2,088,705)PPS LLR Ruby: Earned 2,000,000 credits (2,617,785)PSP LLR Ruby: Earned 2,000,000 credits (2,420,512)SoB LLR Amethyst: Earned 1,000,000 credits (1,780,064)SR5 LLR Ruby: Earned 2,000,000 credits (2,238,295)SGS LLR Ruby: Earned 2,000,000 credits (2,139,392)TRP LLR Ruby: Earned 2,000,000 credits (2,125,391)Woodall LLR Amethyst: Earned 1,000,000 credits (1,311,937)321 Sieve Turquoise: Earned 5,000,000 credits (5,190,731)Cullen/Woodall Sieve (suspended) Silver: Earned 100,000 credits (207,387)Generalized Cullen/Woodall Sieve Turquoise: Earned 5,000,000 credits (5,049,697)PPS Sieve Double Bronze: Earned 100,000,000 credits (100,422,123)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Ruby: Earned 2,000,000 credits (3,227,972)TRP Sieve (suspended) Turquoise: Earned 5,000,000 credits (5,021,659)AP 26/27 Sapphire: Earned 20,000,000 credits (20,295,860)GFN Emerald: Earned 50,000,000 credits (56,515,310)PSA Sapphire: Earned 20,000,000 credits (43,298,465)
Message 68402 - Posted: 15 Aug 2013 | 10:45:00 UTC - in response to Message 68352.

Latest assembla OclGenefer.cpp revision 386 code with full run on HD7970Ghz:



Are we expecting the same "B" limits as GeneferCUDA?

Profile Michael GoetzProject donor
Volunteer moderator
Project administrator
Project scientist
Avatar
Send message
Joined: 21 Jan 10
Posts: 12669
ID: 53948
Credit: 184,131,428
RAC: 10,627
The "Shut up already!" badge:  This loud mouth has mansplained on the forums over 10 thousand times!  Sheesh!!!Discovered the World's First GFN-19 prime!!!Discovered 1 mega primeFound 1 prime in the 2018 Tour de PrimesFound 1 prime in the 2019 Tour de Primes321 LLR Ruby: Earned 2,000,000 credits (2,063,182)Cullen LLR Ruby: Earned 2,000,000 credits (2,005,249)ESP LLR Ruby: Earned 2,000,000 credits (3,820,430)Generalized Cullen/Woodall LLR Ruby: Earned 2,000,000 credits (2,145,754)PPS LLR Ruby: Earned 2,000,000 credits (2,773,744)PSP LLR Ruby: Earned 2,000,000 credits (2,632,269)SoB LLR Sapphire: Earned 20,000,000 credits (34,158,496)SR5 LLR Turquoise: Earned 5,000,000 credits (8,293,415)SGS LLR Ruby: Earned 2,000,000 credits (2,012,781)TRP LLR Ruby: Earned 2,000,000 credits (2,737,347)Woodall LLR Ruby: Earned 2,000,000 credits (2,195,123)321 Sieve Turquoise: Earned 5,000,000 credits (5,046,112)Cullen/Woodall Sieve (suspended) Ruby: Earned 2,000,000 credits (4,170,256)Generalized Cullen/Woodall Sieve Turquoise: Earned 5,000,000 credits (5,059,304)PPS Sieve Sapphire: Earned 20,000,000 credits (20,110,788)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Amethyst: Earned 1,000,000 credits (1,035,522)TRP Sieve (suspended) Ruby: Earned 2,000,000 credits (2,051,121)AP 26/27 Turquoise: Earned 5,000,000 credits (7,090,096)GFN Emerald: Earned 50,000,000 credits (64,594,991)PSA Jade: Earned 10,000,000 credits (10,135,447)
Message 68405 - Posted: 15 Aug 2013 | 13:32:04 UTC - in response to Message 68402.
Last modified: 15 Aug 2013 | 13:32:39 UTC

Are we expecting the same "B" limits as GeneferCUDA?


Running the full blown geneferocl app using the -l option, the b limits mostly look to be similar. Let's see if I can cut and paste something usefull...

Here's GeneferCUDA:

Generalized Fermat Number b Limits
The upper bound m = 8192, b = 2650000, Err = 0.2813
The upper bound m = 16384, b = 2280000, Err = 0.2969
The upper bound m = 32768, b = 1840000, Err = 0.2969
The upper bound m = 65536, b = 1525000, Err = 0.2969
The upper bound m = 131072, b = 1270000, Err = 0.2969
The upper bound m = 262144, b = 995000, Err = 0.2813
The upper bound m = 524288, b = 815000, Err = 0.2813
The upper bound m = 1048576, b = 695000, Err = 0.3047
The upper bound m = 2097152, b = 580000, Err = 0.2969
The upper bound m = 4194304, b = 475000, Err = 0.3125
The upper bound m = 8388608, b = 400000, Err = 0.3125


GeneferOCL:

Generalized Fermat Number b Limits
The upper bound m = 8192, b = 2670000, Err = 0.2910
The upper bound m = 16384, b = 2210000, Err = 0.2969
The upper bound m = 32768, b = 1780000, Err = 0.2969
The upper bound m = 65536, b = 1505000, Err = 0.2969
The upper bound m = 131072, b = 1240000, Err = 0.2969
The upper bound m = 262144, b = 1015000, Err = 0.3047
The upper bound m = 524288, b = 825000, Err = 0.3057
The upper bound m = 1048576, b = 680000, Err = 0.3047
The upper bound m = 2097152, b = 555000, Err = 0.2969
The upper bound m = 4194304, b = 455000, Err = 0.2813
The upper bound m = 8388608, b = 385000, Err = 0.3125


The higher limits are green and the lower limits are red. As you can see, the limits are similar but not identical.
____________
Please do not PM me with support questions. Ask on the forums instead. Thank you!

My lucky number is 75898524288+1

Profile Michael GoetzProject donor
Volunteer moderator
Project administrator
Project scientist
Avatar
Send message
Joined: 21 Jan 10
Posts: 12669
ID: 53948
Credit: 184,131,428
RAC: 10,627
The "Shut up already!" badge:  This loud mouth has mansplained on the forums over 10 thousand times!  Sheesh!!!Discovered the World's First GFN-19 prime!!!Discovered 1 mega primeFound 1 prime in the 2018 Tour de PrimesFound 1 prime in the 2019 Tour de Primes321 LLR Ruby: Earned 2,000,000 credits (2,063,182)Cullen LLR Ruby: Earned 2,000,000 credits (2,005,249)ESP LLR Ruby: Earned 2,000,000 credits (3,820,430)Generalized Cullen/Woodall LLR Ruby: Earned 2,000,000 credits (2,145,754)PPS LLR Ruby: Earned 2,000,000 credits (2,773,744)PSP LLR Ruby: Earned 2,000,000 credits (2,632,269)SoB LLR Sapphire: Earned 20,000,000 credits (34,158,496)SR5 LLR Turquoise: Earned 5,000,000 credits (8,293,415)SGS LLR Ruby: Earned 2,000,000 credits (2,012,781)TRP LLR Ruby: Earned 2,000,000 credits (2,737,347)Woodall LLR Ruby: Earned 2,000,000 credits (2,195,123)321 Sieve Turquoise: Earned 5,000,000 credits (5,046,112)Cullen/Woodall Sieve (suspended) Ruby: Earned 2,000,000 credits (4,170,256)Generalized Cullen/Woodall Sieve Turquoise: Earned 5,000,000 credits (5,059,304)PPS Sieve Sapphire: Earned 20,000,000 credits (20,110,788)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Amethyst: Earned 1,000,000 credits (1,035,522)TRP Sieve (suspended) Ruby: Earned 2,000,000 credits (2,051,121)AP 26/27 Turquoise: Earned 5,000,000 credits (7,090,096)GFN Emerald: Earned 50,000,000 credits (64,594,991)PSA Jade: Earned 10,000,000 credits (10,135,447)
Message 68409 - Posted: 15 Aug 2013 | 15:29:52 UTC

From the database:

Running on platform 'AMD Accelerated Parallel Processing', device 'Tahiti', version 'OpenCL 1.2 AMD-APP (1124.2)' and driver '1124.2 (VM)'.


Starting initialization...
Initialization complete (2.464 seconds).
Testing 216560^1048576+1...
Estimated total run time for 216560^1048576+1 is 7:39:03
216560^1048576+1 is complete. (5594760 digits) (err = 0.0469) (time = 9:48:38) 06:53:04


Running on platform 'AMD Accelerated Parallel Processing', device 'Tahiti', version 'OpenCL 1.2 AMD-APP (1124.2)' and driver '1124.2 (VM)'.


Starting initialization...
Initialization complete (3.205 seconds).
Testing 216878^1048576+1...
Estimated total run time for 216878^1048576+1 is 14:56:13
216878^1048576+1 is complete. (5595428 digits) (err = 0.0488) (time = 7:51:30) 14:44:38

Running on platform 'AMD Accelerated Parallel Processing', device 'Tahiti', version 'OpenCL 1.2 AMD-APP (1124.2)' and driver '1124.2 (VM)'.


Starting initialization...
Initialization complete (3.078 seconds).
Testing 216862^1048576+1...
Estimated total run time for 216862^1048576+1 is 7:52:07
216862^1048576+1 is complete. (5595394 digits) (err = 0.0469) (time = 8:20:41) 16:17:09


A few interesting things. The first and most obvious is that it works.

The second is that it doesn't tell you the exact device model. That's a shame.

The third is that the estimates vary by more than I'm accustomed to seeing on my system. These three results are from the same computer, but it's not mine.
____________
Please do not PM me with support questions. Ask on the forums instead. Thank you!

My lucky number is 75898524288+1

Yves Gallot
Volunteer developer
Project scientist
Send message
Joined: 19 Aug 12
Posts: 513
ID: 164101
Credit: 295,254,118
RAC: 5,845
GFN Double Silver: Earned 200,000,000 credits (295,254,118)
Message 68411 - Posted: 15 Aug 2013 | 16:03:23 UTC - in response to Message 68405.

Running the full blown geneferocl app using the -l option, [...]As you can see, the limits are similar but not identical.

Yes, because the algorithms are similar but not identical.
There are many ways to split a large FFT into smaller ones. Then the round-off error of CUFFT is not identical to the OpenCL transform.

More surprisingly the error of OpenCL implementation seems to depend on hardware...
"cl_khr_fp64" requires that arithmetic is IEEE 754-2008 compliant, and I set the flag FP_CONTRACT=OFF, which disallows the implementation to contract expressions.
But that's not enough... hopefully round-off errors are very similar.

Profile DeleteNullProject donor
Volunteer tester
Avatar
Send message
Joined: 6 Apr 06
Posts: 226
ID: 2663
Credit: 5,102,624,386
RAC: 136,622
Discovered 16 mega primesEliminated 2 conjecture "k"sFound 4 primes in the 2018 Tour de PrimesFound 1 mega prime in the 2018 Tour de PrimesFound 4 primes in the 2019 Tour de PrimesFound 2 mega primes in the 2019 Tour de PrimesFound 1 prime in the 2019 Tour de Primes Mountain Stage321 LLR Emerald: Earned 50,000,000 credits (50,589,422)Cullen LLR Emerald: Earned 50,000,000 credits (51,425,403)ESP LLR Emerald: Earned 50,000,000 credits (59,747,151)Generalized Cullen/Woodall LLR Emerald: Earned 50,000,000 credits (59,199,415)PPS LLR Emerald: Earned 50,000,000 credits (97,915,589)PSP LLR Emerald: Earned 50,000,000 credits (55,716,862)SoB LLR Emerald: Earned 50,000,000 credits (71,933,705)SR5 LLR Emerald: Earned 50,000,000 credits (50,919,719)SGS LLR Emerald: Earned 50,000,000 credits (51,327,296)TPS LLR (retired) Bronze: Earned 10,000 credits (61,785)TRP LLR Emerald: Earned 50,000,000 credits (61,671,238)Woodall LLR Emerald: Earned 50,000,000 credits (50,544,668)321 Sieve Emerald: Earned 50,000,000 credits (56,793,407)Cullen/Woodall Sieve (suspended) Ruby: Earned 2,000,000 credits (4,685,309)Generalized Cullen/Woodall Sieve Emerald: Earned 50,000,000 credits (53,324,439)PPS Sieve Double Ruby: Earned 2,000,000,000 credits (3,085,564,413)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Sapphire: Earned 20,000,000 credits (20,057,785)TRP Sieve (suspended) Emerald: Earned 50,000,000 credits (50,146,560)AP 26/27 Double Bronze: Earned 100,000,000 credits (164,762,545)GFN Double Gold: Earned 500,000,000 credits (820,619,526)PSA Double Bronze: Earned 100,000,000 credits (185,423,866)
Message 68412 - Posted: 15 Aug 2013 | 16:54:55 UTC - in response to Message 68409.



A few interesting things. The first and most obvious is that it works.

The second is that it doesn't tell you the exact device model. That's a shame.

The third is that the estimates vary by more than I'm accustomed to seeing on my system. These three results are from the same computer, but it's not mine.


My computer has two devices:
device 0 = HD7970
device 1 = HD7950

I started with device 1 (device 0 was running distrigen), the estimated time was about 7.5 hours.
Till 75% of the WU the estimated time and the run time were consistent.
At 90% device 0 run dry (no new work), and geneferocl runs wery slow. So instead of 7.5 hours....the run time increases to 9.8 hours.

The second WU started on device 1, and after one hour the progress was about 2% (estimated time = 50 hours!)
After starting a third WU on device 0 geneferocl was running faster, the second WU finishes after 7,8 hours, the third WU after 8,33 hours.

In this moment:
device 1 (7950): 54.3% at 4 hours, estimated time = 7,37 hours
device 0 (7970): 19.55% at 2.5 hours, estimated time = 11,76 hours

It seems to me, that the app uses not only the declared device.....




____________
DeleteNull

Yves Gallot
Volunteer developer
Project scientist
Send message
Joined: 19 Aug 12
Posts: 513
ID: 164101
Credit: 295,254,118
RAC: 5,845
GFN Double Silver: Earned 200,000,000 credits (295,254,118)
Message 68422 - Posted: 15 Aug 2013 | 20:01:30 UTC - in response to Message 68412.

My computer has two devices:
device 0 = HD7970
device 1 = HD7950

If you start genefer using interactive mode and enter a command, it prints the device list that it found.
Could you copy this list? I don't know if there is one OpenCL 'platform' or two.

I downloaded DistrRTgen source code and compared OpenCL initialisation.
I found a difference and modified genefer code (now, initialisations are similar). Mike, could you update GeneferOCL?

I noticed that DistrRTgen uses boinc_get_opencl_ids() to get platform and device ids. Mike, do you know if we should use this function?

Profile Michael GoetzProject donor
Volunteer moderator
Project administrator
Project scientist
Avatar
Send message
Joined: 21 Jan 10
Posts: 12669
ID: 53948
Credit: 184,131,428
RAC: 10,627
The "Shut up already!" badge:  This loud mouth has mansplained on the forums over 10 thousand times!  Sheesh!!!Discovered the World's First GFN-19 prime!!!Discovered 1 mega primeFound 1 prime in the 2018 Tour de PrimesFound 1 prime in the 2019 Tour de Primes321 LLR Ruby: Earned 2,000,000 credits (2,063,182)Cullen LLR Ruby: Earned 2,000,000 credits (2,005,249)ESP LLR Ruby: Earned 2,000,000 credits (3,820,430)Generalized Cullen/Woodall LLR Ruby: Earned 2,000,000 credits (2,145,754)PPS LLR Ruby: Earned 2,000,000 credits (2,773,744)PSP LLR Ruby: Earned 2,000,000 credits (2,632,269)SoB LLR Sapphire: Earned 20,000,000 credits (34,158,496)SR5 LLR Turquoise: Earned 5,000,000 credits (8,293,415)SGS LLR Ruby: Earned 2,000,000 credits (2,012,781)TRP LLR Ruby: Earned 2,000,000 credits (2,737,347)Woodall LLR Ruby: Earned 2,000,000 credits (2,195,123)321 Sieve Turquoise: Earned 5,000,000 credits (5,046,112)Cullen/Woodall Sieve (suspended) Ruby: Earned 2,000,000 credits (4,170,256)Generalized Cullen/Woodall Sieve Turquoise: Earned 5,000,000 credits (5,059,304)PPS Sieve Sapphire: Earned 20,000,000 credits (20,110,788)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Amethyst: Earned 1,000,000 credits (1,035,522)TRP Sieve (suspended) Ruby: Earned 2,000,000 credits (2,051,121)AP 26/27 Turquoise: Earned 5,000,000 credits (7,090,096)GFN Emerald: Earned 50,000,000 credits (64,594,991)PSA Jade: Earned 10,000,000 credits (10,135,447)
Message 68424 - Posted: 15 Aug 2013 | 20:20:58 UTC - in response to Message 68422.
Last modified: 15 Aug 2013 | 20:24:53 UTC

I downloaded DistrRTgen source code and compared OpenCL initialisation.
I found a difference and modified genefer code (now, initialisations are similar). Mike, could you update GeneferOCL?


Certainly.

I noticed that DistrRTgen uses boinc_get_opencl_ids() to get platform and device ids. Mike, do you know if we should use this function?


I don't know anything about it, unfortunately. I do recall some chatter about how much fun it is to get the right OpenCL devices, but I don't remember if that was BOINC related or PRPNet related.

My guess would be that the function is there to solve a problem related to selecting the right device, so chances are it's a good idea to use it.

EDIT: I'm going to bump the variant (-#) version number, so people can tell one version from another. We'll probably use a lot of them, but that's ok.
____________
Please do not PM me with support questions. Ask on the forums instead. Thank you!

My lucky number is 75898524288+1

Profile Michael GoetzProject donor
Volunteer moderator
Project administrator
Project scientist
Avatar
Send message
Joined: 21 Jan 10
Posts: 12669
ID: 53948
Credit: 184,131,428
RAC: 10,627
The "Shut up already!" badge:  This loud mouth has mansplained on the forums over 10 thousand times!  Sheesh!!!Discovered the World's First GFN-19 prime!!!Discovered 1 mega primeFound 1 prime in the 2018 Tour de PrimesFound 1 prime in the 2019 Tour de Primes321 LLR Ruby: Earned 2,000,000 credits (2,063,182)Cullen LLR Ruby: Earned 2,000,000 credits (2,005,249)ESP LLR Ruby: Earned 2,000,000 credits (3,820,430)Generalized Cullen/Woodall LLR Ruby: Earned 2,000,000 credits (2,145,754)PPS LLR Ruby: Earned 2,000,000 credits (2,773,744)PSP LLR Ruby: Earned 2,000,000 credits (2,632,269)SoB LLR Sapphire: Earned 20,000,000 credits (34,158,496)SR5 LLR Turquoise: Earned 5,000,000 credits (8,293,415)SGS LLR Ruby: Earned 2,000,000 credits (2,012,781)TRP LLR Ruby: Earned 2,000,000 credits (2,737,347)Woodall LLR Ruby: Earned 2,000,000 credits (2,195,123)321 Sieve Turquoise: Earned 5,000,000 credits (5,046,112)Cullen/Woodall Sieve (suspended) Ruby: Earned 2,000,000 credits (4,170,256)Generalized Cullen/Woodall Sieve Turquoise: Earned 5,000,000 credits (5,059,304)PPS Sieve Sapphire: Earned 20,000,000 credits (20,110,788)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Amethyst: Earned 1,000,000 credits (1,035,522)TRP Sieve (suspended) Ruby: Earned 2,000,000 credits (2,051,121)AP 26/27 Turquoise: Earned 5,000,000 credits (7,090,096)GFN Emerald: Earned 50,000,000 credits (64,594,991)PSA Jade: Earned 10,000,000 credits (10,135,447)
Message 68427 - Posted: 15 Aug 2013 | 20:57:23 UTC

The latest OpenCL beta build is available for download. Its version number is 3.1.2-1, and can be downloaded via this thread.
____________
Please do not PM me with support questions. Ask on the forums instead. Thank you!

My lucky number is 75898524288+1

Profile Michael GoetzProject donor
Volunteer moderator
Project administrator
Project scientist
Avatar
Send message
Joined: 21 Jan 10
Posts: 12669
ID: 53948
Credit: 184,131,428
RAC: 10,627
The "Shut up already!" badge:  This loud mouth has mansplained on the forums over 10 thousand times!  Sheesh!!!Discovered the World's First GFN-19 prime!!!Discovered 1 mega primeFound 1 prime in the 2018 Tour de PrimesFound 1 prime in the 2019 Tour de Primes321 LLR Ruby: Earned 2,000,000 credits (2,063,182)Cullen LLR Ruby: Earned 2,000,000 credits (2,005,249)ESP LLR Ruby: Earned 2,000,000 credits (3,820,430)Generalized Cullen/Woodall LLR Ruby: Earned 2,000,000 credits (2,145,754)PPS LLR Ruby: Earned 2,000,000 credits (2,773,744)PSP LLR Ruby: Earned 2,000,000 credits (2,632,269)SoB LLR Sapphire: Earned 20,000,000 credits (34,158,496)SR5 LLR Turquoise: Earned 5,000,000 credits (8,293,415)SGS LLR Ruby: Earned 2,000,000 credits (2,012,781)TRP LLR Ruby: Earned 2,000,000 credits (2,737,347)Woodall LLR Ruby: Earned 2,000,000 credits (2,195,123)321 Sieve Turquoise: Earned 5,000,000 credits (5,046,112)Cullen/Woodall Sieve (suspended) Ruby: Earned 2,000,000 credits (4,170,256)Generalized Cullen/Woodall Sieve Turquoise: Earned 5,000,000 credits (5,059,304)PPS Sieve Sapphire: Earned 20,000,000 credits (20,110,788)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Amethyst: Earned 1,000,000 credits (1,035,522)TRP Sieve (suspended) Ruby: Earned 2,000,000 credits (2,051,121)AP 26/27 Turquoise: Earned 5,000,000 credits (7,090,096)GFN Emerald: Earned 50,000,000 credits (64,594,991)PSA Jade: Earned 10,000,000 credits (10,135,447)
Message 68430 - Posted: 15 Aug 2013 | 21:07:08 UTC

On my system (Core2Quad and GTX 460) GeneferCUDA uses almost no CPU at all. GeneferOCL seems to use an entire CPU core as well as the entire GPU.
____________
Please do not PM me with support questions. Ask on the forums instead. Thank you!

My lucky number is 75898524288+1

Yves Gallot
Volunteer developer
Project scientist
Send message
Joined: 19 Aug 12
Posts: 513
ID: 164101
Credit: 295,254,118
RAC: 5,845
GFN Double Silver: Earned 200,000,000 credits (295,254,118)
Message 68433 - Posted: 15 Aug 2013 | 22:00:45 UTC - in response to Message 68430.

On my system (Core2Quad and GTX 460) GeneferCUDA uses almost no CPU at all. GeneferOCL seems to use an entire CPU core as well as the entire GPU.

Yes, a bug of NVidia driver: https://forums.geforce.com/default/topic/543115/opencl-driver-support-for-fah/
NVidia also removed OpenCL documentation from its SDK and still uses OpenCL 1.1 (OpenCL 1.2 specifications were defined in November 2011!).
It is clear today that NVidia doesn't actively support OpenCL.

Profile DeleteNullProject donor
Volunteer tester
Avatar
Send message
Joined: 6 Apr 06
Posts: 226
ID: 2663
Credit: 5,102,624,386
RAC: 136,622
Discovered 16 mega primesEliminated 2 conjecture "k"sFound 4 primes in the 2018 Tour de PrimesFound 1 mega prime in the 2018 Tour de PrimesFound 4 primes in the 2019 Tour de PrimesFound 2 mega primes in the 2019 Tour de PrimesFound 1 prime in the 2019 Tour de Primes Mountain Stage321 LLR Emerald: Earned 50,000,000 credits (50,589,422)Cullen LLR Emerald: Earned 50,000,000 credits (51,425,403)ESP LLR Emerald: Earned 50,000,000 credits (59,747,151)Generalized Cullen/Woodall LLR Emerald: Earned 50,000,000 credits (59,199,415)PPS LLR Emerald: Earned 50,000,000 credits (97,915,589)PSP LLR Emerald: Earned 50,000,000 credits (55,716,862)SoB LLR Emerald: Earned 50,000,000 credits (71,933,705)SR5 LLR Emerald: Earned 50,000,000 credits (50,919,719)SGS LLR Emerald: Earned 50,000,000 credits (51,327,296)TPS LLR (retired) Bronze: Earned 10,000 credits (61,785)TRP LLR Emerald: Earned 50,000,000 credits (61,671,238)Woodall LLR Emerald: Earned 50,000,000 credits (50,544,668)321 Sieve Emerald: Earned 50,000,000 credits (56,793,407)Cullen/Woodall Sieve (suspended) Ruby: Earned 2,000,000 credits (4,685,309)Generalized Cullen/Woodall Sieve Emerald: Earned 50,000,000 credits (53,324,439)PPS Sieve Double Ruby: Earned 2,000,000,000 credits (3,085,564,413)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Sapphire: Earned 20,000,000 credits (20,057,785)TRP Sieve (suspended) Emerald: Earned 50,000,000 credits (50,146,560)AP 26/27 Double Bronze: Earned 100,000,000 credits (164,762,545)GFN Double Gold: Earned 500,000,000 credits (820,619,526)PSA Double Bronze: Earned 100,000,000 credits (185,423,866)
Message 68435 - Posted: 15 Aug 2013 | 22:27:18 UTC - in response to Message 68433.

Hi Yves,

the output with both versions of geneferocl-windows.exe is:

Device List:
0: GPU device 'Tahiti' on 'AMD Accelerated Parallel Processing'.
1: GPU device 'Tahiti' on 'AMD Accelerated Parallel Processing'.
2: CPU device ' Intel(R) Core(TM) i7-3930K CPU @ 3.20GHz' on 'AMD Accelerated Parallel Processing'.
____________
DeleteNull

Profile chip
Avatar
Send message
Joined: 12 Apr 11
Posts: 128
ID: 94709
Credit: 164,082,201
RAC: 5,606
321 LLR Amethyst: Earned 1,000,000 credits (1,081,800)ESP LLR Amethyst: Earned 1,000,000 credits (1,958,365)PPS LLR Ruby: Earned 2,000,000 credits (3,000,162)PSP LLR Amethyst: Earned 1,000,000 credits (1,022,562)SoB LLR Ruby: Earned 2,000,000 credits (2,015,539)SR5 LLR Ruby: Earned 2,000,000 credits (2,000,481)SGS LLR Ruby: Earned 2,000,000 credits (2,000,014)TRP LLR Ruby: Earned 2,000,000 credits (3,000,865)321 Sieve Ruby: Earned 2,000,000 credits (2,000,357)Cullen/Woodall Sieve (suspended) Ruby: Earned 2,000,000 credits (4,000,096)PPS Sieve Emerald: Earned 50,000,000 credits (50,000,422)TRP Sieve (suspended) Ruby: Earned 2,000,000 credits (2,000,500)AP 26/27 Sapphire: Earned 20,000,000 credits (20,000,721)GFN Sapphire: Earned 20,000,000 credits (20,000,117)PSA Emerald: Earned 50,000,000 credits (50,000,198)
Message 68440 - Posted: 16 Aug 2013 | 5:08:56 UTC

Running on platform 'NVIDIA CUDA', device 'GeForce GTX 580', version 'OpenCL 1.1 CUDA' and driver '326.58'.

geneferocl 3.1.2-1 (Windows 64-bit OpenGL) 2199064^8192+1 Time: 68.5 us/mul. Err: 0.2344 51956 digits 1798620^16384+1 Time: 72.4 us/mul. Err: 0.2266 102481 digits 1471094^32768+1 Time: 76.2 us/mul. Err: 0.2344 202102 digits 1203210^65536+1 Time: 114 us/mul. Err: 0.2656 398482 digits 984108^131072+1 Time: 198 us/mul. Err: 0.2188 785521 digits 804904^262144+1 Time: 411 us/mul. Err: 0.2188 1548156 digits 658332^524288+1 Time: 793 us/mul. Err: 0.2188 3050541 digits 538452^1048576+1 Time: 1.59 ms/mul. Err: 0.2266 6009544 digits 440400^2097152+1 Time: 3.29 ms/mul. Err: 0.2266 11836006 digits 360204^4194304+1 Time: 6.83 ms/mul. Err: 0.2031 23305854 digits 294612^8388608+1 Time: 14.6 ms/mul. Err: 0.1895 45879398 digits


genefercuda 3.1.2-2 (Windows 32-bit CUDA) 2199064^8192+1 Time: 87.5 us/mul. Err: 0.2188 51956 digits 1798620^16384+1 Time: 98.1 us/mul. Err: 0.2188 102481 digits 1471094^32768+1 Time: 114 us/mul. Err: 0.2031 202102 digits 1203210^65536+1 Time: 168 us/mul. Err: 0.2031 398482 digits 984108^131072+1 Time: 312 us/mul. Err: 0.2070 785521 digits 804904^262144+1 Time: 519 us/mul. Err: 0.2031 1548156 digits 658332^524288+1 Time: 945 us/mul. Err: 0.2266 3050541 digits 538452^1048576+1 Time: 1.65 ms/mul. Err: 0.2031 6009544 digits 440400^2097152+1 Time: 3.17 ms/mul. Err: 0.1953 11836006 digits 360204^4194304+1 Time: 6.58 ms/mul. Err: 0.2031 23305854 digits 294612^8388608+1 Time: 14.1 ms/mul. Err: 0.1797 45879398 digits

Profile Roger
Volunteer moderator
Project administrator
Volunteer developer
Volunteer tester
Project scientist
Avatar
Send message
Joined: 27 Nov 11
Posts: 1112
ID: 120786
Credit: 261,530,863
RAC: 1,408
Found 1 prime in the 2018 Tour de Primes321 LLR Ruby: Earned 2,000,000 credits (2,012,522)Cullen LLR Amethyst: Earned 1,000,000 credits (1,359,862)ESP LLR Ruby: Earned 2,000,000 credits (2,213,934)Generalized Cullen/Woodall LLR Ruby: Earned 2,000,000 credits (2,088,705)PPS LLR Ruby: Earned 2,000,000 credits (2,617,785)PSP LLR Ruby: Earned 2,000,000 credits (2,420,512)SoB LLR Amethyst: Earned 1,000,000 credits (1,780,064)SR5 LLR Ruby: Earned 2,000,000 credits (2,238,295)SGS LLR Ruby: Earned 2,000,000 credits (2,139,392)TRP LLR Ruby: Earned 2,000,000 credits (2,125,391)Woodall LLR Amethyst: Earned 1,000,000 credits (1,311,937)321 Sieve Turquoise: Earned 5,000,000 credits (5,190,731)Cullen/Woodall Sieve (suspended) Silver: Earned 100,000 credits (207,387)Generalized Cullen/Woodall Sieve Turquoise: Earned 5,000,000 credits (5,049,697)PPS Sieve Double Bronze: Earned 100,000,000 credits (100,422,123)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Ruby: Earned 2,000,000 credits (3,227,972)TRP Sieve (suspended) Turquoise: Earned 5,000,000 credits (5,021,659)AP 26/27 Sapphire: Earned 20,000,000 credits (20,295,860)GFN Emerald: Earned 50,000,000 credits (56,515,310)PSA Sapphire: Earned 20,000,000 credits (43,298,465)
Message 68441 - Posted: 16 Aug 2013 | 5:55:41 UTC - in response to Message 68405.

Are we expecting the same "B" limits as GeneferCUDA?


Running the full blown geneferocl app using the -l option, the b limits mostly look to be similar. Let's see if I can cut and paste something useful...

GeneferOCL:

Generalized Fermat Number b Limits
The upper bound m = 8192, b = 2670000, Err = 0.2910
The upper bound m = 16384, b = 2210000, Err = 0.2969
The upper bound m = 32768, b = 1780000, Err = 0.2969
The upper bound m = 65536, b = 1505000, Err = 0.2969
The upper bound m = 131072, b = 1240000, Err = 0.2969
The upper bound m = 262144, b = 1015000, Err = 0.3047
The upper bound m = 524288, b = 825000, Err = 0.3057
The upper bound m = 1048576, b = 680000, Err = 0.3047
The upper bound m = 2097152, b = 555000, Err = 0.2969
The upper bound m = 4194304, b = 455000, Err = 0.2813
The upper bound m = 8388608, b = 385000, Err = 0.3125


The higher limits are green and the lower limits are red. As you can see, the limits are similar but not identical.


I got exactly the same "B" limits testing GeneferOCL -l on the HD7970Ghz.

Profile Michael GoetzProject donor
Volunteer moderator
Project administrator
Project scientist
Avatar
Send message
Joined: 21 Jan 10
Posts: 12669
ID: 53948
Credit: 184,131,428
RAC: 10,627
The "Shut up already!" badge:  This loud mouth has mansplained on the forums over 10 thousand times!  Sheesh!!!Discovered the World's First GFN-19 prime!!!Discovered 1 mega primeFound 1 prime in the 2018 Tour de PrimesFound 1 prime in the 2019 Tour de Primes321 LLR Ruby: Earned 2,000,000 credits (2,063,182)Cullen LLR Ruby: Earned 2,000,000 credits (2,005,249)ESP LLR Ruby: Earned 2,000,000 credits (3,820,430)Generalized Cullen/Woodall LLR Ruby: Earned 2,000,000 credits (2,145,754)PPS LLR Ruby: Earned 2,000,000 credits (2,773,744)PSP LLR Ruby: Earned 2,000,000 credits (2,632,269)SoB LLR Sapphire: Earned 20,000,000 credits (34,158,496)SR5 LLR Turquoise: Earned 5,000,000 credits (8,293,415)SGS LLR Ruby: Earned 2,000,000 credits (2,012,781)TRP LLR Ruby: Earned 2,000,000 credits (2,737,347)Woodall LLR Ruby: Earned 2,000,000 credits (2,195,123)321 Sieve Turquoise: Earned 5,000,000 credits (5,046,112)Cullen/Woodall Sieve (suspended) Ruby: Earned 2,000,000 credits (4,170,256)Generalized Cullen/Woodall Sieve Turquoise: Earned 5,000,000 credits (5,059,304)PPS Sieve Sapphire: Earned 20,000,000 credits (20,110,788)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Amethyst: Earned 1,000,000 credits (1,035,522)TRP Sieve (suspended) Ruby: Earned 2,000,000 credits (2,051,121)AP 26/27 Turquoise: Earned 5,000,000 credits (7,090,096)GFN Emerald: Earned 50,000,000 credits (64,594,991)PSA Jade: Earned 10,000,000 credits (10,135,447)
Message 68450 - Posted: 16 Aug 2013 | 12:49:59 UTC

Some of DeleteNull's GeneferOCL tasks have now been validated (one with a CPU, the other with a Tesla GPU), so we now officially have our first credit awarded to GeneferOCL and an ATI/AMD GPU.

This is a milestone many people have been waiting for.

Many thanks and congratulation to Yves for the great achievement! (This is, of course, just the latest advance for which Yves deserves our thanks.)
____________
Please do not PM me with support questions. Ask on the forums instead. Thank you!

My lucky number is 75898524288+1

Carlos Augusto Engel
Send message
Joined: 2 Jul 10
Posts: 1
ID: 63166
Credit: 301,093,567
RAC: 19,633
Cullen/Woodall Sieve (suspended) Amethyst: Earned 1,000,000 credits (1,175,058)PPS Sieve Double Silver: Earned 200,000,000 credits (298,226,821)GFN Amethyst: Earned 1,000,000 credits (1,691,688)
Message 68453 - Posted: 16 Aug 2013 | 17:39:18 UTC

Results using HD 7970 on I7 Window7 64:

C:\Users\user\Downloads>geneferocl-windows.exe -b
geneferocl 3.1.2-1 (Windows 64-bit OpenGL)
Copyright 2001-2013, Yves Gallot
Copyright 2009, Mark Rodenkirch, David Underbakke
Copyright 2010-2012, Shoichiro Yamada, Ken Brazier
Copyright 2011-2013, Iain Bethune, Michael Goetz, Ronald Schneider

Command line: geneferocl-windows.exe -b


Running on platform 'AMD Accelerated Parallel Processing', device 'Tahiti', vers
ion 'OpenCL 1.2 AMD-APP (1272.2)' and driver '1272.2 (VM)'.

Generalized Fermat Number Bench
2199064^8192+1 Time: 70.8 us/mul. Err: 0.2344 51956 digits
1798620^16384+1 Time: 73.9 us/mul. Err: 0.2266 102481 digits
1471094^32768+1 Time: 74.5 us/mul. Err: 0.2344 202102 digits
1203210^65536+1 Time: 87.9 us/mul. Err: 0.2656 398482 digits
984108^131072+1 Time: 122 us/mul. Err: 0.2188 785521 digits
804904^262144+1 Time: 322 us/mul. Err: 0.2188 1548156 digits
658332^524288+1 Time: 820 us/mul. Err: 0.2188 3050541 digits
538452^1048576+1 Time: 1.56 ms/mul. Err: 0.2266 6009544 digits
440400^2097152+1 Time: 2.81 ms/mul. Err: 0.2266 11836006 digits
360204^4194304+1 Time: 5.63 ms/mul. Err: 0.2031 23305854 digits
294612^8388608+1 Time: 11.3 ms/mul. Err: 0.1895 45879398 digits
Genefer Mark = 865.



C:\Users\user\Downloads>geneferocl-windows.exe -b3
geneferocl 3.1.2-1 (Windows 64-bit OpenGL)
Copyright 2001-2013, Yves Gallot
Copyright 2009, Mark Rodenkirch, David Underbakke
Copyright 2010-2012, Shoichiro Yamada, Ken Brazier
Copyright 2011-2013, Iain Bethune, Michael Goetz, Ronald Schneider

Command line: geneferocl-windows.exe -b3


Running on platform 'AMD Accelerated Parallel Processing', device 'Tahiti', vers
ion 'OpenCL 1.2 AMD-APP (1272.2)' and driver '1272.2 (VM)'.

14^32768+1 37557 digits 0 days 0.0 hours (0.10 ms/mul, 124
758 iterations) 5772 GFLOPS
75898^32768+1 159916 digits 0 days 0.0 hours (0.08 ms/mul, 531
226 iterations) 19240 GFLOPS
700000^32768+1 191533 digits 0 days 0.0 hours (0.08 ms/mul, 636
255 iterations) 23088 GFLOPS
5000000^32768+1 219512 digits 0 days 0.0 hours (0.08 ms/mul, 729
201 iterations) 26455 GFLOPS

14^65536+1 75113 digits 0 days 0.0 hours (0.09 ms/mul, 249
517 iterations) 10101 GFLOPS
75898^65536+1 319831 digits 0 days 0.0 hours (0.09 ms/mul, 106
2453 iterations) 44252 GFLOPS
710000^65536+1 383469 digits 0 days 0.0 hours (0.09 ms/mul, 127
3852 iterations) 52910 GFLOPS
2500000^65536+1 419296 digits 0 days 0.0 hours (0.09 ms/mul, 139
2868 iterations) 58201 GFLOPS

14^131072+1 150226 digits 0 days 0.0 hours (0.15 ms/mul, 499
036 iterations) 36075 GFLOPS
75898^131072+1 639662 digits 0 days 0.0 hours (0.12 ms/mul, 212
4908 iterations) 124579 GFLOPS
700000^131072+1 766129 digits 0 days 0.0 hours (0.12 ms/mul, 254
5023 iterations) 150553 GFLOPS
1000000^131072+1 786432 digits 0 days 0.0 hours (0.12 ms/mul, 261
2469 iterations) 152958 GFLOPS

14^262144+1 300451 digits 0 days 0.0 hours (0.32 ms/mul, 998
074 iterations) 152477 GFLOPS
75898^262144+1 1279324 digits 0 days 0.3 hours (0.33 ms/mul, 424
9818 iterations) 670033 GFLOPS
468750^262144+1 1486604 digits 0 days 0.4 hours (0.32 ms/mul, 493
8388 iterations) 750360 GFLOPS
815000^262144+1 1549575 digits 0 days 0.4 hours (0.33 ms/mul, 514
7574 iterations) 809523 GFLOPS

14^524288+1 600902 digits 0 days 0.4 hours (0.82 ms/mul, 199
6149 iterations) 790764 GFLOPS
75898^524288+1 2558647 digits 0 days 1.9 hours (0.84 ms/mul, 849
9637 iterations) 3450213 GFLOPS
468750^524288+1 2973207 digits 0 days 2.1 hours (0.80 ms/mul, 987
6777 iterations) 3800381 GFLOPS
710000^524288+1 3067745 digits 0 days 2.3 hours (0.84 ms/mul, 101
90825 iterations) 4112550 GFLOPS



C:\Users\user\Downloads>more genefer.bench
Generalized Fermat Number Bench
2199064^8192+1 Time: 70.8 us/mul. Err: 0.2344 51956 digits
1798620^16384+1 Time: 73.9 us/mul. Err: 0.2266 102481 digits
1471094^32768+1 Time: 74.5 us/mul. Err: 0.2344 202102 digits
1203210^65536+1 Time: 87.9 us/mul. Err: 0.2656 398482 digits
984108^131072+1 Time: 122 us/mul. Err: 0.2188 785521 digits
804904^262144+1 Time: 322 us/mul. Err: 0.2188 1548156 digits
658332^524288+1 Time: 820 us/mul. Err: 0.2188 3050541 digits
538452^1048576+1 Time: 1.56 ms/mul. Err: 0.2266 6009544 digits
440400^2097152+1 Time: 2.81 ms/mul. Err: 0.2266 11836006 digits
360204^4194304+1 Time: 5.63 ms/mul. Err: 0.2031 23305854 digits
294612^8388608+1 Time: 11.3 ms/mul. Err: 0.1895 45879398 digits
Genefer Mark = 865.

____________

Profile Death[Kiev]
Volunteer tester
Avatar
Send message
Joined: 12 Jun 08
Posts: 126
ID: 24043
Credit: 3,622,648
RAC: 0
321 LLR Silver: Earned 100,000 credits (102,041)Cullen LLR Bronze: Earned 10,000 credits (32,959)ESP LLR Bronze: Earned 10,000 credits (83,243)PPS LLR Silver: Earned 100,000 credits (223,033)PSP LLR Silver: Earned 100,000 credits (131,571)SR5 LLR Bronze: Earned 10,000 credits (49,026)SGS LLR Bronze: Earned 10,000 credits (99,613)TRP LLR Silver: Earned 100,000 credits (261,144)Woodall LLR Silver: Earned 100,000 credits (164,341)321 Sieve Bronze: Earned 10,000 credits (10,359)Cullen/Woodall Sieve (suspended) Bronze: Earned 10,000 credits (40,093)PPS Sieve Gold: Earned 500,000 credits (562,985)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Bronze: Earned 10,000 credits (44,403)TRP Sieve (suspended) Silver: Earned 100,000 credits (102,453)AP 26/27 Bronze: Earned 10,000 credits (11,505)GFN Silver: Earned 100,000 credits (165,611)PSA Amethyst: Earned 1,000,000 credits (1,537,892)
Message 68454 - Posted: 16 Aug 2013 | 17:43:28 UTC

Running on platform 'AMD Accelerated Parallel Processing', device 'AMD Athlon(tm) 64 X2 Dual Core Processor 4200+', version 'OpenCL 1.2 AMD-APP (1124.2)' and driver '1124.2 (sse2)'.

Generalized Fermat Number Bench
2199064^8192+1 Time: 8.73 ms/mul. Err: 0.2344 51956 digits



ATI 3550 or somehting like that.....

____________
wbr, Me. Dead J. Dona

Profile DeleteNullProject donor
Volunteer tester
Avatar
Send message
Joined: 6 Apr 06
Posts: 226
ID: 2663
Credit: 5,102,624,386
RAC: 136,622
Discovered 16 mega primesEliminated 2 conjecture "k"sFound 4 primes in the 2018 Tour de PrimesFound 1 mega prime in the 2018 Tour de PrimesFound 4 primes in the 2019 Tour de PrimesFound 2 mega primes in the 2019 Tour de PrimesFound 1 prime in the 2019 Tour de Primes Mountain Stage321 LLR Emerald: Earned 50,000,000 credits (50,589,422)Cullen LLR Emerald: Earned 50,000,000 credits (51,425,403)ESP LLR Emerald: Earned 50,000,000 credits (59,747,151)Generalized Cullen/Woodall LLR Emerald: Earned 50,000,000 credits (59,199,415)PPS LLR Emerald: Earned 50,000,000 credits (97,915,589)PSP LLR Emerald: Earned 50,000,000 credits (55,716,862)SoB LLR Emerald: Earned 50,000,000 credits (71,933,705)SR5 LLR Emerald: Earned 50,000,000 credits (50,919,719)SGS LLR Emerald: Earned 50,000,000 credits (51,327,296)TPS LLR (retired) Bronze: Earned 10,000 credits (61,785)TRP LLR Emerald: Earned 50,000,000 credits (61,671,238)Woodall LLR Emerald: Earned 50,000,000 credits (50,544,668)321 Sieve Emerald: Earned 50,000,000 credits (56,793,407)Cullen/Woodall Sieve (suspended) Ruby: Earned 2,000,000 credits (4,685,309)Generalized Cullen/Woodall Sieve Emerald: Earned 50,000,000 credits (53,324,439)PPS Sieve Double Ruby: Earned 2,000,000,000 credits (3,085,564,413)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Sapphire: Earned 20,000,000 credits (20,057,785)TRP Sieve (suspended) Emerald: Earned 50,000,000 credits (50,146,560)AP 26/27 Double Bronze: Earned 100,000,000 credits (164,762,545)GFN Double Gold: Earned 500,000,000 credits (820,619,526)PSA Double Bronze: Earned 100,000,000 credits (185,423,866)
Message 68455 - Posted: 16 Aug 2013 | 17:58:39 UTC - in response to Message 68409.


The second is that it doesn't tell you the exact device model. That's a shame.


Why tell the prog only 'Tahiti' for HD7950 and HD7970?

I have made a test with a litte java prog (works with opncl dll's):
querying CL_DEVICE_NAME gives back
GeForce GTS 450 for my little NVIDIA
Tahiti for my HD7950
Tahiti for my HD7970

so (if Tahiti) we have to query CL_DEVICE_MAX_COMPUTE_UNITS also:
Tahiti, 32 => HD7970
Tahiti, 28 => HD7950
Tahiti, 24 => HD7870 Boost Edition



____________
DeleteNull

Husu*
Avatar
Send message
Joined: 16 Jan 12
Posts: 15
ID: 127298
Credit: 165,338,156
RAC: 0
321 LLR Bronze: Earned 10,000 credits (15,207)PPS LLR Silver: Earned 100,000 credits (373,654)SGS LLR Silver: Earned 100,000 credits (129,350)TRP LLR Bronze: Earned 10,000 credits (12,028)Woodall LLR Bronze: Earned 10,000 credits (11,232)Cullen/Woodall Sieve (suspended) Silver: Earned 100,000 credits (292,273)PPS Sieve Double Bronze: Earned 100,000,000 credits (159,185,362)GFN Turquoise: Earned 5,000,000 credits (5,316,037)
Message 68456 - Posted: 16 Aug 2013 | 18:17:24 UTC

For reference, it seems that OpenCL version is faster than CUDA on GTX Titan.

First runs with Double Precision ENABLED.

CUDA:

genefercuda-windows.exe -b
genefercuda 3.1.2-2 (Windows 32-bit CUDA)
Copyright 2001-2013, Yves Gallot
Copyright 2009, Mark Rodenkirch, David Underbakke
Copyright 2010-2012, Shoichiro Yamada, Ken Brazier
Copyright 2011-2013, Iain Bethune, Michael Goetz, Ronald Schneider

Command line: genefercuda-windows.exe -b

Generalized Fermat Number Bench
2199064^8192+1 Time: 135 us/mul. Err: 0.2344 51956 digits
1798620^16384+1 Time: 157 us/mul. Err: 0.2188 102481 digits
1471094^32768+1 Time: 173 us/mul. Err: 0.2500 202102 digits
1203210^65536+1 Time: 208 us/mul. Err: 0.2352 398482 digits
984108^131072+1 Time: 331 us/mul. Err: 0.5000 785521 digits
804904^262144+1 Time: 500 us/mul. Err: 0.2227 1548156 digits
658332^524288+1 Time: 889 us/mul. Err: 0.2500 3050541 digits
538452^1048576+1 Time: 1.75 ms/mul. Err: 0.2031 6009544 digits
440400^2097152+1 Time: 3.23 ms/mul. Err: 0.2051 11836006 digits
360204^4194304+1 Time: 6.09 ms/mul. Err: 0.2167 23305854 digits
294612^8388608+1 Time: 13.3 ms/mul. Err: 0.1797 45879398 digits

OpenCL:

geneferocl-windows.exe -b
geneferocl 3.1.2-1 (Windows 64-bit OpenGL)
Copyright 2001-2013, Yves Gallot
Copyright 2009, Mark Rodenkirch, David Underbakke
Copyright 2010-2012, Shoichiro Yamada, Ken Brazier
Copyright 2011-2013, Iain Bethune, Michael Goetz, Ronald Schneider

Command line: geneferocl-windows.exe -b


Running on platform 'NVIDIA CUDA', device 'GeForce GTX TITAN', version 'OpenCL 1
.1 CUDA' and driver '320.49'.

Generalized Fermat Number Bench
2199064^8192+1 Time: 82.5 us/mul. Err: 0.2344 51956 digits
1798620^16384+1 Time: 80.3 us/mul. Err: 0.2266 102481 digits
1471094^32768+1 Time: 85.2 us/mul. Err: 0.2344 202102 digits
1203210^65536+1 Time: 97.7 us/mul. Err: 0.2656 398482 digits
984108^131072+1 Time: 134 us/mul. Err: 0.2188 785521 digits
804904^262144+1 Time: 279 us/mul. Err: 0.2188 1548156 digits
658332^524288+1 Time: 465 us/mul. Err: 0.2188 3050541 digits
538452^1048576+1 Time: 859 us/mul. Err: 0.2266 6009544 digits
440400^2097152+1 Time: 1.63 ms/mul. Err: 0.2266 11836006 digits
360204^4194304+1 Time: 3.33 ms/mul. Err: 0.2031 23305854 digits
294612^8388608+1 Time: 6.88 ms/mul. Err: 0.1895 45879398 digits
Genefer Mark = 1487.

----------

Double Precision DISABLED.

CUDA:

genefercuda-windows.exe -b
genefercuda 3.1.2-2 (Windows 32-bit CUDA)
Copyright 2001-2013, Yves Gallot
Copyright 2009, Mark Rodenkirch, David Underbakke
Copyright 2010-2012, Shoichiro Yamada, Ken Brazier
Copyright 2011-2013, Iain Bethune, Michael Goetz, Ronald Schneider

Command line: genefercuda-windows.exe -b

Generalized Fermat Number Bench
2199064^8192+1 Time: 146 us/mul. Err: 0.2344 51956 digits
1798620^16384+1 Time: 153 us/mul. Err: 0.2188 102481 digits
1471094^32768+1 Time: 177 us/mul. Err: 0.2500 202102 digits
1203210^65536+1 Time: 229 us/mul. Err: 0.2352 398482 digits
984108^131072+1 Time: 425 us/mul. Err: 0.2188 785521 digits
804904^262144+1 Time: 646 us/mul. Err: 0.2227 1548156 digits
658332^524288+1 Time: 1.08 ms/mul. Err: 0.2500 3050541 digits
538452^1048576+1 Time: 2.02 ms/mul. Err: 0.2031 6009544 digits
440400^2097152+1 Time: 3.66 ms/mul. Err: 0.2051 11836006 digits
360204^4194304+1 Time: 6.86 ms/mul. Err: 0.2167 23305854 digits
294612^8388608+1 Time: 14.7 ms/mul. Err: 0.1797 45879398 digits


OpenCL:

geneferocl-windows.exe -b
geneferocl 3.1.2-1 (Windows 64-bit OpenGL)
Copyright 2001-2013, Yves Gallot
Copyright 2009, Mark Rodenkirch, David Underbakke
Copyright 2010-2012, Shoichiro Yamada, Ken Brazier
Copyright 2011-2013, Iain Bethune, Michael Goetz, Ronald Schneider

Command line: geneferocl-windows.exe -b


Running on platform 'NVIDIA CUDA', device 'GeForce GTX TITAN', version 'OpenCL 1
.1 CUDA' and driver '320.49'.

Generalized Fermat Number Bench
2199064^8192+1 Time: 88.7 us/mul. Err: 0.2344 51956 digits
1798620^16384+1 Time: 82.5 us/mul. Err: 0.2266 102481 digits
1471094^32768+1 Time: 98.5 us/mul. Err: 0.2344 202102 digits
1203210^65536+1 Time: 125 us/mul. Err: 0.2656 398482 digits
984108^131072+1 Time: 180 us/mul. Err: 0.2188 785521 digits
804904^262144+1 Time: 346 us/mul. Err: 0.2188 1548156 digits
658332^524288+1 Time: 609 us/mul. Err: 0.2188 3050541 digits
538452^1048576+1 Time: 1.15 ms/mul. Err: 0.2266 6009544 digits
440400^2097152+1 Time: 2.28 ms/mul. Err: 0.2266 11836006 digits
360204^4194304+1 Time: 4.64 ms/mul. Err: 0.2031 23305854 digits
294612^8388608+1 Time: 9.69 ms/mul. Err: 0.1895 45879398 digits
Genefer Mark = 1072.

----------

I'm changing my main computer into Haswell one so can't run these again on this same platform. Can post new ones after mobo/CPU change (ASUS P8Z68-V PRO GEN3 & i7 2600K -> ASUS Z87 Maximus VI Gene & i5-4670k).

I used the precompiled executables from http://www.primegrid.com/forum_thread.php?id=4889#63012, if they have been updated (should be the same which is in beta Primegrid BOINC).

Profile Michael GoetzProject donor
Volunteer moderator
Project administrator
Project scientist
Avatar
Send message
Joined: 21 Jan 10
Posts: 12669
ID: 53948
Credit: 184,131,428
RAC: 10,627
The "Shut up already!" badge:  This loud mouth has mansplained on the forums over 10 thousand times!  Sheesh!!!Discovered the World's First GFN-19 prime!!!Discovered 1 mega primeFound 1 prime in the 2018 Tour de PrimesFound 1 prime in the 2019 Tour de Primes321 LLR Ruby: Earned 2,000,000 credits (2,063,182)Cullen LLR Ruby: Earned 2,000,000 credits (2,005,249)ESP LLR Ruby: Earned 2,000,000 credits (3,820,430)Generalized Cullen/Woodall LLR Ruby: Earned 2,000,000 credits (2,145,754)PPS LLR Ruby: Earned 2,000,000 credits (2,773,744)PSP LLR Ruby: Earned 2,000,000 credits (2,632,269)SoB LLR Sapphire: Earned 20,000,000 credits (34,158,496)SR5 LLR Turquoise: Earned 5,000,000 credits (8,293,415)SGS LLR Ruby: Earned 2,000,000 credits (2,012,781)TRP LLR Ruby: Earned 2,000,000 credits (2,737,347)Woodall LLR Ruby: Earned 2,000,000 credits (2,195,123)321 Sieve Turquoise: Earned 5,000,000 credits (5,046,112)Cullen/Woodall Sieve (suspended) Ruby: Earned 2,000,000 credits (4,170,256)Generalized Cullen/Woodall Sieve Turquoise: Earned 5,000,000 credits (5,059,304)PPS Sieve Sapphire: Earned 20,000,000 credits (20,110,788)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Amethyst: Earned 1,000,000 credits (1,035,522)TRP Sieve (suspended) Ruby: Earned 2,000,000 credits (2,051,121)AP 26/27 Turquoise: Earned 5,000,000 credits (7,090,096)GFN Emerald: Earned 50,000,000 credits (64,594,991)PSA Jade: Earned 10,000,000 credits (10,135,447)
Message 68460 - Posted: 16 Aug 2013 | 18:50:05 UTC - in response to Message 68454.

Running on platform 'AMD Accelerated Parallel Processing', device 'AMD Athlon(tm) 64 X2 Dual Core Processor 4200+', version 'OpenCL 1.2 AMD-APP (1124.2)' and driver '1124.2 (sse2)'.

Generalized Fermat Number Bench
2199064^8192+1 Time: 8.73 ms/mul. Err: 0.2344 51956 digits



ATI 3550 or somehting like that.....


It's running on the CPU.
____________
Please do not PM me with support questions. Ask on the forums instead. Thank you!

My lucky number is 75898524288+1

Profile Michael GoetzProject donor
Volunteer moderator
Project administrator
Project scientist
Avatar
Send message
Joined: 21 Jan 10
Posts: 12669
ID: 53948
Credit: 184,131,428
RAC: 10,627
The "Shut up already!" badge:  This loud mouth has mansplained on the forums over 10 thousand times!  Sheesh!!!Discovered the World's First GFN-19 prime!!!Discovered 1 mega primeFound 1 prime in the 2018 Tour de PrimesFound 1 prime in the 2019 Tour de Primes321 LLR Ruby: Earned 2,000,000 credits (2,063,182)Cullen LLR Ruby: Earned 2,000,000 credits (2,005,249)ESP LLR Ruby: Earned 2,000,000 credits (3,820,430)Generalized Cullen/Woodall LLR Ruby: Earned 2,000,000 credits (2,145,754)PPS LLR Ruby: Earned 2,000,000 credits (2,773,744)PSP LLR Ruby: Earned 2,000,000 credits (2,632,269)SoB LLR Sapphire: Earned 20,000,000 credits (34,158,496)SR5 LLR Turquoise: Earned 5,000,000 credits (8,293,415)SGS LLR Ruby: Earned 2,000,000 credits (2,012,781)TRP LLR Ruby: Earned 2,000,000 credits (2,737,347)Woodall LLR Ruby: Earned 2,000,000 credits (2,195,123)321 Sieve Turquoise: Earned 5,000,000 credits (5,046,112)Cullen/Woodall Sieve (suspended) Ruby: Earned 2,000,000 credits (4,170,256)Generalized Cullen/Woodall Sieve Turquoise: Earned 5,000,000 credits (5,059,304)PPS Sieve Sapphire: Earned 20,000,000 credits (20,110,788)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Amethyst: Earned 1,000,000 credits (1,035,522)TRP Sieve (suspended) Ruby: Earned 2,000,000 credits (2,051,121)AP 26/27 Turquoise: Earned 5,000,000 credits (7,090,096)GFN Emerald: Earned 50,000,000 credits (64,594,991)PSA Jade: Earned 10,000,000 credits (10,135,447)
Message 68462 - Posted: 16 Aug 2013 | 19:03:37 UTC

If anyone builds a Linux or Mac build of GeneferOCL before we do, the source has been changed to 3.1.2-2, which has a few minor tweaks to the utility functions. The only significant change is that under Linux and Mac, the benchmarks should produce the correct results.
____________
Please do not PM me with support questions. Ask on the forums instead. Thank you!

My lucky number is 75898524288+1

Profile Crystal PelletProject donor
Avatar
Send message
Joined: 9 Nov 08
Posts: 180
ID: 31494
Credit: 77,230,917
RAC: 390
321 LLR Amethyst: Earned 1,000,000 credits (1,003,526)Cullen LLR Gold: Earned 500,000 credits (500,200)ESP LLR Gold: Earned 500,000 credits (738,168)Generalized Cullen/Woodall LLR Silver: Earned 100,000 credits (241,969)PPS LLR Ruby: Earned 2,000,000 credits (2,123,786)PSP LLR Amethyst: Earned 1,000,000 credits (1,303,207)SoB LLR Amethyst: Earned 1,000,000 credits (1,567,316)SR5 LLR Gold: Earned 500,000 credits (542,997)SGS LLR Amethyst: Earned 1,000,000 credits (1,256,351)TRP LLR Amethyst: Earned 1,000,000 credits (1,010,058)Woodall LLR Silver: Earned 100,000 credits (118,189)321 Sieve Silver: Earned 100,000 credits (102,310)Cullen/Woodall Sieve (suspended) Jade: Earned 10,000,000 credits (14,643,949)Generalized Cullen/Woodall Sieve Ruby: Earned 2,000,000 credits (3,494,619)PPS Sieve Sapphire: Earned 20,000,000 credits (29,962,345)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Turquoise: Earned 5,000,000 credits (5,208,890)TRP Sieve (suspended) Turquoise: Earned 5,000,000 credits (5,080,539)AP 26/27 Gold: Earned 500,000 credits (507,329)GFN Turquoise: Earned 5,000,000 credits (7,824,901)
Message 68463 - Posted: 16 Aug 2013 | 19:21:46 UTC - in response to Message 68461.

I'm testing on an ATI Radeon HD 7770 stock speed with one CPU core reserved -> http://www.primegrid.com/show_host_detail.php?hostid=407824

Estimated elapsed time 27½ hours. CPU run time 82% of elapsed time
____________

Profile VictordeHollanderProject donor
Send message
Joined: 13 Jan 11
Posts: 25
ID: 81079
Credit: 300,776,184
RAC: 37
321 LLR Silver: Earned 100,000 credits (433,133)Cullen LLR Silver: Earned 100,000 credits (448,329)ESP LLR Gold: Earned 500,000 credits (623,750)PPS LLR Gold: Earned 500,000 credits (520,342)PSP LLR Silver: Earned 100,000 credits (160,346)SoB LLR Silver: Earned 100,000 credits (173,517)SR5 LLR Silver: Earned 100,000 credits (133,674)SGS LLR Silver: Earned 100,000 credits (338,786)TRP LLR Gold: Earned 500,000 credits (618,625)Woodall LLR Silver: Earned 100,000 credits (260,361)Cullen/Woodall Sieve (suspended) Ruby: Earned 2,000,000 credits (4,153,179)PPS Sieve Double Silver: Earned 200,000,000 credits (280,553,783)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Amethyst: Earned 1,000,000 credits (1,160,571)TRP Sieve (suspended) Amethyst: Earned 1,000,000 credits (1,406,640)GFN Ruby: Earned 2,000,000 credits (4,856,299)PSA Ruby: Earned 2,000,000 credits (4,934,850)
Message 68465 - Posted: 16 Aug 2013 | 20:23:18 UTC

Testing on a HD7950 with 1 CPU core reserved, estimated time for a WU ~28,000 sec.

Profile VictordeHollanderProject donor
Send message
Joined: 13 Jan 11
Posts: 25
ID: 81079
Credit: 300,776,184
RAC: 37
321 LLR Silver: Earned 100,000 credits (433,133)Cullen LLR Silver: Earned 100,000 credits (448,329)ESP LLR Gold: Earned 500,000 credits (623,750)PPS LLR Gold: Earned 500,000 credits (520,342)PSP LLR Silver: Earned 100,000 credits (160,346)SoB LLR Silver: Earned 100,000 credits (173,517)SR5 LLR Silver: Earned 100,000 credits (133,674)SGS LLR Silver: Earned 100,000 credits (338,786)TRP LLR Gold: Earned 500,000 credits (618,625)Woodall LLR Silver: Earned 100,000 credits (260,361)Cullen/Woodall Sieve (suspended) Ruby: Earned 2,000,000 credits (4,153,179)PPS Sieve Double Silver: Earned 200,000,000 credits (280,553,783)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Amethyst: Earned 1,000,000 credits (1,160,571)TRP Sieve (suspended) Amethyst: Earned 1,000,000 credits (1,406,640)GFN Ruby: Earned 2,000,000 credits (4,856,299)PSA Ruby: Earned 2,000,000 credits (4,934,850)
Message 68469 - Posted: 16 Aug 2013 | 20:48:52 UTC - in response to Message 68456.

For reference, it seems that OpenCL version is faster than CUDA on GTX Titan.

CUDA:
538452^1048576+1 Time: 1.75 ms/mul. Err: 0.2031 6009544 digits

OpenCL:
538452^1048576+1 Time: 859 us/mul. Err: 0.2266 6009544 digits

Wow, OpenCL is almost twice as fast on your Titan! That is totally unexpected.

Yves Gallot
Volunteer developer
Project scientist
Send message
Joined: 19 Aug 12
Posts: 513
ID: 164101
Credit: 295,254,118
RAC: 5,845
GFN Double Silver: Earned 200,000,000 credits (295,254,118)
Message 68470 - Posted: 16 Aug 2013 | 21:50:31 UTC - in response to Message 68469.

Wow, OpenCL is almost twice as fast on your Titan! That is totally unexpected.

No, that's not unexpected :o)
First, my graphics card is a Kepler then I optimized the current version for this architecture.
Second, the maximum number of registers per thread and per multiprocessor is (I think) a major bottleneck. With 255 registers per thread, the new Kepler GK110 removed this limitation.

It would be interesting to compare the real GFLOPS of genefer with the peak GFLOPS of GPUs.

Profile Peciak
Avatar
Send message
Joined: 21 Jul 09
Posts: 17
ID: 43788
Credit: 349,954,843
RAC: 261
321 LLR Amethyst: Earned 1,000,000 credits (1,068,541)Cullen LLR Ruby: Earned 2,000,000 credits (2,008,886)ESP LLR Amethyst: Earned 1,000,000 credits (1,015,088)Generalized Cullen/Woodall LLR Ruby: Earned 2,000,000 credits (2,007,262)PPS LLR Ruby: Earned 2,000,000 credits (2,240,295)PSP LLR Amethyst: Earned 1,000,000 credits (1,021,363)SoB LLR Amethyst: Earned 1,000,000 credits (1,002,907)SR5 LLR Ruby: Earned 2,000,000 credits (2,013,584)SGS LLR Ruby: Earned 2,000,000 credits (2,007,325)TRP LLR Amethyst: Earned 1,000,000 credits (1,005,791)Woodall LLR Amethyst: Earned 1,000,000 credits (1,018,056)321 Sieve Silver: Earned 100,000 credits (200,896)Cullen/Woodall Sieve (suspended) Ruby: Earned 2,000,000 credits (4,381,648)Generalized Cullen/Woodall Sieve Turquoise: Earned 5,000,000 credits (6,410,027)PPS Sieve Double Bronze: Earned 100,000,000 credits (145,163,235)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Jade: Earned 10,000,000 credits (10,006,084)TRP Sieve (suspended) Jade: Earned 10,000,000 credits (10,003,510)AP 26/27 Turquoise: Earned 5,000,000 credits (5,553,750)GFN Sapphire: Earned 20,000,000 credits (38,069,779)PSA Double Bronze: Earned 100,000,000 credits (113,759,350)
Message 68472 - Posted: 16 Aug 2013 | 22:06:40 UTC

ATI 7970
Completed and validated
Run time 26,231.64
CPU time 491.36
pkt 29,125.69

Yves Gallot
Volunteer developer
Project scientist
Send message
Joined: 19 Aug 12
Posts: 513
ID: 164101
Credit: 295,254,118
RAC: 5,845
GFN Double Silver: Earned 200,000,000 credits (295,254,118)
Message 68474 - Posted: 16 Aug 2013 | 22:27:11 UTC

I tuned geneferocl. It's a bit faster on my computer for large exponents.

I committed the new parameters on assembla, in the branch [...]\branches\yves\2013\OclGenefer.

I think that it is faster on a 'Tahiti', but only a real experiment can answer to the question. Please, could someone compile it and run the bench on a HD79x0?

Mike, could you also test it on your GTX 460? Because, I don't know why, but your card doesn't like my algorithm :o) If it is the card that found 75898^524288+1 with GeneferCUDA, I understand it!

If new tuning is faster on Tahiti and Fermi, I will update the real genefer.

Thanks, Yves

Profile Michael GoetzProject donor
Volunteer moderator
Project administrator
Project scientist
Avatar
Send message
Joined: 21 Jan 10
Posts: 12669
ID: 53948
Credit: 184,131,428
RAC: 10,627
The "Shut up already!" badge:  This loud mouth has mansplained on the forums over 10 thousand times!  Sheesh!!!Discovered the World's First GFN-19 prime!!!Discovered 1 mega primeFound 1 prime in the 2018 Tour de PrimesFound 1 prime in the 2019 Tour de Primes321 LLR Ruby: Earned 2,000,000 credits (2,063,182)Cullen LLR Ruby: Earned 2,000,000 credits (2,005,249)ESP LLR Ruby: Earned 2,000,000 credits (3,820,430)Generalized Cullen/Woodall LLR Ruby: Earned 2,000,000 credits (2,145,754)PPS LLR Ruby: Earned 2,000,000 credits (2,773,744)PSP LLR Ruby: Earned 2,000,000 credits (2,632,269)SoB LLR Sapphire: Earned 20,000,000 credits (34,158,496)SR5 LLR Turquoise: Earned 5,000,000 credits (8,293,415)SGS LLR Ruby: Earned 2,000,000 credits (2,012,781)TRP LLR Ruby: Earned 2,000,000 credits (2,737,347)Woodall LLR Ruby: Earned 2,000,000 credits (2,195,123)321 Sieve Turquoise: Earned 5,000,000 credits (5,046,112)Cullen/Woodall Sieve (suspended) Ruby: Earned 2,000,000 credits (4,170,256)Generalized Cullen/Woodall Sieve Turquoise: Earned 5,000,000 credits (5,059,304)PPS Sieve Sapphire: Earned 20,000,000 credits (20,110,788)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Amethyst: Earned 1,000,000 credits (1,035,522)TRP Sieve (suspended) Ruby: Earned 2,000,000 credits (2,051,121)AP 26/27 Turquoise: Earned 5,000,000 credits (7,090,096)GFN Emerald: Earned 50,000,000 credits (64,594,991)PSA Jade: Earned 10,000,000 credits (10,135,447)
Message 68475 - Posted: 16 Aug 2013 | 22:34:50 UTC

The GeneferOCL we've been running is a 64 bit version. Until today, I had not tried running a 32 bit version. I would expect the 32 bit and 64 bit builds to run at the same speed, and in fact their benchmark speeds are the same. This behavior is consistent with GeneferCUDA.

However, the B limits are NOT the same.

64 bit GeneferOCL:

Command line: geneferocl-windows-x64.exe -l


Running on platform 'NVIDIA CUDA', device 'GeForce GTX 460', version 'OpenCL 1.1
CUDA' and driver '320.57'.

Generalized Fermat Number b Limits
The upper bound m = 8192, b = 2670000, Err = 0.2910
The upper bound m = 16384, b = 2210000, Err = 0.2969
The upper bound m = 32768, b = 1780000, Err = 0.2969
The upper bound m = 65536, b = 1505000, Err = 0.2969
The upper bound m = 131072, b = 1240000, Err = 0.2969
The upper bound m = 262144, b = 1015000, Err = 0.3047
The upper bound m = 524288, b = 825000, Err = 0.3057
The upper bound m = 1048576, b = 680000, Err = 0.3047
The upper bound m = 2097152, b = 555000, Err = 0.2969
The upper bound m = 4194304, b = 455000, Err = 0.2813
The upper bound m = 8388608, b = 385000, Err = 0.3125


32 bit GeneferOCL:

Command line: geneferocl-windows-x86.exe -l


Running on platform 'NVIDIA CUDA', device 'GeForce GTX 460', version 'OpenCL 1.1
CUDA' and driver '320.57'.

Generalized Fermat Number b Limits
The upper bound m = 8192, b = 2720000, Err = 0.2969
The upper bound m = 16384, b = 2210000, Err = 0.2969
The upper bound m = 32768, b = 1830000, Err = 0.3008
The upper bound m = 65536, b = 1490000, Err = 0.2969
The upper bound m = 131072, b = 1235000, Err = 0.3008
The upper bound m = 262144, b = 1015000, Err = 0.2891
The upper bound m = 524288, b = 840000, Err = 0.3047
The upper bound m = 1048576, b = 690000, Err = 0.3008
The upper bound m = 2097152, b = 565000, Err = 0.3066
The upper bound m = 4194304, b = 470000, Err = 0.3125
The upper bound m = 8388608, b = 385000, Err = 0.3125


While I wasn't surprised that there were small differences between GeneferCUDA and GeneferOCL, this does surprise me a bit because the math done on the GPU shouldn't be affected by the integer size on the CPU, especially since that doesn't affect the double precision floating point on either the CPU or the GPU.
____________
Please do not PM me with support questions. Ask on the forums instead. Thank you!

My lucky number is 75898524288+1

Yves Gallot
Volunteer developer
Project scientist
Send message
Joined: 19 Aug 12
Posts: 513
ID: 164101
Credit: 295,254,118
RAC: 5,845
GFN Double Silver: Earned 200,000,000 credits (295,254,118)
Message 68476 - Posted: 16 Aug 2013 | 22:45:42 UTC - in response to Message 68475.

While I wasn't surprised that there were small differences between GeneferCUDA and GeneferOCL, this does surprise me a bit because the math done on the GPU shouldn't be affected by the integer size on the CPU, especially since that doesn't affect the double precision floating point on either the CPU or the GPU.

The cos/sin tables are computed during initialisation by the CPU.
I think that it is more accurate than the GPU sin/cos functions.

Win32 binaries still use FP80 for internal computation. x64 uses SSE2 and then FP64. If you compile the win32 app with "SSE2 instruction set", the results may be identical...?

Profile Michael GoetzProject donor
Volunteer moderator
Project administrator
Project scientist
Avatar
Send message
Joined: 21 Jan 10
Posts: 12669
ID: 53948
Credit: 184,131,428
RAC: 10,627
The "Shut up already!" badge:  This loud mouth has mansplained on the forums over 10 thousand times!  Sheesh!!!Discovered the World's First GFN-19 prime!!!Discovered 1 mega primeFound 1 prime in the 2018 Tour de PrimesFound 1 prime in the 2019 Tour de Primes321 LLR Ruby: Earned 2,000,000 credits (2,063,182)Cullen LLR Ruby: Earned 2,000,000 credits (2,005,249)ESP LLR Ruby: Earned 2,000,000 credits (3,820,430)Generalized Cullen/Woodall LLR Ruby: Earned 2,000,000 credits (2,145,754)PPS LLR Ruby: Earned 2,000,000 credits (2,773,744)PSP LLR Ruby: Earned 2,000,000 credits (2,632,269)SoB LLR Sapphire: Earned 20,000,000 credits (34,158,496)SR5 LLR Turquoise: Earned 5,000,000 credits (8,293,415)SGS LLR Ruby: Earned 2,000,000 credits (2,012,781)TRP LLR Ruby: Earned 2,000,000 credits (2,737,347)Woodall LLR Ruby: Earned 2,000,000 credits (2,195,123)321 Sieve Turquoise: Earned 5,000,000 credits (5,046,112)Cullen/Woodall Sieve (suspended) Ruby: Earned 2,000,000 credits (4,170,256)Generalized Cullen/Woodall Sieve Turquoise: Earned 5,000,000 credits (5,059,304)PPS Sieve Sapphire: Earned 20,000,000 credits (20,110,788)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Amethyst: Earned 1,000,000 credits (1,035,522)TRP Sieve (suspended) Ruby: Earned 2,000,000 credits (2,051,121)AP 26/27 Turquoise: Earned 5,000,000 credits (7,090,096)GFN Emerald: Earned 50,000,000 credits (64,594,991)PSA Jade: Earned 10,000,000 credits (10,135,447)
Message 68477 - Posted: 17 Aug 2013 | 0:30:05 UTC - in response to Message 68476.
Last modified: 17 Aug 2013 | 0:32:43 UTC

While I wasn't surprised that there were small differences between GeneferCUDA and GeneferOCL, this does surprise me a bit because the math done on the GPU shouldn't be affected by the integer size on the CPU, especially since that doesn't affect the double precision floating point on either the CPU or the GPU.

The cos/sin tables are computed during initialisation by the CPU.
I think that it is more accurate than the GPU sin/cos functions.

Win32 binaries still use FP80 for internal computation. x64 uses SSE2 and then FP64. If you compile the win32 app with "SSE2 instruction set", the results may be identical...?


Not at all. With SSE2 enabled on the 32 bit build, none of the B limits changed, except for N=19 through 22, where all went up by 5000. Those 4 B limits were already higher than the 64 bit versions, so with SSE2 the gap was slightly larger.

There's no obvious pattern to which version (32 or 64 bit) is better here, so I suspect that the differences might reflect inaccuracies in the B limit testing process rather than actual differences in precision.

EDIT: For future builds, I'm going to stay with the 32 bit (SSE2) version because it will run on either 32 or 64 bit platforms. The -l limit test seems to indicate that the limits are higher at the N values we care about, so it's also a better choice from that perspective.
____________
Please do not PM me with support questions. Ask on the forums instead. Thank you!

My lucky number is 75898524288+1

Husu*
Avatar
Send message
Joined: 16 Jan 12
Posts: 15
ID: 127298
Credit: 165,338,156
RAC: 0
321 LLR Bronze: Earned 10,000 credits (15,207)PPS LLR Silver: Earned 100,000 credits (373,654)SGS LLR Silver: Earned 100,000 credits (129,350)TRP LLR Bronze: Earned 10,000 credits (12,028)Woodall LLR Bronze: Earned 10,000 credits (11,232)Cullen/Woodall Sieve (suspended) Silver: Earned 100,000 credits (292,273)PPS Sieve Double Bronze: Earned 100,000,000 credits (159,185,362)GFN Turquoise: Earned 5,000,000 credits (5,316,037)
Message 68478 - Posted: 17 Aug 2013 | 0:38:07 UTC

I'll probably run your OpenCL version instead of CUDA on this card, if these seem to be correct on full runs also :)

I got my Haswell put together, which I'm intending to do mostly BOINC stuff when not in use. This is the same card with different mobo, cpu and memory. Not much difference from platform change.

GF WR bugged out with Titan and 690's on same computer so had to build new one just for the Titan :S

These are made on stock speeds on CPU and GPU.

-----

Double precision ENABLED.

CUDA:

genefercuda-windows.exe -b
genefercuda 3.1.2-2 (Windows 32-bit CUDA)
Copyright 2001-2013, Yves Gallot
Copyright 2009, Mark Rodenkirch, David Underbakke
Copyright 2010-2012, Shoichiro Yamada, Ken Brazier
Copyright 2011-2013, Iain Bethune, Michael Goetz, Ronald Schneider

Command line: genefercuda-windows.exe -b

Generalized Fermat Number Bench
2199064^8192+1 Time: 131 us/mul. Err: 0.2344 51956 digits
1798620^16384+1 Time: 141 us/mul. Err: 0.2188 102481 digits
1471094^32768+1 Time: 159 us/mul. Err: 0.2500 202102 digits
1203210^65536+1 Time: 206 us/mul. Err: 0.2352 398482 digits
984108^131072+1 Time: 328 us/mul. Err: 0.5000 785521 digits
804904^262144+1 Time: 477 us/mul. Err: 0.2227 1548156 digits
658332^524288+1 Time: 793 us/mul. Err: 0.2500 3050541 digits
538452^1048576+1 Time: 1.61 ms/mul. Err: 0.2031 6009544 digits
440400^2097152+1 Time: 2.96 ms/mul. Err: 0.2051 11836006 digits
360204^4194304+1 Time: 5.75 ms/mul. Err: 0.2167 23305854 digits
294612^8388608+1 Time: 12.8 ms/mul. Err: 0.1797 45879398 digits


OpenCL, with the two different versions 3.1.2-1 and 3.1.2-2:

geneferocl-windows.exe -b
geneferocl 3.1.2-1 (Windows 64-bit OpenGL)
Copyright 2001-2013, Yves Gallot
Copyright 2009, Mark Rodenkirch, David Underbakke
Copyright 2010-2012, Shoichiro Yamada, Ken Brazier
Copyright 2011-2013, Iain Bethune, Michael Goetz, Ronald Schneider

Command line: geneferocl-windows.exe -b


Running on platform 'NVIDIA CUDA', device 'GeForce GTX TITAN', version 'OpenCL 1.1 CUDA' and driver '320.49'.

Generalized Fermat Number Bench
2199064^8192+1 Time: 75.2 us/mul. Err: 0.2344 51956 digits
1798620^16384+1 Time: 76.5 us/mul. Err: 0.2266 102481 digits
1471094^32768+1 Time: 81.2 us/mul. Err: 0.2344 202102 digits
1203210^65536+1 Time: 95.5 us/mul. Err: 0.2656 398482 digits
984108^131072+1 Time: 131 us/mul. Err: 0.2188 785521 digits
804904^262144+1 Time: 275 us/mul. Err: 0.2188 1548156 digits
658332^524288+1 Time: 459 us/mul. Err: 0.2188 3050541 digits
538452^1048576+1 Time: 855 us/mul. Err: 0.2266 6009544 digits
440400^2097152+1 Time: 1.61 ms/mul. Err: 0.2266 11836006 digits
360204^4194304+1 Time: 3.33 ms/mul. Err: 0.2031 23305854 digits
294612^8388608+1 Time: 6.84 ms/mul. Err: 0.1895 45879398 digits
Genefer Mark = 1494.


geneferocl-windows.exe -b
geneferocl 3.1.2-2 (Windows 64-bit OpenGL)
Copyright 2001-2013, Yves Gallot
Copyright 2009, Mark Rodenkirch, David Underbakke
Copyright 2010-2012, Shoichiro Yamada, Ken Brazier
Copyright 2011-2013, Iain Bethune, Michael Goetz, Ronald Schneider

Command line: geneferocl-windows.exe -b


Running on platform 'NVIDIA CUDA', device 'GeForce GTX TITAN', version 'OpenCL 1.1 CUDA' and driver '320.49'.

Generalized Fermat Number Bench
2199064^8192+1 Time: 75.3 us/mul. Err: 0.2344 51956 digits
1798620^16384+1 Time: 76.4 us/mul. Err: 0.2266 102481 digits
1471094^32768+1 Time: 81.1 us/mul. Err: 0.2344 202102 digits
1203210^65536+1 Time: 95.5 us/mul. Err: 0.2656 398482 digits
984108^131072+1 Time: 131 us/mul. Err: 0.2188 785521 digits
804904^262144+1 Time: 275 us/mul. Err: 0.2188 1548156 digits
658332^524288+1 Time: 459 us/mul. Err: 0.2188 3050541 digits
538452^1048576+1 Time: 852 us/mul. Err: 0.2266 6009544 digits
440400^2097152+1 Time: 1.6 ms/mul. Err: 0.2266 11836006 digits
360204^4194304+1 Time: 3.31 ms/mul. Err: 0.2031 23305854 digits
294612^8388608+1 Time: 6.94 ms/mul. Err: 0.1895 45879398 digits
Genefer Mark = 1494.

-----

And here's -b3 runs. Estimated time difference starts to get huge later on.

CUDA:

genefercuda 3.1.2-2 (Windows 32-bit CUDA)
Copyright 2001-2013, Yves Gallot
Copyright 2009, Mark Rodenkirch, David Underbakke
Copyright 2010-2012, Shoichiro Yamada, Ken Brazier
Copyright 2011-2013, Iain Bethune, Michael Goetz, Ronald Schneider
14^262144+1 300451 digits 0 days 0.1 hours (0.48 ms/mul, 998074 iterations) 230880 GFLOPS
75898^262144+1 1279324 digits 0 days 0.5 hours (0.48 ms/mul, 4249818 iterations) 972582 GFLOPS
468750^262144+1 1486604 digits 0 days 0.6 hours (0.48 ms/mul, 4938388 iterations) 1132755 GFLOPS
815000^262144+1 1549575 digits 0 days 0.6 hours (0.47 ms/mul, 5147574 iterations) 1176045 GFLOPS

14^524288+1 600902 digits 0 days 0.4 hours (0.80 ms/mul, 1996149 iterations) 762866 GFLOPS
75898^524288+1 2558647 digits 0 days 1.8 hours (0.80 ms/mul, 8499637 iterations) 3253965 GFLOPS
468750^524288+1 2973207 digits 0 days 2.1 hours (0.80 ms/mul, 9876777 iterations) 3785951 GFLOPS
710000^524288+1 3067745 digits 0 days 2.2 hours (0.80 ms/mul, 10190825 iterations) 3901391 GFLOPS

14^1048576+1 1201803 digits 0 days 1.8 hours (1.63 ms/mul, 3992299 iterations) 3127943 GFLOPS
75898^1048576+1 5117293 digits 0 days 7.5 hours (1.60 ms/mul, 16999276 iterations) 13082238 GFLOPS
468750^1048576+1 5946413 digits 0 days 8.8 hours (1.62 ms/mul, 19753555 iterations) 15344381 GFLOPS
700000^1048576+1 6129030 digits 0 days 9.0 hours (1.61 ms/mul, 20360194 iterations) 15757079 GFLOPS

14^2097152+1 2403605 digits 0 days 6.5 hours (2.94 ms/mul, 7984600 iterations) 11298690 GFLOPS
75898^2097152+1 10234585 digits 1 days 3.7 hours (2.94 ms/mul, 33998553 iterations) 48078355 GFLOPS
380742^2097152+1 11703432 digits 1 days 7.7 hours (2.94 ms/mul, 38877955 iterations) 54960022 GFLOPS
570000^2097152+1 12070945 digits 1 days 8.7 hours (2.94 ms/mul, 40098808 iterations) 56705090 GFLOPS

14^4194304+1 4807210 digits 1 days 1.2 hours (5.68 ms/mul, 15969202 iterations) 43659408 GFLOPS
1248^4194304+1 12986466 digits 2 days 19.7 hours (5.66 ms/mul, 43140102 iterations) 117384683 GFLOPS
10000^4194304+1 16777217 digits 3 days 15.8 hours (5.67 ms/mul, 55732704 iterations) 152105187 GFLOPS
50000^4194304+1 19708909 digits 4 days 6.8 hours (5.66 ms/mul, 65471576 iterations) 178148932 GFLOPS
150000^4194304+1 21710101 digits 4 days 17.3 hours (5.66 ms/mul, 72119391 iterations) 196272531 GFLOPS
309258^4194304+1 23028076 digits 5 days 0.2 hours (5.66 ms/mul, 76497608 iterations) 208261456 GFLOPS
480000^4194304+1 23828853 digits 5 days 4.8 hours (5.68 ms/mul, 79157734 iterations) 216188817 GFLOPS

14^8388608+1 9614419 digits 4 days 13.2 hours (12.31 ms/mul, 31938406 iterations) 189172009 GFLOPS
36^8388608+1 13055212 digits 6 days 4.4 hours (12.32 ms/mul, 43368473 iterations) 257018502 GFLOPS
100^8388608+1 16777217 digits 7 days 22.9 hours (12.33 ms/mul, 55732704 iterations) 330588895 GFLOPS


OpenCL:


geneferocl 3.1.2-2 (Windows 64-bit OpenGL)
Copyright 2001-2013, Yves Gallot
Copyright 2009, Mark Rodenkirch, David Underbakke
Copyright 2010-2012, Shoichiro Yamada, Ken Brazier
Copyright 2011-2013, Iain Bethune, Michael Goetz, Ronald Schneider

Running on platform 'NVIDIA CUDA', device 'GeForce GTX TITAN', version 'OpenCL 1.1 CUDA' and driver '320.49'.

14^32768+1 37557 digits 0 days 0.0 hours (0.09 ms/mul, 124758 iterations) 4810 GFLOPS
75898^32768+1 159916 digits 0 days 0.0 hours (0.08 ms/mul, 531226 iterations) 21164 GFLOPS
700000^32768+1 191533 digits 0 days 0.0 hours (0.09 ms/mul, 636255 iterations) 25974 GFLOPS
5000000^32768+1 219512 digits 0 days 0.0 hours (0.09 ms/mul, 729201 iterations) 29341 GFLOPS

14^65536+1 75113 digits 0 days 0.0 hours (0.10 ms/mul, 249517 iterations) 11544 GFLOPS
75898^65536+1 319831 digits 0 days 0.0 hours (0.10 ms/mul, 1062453 iterations) 48100 GFLOPS
710000^65536+1 383469 digits 0 days 0.0 hours (0.10 ms/mul, 1273852 iterations) 58682 GFLOPS
2500000^65536+1 419296 digits 0 days 0.0 hours (0.10 ms/mul, 1392868 iterations) 63973 GFLOPS

14^131072+1 150226 digits 0 days 0.0 hours (0.13 ms/mul, 499036 iterations) 31746 GFLOPS
75898^131072+1 639662 digits 0 days 0.0 hours (0.13 ms/mul, 2124908 iterations) 133718 GFLOPS
700000^131072+1 766129 digits 0 days 0.0 hours (0.13 ms/mul, 2545023 iterations) 160173 GFLOPS
1000000^131072+1 786432 digits 0 days 0.0 hours (0.13 ms/mul, 2612469 iterations) 164502 GFLOPS

14^262144+1 300451 digits 0 days 0.0 hours (0.28 ms/mul, 998074 iterations) 134199 GFLOPS
75898^262144+1 1279324 digits 0 days 0.3 hours (0.28 ms/mul, 4249818 iterations) 561808 GFLOPS
468750^262144+1 1486604 digits 0 days 0.3 hours (0.28 ms/mul, 4938388 iterations) 655122 GFLOPS
815000^262144+1 1549575 digits 0 days 0.3 hours (0.28 ms/mul, 5147574 iterations) 680615 GFLOPS

14^524288+1 600902 digits 0 days 0.2 hours (0.47 ms/mul, 1996149 iterations) 448292 GFLOPS
75898^524288+1 2558647 digits 0 days 1.0 hours (0.46 ms/mul, 8499637 iterations) 1880229 GFLOPS
468750^524288+1 2973207 digits 0 days 1.2 hours (0.46 ms/mul, 9876777 iterations) 2185183 GFLOPS
710000^524288+1 3067745 digits 0 days 1.3 hours (0.47 ms/mul, 10190825 iterations) 2289079 GFLOPS

14^1048576+1 1201803 digits 0 days 0.9 hours (0.87 ms/mul, 3992299 iterations) 1672437 GFLOPS
75898^1048576+1 5117293 digits 0 days 4.0 hours (0.86 ms/mul, 16999276 iterations) 7064447 GFLOPS
468750^1048576+1 5946413 digits 0 days 4.6 hours (0.86 ms/mul, 19753555 iterations) 8133229 GFLOPS
700000^1048576+1 6129030 digits 0 days 4.8 hours (0.86 ms/mul, 20360194 iterations) 8392488 GFLOPS

14^2097152+1 2403605 digits 0 days 3.6 hours (1.64 ms/mul, 7984600 iterations) 6282822 GFLOPS
75898^2097152+1 10234585 digits 0 days 15.2 hours (1.61 ms/mul, 33998553 iterations) 26393913 GFLOPS
380742^2097152+1 11703432 digits 0 days 17.4 hours (1.61 ms/mul, 38877955 iterations) 30144751 GFLOPS
570000^2097152+1 12070945 digits 0 days 17.9 hours (1.61 ms/mul, 40098808 iterations) 31091359 GFLOPS

14^4194304+1 4807210 digits 0 days 14.9 hours (3.37 ms/mul, 15969202 iterations) 25885496 GFLOPS
1248^4194304+1 12986466 digits 1 days 15.9 hours (3.33 ms/mul, 43140102 iterations) 69098536 GFLOPS
10000^4194304+1 16777217 digits 2 days 3.4 hours (3.32 ms/mul, 55732704 iterations) 89080719 GFLOPS
50000^4194304+1 19708909 digits 2 days 12.4 hours (3.33 ms/mul, 65471576 iterations) 104741598 GFLOPS
150000^4194304+1 21710101 digits 2 days 18.7 hours (3.33 ms/mul, 72119391 iterations) 115515517 GFLOPS
309258^4194304+1 23028076 digits 2 days 22.5 hours (3.32 ms/mul, 76497608 iterations) 122234125 GFLOPS
480000^4194304+1 23828853 digits 3 days 0.9 hours (3.32 ms/mul, 79157734 iterations) 126370244 GFLOPS

14^8388608+1 9614419 digits 2 days 14.4 hours (7.04 ms/mul, 31938406 iterations) 108104750 GFLOPS
36^8388608+1 13055212 digits 3 days 12.6 hours (7.02 ms/mul, 43368473 iterations) 146522220 GFLOPS
100^8388608+1 16777217 digits 4 days 12.2 hours (6.99 ms/mul, 55732704 iterations) 187490914 GFLOPS

Profile Michael GoetzProject donor
Volunteer moderator
Project administrator
Project scientist
Avatar
Send message
Joined: 21 Jan 10
Posts: 12669
ID: 53948
Credit: 184,131,428
RAC: 10,627
The "Shut up already!" badge:  This loud mouth has mansplained on the forums over 10 thousand times!  Sheesh!!!Discovered the World's First GFN-19 prime!!!Discovered 1 mega primeFound 1 prime in the 2018 Tour de PrimesFound 1 prime in the 2019 Tour de Primes321 LLR Ruby: Earned 2,000,000 credits (2,063,182)Cullen LLR Ruby: Earned 2,000,000 credits (2,005,249)ESP LLR Ruby: Earned 2,000,000 credits (3,820,430)Generalized Cullen/Woodall LLR Ruby: Earned 2,000,000 credits (2,145,754)PPS LLR Ruby: Earned 2,000,000 credits (2,773,744)PSP LLR Ruby: Earned 2,000,000 credits (2,632,269)SoB LLR Sapphire: Earned 20,000,000 credits (34,158,496)SR5 LLR Turquoise: Earned 5,000,000 credits (8,293,415)SGS LLR Ruby: Earned 2,000,000 credits (2,012,781)TRP LLR Ruby: Earned 2,000,000 credits (2,737,347)Woodall LLR Ruby: Earned 2,000,000 credits (2,195,123)321 Sieve Turquoise: Earned 5,000,000 credits (5,046,112)Cullen/Woodall Sieve (suspended) Ruby: Earned 2,000,000 credits (4,170,256)Generalized Cullen/Woodall Sieve Turquoise: Earned 5,000,000 credits (5,059,304)PPS Sieve Sapphire: Earned 20,000,000 credits (20,110,788)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Amethyst: Earned 1,000,000 credits (1,035,522)TRP Sieve (suspended) Ruby: Earned 2,000,000 credits (2,051,121)AP 26/27 Turquoise: Earned 5,000,000 credits (7,090,096)GFN Emerald: Earned 50,000,000 credits (64,594,991)PSA Jade: Earned 10,000,000 credits (10,135,447)
Message 68479 - Posted: 17 Aug 2013 | 0:43:41 UTC - in response to Message 68474.
Last modified: 17 Aug 2013 | 0:49:03 UTC

I tuned geneferocl. It's a bit faster on my computer for large exponents.

I committed the new parameters on assembla, in the branch [...]\branches\yves\2013\OclGenefer.

I think that it is faster on a 'Tahiti', but only a real experiment can answer to the question. Please, could someone compile it and run the bench on a HD79x0?

Mike, could you also test it on your GTX 460? Because, I don't know why, but your card doesn't like my algorithm :o) If it is the card that found 75898^524288+1 with GeneferCUDA, I understand it!

If new tuning is faster on Tahiti and Fermi, I will update the real genefer.

Thanks, Yves


Here's the new code on my 460:

OclGenefer 2013-08-16, Copyright (C) 2001-2013, Yves Gallot. Options: -q "b^N+1" Test expression. Platform 'NVIDIA CUDA': GPU device 'GeForce GTX 460' found. Platform 'AMD Accelerated Parallel Processing': CPU device 'Intel(R) Core(TM)2 Q uad CPU @ 2.40GHz' found. Running on platform 'NVIDIA CUDA', device 'GeForce GTX 460', version 'OpenCL 1.1 CUDA' and driver '320.57'. Clock frequency = 1350 MHz, compute units = 7. Global mem size = 1024 MB, cache size = 112 kB (ReadWrite), cache line size = 128 Bytes. Local mem size = 48 kB (dedicated), Constant mem size = 64 kB. Max workgroup size = 1024. 2199064^8192+1 Time: 83 us/mul. Err: 0.2188 51956 digits 1798620^16384+1 Time: 125 us/mul. Err: 0.2266 102481 digits 1471094^32768+1 Time: 166 us/mul. Err: 0.2344 202102 digits 1203210^65536+1 Time: 304 us/mul. Err: 0.2188 398482 digits 984108^131072+1 Time: 520 us/mul. Err: 0.2422 785521 digits 804904^262144+1 Time: 1.05 ms/mul. Err: 0.2178 1548156 digits 658332^524288+1 Time: 2.02 ms/mul. Err: 0.2256 3050541 digits 538452^1048576+1 Time: 4.24 ms/mul. Err: 0.2031 6009544 digits 440400^2097152+1 Time: 8.83 ms/mul. Err: 0.2305 11836006 digits 360204^4194304+1 Time: 18.6 ms/mul. Err: 0.1953 23305854 digits 294612^8388608+1 Time: 39.6 ms/mul. Err: 0.1973 45879398 digits


EDIT: This is slightly faster than the previous versions.
____________
Please do not PM me with support questions. Ask on the forums instead. Thank you!

My lucky number is 75898524288+1

Profile Michael GoetzProject donor
Volunteer moderator
Project administrator
Project scientist
Avatar
Send message
Joined: 21 Jan 10
Posts: 12669
ID: 53948
Credit: 184,131,428
RAC: 10,627
The "Shut up already!" badge:  This loud mouth has mansplained on the forums over 10 thousand times!  Sheesh!!!Discovered the World's First GFN-19 prime!!!Discovered 1 mega primeFound 1 prime in the 2018 Tour de PrimesFound 1 prime in the 2019 Tour de Primes321 LLR Ruby: Earned 2,000,000 credits (2,063,182)Cullen LLR Ruby: Earned 2,000,000 credits (2,005,249)ESP LLR Ruby: Earned 2,000,000 credits (3,820,430)Generalized Cullen/Woodall LLR Ruby: Earned 2,000,000 credits (2,145,754)PPS LLR Ruby: Earned 2,000,000 credits (2,773,744)PSP LLR Ruby: Earned 2,000,000 credits (2,632,269)SoB LLR Sapphire: Earned 20,000,000 credits (34,158,496)SR5 LLR Turquoise: Earned 5,000,000 credits (8,293,415)SGS LLR Ruby: Earned 2,000,000 credits (2,012,781)TRP LLR Ruby: Earned 2,000,000 credits (2,737,347)Woodall LLR Ruby: Earned 2,000,000 credits (2,195,123)321 Sieve Turquoise: Earned 5,000,000 credits (5,046,112)Cullen/Woodall Sieve (suspended) Ruby: Earned 2,000,000 credits (4,170,256)Generalized Cullen/Woodall Sieve Turquoise: Earned 5,000,000 credits (5,059,304)PPS Sieve Sapphire: Earned 20,000,000 credits (20,110,788)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Amethyst: Earned 1,000,000 credits (1,035,522)TRP Sieve (suspended) Ruby: Earned 2,000,000 credits (2,051,121)AP 26/27 Turquoise: Earned 5,000,000 credits (7,090,096)GFN Emerald: Earned 50,000,000 credits (64,594,991)PSA Jade: Earned 10,000,000 credits (10,135,447)
Message 68480 - Posted: 17 Aug 2013 | 0:52:16 UTC

If anyone has BOTH an Nvidia and an AMD GPU in the same system, I'd like to know which one GeneferOCL chooses.
____________
Please do not PM me with support questions. Ask on the forums instead. Thank you!

My lucky number is 75898524288+1

NeoMetal*
Volunteer tester
Avatar
Send message
Joined: 25 Mar 11
Posts: 418
ID: 92179
Credit: 1,747,428,303
RAC: 0
Eliminated 1 conjecture "k"Found 1 prime in the 2018 Tour de Primes321 LLR Turquoise: Earned 5,000,000 credits (7,698,294)Cullen LLR Turquoise: Earned 5,000,000 credits (6,655,866)ESP LLR Turquoise: Earned 5,000,000 credits (8,198,062)Generalized Cullen/Woodall LLR Ruby: Earned 2,000,000 credits (2,817,032)PPS LLR Jade: Earned 10,000,000 credits (10,016,096)PSP LLR Jade: Earned 10,000,000 credits (10,891,779)SoB LLR Jade: Earned 10,000,000 credits (11,111,741)SR5 LLR Turquoise: Earned 5,000,000 credits (7,278,494)SGS LLR Turquoise: Earned 5,000,000 credits (7,457,856)TRP LLR Turquoise: Earned 5,000,000 credits (7,714,186)Woodall LLR Turquoise: Earned 5,000,000 credits (5,726,778)Cullen/Woodall Sieve (suspended) Ruby: Earned 2,000,000 credits (4,094,130)Generalized Cullen/Woodall Sieve Jade: Earned 10,000,000 credits (11,628,594)PPS Sieve Double Amethyst: Earned 1,000,000,000 credits (1,040,865,445)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Sapphire: Earned 20,000,000 credits (20,235,150)TRP Sieve (suspended) Sapphire: Earned 20,000,000 credits (20,376,844)AP 26/27 Jade: Earned 10,000,000 credits (11,615,539)GFN Double Silver: Earned 200,000,000 credits (427,593,017)PSA Double Bronze: Earned 100,000,000 credits (125,462,363)
Message 68481 - Posted: 17 Aug 2013 | 2:23:53 UTC
Last modified: 17 Aug 2013 | 2:28:18 UTC

Here's some benchmarks for some Fermis.

These are on AMD PII 1100Ts systems @ 3.8GHZ accept the 570 which is on a 2600K @ 4.3GHZ
All GPUs at stock speed

GeForce GTX 470

C:\>genefercuda-windows.exe -b
genefercuda 3.1.2-2 (Windows 32-bit CUDA)
Copyright 2001-2013, Yves Gallot
Copyright 2009, Mark Rodenkirch, David Underbakke
Copyright 2010-2012, Shoichiro Yamada, Ken Brazier
Copyright 2011-2013, Iain Bethune, Michael Goetz, Ronald Schneider

Command line: genefercuda-windows.exe -b

Generalized Fermat Number Bench
2199064^8192+1 Time: 111 us/mul. Err: 0.2188 51956 digits
1798620^16384+1 Time: 119 us/mul. Err: 0.2188 102481 digits
1471094^32768+1 Time: 177 us/mul. Err: 0.2031 202102 digits
1203210^65536+1 Time: 271 us/mul. Err: 0.2031 398482 digits
984108^131072+1 Time: 527 us/mul. Err: 0.2070 785521 digits
804904^262144+1 Time: 742 us/mul. Err: 0.2031 1548156 digits
658332^524288+1 Time: 1.35 ms/mul. Err: 0.2266 3050541 digits
538452^1048576+1 Time: 2.5 ms/mul. Err: 0.2031 6009544 digits
440400^2097152+1 Time: 4.77 ms/mul. Err: 0.1953 11836006 digits
360204^4194304+1 Time: 10 ms/mul. Err: 0.2031 23305854 digits
294612^8388608+1 Time: 20.9 ms/mul. Err: 0.1797 45879398 digitsght


C:\>geneferocl-windows.exe -b
geneferocl 3.1.2-2 (Windows 64-bit OpenGL)
Copyright 2001-2013, Yves Gallot
Copyright 2009, Mark Rodenkirch, David Underbakke
Copyright 2010-2012, Shoichiro Yamada, Ken Brazier
Copyright 2011-2013, Iain Bethune, Michael Goetz, Ronald Schneider

Command line: geneferocl-windows.exe -b


Running on platform 'NVIDIA CUDA', device 'GeForce GTX 470', version 'OpenCL 1.1
CUDA' and driver '320.18'.

Generalized Fermat Number Bench
2199064^8192+1 Time: 90.3 us/mul. Err: 0.2344 51956 digits
1798620^16384+1 Time: 98.9 us/mul. Err: 0.2266 102481 digits
1471094^32768+1 Time: 134 us/mul. Err: 0.2344 202102 digits
1203210^65536+1 Time: 195 us/mul. Err: 0.2656 398482 digits
984108^131072+1 Time: 322 us/mul. Err: 0.2188 785521 digits
804904^262144+1 Time: 654 us/mul. Err: 0.2188 1548156 digits
658332^524288+1 Time: 1.19 ms/mul. Err: 0.2188 3050541 digits
538452^1048576+1 Time: 2.46 ms/mul. Err: 0.2266 6009544 digits
440400^2097152+1 Time: 5 ms/mul. Err: 0.2266 11836006 digits
360204^4194304+1 Time: 10.5 ms/mul. Err: 0.2031 23305854 digits
294612^8388608+1 Time: 22.2 ms/mul. Err: 0.1895 45879398 digits
Genefer Mark = 483.


GeForce GTX 560ti

C:\>genefercuda-windows.exe -d 1 -b
genefercuda 3.1.2-2 (Windows 32-bit CUDA)
Copyright 2001-2013, Yves Gallot
Copyright 2009, Mark Rodenkirch, David Underbakke
Copyright 2010-2012, Shoichiro Yamada, Ken Brazier
Copyright 2011-2013, Iain Bethune, Michael Goetz, Ronald Schneider

Command line: genefercuda-windows.exe -d 1 -b

Generalized Fermat Number Bench
2199064^8192+1 Time: 97.7 us/mul. Err: 0.2188 51956 digits
1798620^16384+1 Time: 111 us/mul. Err: 0.2188 102481 digits
1471094^32768+1 Time: 150 us/mul. Err: 0.2031 202102 digits
1203210^65536+1 Time: 239 us/mul. Err: 0.2031 398482 digits
984108^131072+1 Time: 444 us/mul. Err: 0.2070 785521 digits
804904^262144+1 Time: 752 us/mul. Err: 0.2031 1548156 digits
658332^524288+1 Time: 1.48 ms/mul. Err: 0.2266 3050541 digits
538452^1048576+1 Time: 2.97 ms/mul. Err: 0.2031 6009544 digits
440400^2097152+1 Time: 6.09 ms/mul. Err: 0.1953 11836006 digits
360204^4194304+1 Time: 12.3 ms/mul. Err: 0.2031 23305854 digits
294612^8388608+1 Time: 25.6 ms/mul. Err: 0.1797 45879398 digits

C:\>geneferocl-windows.exe -d 1 -b
geneferocl 3.1.2-2 (Windows 64-bit OpenGL)
Copyright 2001-2013, Yves Gallot
Copyright 2009, Mark Rodenkirch, David Underbakke
Copyright 2010-2012, Shoichiro Yamada, Ken Brazier
Copyright 2011-2013, Iain Bethune, Michael Goetz, Ronald Schneider

Command line: geneferocl-windows.exe -d 1 -b


Running on platform 'NVIDIA CUDA', device 'GeForce GTX 560 Ti', version 'OpenCL
1.1 CUDA' and driver '320.18'.

Generalized Fermat Number Bench
2199064^8192+1 Time: 68.4 us/mul. Err: 0.2344 51956 digits
1798620^16384+1 Time: 73.2 us/mul. Err: 0.2266 102481 digits
1471094^32768+1 Time: 106 us/mul. Err: 0.2344 202102 digits
1203210^65536+1 Time: 190 us/mul. Err: 0.2656 398482 digits
984108^131072+1 Time: 366 us/mul. Err: 0.2188 785521 digits
804904^262144+1 Time: 791 us/mul. Err: 0.2188 1548156 digits
658332^524288+1 Time: 1.58 ms/mul. Err: 0.2188 3050541 digits
538452^1048576+1 Time: 3.28 ms/mul. Err: 0.2266 6009544 digits
440400^2097152+1 Time: 6.95 ms/mul. Err: 0.2266 11836006 digits
360204^4194304+1 Time: 14.5 ms/mul. Err: 0.2031 23305854 digits
294612^8388608+1 Time: 30.3 ms/mul. Err: 0.1895 45879398 digits
Genefer Mark = 353.


GeForce GTX 570

C:\>genefercuda-windows.exe -b
genefercuda 3.1.2-2 (Windows 32-bit CUDA)
Copyright 2001-2013, Yves Gallot
Copyright 2009, Mark Rodenkirch, David Underbakke
Copyright 2010-2012, Shoichiro Yamada, Ken Brazier
Copyright 2011-2013, Iain Bethune, Michael Goetz, Ronald Schneider

Command line: genefercuda-windows.exe -b

Generalized Fermat Number Bench
2199064^8192+1 Time: 91.6 us/mul. Err: 0.2188 51956 digits
1798620^16384+1 Time: 98 us/mul. Err: 0.2188 102481 digits
1471094^32768+1 Time: 145 us/mul. Err: 0.2031 202102 digits
1203210^65536+1 Time: 226 us/mul. Err: 0.2031 398482 digits
984108^131072+1 Time: 410 us/mul. Err: 0.2070 785521 digits
804904^262144+1 Time: 596 us/mul. Err: 0.2031 1548156 digits
658332^524288+1 Time: 1.09 ms/mul. Err: 0.2266 3050541 digits
538452^1048576+1 Time: 2.03 ms/mul. Err: 0.2031 6009544 digits
440400^2097152+1 Time: 3.91 ms/mul. Err: 0.1953 11836006 digits
360204^4194304+1 Time: 8.13 ms/mul. Err: 0.2031 23305854 digits
294612^8388608+1 Time: 16.7 ms/mul. Err: 0.1797 45879398 digits

C:\>geneferocl-windows.exe -b
geneferocl 3.1.2-2 (Windows 64-bit OpenGL)
Copyright 2001-2013, Yves Gallot
Copyright 2009, Mark Rodenkirch, David Underbakke
Copyright 2010-2012, Shoichiro Yamada, Ken Brazier
Copyright 2011-2013, Iain Bethune, Michael Goetz, Ronald Schneider

Command line: geneferocl-windows.exe -b


Running on platform 'NVIDIA CUDA', device 'GeForce GTX 570', version 'OpenCL 1.1
CUDA' and driver '310.70'.

Generalized Fermat Number Bench
2199064^8192+1 Time: 81.8 us/mul. Err: 0.2344 51956 digits
1798620^16384+1 Time: 88.8 us/mul. Err: 0.2266 102481 digits
1471094^32768+1 Time: 118 us/mul. Err: 0.2344 202102 digits
1203210^65536+1 Time: 170 us/mul. Err: 0.2656 398482 digits
984108^131072+1 Time: 273 us/mul. Err: 0.2188 785521 digits
804904^262144+1 Time: 527 us/mul. Err: 0.2188 1548156 digits
658332^524288+1 Time: 957 us/mul. Err: 0.2188 3050541 digits
538452^1048576+1 Time: 1.91 ms/mul. Err: 0.2266 6009544 digits
440400^2097152+1 Time: 3.91 ms/mul. Err: 0.2266 11836006 digits
360204^4194304+1 Time: 8.2 ms/mul. Err: 0.2031 23305854 digits
294612^8388608+1 Time: 17.3 ms/mul. Err: 0.1895 45879398 digits
Genefer Mark = 618.


GeForce GTX 580

C:\>genefercuda-windows.exe -b
genefercuda 3.1.2-2 (Windows 32-bit CUDA)
Copyright 2001-2013, Yves Gallot
Copyright 2009, Mark Rodenkirch, David Underbakke
Copyright 2010-2012, Shoichiro Yamada, Ken Brazier
Copyright 2011-2013, Iain Bethune, Michael Goetz, Ronald Schneider

Command line: genefercuda-windows.exe -b

Generalized Fermat Number Bench
2199064^8192+1 Time: 98.4 us/mul. Err: 0.2188 51956 digits
1798620^16384+1 Time: 94.8 us/mul. Err: 0.2188 102481 digits
1471094^32768+1 Time: 114 us/mul. Err: 0.2031 202102 digits
1203210^65536+1 Time: 159 us/mul. Err: 0.2031 398482 digits
984108^131072+1 Time: 278 us/mul. Err: 0.2070 785521 digits
804904^262144+1 Time: 489 us/mul. Err: 0.2031 1548156 digits
658332^524288+1 Time: 867 us/mul. Err: 0.2266 3050541 digits
538452^1048576+1 Time: 1.61 ms/mul. Err: 0.2031 6009544 digits
440400^2097152+1 Time: 3.09 ms/mul. Err: 0.1953 11836006 digits
360204^4194304+1 Time: 6.42 ms/mul. Err: 0.2031 23305854 digits
294612^8388608+1 Time: 13.3 ms/mul. Err: 0.1797 45879398 digits

C:\>geneferocl-windows.exe -b
geneferocl 3.1.2-2 (Windows 64-bit OpenGL)
Copyright 2001-2013, Yves Gallot
Copyright 2009, Mark Rodenkirch, David Underbakke
Copyright 2010-2012, Shoichiro Yamada, Ken Brazier
Copyright 2011-2013, Iain Bethune, Michael Goetz, Ronald Schneider

Command line: geneferocl-windows.exe -b


Running on platform 'NVIDIA CUDA', device 'GeForce GTX 580', version 'OpenCL 1.1
CUDA' and driver '314.22'.

Generalized Fermat Number Bench
2199064^8192+1 Time: 67.3 us/mul. Err: 0.2344 51956 digits
1798620^16384+1 Time: 71.8 us/mul. Err: 0.2266 102481 digits
1471094^32768+1 Time: 74.5 us/mul. Err: 0.2344 202102 digits
1203210^65536+1 Time: 112 us/mul. Err: 0.2656 398482 digits
984108^131072+1 Time: 186 us/mul. Err: 0.2188 785521 digits
804904^262144+1 Time: 396 us/mul. Err: 0.2188 1548156 digits
658332^524288+1 Time: 752 us/mul. Err: 0.2188 3050541 digits
538452^1048576+1 Time: 1.52 ms/mul. Err: 0.2266 6009544 digits
440400^2097152+1 Time: 3.13 ms/mul. Err: 0.2266 11836006 digits
360204^4194304+1 Time: 6.64 ms/mul. Err: 0.2031 23305854 digits
294612^8388608+1 Time: 14.2 ms/mul. Err: 0.1895 45879398 digits
Genefer Mark = 767.


Any specific testing needed let me know.
____________
Largest Primes to Date:
As Double Checker: SR5 109208*5^1816285+1 Dgts-1,269,534
As Initial Finder: SR5 243944*5^1258576-1 Dgts-879,713


Profile Michael GoetzProject donor
Volunteer moderator
Project administrator
Project scientist
Avatar
Send message
Joined: 21 Jan 10
Posts: 12669
ID: 53948
Credit: 184,131,428
RAC: 10,627
The "Shut up already!" badge:  This loud mouth has mansplained on the forums over 10 thousand times!  Sheesh!!!Discovered the World's First GFN-19 prime!!!Discovered 1 mega primeFound 1 prime in the 2018 Tour de PrimesFound 1 prime in the 2019 Tour de Primes321 LLR Ruby: Earned 2,000,000 credits (2,063,182)Cullen LLR Ruby: Earned 2,000,000 credits (2,005,249)ESP LLR Ruby: Earned 2,000,000 credits (3,820,430)Generalized Cullen/Woodall LLR Ruby: Earned 2,000,000 credits (2,145,754)PPS LLR Ruby: Earned 2,000,000 credits (2,773,744)PSP LLR Ruby: Earned 2,000,000 credits (2,632,269)SoB LLR Sapphire: Earned 20,000,000 credits (34,158,496)SR5 LLR Turquoise: Earned 5,000,000 credits (8,293,415)SGS LLR Ruby: Earned 2,000,000 credits (2,012,781)TRP LLR Ruby: Earned 2,000,000 credits (2,737,347)Woodall LLR Ruby: Earned 2,000,000 credits (2,195,123)321 Sieve Turquoise: Earned 5,000,000 credits (5,046,112)Cullen/Woodall Sieve (suspended) Ruby: Earned 2,000,000 credits (4,170,256)Generalized Cullen/Woodall Sieve Turquoise: Earned 5,000,000 credits (5,059,304)PPS Sieve Sapphire: Earned 20,000,000 credits (20,110,788)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Amethyst: Earned 1,000,000 credits (1,035,522)TRP Sieve (suspended) Ruby: Earned 2,000,000 credits (2,051,121)AP 26/27 Turquoise: Earned 5,000,000 credits (7,090,096)GFN Emerald: Earned 50,000,000 credits (64,594,991)PSA Jade: Earned 10,000,000 credits (10,135,447)
Message 68482 - Posted: 17 Aug 2013 | 2:31:14 UTC - in response to Message 68481.

Here's some benchmarks for some Fermis.

These are on AMD PII 1100Ts systems @ 3.8GHZ accept the 570 which is on a 2600K @ 4.3GHZ
All GPUs at stock speed


Interesting. On my machine, I saw identical speeds with both 32 and 64 bits. On yours, with a variety of Fermi GPUs, the 32 bit builds all slightly faster.

One difference: Core2 on my system, Phenom II on yours.

Looks like yet another reason to use a 32 bit build.
____________
Please do not PM me with support questions. Ask on the forums instead. Thank you!

My lucky number is 75898524288+1

NeoMetal*
Volunteer tester
Avatar
Send message
Joined: 25 Mar 11
Posts: 418
ID: 92179
Credit: 1,747,428,303
RAC: 0
Eliminated 1 conjecture "k"Found 1 prime in the 2018 Tour de Primes321 LLR Turquoise: Earned 5,000,000 credits (7,698,294)Cullen LLR Turquoise: Earned 5,000,000 credits (6,655,866)ESP LLR Turquoise: Earned 5,000,000 credits (8,198,062)Generalized Cullen/Woodall LLR Ruby: Earned 2,000,000 credits (2,817,032)PPS LLR Jade: Earned 10,000,000 credits (10,016,096)PSP LLR Jade: Earned 10,000,000 credits (10,891,779)SoB LLR Jade: Earned 10,000,000 credits (11,111,741)SR5 LLR Turquoise: Earned 5,000,000 credits (7,278,494)SGS LLR Turquoise: Earned 5,000,000 credits (7,457,856)TRP LLR Turquoise: Earned 5,000,000 credits (7,714,186)Woodall LLR Turquoise: Earned 5,000,000 credits (5,726,778)Cullen/Woodall Sieve (suspended) Ruby: Earned 2,000,000 credits (4,094,130)Generalized Cullen/Woodall Sieve Jade: Earned 10,000,000 credits (11,628,594)PPS Sieve Double Amethyst: Earned 1,000,000,000 credits (1,040,865,445)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Sapphire: Earned 20,000,000 credits (20,235,150)TRP Sieve (suspended) Sapphire: Earned 20,000,000 credits (20,376,844)AP 26/27 Jade: Earned 10,000,000 credits (11,615,539)GFN Double Silver: Earned 200,000,000 credits (427,593,017)PSA Double Bronze: Earned 100,000,000 credits (125,462,363)
Message 68483 - Posted: 17 Aug 2013 | 3:41:36 UTC
Last modified: 17 Aug 2013 | 3:42:32 UTC

I do have a second 560ti, exact same Zotac model as first bench but on my 2600K SB. The benchmarks are slightly slower. Here's the bench from the 2600K with the PII 1100T below it to compare.

GTX 560ti on 2600K

C:\>genefercuda-windows.exe -d 1 -b
genefercuda 3.1.2-2 (Windows 32-bit CUDA)
Copyright 2001-2013, Yves Gallot
Copyright 2009, Mark Rodenkirch, David Underbakke
Copyright 2010-2012, Shoichiro Yamada, Ken Brazier
Copyright 2011-2013, Iain Bethune, Michael Goetz, Ronald Schneider

Command line: genefercuda-windows.exe -d 1 -b

Generalized Fermat Number Bench
2199064^8192+1 Time: 97.7 us/mul. Err: 0.2188 51956 digits
1798620^16384+1 Time: 112 us/mul. Err: 0.2188 102481 digits
1471094^32768+1 Time: 151 us/mul. Err: 0.2031 202102 digits
1203210^65536+1 Time: 243 us/mul. Err: 0.2031 398482 digits
984108^131072+1 Time: 449 us/mul. Err: 0.2070 785521 digits
804904^262144+1 Time: 771 us/mul. Err: 0.2031 1548156 digits
658332^524288+1 Time: 1.5 ms/mul. Err: 0.2266 3050541 digits
538452^1048576+1 Time: 3.01 ms/mul. Err: 0.2031 6009544 digits
440400^2097152+1 Time: 6.25 ms/mul. Err: 0.1953 11836006 digits
360204^4194304+1 Time: 12.5 ms/mul. Err: 0.2031 23305854 digits
294612^8388608+1 Time: 26.1 ms/mul. Err: 0.1797 45879398 digits

C:\>geneferocl-windows.exe -d 1 -b
geneferocl 3.1.2-2 (Windows 64-bit OpenGL)
Copyright 2001-2013, Yves Gallot
Copyright 2009, Mark Rodenkirch, David Underbakke
Copyright 2010-2012, Shoichiro Yamada, Ken Brazier
Copyright 2011-2013, Iain Bethune, Michael Goetz, Ronald Schneider

Command line: geneferocl-windows.exe -d 1 -b


Running on platform 'NVIDIA CUDA', device 'GeForce GTX 560 Ti', version 'OpenCL
1.1 CUDA' and driver '310.70'.

Generalized Fermat Number Bench
2199064^8192+1 Time: 69.6 us/mul. Err: 0.2344 51956 digits
1798620^16384+1 Time: 74.2 us/mul. Err: 0.2266 102481 digits
1471094^32768+1 Time: 107 us/mul. Err: 0.2344 202102 digits
1203210^65536+1 Time: 195 us/mul. Err: 0.2656 398482 digits
984108^131072+1 Time: 374 us/mul. Err: 0.2188 785521 digits
804904^262144+1 Time: 806 us/mul. Err: 0.2188 1548156 digits
658332^524288+1 Time: 1.59 ms/mul. Err: 0.2188 3050541 digits
538452^1048576+1 Time: 3.32 ms/mul. Err: 0.2266 6009544 digits
440400^2097152+1 Time: 6.95 ms/mul. Err: 0.2266 11836006 digits
360204^4194304+1 Time: 14.6 ms/mul. Err: 0.2031 23305854 digits
294612^8388608+1 Time: 30.6 ms/mul. Err: 0.1895 45879398 digits
Genefer Mark = 350.


GTX 560ti on 1100T

C:\>genefercuda-windows.exe -d 1 -b
genefercuda 3.1.2-2 (Windows 32-bit CUDA)
Copyright 2001-2013, Yves Gallot
Copyright 2009, Mark Rodenkirch, David Underbakke
Copyright 2010-2012, Shoichiro Yamada, Ken Brazier
Copyright 2011-2013, Iain Bethune, Michael Goetz, Ronald Schneider

Command line: genefercuda-windows.exe -d 1 -b

Generalized Fermat Number Bench
2199064^8192+1 Time: 97.7 us/mul. Err: 0.2188 51956 digits
1798620^16384+1 Time: 111 us/mul. Err: 0.2188 102481 digits
1471094^32768+1 Time: 150 us/mul. Err: 0.2031 202102 digits
1203210^65536+1 Time: 239 us/mul. Err: 0.2031 398482 digits
984108^131072+1 Time: 444 us/mul. Err: 0.2070 785521 digits
804904^262144+1 Time: 752 us/mul. Err: 0.2031 1548156 digits
658332^524288+1 Time: 1.48 ms/mul. Err: 0.2266 3050541 digits
538452^1048576+1 Time: 2.97 ms/mul. Err: 0.2031 6009544 digits
440400^2097152+1 Time: 6.09 ms/mul. Err: 0.1953 11836006 digits
360204^4194304+1 Time: 12.3 ms/mul. Err: 0.2031 23305854 digits
294612^8388608+1 Time: 25.6 ms/mul. Err: 0.1797 45879398 digits

C:\>geneferocl-windows.exe -d 1 -b
geneferocl 3.1.2-2 (Windows 64-bit OpenGL)
Copyright 2001-2013, Yves Gallot
Copyright 2009, Mark Rodenkirch, David Underbakke
Copyright 2010-2012, Shoichiro Yamada, Ken Brazier
Copyright 2011-2013, Iain Bethune, Michael Goetz, Ronald Schneider

Command line: geneferocl-windows.exe -d 1 -b


Running on platform 'NVIDIA CUDA', device 'GeForce GTX 560 Ti', version 'OpenCL
1.1 CUDA' and driver '320.18'.

Generalized Fermat Number Bench
2199064^8192+1 Time: 68.4 us/mul. Err: 0.2344 51956 digits
1798620^16384+1 Time: 73.2 us/mul. Err: 0.2266 102481 digits
1471094^32768+1 Time: 106 us/mul. Err: 0.2344 202102 digits
1203210^65536+1 Time: 190 us/mul. Err: 0.2656 398482 digits
984108^131072+1 Time: 366 us/mul. Err: 0.2188 785521 digits
804904^262144+1 Time: 791 us/mul. Err: 0.2188 1548156 digits
658332^524288+1 Time: 1.58 ms/mul. Err: 0.2188 3050541 digits
538452^1048576+1 Time: 3.28 ms/mul. Err: 0.2266 6009544 digits
440400^2097152+1 Time: 6.95 ms/mul. Err: 0.2266 11836006 digits
360204^4194304+1 Time: 14.5 ms/mul. Err: 0.2031 23305854 digits
294612^8388608+1 Time: 30.3 ms/mul. Err: 0.1895 45879398 digits
Genefer Mark = 353.

I wonder if it's the CPU, driver version, GPU chip quality or something else. I'll try different driver version a little later.
____________
Largest Primes to Date:
As Double Checker: SR5 109208*5^1816285+1 Dgts-1,269,534
As Initial Finder: SR5 243944*5^1258576-1 Dgts-879,713


Profile Roger
Volunteer moderator
Project administrator
Volunteer developer
Volunteer tester
Project scientist
Avatar
Send message
Joined: 27 Nov 11
Posts: 1112
ID: 120786
Credit: 261,530,863
RAC: 1,408
Found 1 prime in the 2018 Tour de Primes321 LLR Ruby: Earned 2,000,000 credits (2,012,522)Cullen LLR Amethyst: Earned 1,000,000 credits (1,359,862)ESP LLR Ruby: Earned 2,000,000 credits (2,213,934)Generalized Cullen/Woodall LLR Ruby: Earned 2,000,000 credits (2,088,705)PPS LLR Ruby: Earned 2,000,000 credits (2,617,785)PSP LLR Ruby: Earned 2,000,000 credits (2,420,512)SoB LLR Amethyst: Earned 1,000,000 credits (1,780,064)SR5 LLR Ruby: Earned 2,000,000 credits (2,238,295)SGS LLR Ruby: Earned 2,000,000 credits (2,139,392)TRP LLR Ruby: Earned 2,000,000 credits (2,125,391)Woodall LLR Amethyst: Earned 1,000,000 credits (1,311,937)321 Sieve Turquoise: Earned 5,000,000 credits (5,190,731)Cullen/Woodall Sieve (suspended) Silver: Earned 100,000 credits (207,387)Generalized Cullen/Woodall Sieve Turquoise: Earned 5,000,000 credits (5,049,697)PPS Sieve Double Bronze: Earned 100,000,000 credits (100,422,123)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Ruby: Earned 2,000,000 credits (3,227,972)TRP Sieve (suspended) Turquoise: Earned 5,000,000 credits (5,021,659)AP 26/27 Sapphire: Earned 20,000,000 credits (20,295,860)GFN Emerald: Earned 50,000,000 credits (56,515,310)PSA Sapphire: Earned 20,000,000 credits (43,298,465)
Message 68485 - Posted: 17 Aug 2013 | 5:47:53 UTC - in response to Message 68474.

I tuned geneferocl. It's a bit faster on my computer for large exponents.

Full test run with newly tuned geneferocl, revision 394. HD7970GHz GPU, CPU is 3.5 GHz AMD Phenom II X6 1100T:



It seems the higher exponents are slower and some lower exponents are faster. Can the GeneferOCL program optimise it's parameters at run time during initialisation?

Profile Roger
Volunteer moderator
Project administrator
Volunteer developer
Volunteer tester
Project scientist
Avatar
Send message
Joined: 27 Nov 11
Posts: 1112
ID: 120786
Credit: 261,530,863
RAC: 1,408
Found 1 prime in the 2018 Tour de Primes321 LLR Ruby: Earned 2,000,000 credits (2,012,522)Cullen LLR Amethyst: Earned 1,000,000 credits (1,359,862)ESP LLR Ruby: Earned 2,000,000 credits (2,213,934)Generalized Cullen/Woodall LLR Ruby: Earned 2,000,000 credits (2,088,705)PPS LLR Ruby: Earned 2,000,000 credits (2,617,785)PSP LLR Ruby: Earned 2,000,000 credits (2,420,512)SoB LLR Amethyst: Earned 1,000,000 credits (1,780,064)SR5 LLR Ruby: Earned 2,000,000 credits (2,238,295)SGS LLR Ruby: Earned 2,000,000 credits (2,139,392)TRP LLR Ruby: Earned 2,000,000 credits (2,125,391)Woodall LLR Amethyst: Earned 1,000,000 credits (1,311,937)321 Sieve Turquoise: Earned 5,000,000 credits (5,190,731)Cullen/Woodall Sieve (suspended) Silver: Earned 100,000 credits (207,387)Generalized Cullen/Woodall Sieve Turquoise: Earned 5,000,000 credits (5,049,697)PPS Sieve Double Bronze: Earned 100,000,000 credits (100,422,123)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Ruby: Earned 2,000,000 credits (3,227,972)TRP Sieve (suspended) Turquoise: Earned 5,000,000 credits (5,021,659)AP 26/27 Sapphire: Earned 20,000,000 credits (20,295,860)GFN Emerald: Earned 50,000,000 credits (56,515,310)PSA Sapphire: Earned 20,000,000 credits (43,298,465)
Message 68486 - Posted: 17 Aug 2013 | 6:12:31 UTC - in response to Message 68427.

The latest OpenCL beta build is available for download. Its version number is 3.1.2-1, and can be downloaded via this thread.

Benching geneferocl 3.1.2-1 Windows 64-bit
HD7970GHz on X6 1100T

C:\>geneferocl-windows.exe -b
Generalized Fermat Number Bench
2199064^8192+1 Time: 80 us/mul. Err: 0.2344 51956 digits
1798620^16384+1 Time: 77.1 us/mul. Err: 0.2266 102481 digits
1471094^32768+1 Time: 78.1 us/mul. Err: 0.2344 202102 digits
1203210^65536+1 Time: 83.7 us/mul. Err: 0.2656 398482 digits
984108^131072+1 Time: 129 us/mul. Err: 0.2188 785521 digits
804904^262144+1 Time: 335 us/mul. Err: 0.2188 1548156 digits
658332^524288+1 Time: 791 us/mul. Err: 0.2188 3050541 digits
538452^1048576+1 Time: 1.52 ms/mul. Err: 0.2266 6009544 digits
440400^2097152+1 Time: 2.8 ms/mul. Err: 0.2266 11836006 digits
360204^4194304+1 Time: 5.36 ms/mul. Err: 0.2031 23305854 digits
294612^8388608+1 Time: 10.8 ms/mul. Err: 0.1895 45879398 digits
Genefer Mark = 892.

C:\>geneferocl-windows.exe -b3
geneferocl 3.1.2-1 (Windows 64-bit OpenGL)
Copyright 2001-2013, Yves Gallot
Copyright 2009, Mark Rodenkirch, David Underbakke
Copyright 2010-2012, Shoichiro Yamada, Ken Brazier
Copyright 2011-2013, Iain Bethune, Michael Goetz, Ronald Schneider

Running on platform 'AMD Accelerated Parallel Processing', device 'Tahiti', version 'OpenCL 1.2 AMD-APP (1124.2)' and driver '1124.2 (VM)'.

14^32768+1 37557 digits 0 days 0.0 hours (0.09 ms/mul, 124758 iterations) 5291 GFLOPS
75898^32768+1 159916 digits 0 days 0.0 hours (0.08 ms/mul, 531226 iterations) 19721 GFLOPS
700000^32768+1 191533 digits 0 days 0.0 hours (0.08 ms/mul, 636255 iterations) 23569 GFLOPS
5000000^32768+1 219512 digits 0 days 0.0 hours (0.08 ms/mul, 729201 iterations) 26936 GFLOPS

14^65536+1 75113 digits 0 days 0.0 hours (0.08 ms/mul, 249517 iterations) 9139 GFLOPS
75898^65536+1 319831 digits 0 days 0.0 hours (0.09 ms/mul, 1062453 iterations) 47619 GFLOPS
710000^65536+1 383469 digits 0 days 0.0 hours (0.08 ms/mul, 1273852 iterations) 47619 GFLOPS
2500000^65536+1 419296 digits 0 days 0.0 hours (0.11 ms/mul, 1392868 iterations) 72631 GFLOPS

14^131072+1 150226 digits 0 days 0.0 hours (0.14 ms/mul, 499036 iterations) 33189 GFLOPS
75898^131072+1 639662 digits 0 days 0.0 hours (0.14 ms/mul, 2124908 iterations) 143819 GFLOPS
700000^131072+1 766129 digits 0 days 0.0 hours (0.12 ms/mul, 2545023 iterations) 151515 GFLOPS
1000000^131072+1 786432 digits 0 days 0.0 hours (0.13 ms/mul, 2612469 iterations) 156806 GFLOPS

14^262144+1 300451 digits 0 days 0.0 hours (0.33 ms/mul, 998074 iterations) 157287 GFLOPS
75898^262144+1 1279324 digits 0 days 0.4 hours (0.34 ms/mul, 4249818 iterations) 702741 GFLOPS
468750^262144+1 1486604 digits 0 days 0.4 hours (0.31 ms/mul, 4938388 iterations) 740740 GFLOPS
815000^262144+1 1549575 digits 0 days 0.4 hours (0.31 ms/mul, 5147574 iterations) 772486 GFLOPS

14^524288+1 600902 digits 0 days 0.4 hours (0.89 ms/mul, 1996149 iterations) 853294 GFLOPS
75898^524288+1 2558647 digits 0 days 1.9 hours (0.81 ms/mul, 8499637 iterations) 3315533 GFLOPS
468750^524288+1 2973207 digits 0 days 2.3 hours (0.87 ms/mul, 9876777 iterations) 4151992 GFLOPS
710000^524288+1 3067745 digits 0 days 2.2 hours (0.81 ms/mul, 10190825 iterations) 3974984 GFLOPS

Fast, but not as fast as a TITAN.

Profile VictordeHollanderProject donor
Send message
Joined: 13 Jan 11
Posts: 25
ID: 81079
Credit: 300,776,184
RAC: 37
321 LLR Silver: Earned 100,000 credits (433,133)Cullen LLR Silver: Earned 100,000 credits (448,329)ESP LLR Gold: Earned 500,000 credits (623,750)PPS LLR Gold: Earned 500,000 credits (520,342)PSP LLR Silver: Earned 100,000 credits (160,346)SoB LLR Silver: Earned 100,000 credits (173,517)SR5 LLR Silver: Earned 100,000 credits (133,674)SGS LLR Silver: Earned 100,000 credits (338,786)TRP LLR Gold: Earned 500,000 credits (618,625)Woodall LLR Silver: Earned 100,000 credits (260,361)Cullen/Woodall Sieve (suspended) Ruby: Earned 2,000,000 credits (4,153,179)PPS Sieve Double Silver: Earned 200,000,000 credits (280,553,783)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Amethyst: Earned 1,000,000 credits (1,160,571)TRP Sieve (suspended) Amethyst: Earned 1,000,000 credits (1,406,640)GFN Ruby: Earned 2,000,000 credits (4,856,299)PSA Ruby: Earned 2,000,000 credits (4,934,850)
Message 68489 - Posted: 17 Aug 2013 | 7:40:35 UTC

My first WU with Genefer OpenCL on my 7950 is complete and validated:

http://www.primegrid.com/workunit.php?wuid=343200843

It took 27,721 sec.

Profile Roger
Volunteer moderator
Project administrator
Volunteer developer
Volunteer tester
Project scientist
Avatar
Send message
Joined: 27 Nov 11
Posts: 1112
ID: 120786
Credit: 261,530,863
RAC: 1,408
Found 1 prime in the 2018 Tour de Primes321 LLR Ruby: Earned 2,000,000 credits (2,012,522)Cullen LLR Amethyst: Earned 1,000,000 credits (1,359,862)ESP LLR Ruby: Earned 2,000,000 credits (2,213,934)Generalized Cullen/Woodall LLR Ruby: Earned 2,000,000 credits (2,088,705)PPS LLR Ruby: Earned 2,000,000 credits (2,617,785)PSP LLR Ruby: Earned 2,000,000 credits (2,420,512)SoB LLR Amethyst: Earned 1,000,000 credits (1,780,064)SR5 LLR Ruby: Earned 2,000,000 credits (2,238,295)SGS LLR Ruby: Earned 2,000,000 credits (2,139,392)TRP LLR Ruby: Earned 2,000,000 credits (2,125,391)Woodall LLR Amethyst: Earned 1,000,000 credits (1,311,937)321 Sieve Turquoise: Earned 5,000,000 credits (5,190,731)Cullen/Woodall Sieve (suspended) Silver: Earned 100,000 credits (207,387)Generalized Cullen/Woodall Sieve Turquoise: Earned 5,000,000 credits (5,049,697)PPS Sieve Double Bronze: Earned 100,000,000 credits (100,422,123)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Ruby: Earned 2,000,000 credits (3,227,972)TRP Sieve (suspended) Turquoise: Earned 5,000,000 credits (5,021,659)AP 26/27 Sapphire: Earned 20,000,000 credits (20,295,860)GFN Emerald: Earned 50,000,000 credits (56,515,310)PSA Sapphire: Earned 20,000,000 credits (43,298,465)
Message 68492 - Posted: 17 Aug 2013 | 11:56:03 UTC - in response to Message 68373.

I am trying to get geneferOCL working with BOINC. I've never used app_info.xml, never had a need for it.
I created an app_info.xml file same as DeleteNull, except have <platform>windows_x86_64</platform>
Then I placed the file in C:\ProgramData\BOINC\projects\www.primegrid.com, along with the geneferocl-windows.exe
Then I set Primegrid preferences for short GFN, but there is no GFN option for ATI.
When I start BOINC it downloads some GFN for CPU. If I enable ATI in the preferences it starts PPS Sieve on ATI.
Have I missed a step? What should I be setting the Primegrid preferences to?

Profile Michael GoetzProject donor
Volunteer moderator
Project administrator
Project scientist
Avatar
Send message
Joined: 21 Jan 10
Posts: 12669
ID: 53948
Credit: 184,131,428
RAC: 10,627
The "Shut up already!" badge:  This loud mouth has mansplained on the forums over 10 thousand times!  Sheesh!!!Discovered the World's First GFN-19 prime!!!Discovered 1 mega primeFound 1 prime in the 2018 Tour de PrimesFound 1 prime in the 2019 Tour de Primes321 LLR Ruby: Earned 2,000,000 credits (2,063,182)Cullen LLR Ruby: Earned 2,000,000 credits (2,005,249)ESP LLR Ruby: Earned 2,000,000 credits (3,820,430)Generalized Cullen/Woodall LLR Ruby: Earned 2,000,000 credits (2,145,754)PPS LLR Ruby: Earned 2,000,000 credits (2,773,744)PSP LLR Ruby: Earned 2,000,000 credits (2,632,269)SoB LLR Sapphire: Earned 20,000,000 credits (34,158,496)SR5 LLR Turquoise: Earned 5,000,000 credits (8,293,415)SGS LLR Ruby: Earned 2,000,000 credits (2,012,781)TRP LLR Ruby: Earned 2,000,000 credits (2,737,347)Woodall LLR Ruby: Earned 2,000,000 credits (2,195,123)321 Sieve Turquoise: Earned 5,000,000 credits (5,046,112)Cullen/Woodall Sieve (suspended) Ruby: Earned 2,000,000 credits (4,170,256)Generalized Cullen/Woodall Sieve Turquoise: Earned 5,000,000 credits (5,059,304)PPS Sieve Sapphire: Earned 20,000,000 credits (20,110,788)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Amethyst: Earned 1,000,000 credits (1,035,522)TRP Sieve (suspended) Ruby: Earned 2,000,000 credits (2,051,121)AP 26/27 Turquoise: Earned 5,000,000 credits (7,090,096)GFN Emerald: Earned 50,000,000 credits (64,594,991)PSA Jade: Earned 10,000,000 credits (10,135,447)
Message 68496 - Posted: 17 Aug 2013 | 12:11:50 UTC - in response to Message 68492.

I am trying to get geneferOCL working with BOINC. I've never used app_info.xml, never had a need for it.
I created an app_info.xml file same as DeleteNull, except have <platform>windows_x86_64</platform>
Then I placed the file in C:\ProgramData\BOINC\projects\www.primegrid.com, along with the geneferocl-windows.exe
Then I set Primegrid preferences for short GFN, but there is no GFN option for ATI.
When I start BOINC it downloads some GFN for CPU. If I enable ATI in the preferences it starts PPS Sieve on ATI.
Have I missed a step? What should I be setting the Primegrid preferences to?


You had it right the first time. You need to tell the server you're running a CPU Genefer. There's no ATI option (yet) on the server.

It will send you "CPU tasks", but a task is just a task. App_info overrides the "how it's run" part, and tells your computer to use GeneferOCL instead of what the server is telling it to do.

Tasks sent to a computer consist of two parts: The "what needs to be crunched" part, and the "how does it get crunched" part. App_info redefines the "how does it get crunched" portion, so all the server is really sending is "what needs to be crunched." It doesn't actually matter (much) that it's running on an ATI instead of a CPU.
____________
Please do not PM me with support questions. Ask on the forums instead. Thank you!

My lucky number is 75898524288+1

Yves Gallot
Volunteer developer
Project scientist
Send message
Joined: 19 Aug 12
Posts: 513
ID: 164101
Credit: 295,254,118
RAC: 5,845
GFN Double Silver: Earned 200,000,000 credits (295,254,118)
Message 68498 - Posted: 17 Aug 2013 | 12:54:18 UTC - in response to Message 68485.

HD7970GHz The higher exponents are slower and some lower exponents are faster. Can the GeneferOCL program optimise it's parameters at run time during initialisation?

Yes, it should because with new paramters GeneferOCL is faster on NVidia and slower on ATI (for N = 1048576 and 4194304).

Then, the next step is a self-tuning transform...

Profile Roger
Volunteer moderator
Project administrator
Volunteer developer
Volunteer tester
Project scientist
Avatar
Send message
Joined: 27 Nov 11
Posts: 1112
ID: 120786
Credit: 261,530,863
RAC: 1,408
Found 1 prime in the 2018 Tour de Primes321 LLR Ruby: Earned 2,000,000 credits (2,012,522)Cullen LLR Amethyst: Earned 1,000,000 credits (1,359,862)ESP LLR Ruby: Earned 2,000,000 credits (2,213,934)Generalized Cullen/Woodall LLR Ruby: Earned 2,000,000 credits (2,088,705)PPS LLR Ruby: Earned 2,000,000 credits (2,617,785)PSP LLR Ruby: Earned 2,000,000 credits (2,420,512)SoB LLR Amethyst: Earned 1,000,000 credits (1,780,064)SR5 LLR Ruby: Earned 2,000,000 credits (2,238,295)SGS LLR Ruby: Earned 2,000,000 credits (2,139,392)TRP LLR Ruby: Earned 2,000,000 credits (2,125,391)Woodall LLR Amethyst: Earned 1,000,000 credits (1,311,937)321 Sieve Turquoise: Earned 5,000,000 credits (5,190,731)Cullen/Woodall Sieve (suspended) Silver: Earned 100,000 credits (207,387)Generalized Cullen/Woodall Sieve Turquoise: Earned 5,000,000 credits (5,049,697)PPS Sieve Double Bronze: Earned 100,000,000 credits (100,422,123)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Ruby: Earned 2,000,000 credits (3,227,972)TRP Sieve (suspended) Turquoise: Earned 5,000,000 credits (5,021,659)AP 26/27 Sapphire: Earned 20,000,000 credits (20,295,860)GFN Emerald: Earned 50,000,000 credits (56,515,310)PSA Sapphire: Earned 20,000,000 credits (43,298,465)
Message 68499 - Posted: 17 Aug 2013 | 13:51:20 UTC - in response to Message 68496.
Last modified: 17 Aug 2013 | 13:53:18 UTC

You had it right the first time. You need to tell the server you're running a CPU Genefer. There's no ATI option (yet) on the server.

It will send you "CPU tasks", but a task is just a task. App_info overrides the "how it's run" part, and tells your computer to use GeneferOCL instead of what the server is telling it to do.

Tasks sent to a computer consist of two parts: The "what needs to be crunched" part, and the "how does it get crunched" part. App_info redefines the "how does it get crunched" portion, so all the server is really sending is "what needs to be crunched." It doesn't actually matter (much) that it's running on an ATI instead of a CPU.

OK. BOINC is now reading my app_info file. Trick was to Remove Primegrid project, close BOINC, create www.primegrid.com directory and place app_info.xml file in it, restart BOINC, and add Primegrid project.
In the Primegrid preferences I have set to use CPU only and get only Generalized Fermat Prime Search (short) tasks.
When I update the Primegrid project its not downloading any tasks. In the BOINC Event log I get this:

17/08/2013 9:27:01 PM | PrimeGrid | update requested by user 17/08/2013 9:27:03 PM | PrimeGrid | Sending scheduler request: Requested by user. 17/08/2013 9:27:03 PM | PrimeGrid | Requesting new tasks for CPU 17/08/2013 9:27:09 PM | PrimeGrid | Scheduler request completed: got 0 new tasks 17/08/2013 9:27:09 PM | PrimeGrid | No tasks sent 17/08/2013 9:27:09 PM | PrimeGrid | No tasks are available for Genefer 17/08/2013 9:27:09 PM | PrimeGrid | Tasks for NVIDIA GPU are available, but your preferences are set to not accept them 17/08/2013 9:27:09 PM | PrimeGrid | Tasks for AMD/ATI GPU are available, but your preferences are set to not accept them 17/08/2013 9:27:09 PM | PrimeGrid | Project has no tasks available

It also whines "Your app_info.xml file doesn't have a usable version of..." for the 13 other sub-projects.

Profile Michael GoetzProject donor
Volunteer moderator
Project administrator
Project scientist
Avatar
Send message
Joined: 21 Jan 10
Posts: 12669
ID: 53948
Credit: 184,131,428
RAC: 10,627
The "Shut up already!" badge:  This loud mouth has mansplained on the forums over 10 thousand times!  Sheesh!!!Discovered the World's First GFN-19 prime!!!Discovered 1 mega primeFound 1 prime in the 2018 Tour de PrimesFound 1 prime in the 2019 Tour de Primes321 LLR Ruby: Earned 2,000,000 credits (2,063,182)Cullen LLR Ruby: Earned 2,000,000 credits (2,005,249)ESP LLR Ruby: Earned 2,000,000 credits (3,820,430)Generalized Cullen/Woodall LLR Ruby: Earned 2,000,000 credits (2,145,754)PPS LLR Ruby: Earned 2,000,000 credits (2,773,744)PSP LLR Ruby: Earned 2,000,000 credits (2,632,269)SoB LLR Sapphire: Earned 20,000,000 credits (34,158,496)SR5 LLR Turquoise: Earned 5,000,000 credits (8,293,415)SGS LLR Ruby: Earned 2,000,000 credits (2,012,781)TRP LLR Ruby: Earned 2,000,000 credits (2,737,347)Woodall LLR Ruby: Earned 2,000,000 credits (2,195,123)321 Sieve Turquoise: Earned 5,000,000 credits (5,046,112)Cullen/Woodall Sieve (suspended) Ruby: Earned 2,000,000 credits (4,170,256)Generalized Cullen/Woodall Sieve Turquoise: Earned 5,000,000 credits (5,059,304)PPS Sieve Sapphire: Earned 20,000,000 credits (20,110,788)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Amethyst: Earned 1,000,000 credits (1,035,522)TRP Sieve (suspended) Ruby: Earned 2,000,000 credits (2,051,121)AP 26/27 Turquoise: Earned 5,000,000 credits (7,090,096)GFN Emerald: Earned 50,000,000 credits (64,594,991)PSA Jade: Earned 10,000,000 credits (10,135,447)
Message 68500 - Posted: 17 Aug 2013 | 14:07:18 UTC - in response to Message 68498.

Yves,

I put self-tuning ability into the the CUDA version earlier this year. It's particularly helpful when running on hardware very different from what we have available to test.

The CUDA tuning (essentially a full "-b2 N" test) is only run once at the start of the test, and the result saved in a file for use during restarts. Tuning (aka "shift") can also be set from the command line, and can be specified from the PrimeGrid preferences.

If you want, we can do that with OCL too. I didn't try OCL with N=22 tests, but there may be some value in users "de-tuning" OCL a bit to improve screen lag.
____________
Please do not PM me with support questions. Ask on the forums instead. Thank you!

My lucky number is 75898524288+1

Profile Michael GoetzProject donor
Volunteer moderator
Project administrator
Project scientist
Avatar
Send message
Joined: 21 Jan 10
Posts: 12669
ID: 53948
Credit: 184,131,428
RAC: 10,627
The "Shut up already!" badge:  This loud mouth has mansplained on the forums over 10 thousand times!  Sheesh!!!Discovered the World's First GFN-19 prime!!!Discovered 1 mega primeFound 1 prime in the 2018 Tour de PrimesFound 1 prime in the 2019 Tour de Primes321 LLR Ruby: Earned 2,000,000 credits (2,063,182)Cullen LLR Ruby: Earned 2,000,000 credits (2,005,249)ESP LLR Ruby: Earned 2,000,000 credits (3,820,430)Generalized Cullen/Woodall LLR Ruby: Earned 2,000,000 credits (2,145,754)PPS LLR Ruby: Earned 2,000,000 credits (2,773,744)PSP LLR Ruby: Earned 2,000,000 credits (2,632,269)SoB LLR Sapphire: Earned 20,000,000 credits (34,158,496)SR5 LLR Turquoise: Earned 5,000,000 credits (8,293,415)SGS LLR Ruby: Earned 2,000,000 credits (2,012,781)TRP LLR Ruby: Earned 2,000,000 credits (2,737,347)Woodall LLR Ruby: Earned 2,000,000 credits (2,195,123)321 Sieve Turquoise: Earned 5,000,000 credits (5,046,112)Cullen/Woodall Sieve (suspended) Ruby: Earned 2,000,000 credits (4,170,256)Generalized Cullen/Woodall Sieve Turquoise: Earned 5,000,000 credits (5,059,304)PPS Sieve Sapphire: Earned 20,000,000 credits (20,110,788)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Amethyst: Earned 1,000,000 credits (1,035,522)TRP Sieve (suspended) Ruby: Earned 2,000,000 credits (2,051,121)AP 26/27 Turquoise: Earned 5,000,000 credits (7,090,096)GFN Emerald: Earned 50,000,000 credits (64,594,991)PSA Jade: Earned 10,000,000 credits (10,135,447)
Message 68502 - Posted: 17 Aug 2013 | 14:19:00 UTC - in response to Message 68499.
Last modified: 17 Aug 2013 | 20:25:29 UTC

OK. BOINC is now reading my app_info file. Trick was to Remove Primegrid project, close BOINC, create www.primegrid.com directory and place app_info.xml file in it, restart BOINC, and add Primegrid project.


I'ts not necessary to remove PrimeGrid. Just put the app_info file in there and restart the BOINC client. It's best to reboot to make sure, since the BOINC client runs in the background -- it's NOT the program you're interacting with on the screen.

In the Primegrid preferences I have set to use CPU only and get only Generalized Fermat Prime Search (short) tasks.
When I update the Primegrid project its not downloading any tasks. In the BOINC Event log I get this:


I'm not sure what's going on, because I'm looking at your preferences and they're not set that way.

On the "---" venue, you have SGS and PPS Sieve (ATI & CUDA) selected (and CPU and ATI enabled). On the "work" venue you have SR5 and PPS Sieve enabled (and CPU/ATI/CUDA processing enabled.

On neither venue is GFN enabled. I'm assuming you've turned it off?
____________
Please do not PM me with support questions. Ask on the forums instead. Thank you!

My lucky number is 75898524288+1

Profile [DPC]CamulosProject donor
Send message
Joined: 23 Mar 11
Posts: 86
ID: 91898
Credit: 100,928,703
RAC: 0
321 LLR Gold: Earned 500,000 credits (650,015)Cullen LLR Gold: Earned 500,000 credits (820,374)ESP LLR Gold: Earned 500,000 credits (550,246)Generalized Cullen/Woodall LLR Gold: Earned 500,000 credits (607,632)PPS LLR Turquoise: Earned 5,000,000 credits (9,245,280)PSP LLR Gold: Earned 500,000 credits (539,445)SoB LLR Gold: Earned 500,000 credits (627,950)SR5 LLR Gold: Earned 500,000 credits (801,970)SGS LLR Ruby: Earned 2,000,000 credits (2,005,187)TRP LLR Gold: Earned 500,000 credits (523,441)Woodall LLR Gold: Earned 500,000 credits (718,450)Cullen/Woodall Sieve (suspended) Ruby: Earned 2,000,000 credits (4,084,280)Generalized Cullen/Woodall Sieve Gold: Earned 500,000 credits (510,148)PPS Sieve Sapphire: Earned 20,000,000 credits (41,708,032)TRP Sieve (suspended) Gold: Earned 500,000 credits (560,644)AP 26/27 Turquoise: Earned 5,000,000 credits (5,061,836)GFN Sapphire: Earned 20,000,000 credits (31,550,516)PSA Silver: Earned 100,000 credits (357,578)
Message 68503 - Posted: 17 Aug 2013 | 15:51:35 UTC - in response to Message 68486.
Last modified: 17 Aug 2013 | 15:54:00 UTC

It seems that a GFN task takes up to 20+ hours on a HD5870 (based on 5% done in 1 hour)

Running the B3 test on a AMD HD5870 icm with Windows x64 and Intel Core-i7 860

PS C:\Users\E> .\geneferocl-windows.exe -b3
geneferocl 3.1.2-2 (Windows 64-bit OpenGL)
Copyright 2001-2013, Yves Gallot
Copyright 2009, Mark Rodenkirch, David Underbakke
Copyright 2010-2012, Shoichiro Yamada, Ken Brazier
Copyright 2011-2013, Iain Bethune, Michael Goetz, Ronald Schneider

Command line: C:\Users\E\geneferocl-windows.exe -b3


Running on platform 'AMD Accelerated Parallel Processing', device 'Cypress', version 'OpenCL 1.2 AMD-APP (1124.2)' and d
river '1124.2 (VM)'.

14^32768+1 37557 digits 0 days 0.0 hours (0.17 ms/mul, 124758 iterations) 10101 GFLOPS
75898^32768+1 159916 digits 0 days 0.0 hours (0.14 ms/mul, 531226 iterations) 37037 GFLOPS
700000^32768+1 191533 digits 0 days 0.0 hours (0.14 ms/mul, 636255 iterations) 43290 GFLOPS
5000000^32768+1 219512 digits 0 days 0.0 hours (0.14 ms/mul, 729201 iterations) 50505 GFLOPS

14^65536+1 75113 digits 0 days 0.0 hours (0.24 ms/mul, 249517 iterations) 28379 GFLOPS
75898^65536+1 319831 digits 0 days 0.0 hours (0.24 ms/mul, 1062453 iterations) 120250 GFLOPS
710000^65536+1 383469 digits 0 days 0.0 hours (0.26 ms/mul, 1273852 iterations) 157287 GFLOPS
2500000^65536+1 419296 digits 0 days 0.0 hours (0.23 ms/mul, 1392868 iterations) 155363 GFLOPS

14^131072+1 150226 digits 0 days 0.0 hours (0.53 ms/mul, 499036 iterations) 126022 GFLOPS
75898^131072+1 639662 digits 0 days 0.2 hours (0.50 ms/mul, 2124908 iterations) 513708 GFLOPS
700000^131072+1 766129 digits 0 days 0.3 hours (0.51 ms/mul, 2545023 iterations) 626743 GFLOPS
1000000^131072+1 786432 digits 0 days 0.3 hours (0.50 ms/mul, 2612469 iterations) 626743 GFLOPS

14^262144+1 300451 digits 0 days 0.2 hours (0.94 ms/mul, 998074 iterations) 453583 GFLOPS
75898^262144+1 1279324 digits 0 days 1.0 hours (0.93 ms/mul, 4249818 iterations) 1894659 GFLOPS
468750^262144+1 1486604 digits 0 days 1.2 hours (0.92 ms/mul, 4938388 iterations) 2192398 GFLOPS
815000^262144+1 1549575 digits 0 days 1.3 hours (0.92 ms/mul, 5147574 iterations) 2285231 GFLOPS

14^524288+1 600902 digits 0 days 1.0 hours (1.89 ms/mul, 1996149 iterations) 1816256 GFLOPS
75898^524288+1 2558647 digits 0 days 4.5 hours (1.95 ms/mul, 8499637 iterations) 7951411 GFLOPS
468750^524288+1 2973207 digits 0 days 5.0 hours (1.85 ms/mul, 9876777 iterations) 8788832 GFLOPS
710000^524288+1 3067745 digits 0 days 5.2 hours (1.86 ms/mul, 10190825 iterations) 9122165 GFLOPS

14^1048576+1 1201803 digits 0 days 4.6 hours (4.21 ms/mul, 3992299 iterations) 8093787 GFLOPS
75898^1048576+1 5117293 digits 0 days 19.4 hours (4.11 ms/mul, 16999276 iterations) 33606027 GFLOPS
468750^1048576+1 5946413 digits 0 days 22.5 hours (4.11 ms/mul, 19753555 iterations) 39041327 GFLOPS
700000^1048576+1 6129030 digits 0 days 23.2 hours (4.11 ms/mul, 20360194 iterations) 40289041 GFLOPS

14^2097152+1 2403605 digits 0 days 19.6 hours (8.85 ms/mul, 7984600 iterations) 34004295 GFLOPS
75898^2097152+1 10234585 digits 3 days 10.0 hours (8.69 ms/mul, 33998553 iterations) 142110007 GFLOPS
380742^2097152+1 11703432 digits 3 days 21.9 hours (8.70 ms/mul, 38877955 iterations) 162711237 GFLOPS
570000^2097152+1 12070945 digits 4 days 0.7 hours (8.69 ms/mul, 40098808 iterations) 167550578 GFLOPS

14^4194304+1 4807210 digits 3 days 14.8 hours (19.58 ms/mul, 15969202 iterations) 150366853 GFLOPS
1248^4194304+1 12986466 digits 9 days 14.8 hours (19.26 ms/mul, 43140102 iterations) 399673001 GFLOPS
10000^4194304+1 16777217 digits 12 days 10.8 hours (19.31 ms/mul, 55732704 iterations) 517570911 GFLOPS
50000^4194304+1 19708909 digits 14 days 13.0 hours (19.20 ms/mul, 65471576 iterations) 604485206 GFLOPS

150000^4194304+1 21710101 digits 16 days 1.7 hours (19.26 ms/mul, 72119391 iterations) 668048875 GFLOPS
309258^4194304+1 23028076 digits 17 days 1.1 hours (19.25 ms/mul, 76497608 iterations) 708420648 GFLOPS
480000^4194304+1 23828853 digits 17 days 13.5 hours (19.17 ms/mul, 79157734 iterations) 730009371 GFLOPS

14^8388608+1 9614419 digits 16 days 10.5 hours (44.47 ms/mul, 31938406 iterations) 683195084 GFLOPS
36^8388608+1 13055212 digits 22 days 4.9 hours (44.24 ms/mul, 43368473 iterations) 922773007 GFLOPS
100^8388608+1 16777217 digits 28 days 7.7 hours (43.91 ms/mul, 55732704 iterations) 1177114263 GFLOPS
____________
Member of the Dutch Power Cows
My Stats

Profile Michael GoetzProject donor
Volunteer moderator
Project administrator
Project scientist
Avatar
Send message
Joined: 21 Jan 10
Posts: 12669
ID: 53948
Credit: 184,131,428
RAC: 10,627
The "Shut up already!" badge:  This loud mouth has mansplained on the forums over 10 thousand times!  Sheesh!!!Discovered the World's First GFN-19 prime!!!Discovered 1 mega primeFound 1 prime in the 2018 Tour de PrimesFound 1 prime in the 2019 Tour de Primes321 LLR Ruby: Earned 2,000,000 credits (2,063,182)Cullen LLR Ruby: Earned 2,000,000 credits (2,005,249)ESP LLR Ruby: Earned 2,000,000 credits (3,820,430)Generalized Cullen/Woodall LLR Ruby: Earned 2,000,000 credits (2,145,754)PPS LLR Ruby: Earned 2,000,000 credits (2,773,744)PSP LLR Ruby: Earned 2,000,000 credits (2,632,269)SoB LLR Sapphire: Earned 20,000,000 credits (34,158,496)SR5 LLR Turquoise: Earned 5,000,000 credits (8,293,415)SGS LLR Ruby: Earned 2,000,000 credits (2,012,781)TRP LLR Ruby: Earned 2,000,000 credits (2,737,347)Woodall LLR Ruby: Earned 2,000,000 credits (2,195,123)321 Sieve Turquoise: Earned 5,000,000 credits (5,046,112)Cullen/Woodall Sieve (suspended) Ruby: Earned 2,000,000 credits (4,170,256)Generalized Cullen/Woodall Sieve Turquoise: Earned 5,000,000 credits (5,059,304)PPS Sieve Sapphire: Earned 20,000,000 credits (20,110,788)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Amethyst: Earned 1,000,000 credits (1,035,522)TRP Sieve (suspended) Ruby: Earned 2,000,000 credits (2,051,121)AP 26/27 Turquoise: Earned 5,000,000 credits (7,090,096)GFN Emerald: Earned 50,000,000 credits (64,594,991)PSA Jade: Earned 10,000,000 credits (10,135,447)
Message 68508 - Posted: 17 Aug 2013 | 17:22:07 UTC - in response to Message 68503.

Running the B3 test on a AMD HD5870 icm with Windows x64 and Intel Core-i7 860


The -b test is a better, more concise, and much faster test and is better for looking at relative performance. The -b3 test was put in there mostly as fluff, and to give you an estimate of run times for complete computations.



____________
Please do not PM me with support questions. Ask on the forums instead. Thank you!

My lucky number is 75898524288+1

Profile Michael GoetzProject donor
Volunteer moderator
Project administrator
Project scientist
Avatar
Send message
Joined: 21 Jan 10
Posts: 12669
ID: 53948
Credit: 184,131,428
RAC: 10,627
The "Shut up already!" badge:  This loud mouth has mansplained on the forums over 10 thousand times!  Sheesh!!!Discovered the World's First GFN-19 prime!!!Discovered 1 mega primeFound 1 prime in the 2018 Tour de PrimesFound 1 prime in the 2019 Tour de Primes321 LLR Ruby: Earned 2,000,000 credits (2,063,182)Cullen LLR Ruby: Earned 2,000,000 credits (2,005,249)ESP LLR Ruby: Earned 2,000,000 credits (3,820,430)Generalized Cullen/Woodall LLR Ruby: Earned 2,000,000 credits (2,145,754)PPS LLR Ruby: Earned 2,000,000 credits (2,773,744)PSP LLR Ruby: Earned 2,000,000 credits (2,632,269)SoB LLR Sapphire: Earned 20,000,000 credits (34,158,496)SR5 LLR Turquoise: Earned 5,000,000 credits (8,293,415)SGS LLR Ruby: Earned 2,000,000 credits (2,012,781)TRP LLR Ruby: Earned 2,000,000 credits (2,737,347)Woodall LLR Ruby: Earned 2,000,000 credits (2,195,123)321 Sieve Turquoise: Earned 5,000,000 credits (5,046,112)Cullen/Woodall Sieve (suspended) Ruby: Earned 2,000,000 credits (4,170,256)Generalized Cullen/Woodall Sieve Turquoise: Earned 5,000,000 credits (5,059,304)PPS Sieve Sapphire: Earned 20,000,000 credits (20,110,788)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Amethyst: Earned 1,000,000 credits (1,035,522)TRP Sieve (suspended) Ruby: Earned 2,000,000 credits (2,051,121)AP 26/27 Turquoise: Earned 5,000,000 credits (7,090,096)GFN Emerald: Earned 50,000,000 credits (64,594,991)PSA Jade: Earned 10,000,000 credits (10,135,447)
Message 68515 - Posted: 17 Aug 2013 | 21:08:19 UTC

The latest version of GeneferOCL (3.1.2.-3) incorporating Yves' most recent changes can now be downloaded via this thread.

It's faster. You want it.
____________
Please do not PM me with support questions. Ask on the forums instead. Thank you!

My lucky number is 75898524288+1

Husu*
Avatar
Send message
Joined: 16 Jan 12
Posts: 15
ID: 127298
Credit: 165,338,156
RAC: 0
321 LLR Bronze: Earned 10,000 credits (15,207)PPS LLR Silver: Earned 100,000 credits (373,654)SGS LLR Silver: Earned 100,000 credits (129,350)TRP LLR Bronze: Earned 10,000 credits (12,028)Woodall LLR Bronze: Earned 10,000 credits (11,232)Cullen/Woodall Sieve (suspended) Silver: Earned 100,000 credits (292,273)PPS Sieve Double Bronze: Earned 100,000,000 credits (159,185,362)GFN Turquoise: Earned 5,000,000 credits (5,316,037)
Message 68516 - Posted: 17 Aug 2013 | 21:47:17 UTC

Not much effect on Titan it seems.

New drivers fixed my Genefer crashes so leaving the Titan to crunch over night with BOINC & geneferocl, will see the results tomorrow.

All test runs I've made have been ok (no crashes or bailouts because of errors).

-----

geneferocl 3.1.2-3 (Windows 32-bit OpenGL)
Copyright 2001-2013, Yves Gallot
Copyright 2009, Mark Rodenkirch, David Underbakke
Copyright 2010-2012, Shoichiro Yamada, Ken Brazier
Copyright 2011-2013, Iain Bethune, Michael Goetz, Ronald Schneider

Command line: geneferocl-windows.exe -b


Running on platform 'NVIDIA CUDA', device 'GeForce GTX TITAN', version 'OpenCL 1.1 CUDA' and driver '326.41'.

Generalized Fermat Number Bench
2199064^8192+1 Time: 74.3 us/mul. Err: 0.2188 51956 digits
1798620^16384+1 Time: 76.2 us/mul. Err: 0.2344 102481 digits
1471094^32768+1 Time: 80 us/mul. Err: 0.2344 202102 digits
1203210^65536+1 Time: 95.2 us/mul. Err: 0.2813 398482 digits
984108^131072+1 Time: 129 us/mul. Err: 0.2295 785521 digits
804904^262144+1 Time: 274 us/mul. Err: 0.2188 1548156 digits
658332^524288+1 Time: 486 us/mul. Err: 0.2266 3050541 digits
538452^1048576+1 Time: 973 us/mul. Err: 0.2188 6009544 digits
440400^2097152+1 Time: 1.7 ms/mul. Err: 0.2188 11836006 digits
360204^4194304+1 Time: 3.41 ms/mul. Err: 0.1953 23305854 digits
294612^8388608+1 Time: 6.81 ms/mul. Err: 0.2070 45879398 digits
Genefer Mark = 117.

-----

geneferocl 3.1.2-3 (Windows 32-bit OpenGL)
Copyright 2001-2013, Yves Gallot
Copyright 2009, Mark Rodenkirch, David Underbakke
Copyright 2010-2012, Shoichiro Yamada, Ken Brazier
Copyright 2011-2013, Iain Bethune, Michael Goetz, Ronald Schneider

Command line: geneferocl-windows.exe -b3


Running on platform 'NVIDIA CUDA', device 'GeForce GTX TITAN', version 'OpenCL 1.1 CUDA' and driver '326.41'.

14^32768+1 37557 digits 0 days 0.0 hours (0.10 ms/mul, 124758 iterations) 294 GFLOPS
75898^32768+1 159916 digits 0 days 0.0 hours (0.10 ms/mul, 531226 iterations) 1253 GFLOPS
700000^32768+1 191533 digits 0 days 0.0 hours (0.08 ms/mul, 636255 iterations) 1501 GFLOPS
5000000^32768+1 219512 digits 0 days 0.0 hours (0.08 ms/mul, 729201 iterations) 1720 GFLOPS

14^65536+1 75113 digits 0 days 0.0 hours (0.10 ms/mul, 249517 iterations) 1243 GFLOPS
75898^65536+1 319831 digits 0 days 0.0 hours (0.10 ms/mul, 1062453 iterations) 5292 GFLOPS
710000^65536+1 383469 digits 0 days 0.0 hours (0.10 ms/mul, 1273852 iterations) 6345 GFLOPS
2500000^65536+1 419296 digits 0 days 0.0 hours (0.10 ms/mul, 1392868 iterations) 6938 GFLOPS

14^131072+1 150226 digits 0 days 0.0 hours (0.14 ms/mul, 499036 iterations) 5233 GFLOPS
75898^131072+1 639662 digits 0 days 0.0 hours (0.14 ms/mul, 2124908 iterations) 22281 GFLOPS
700000^131072+1 766129 digits 0 days 0.0 hours (0.14 ms/mul, 2545023 iterations) 26687 GFLOPS
1000000^131072+1 786432 digits 0 days 0.1 hours (0.14 ms/mul, 2612469 iterations) 27394 GFLOPS

14^262144+1 300451 digits 0 days 0.0 hours (0.27 ms/mul, 998074 iterations) 21978 GFLOPS
75898^262144+1 1279324 digits 0 days 0.3 hours (0.27 ms/mul, 4249818 iterations) 93581 GFLOPS
468750^262144+1 1486604 digits 0 days 0.3 hours (0.26 ms/mul, 4938388 iterations) 108744 GFLOPS
815000^262144+1 1549575 digits 0 days 0.3 hours (0.28 ms/mul, 5147574 iterations) 113350 GFLOPS

14^524288+1 600902 digits 0 days 0.2 hours (0.49 ms/mul, 1996149 iterations) 92097 GFLOPS
75898^524288+1 2558647 digits 0 days 1.1 hours (0.48 ms/mul, 8499637 iterations) 392151 GFLOPS
468750^524288+1 2973207 digits 0 days 1.3 hours (0.50 ms/mul, 9876777 iterations) 455688 GFLOPS
710000^524288+1 3067745 digits 0 days 1.3 hours (0.48 ms/mul, 10190825 iterations) 470178 GFLOPS

14^1048576+1 1201803 digits 0 days 0.9 hours (0.90 ms/mul, 3992299 iterations) 385133 GFLOPS
75898^1048576+1 5117293 digits 0 days 4.1 hours (0.88 ms/mul, 16999276 iterations) 1639903 GFLOPS
468750^1048576+1 5946413 digits 0 days 4.8 hours (0.89 ms/mul, 19753555 iterations) 1905606 GFLOPS
700000^1048576+1 6129030 digits 0 days 5.0 hours (0.89 ms/mul, 20360194 iterations) 1964127 GFLOPS

14^2097152+1 2403605 digits 0 days 3.7 hours (1.67 ms/mul, 7984600 iterations) 1607512 GFLOPS
75898^2097152+1 10234585 digits 0 days 15.6 hours (1.65 ms/mul, 33998553 iterations) 6844813 GFLOPS
380742^2097152+1 11703432 digits 0 days 17.8 hours (1.65 ms/mul, 38877955 iterations) 7827166 GFLOPS
570000^2097152+1 12070945 digits 0 days 18.2 hours (1.64 ms/mul, 40098808 iterations) 8072956 GFLOPS

14^4194304+1 4807210 digits 0 days 15.1 hours (3.42 ms/mul, 15969202 iterations) 6697969 GFLOPS
1248^4194304+1 12986466 digits 1 days 16.3 hours (3.37 ms/mul, 43140102 iterations) 18094270 GFLOPS
10000^4194304+1 16777217 digits 2 days 4.4 hours (3.38 ms/mul, 55732704 iterations) 23375990 GFLOPS
50000^4194304+1 19708909 digits 2 days 13.2 hours (3.37 ms/mul, 65471576 iterations) 27460769 GFLOPS
150000^4194304+1 21710101 digits 2 days 19.8 hours (3.39 ms/mul, 72119391 iterations) 30249065 GFLOPS
309258^4194304+1 23028076 digits 2 days 23.9 hours (3.38 ms/mul, 76497608 iterations) 32085422 GFLOPS
480000^4194304+1 23828853 digits 3 days 2.1 hours (3.37 ms/mul, 79157734 iterations) 33201160 GFLOPS

14^8388608+1 9614419 digits 2 days 15.3 hours (7.14 ms/mul, 31938406 iterations) 27863552 GFLOPS
36^8388608+1 13055212 digits 3 days 13.6 hours (7.11 ms/mul, 43368473 iterations) 37835316 GFLOPS
100^8388608+1 16777217 digits 4 days 13.8 hours (7.10 ms/mul, 55732704 iterations) 48622060 GFLOPS

Profile chip
Avatar
Send message
Joined: 12 Apr 11
Posts: 128
ID: 94709
Credit: 164,082,201
RAC: 5,606
321 LLR Amethyst: Earned 1,000,000 credits (1,081,800)ESP LLR Amethyst: Earned 1,000,000 credits (1,958,365)PPS LLR Ruby: Earned 2,000,000 credits (3,000,162)PSP LLR Amethyst: Earned 1,000,000 credits (1,022,562)SoB LLR Ruby: Earned 2,000,000 credits (2,015,539)SR5 LLR Ruby: Earned 2,000,000 credits (2,000,481)SGS LLR Ruby: Earned 2,000,000 credits (2,000,014)TRP LLR Ruby: Earned 2,000,000 credits (3,000,865)321 Sieve Ruby: Earned 2,000,000 credits (2,000,357)Cullen/Woodall Sieve (suspended) Ruby: Earned 2,000,000 credits (4,000,096)PPS Sieve Emerald: Earned 50,000,000 credits (50,000,422)TRP Sieve (suspended) Ruby: Earned 2,000,000 credits (2,000,500)AP 26/27 Sapphire: Earned 20,000,000 credits (20,000,721)GFN Sapphire: Earned 20,000,000 credits (20,000,117)PSA Emerald: Earned 50,000,000 credits (50,000,198)
Message 68517 - Posted: 17 Aug 2013 | 22:02:31 UTC
Last modified: 17 Aug 2013 | 22:03:01 UTC

Husu*, you forgot to abort the task on previous host 400849 and now it is lost - it is very bad.

Scott BrownProject donor
Volunteer moderator
Project administrator
Volunteer tester
Project scientist
Avatar
Send message
Joined: 17 Oct 05
Posts: 1909
ID: 1178
Credit: 6,286,595,170
RAC: 167,469
Discovered the World's First base 116 Generalized Cullen prime!!!Discovered 13 mega primesEliminated 7 conjecture "k"sDiscovered 1 Sophie Germain pairDiscovered 1 Fermat divisor2012 Tour de Primes highest prime count2012 Tour de Primes most Mountain Stage primes2015 Tour de Primes highest prime count2016 Tour de Primes highest prime countFound 23 primes in the 2018 Tour de PrimesFound 1 mega prime in the 2018 Tour de PrimesFound 2 primes in the 2018 Tour de Primes Mountain Stage2019 Tour de Primes highest prime countFound 22 primes in the 2019 Tour de Primes321 LLR Double Bronze: Earned 100,000,000 credits (100,829,118)Cullen LLR Double Bronze: Earned 100,000,000 credits (103,870,990)ESP LLR Double Bronze: Earned 100,000,000 credits (114,443,502)Generalized Cullen/Woodall LLR Double Bronze: Earned 100,000,000 credits (108,461,080)PPS LLR Double Silver: Earned 200,000,000 credits (343,575,390)PSP LLR Double Bronze: Earned 100,000,000 credits (108,003,110)SoB LLR Double Bronze: Earned 100,000,000 credits (135,747,083)SR5 LLR Double Silver: Earned 200,000,000 credits (201,210,377)SGS LLR Double Bronze: Earned 100,000,000 credits (160,908,586)TPS LLR (retired) Silver: Earned 100,000 credits (235,439)TRP LLR Double Bronze: Earned 100,000,000 credits (121,443,822)Woodall LLR Double Bronze: Earned 100,000,000 credits (101,447,725)321 Sieve Double Silver: Earned 200,000,000 credits (203,510,966)Cullen/Woodall Sieve (suspended) Emerald: Earned 50,000,000 credits (83,794,448)Generalized Cullen/Woodall Sieve Double Silver: Earned 200,000,000 credits (285,139,652)PPS Sieve Double Ruby: Earned 2,000,000,000 credits (2,107,819,760)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Double Silver: Earned 200,000,000 credits (203,523,358)TRP Sieve (suspended) Double Silver: Earned 200,000,000 credits (201,489,157)AP 26/27 Double Bronze: Earned 100,000,000 credits (119,562,131)GFN Double Amethyst: Earned 1,000,000,000 credits (1,222,501,936)PSA Double Silver: Earned 200,000,000 credits (259,058,048)
Message 68518 - Posted: 17 Aug 2013 | 22:13:00 UTC
Last modified: 17 Aug 2013 | 22:14:35 UTC

Slower overall than geneferCUDA on GT-440 OEM card (64-bit Vista in i7-920):

Command line: geneferocl-windows.exe -b Running on platform 'NVIDIA CUDA', device 'GeForce GT 440', version 'OpenCL 1.1 CUDA' and driver '314.07'. Generalized Fermat Number Bench 2199064^8192+1 Time: 120 us/mul. Err: 0.2188 51956 digits 1798620^16384+1 Time: 169 us/mul. Err: 0.2344 102481 digits 1471094^32768+1 Time: 311 us/mul. Err: 0.2344 202102 digits 1203210^65536+1 Time: 583 us/mul. Err: 0.2813 398482 digits 984108^131072+1 Time: 1.05 ms/mul. Err: 0.2295 785521 digits 804904^262144+1 Time: 2.24 ms/mul. Err: 0.2188 1548156 digits 658332^524288+1 Time: 4.47 ms/mul. Err: 0.2266 3050541 digits 538452^1048576+1 Time: 9.36 ms/mul. Err: 0.2188 6009544 digits 440400^2097152+1 Time: 20 ms/mul. Err: 0.2188 11836006 digits 360204^4194304+1 Time: 42.5 ms/mul. Err: 0.1953 23305854 digits 294612^8388608+1 Time: 92.7 ms/mul. Err: 0.2070 45879398 digits Genefer Mark = 10.



Command line: genefercuda.exe -b Generalized Fermat Number Bench 2009574^8192+1 Time: 192 us/mul. Err: 0.1719 51636 digits 1632282^16384+1 Time: 194 us/mul. Err: 0.1563 101791 digits 1325824^32768+1 Time: 328 us/mul. Err: 0.1563 200622 digits 1076904^65536+1 Time: 583 us/mul. Err: 0.1602 395325 digits 874718^131072+1 Time: 1.13 ms/mul. Err: 0.1563 778813 digits 710492^262144+1 Time: 2.04 ms/mul. Err: 0.1641 1533952 digits 577098^524288+1 Time: 4.29 ms/mul. Err: 0.1875 3020555 digits 468750^1048576+1 Time: 8.71 ms/mul. Err: 0.1563 5946413 digits 380742^2097152+1 Time: 18.4 ms/mul. Err: 0.1484 11703432 digits 309258^4194304+1 Time: 36.7 ms/mul. Err: 0.1719 23028076 digits 100^8388608+1 Time: 76.4 ms/mul. Err: 0.0000 16777217 digits


EDIT: Should note that all 8 threads (HT is on) were loaded and running a combination of TRP sieve and PPS LLR

Scott BrownProject donor
Volunteer moderator
Project administrator
Volunteer tester
Project scientist
Avatar
Send message
Joined: 17 Oct 05
Posts: 1909
ID: 1178
Credit: 6,286,595,170
RAC: 167,469
Discovered the World's First base 116 Generalized Cullen prime!!!Discovered 13 mega primesEliminated 7 conjecture "k"sDiscovered 1 Sophie Germain pairDiscovered 1 Fermat divisor2012 Tour de Primes highest prime count2012 Tour de Primes most Mountain Stage primes2015 Tour de Primes highest prime count2016 Tour de Primes highest prime countFound 23 primes in the 2018 Tour de PrimesFound 1 mega prime in the 2018 Tour de PrimesFound 2 primes in the 2018 Tour de Primes Mountain Stage2019 Tour de Primes highest prime countFound 22 primes in the 2019 Tour de Primes321 LLR Double Bronze: Earned 100,000,000 credits (100,829,118)Cullen LLR Double Bronze: Earned 100,000,000 credits (103,870,990)ESP LLR Double Bronze: Earned 100,000,000 credits (114,443,502)Generalized Cullen/Woodall LLR Double Bronze: Earned 100,000,000 credits (108,461,080)PPS LLR Double Silver: Earned 200,000,000 credits (343,575,390)PSP LLR Double Bronze: Earned 100,000,000 credits (108,003,110)SoB LLR Double Bronze: Earned 100,000,000 credits (135,747,083)SR5 LLR Double Silver: Earned 200,000,000 credits (201,210,377)SGS LLR Double Bronze: Earned 100,000,000 credits (160,908,586)TPS LLR (retired) Silver: Earned 100,000 credits (235,439)TRP LLR Double Bronze: Earned 100,000,000 credits (121,443,822)Woodall LLR Double Bronze: Earned 100,000,000 credits (101,447,725)321 Sieve Double Silver: Earned 200,000,000 credits (203,510,966)Cullen/Woodall Sieve (suspended) Emerald: Earned 50,000,000 credits (83,794,448)Generalized Cullen/Woodall Sieve Double Silver: Earned 200,000,000 credits (285,139,652)PPS Sieve Double Ruby: Earned 2,000,000,000 credits (2,107,819,760)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Double Silver: Earned 200,000,000 credits (203,523,358)TRP Sieve (suspended) Double Silver: Earned 200,000,000 credits (201,489,157)AP 26/27 Double Bronze: Earned 100,000,000 credits (119,562,131)GFN Double Amethyst: Earned 1,000,000,000 credits (1,222,501,936)PSA Double Silver: Earned 200,000,000 credits (259,058,048)
Message 68519 - Posted: 17 Aug 2013 | 22:22:54 UTC

...But faster on GTX 650Ti (AMD 1100T, 64-bit WIn7 with all cores same load as my i7-920):


Command line: geneferocl-windows.exe -b Running on platform 'NVIDIA CUDA', device 'GeForce GTX 650 Ti', version 'OpenCL 1.1 CUDA' and driver '311.06'. Generalized Fermat Number Bench 2199064^8192+1 Time: 130 us/mul. Err: 0.2188 51956 digits 1798620^16384+1 Time: 147 us/mul. Err: 0.2344 102481 digits 1471094^32768+1 Time: 201 us/mul. Err: 0.2344 202102 digits 1203210^65536+1 Time: 318 us/mul. Err: 0.2813 398482 digits 984108^131072+1 Time: 652 us/mul. Err: 0.2295 785521 digits 804904^262144+1 Time: 1.29 ms/mul. Err: 0.2188 1548156 digits 658332^524288+1 Time: 2.51 ms/mul. Err: 0.2266 3050541 digits 538452^1048576+1 Time: 5 ms/mul. Err: 0.2188 6009544 digits 440400^2097152+1 Time: 10.2 ms/mul. Err: 0.2188 11836006 digits 360204^4194304+1 Time: 20.9 ms/mul. Err: 0.1953 23305854 digits 294612^8388608+1 Time: 43.1 ms/mul. Err: 0.2070 45879398 digits Genefer Mark = 20. Command line: genefercuda-windows.exe -b Generalized Fermat Number Bench 2199064^8192+1 Time: 189 us/mul. Err: 0.2344 51956 digits 1798620^16384+1 Time: 236 us/mul. Err: 0.2188 102481 digits 1471094^32768+1 Time: 266 us/mul. Err: 0.2500 202102 digits 1203210^65536+1 Time: 416 us/mul. Err: 0.2352 398482 digits 984108^131072+1 Time: 791 us/mul. Err: 0.2188 785521 digits 804904^262144+1 Time: 1.37 ms/mul. Err: 0.2227 1548156 digits 658332^524288+1 Time: 2.55 ms/mul. Err: 0.2500 3050541 digits 538452^1048576+1 Time: 5.46 ms/mul. Err: 0.2031 6009544 digits 440400^2097152+1 Time: 10.7 ms/mul. Err: 0.2051 11836006 digits 360204^4194304+1 Time: 21.5 ms/mul. Err: 0.2167 23305854 digits 294612^8388608+1 Time: 47.5 ms/mul. Err: 0.1797 45879398 digits



Andrew Dicker
Send message
Joined: 29 Jun 13
Posts: 17
ID: 237644
Credit: 69,047,503
RAC: 0
321 LLR Gold: Earned 500,000 credits (596,676)Cullen LLR Gold: Earned 500,000 credits (951,634)ESP LLR Silver: Earned 100,000 credits (357,348)PPS LLR Bronze: Earned 10,000 credits (74,859)PSP LLR Amethyst: Earned 1,000,000 credits (1,273,262)SoB LLR Amethyst: Earned 1,000,000 credits (1,696,392)SR5 LLR Silver: Earned 100,000 credits (197,349)SGS LLR Bronze: Earned 10,000 credits (24,587)TRP LLR Silver: Earned 100,000 credits (356,642)Woodall LLR Gold: Earned 500,000 credits (995,434)PPS Sieve Turquoise: Earned 5,000,000 credits (6,920,663)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Bronze: Earned 10,000 credits (31,307)TRP Sieve (suspended) Silver: Earned 100,000 credits (199,316)AP 26/27 Bronze: Earned 10,000 credits (64,688)GFN Emerald: Earned 50,000,000 credits (55,297,513)
Message 68520 - Posted: 17 Aug 2013 | 23:02:38 UTC - in response to Message 68480.

If anyone has BOTH an Nvidia and an AMD GPU in the same system, I'd like to know which one GeneferOCL chooses.


In a week's time (hopefully), i'll be able to answer you. I was building a machine with nvidia for genefer and AMD for other projects, and parts are still on their way to me.
____________

Yves Gallot
Volunteer developer
Project scientist
Send message
Joined: 19 Aug 12
Posts: 513
ID: 164101
Credit: 295,254,118
RAC: 5,845
GFN Double Silver: Earned 200,000,000 credits (295,254,118)
Message 68521 - Posted: 17 Aug 2013 | 23:08:15 UTC

A new version of oCLgenefer with self-tuning is available on assembla [...]\branches\yves\2013\OclGenefer.

In OpenCL, we can enable profiling of commands in the command-queue: I used it: it's fast and easy to use.

I printed the two parameters, just out of curiosity.
According to NVidia guide, they should be a multiple of 32 (warp size) and to ATI guide, a multiple of 64 (wavefront size)... but that's just the theory.

A bench on Tahiti and Fermi are welcome.

Thanks, Yves

Profile Michael GoetzProject donor
Volunteer moderator
Project administrator
Project scientist
Avatar
Send message
Joined: 21 Jan 10
Posts: 12669
ID: 53948
Credit: 184,131,428
RAC: 10,627
The "Shut up already!" badge:  This loud mouth has mansplained on the forums over 10 thousand times!  Sheesh!!!Discovered the World's First GFN-19 prime!!!Discovered 1 mega primeFound 1 prime in the 2018 Tour de PrimesFound 1 prime in the 2019 Tour de Primes321 LLR Ruby: Earned 2,000,000 credits (2,063,182)Cullen LLR Ruby: Earned 2,000,000 credits (2,005,249)ESP LLR Ruby: Earned 2,000,000 credits (3,820,430)Generalized Cullen/Woodall LLR Ruby: Earned 2,000,000 credits (2,145,754)PPS LLR Ruby: Earned 2,000,000 credits (2,773,744)PSP LLR Ruby: Earned 2,000,000 credits (2,632,269)SoB LLR Sapphire: Earned 20,000,000 credits (34,158,496)SR5 LLR Turquoise: Earned 5,000,000 credits (8,293,415)SGS LLR Ruby: Earned 2,000,000 credits (2,012,781)TRP LLR Ruby: Earned 2,000,000 credits (2,737,347)Woodall LLR Ruby: Earned 2,000,000 credits (2,195,123)321 Sieve Turquoise: Earned 5,000,000 credits (5,046,112)Cullen/Woodall Sieve (suspended) Ruby: Earned 2,000,000 credits (4,170,256)Generalized Cullen/Woodall Sieve Turquoise: Earned 5,000,000 credits (5,059,304)PPS Sieve Sapphire: Earned 20,000,000 credits (20,110,788)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Amethyst: Earned 1,000,000 credits (1,035,522)TRP Sieve (suspended) Ruby: Earned 2,000,000 credits (2,051,121)AP 26/27 Turquoise: Earned 5,000,000 credits (7,090,096)GFN Emerald: Earned 50,000,000 credits (64,594,991)PSA Jade: Earned 10,000,000 credits (10,135,447)
Message 68522 - Posted: 17 Aug 2013 | 23:51:06 UTC - in response to Message 68521.

A new version of oCLgenefer with self-tuning is available on assembla [...]\branches\yves\2013\OclGenefer.

In OpenCL, we can enable profiling of commands in the command-queue: I used it: it's fast and easy to use.

I printed the two parameters, just out of curiosity.
According to NVidia guide, they should be a multiple of 32 (warp size) and to ATI guide, a multiple of 64 (wavefront size)... but that's just the theory.

A bench on Tahiti and Fermi are welcome.

Thanks, Yves


I ran the test several times because I was getting inconsistent results at low N.

GTX 460:

OclGenefer 2013-08-17, Copyright (C) 2001-2013, Yves Gallot.

Options: -q "b^N+1" Test expression.

Platform 'NVIDIA CUDA': GPU device 'GeForce GTX 460' found.
Platform 'AMD Accelerated Parallel Processing': CPU device 'Intel(R) Core(TM)2 Quad CPU @ 2.40GHz' found.


Running on platform 'NVIDIA CUDA', device 'GeForce GTX 460', version 'OpenCL 1.1 CUDA' and driver '320.57'.
Clock frequency = 1350 MHz, compute units = 7.
Global mem size = 1024 MB, cache size = 112 kB (ReadWrite), cache line size = 128 Bytes.
Local mem size = 48 kB (dedicated), Constant mem size = 64 kB.
Max workgroup size = 1024.

localWorkSize0 = 4, localWorkSize1 = 4.
2199064^8192+1 Time: 91.3 us/mul. Err: 0.2188 51956 digits
localWorkSize0 = 16, localWorkSize1 = 16.
1798620^16384+1 Time: 134 us/mul. Err: 0.2266 102481 digits
localWorkSize0 = 16, localWorkSize1 = 16.
1471094^32768+1 Time: 176 us/mul. Err: 0.2344 202102 digits
localWorkSize0 = 32, localWorkSize1 = 32.
1203210^65536+1 Time: 277 us/mul. Err: 0.2188 398482 digits
localWorkSize0 = 64, localWorkSize1 = 64.
984108^131072+1 Time: 507 us/mul. Err: 0.2422 785521 digits
localWorkSize0 = 32, localWorkSize1 = 32.
804904^262144+1 Time: 1.08 ms/mul. Err: 0.2178 1548156 digits
localWorkSize0 = 32, localWorkSize1 = 32.
658332^524288+1 Time: 2.05 ms/mul. Err: 0.2256 3050541 digits
localWorkSize0 = 32, localWorkSize1 = 32.
538452^1048576+1 Time: 4.28 ms/mul. Err: 0.2031 6009544 digits
localWorkSize0 = 32, localWorkSize1 = 32.
440400^2097152+1 Time: 8.9 ms/mul. Err: 0.2305 11836006 digits
localWorkSize0 = 32, localWorkSize1 = 32.
360204^4194304+1 Time: 18.7 ms/mul. Err: 0.1953 23305854 digits
localWorkSize0 = 32, localWorkSize1 = 32.
294612^8388608+1 Time: 39.7 ms/mul. Err: 0.1973 45879398 digits


localWorkSize0 = 4, localWorkSize1 = 4.
2199064^8192+1 Time: 91.1 us/mul. Err: 0.2188 51956 digits
localWorkSize0 = 16, localWorkSize1 = 16.
1798620^16384+1 Time: 136 us/mul. Err: 0.2266 102481 digits
localWorkSize0 = 16, localWorkSize1 = 16.
1471094^32768+1 Time: 178 us/mul. Err: 0.2344 202102 digits
localWorkSize0 = 32, localWorkSize1 = 2.
1203210^65536+1 Time: 292 us/mul. Err: 0.2188 398482 digits
localWorkSize0 = 64, localWorkSize1 = 64.
984108^131072+1 Time: 510 us/mul. Err: 0.2422 785521 digits
localWorkSize0 = 32, localWorkSize1 = 32.
804904^262144+1 Time: 1.08 ms/mul. Err: 0.2178 1548156 digits
localWorkSize0 = 32, localWorkSize1 = 32.
658332^524288+1 Time: 2.06 ms/mul. Err: 0.2256 3050541 digits
localWorkSize0 = 32, localWorkSize1 = 32.
538452^1048576+1 Time: 4.29 ms/mul. Err: 0.2031 6009544 digits
localWorkSize0 = 32, localWorkSize1 = 32.
440400^2097152+1 Time: 8.89 ms/mul. Err: 0.2305 11836006 digits
localWorkSize0 = 32, localWorkSize1 = 32.
360204^4194304+1 Time: 18.6 ms/mul. Err: 0.1953 23305854 digits
localWorkSize0 = 32, localWorkSize1 = 32.
294612^8388608+1 Time: 39.7 ms/mul. Err: 0.1973 45879398 digits


localWorkSize0 = 4, localWorkSize1 = 1.
2199064^8192+1 Time: 92.3 us/mul. Err: 0.2188 51956 digits
localWorkSize0 = 16, localWorkSize1 = 4.
1798620^16384+1 Time: 135 us/mul. Err: 0.2266 102481 digits
localWorkSize0 = 16, localWorkSize1 = 32.
1471094^32768+1 Time: 175 us/mul. Err: 0.2344 202102 digits
localWorkSize0 = 32, localWorkSize1 = 32.
1203210^65536+1 Time: 276 us/mul. Err: 0.2188 398482 digits
localWorkSize0 = 64, localWorkSize1 = 64.
984108^131072+1 Time: 504 us/mul. Err: 0.2422 785521 digits
localWorkSize0 = 32, localWorkSize1 = 32.
804904^262144+1 Time: 1.07 ms/mul. Err: 0.2178 1548156 digits
localWorkSize0 = 32, localWorkSize1 = 32.
658332^524288+1 Time: 2.03 ms/mul. Err: 0.2256 3050541 digits
localWorkSize0 = 32, localWorkSize1 = 32.
538452^1048576+1 Time: 4.25 ms/mul. Err: 0.2031 6009544 digits
localWorkSize0 = 32, localWorkSize1 = 32.
440400^2097152+1 Time: 8.84 ms/mul. Err: 0.2305 11836006 digits
localWorkSize0 = 32, localWorkSize1 = 32.
360204^4194304+1 Time: 18.6 ms/mul. Err: 0.1953 23305854 digits
localWorkSize0 = 32, localWorkSize1 = 32.
294612^8388608+1 Time: 39.7 ms/mul. Err: 0.1973 45879398 digits


localWorkSize0 = 4, localWorkSize1 = 4.
2199064^8192+1 Time: 90 us/mul. Err: 0.2188 51956 digits
localWorkSize0 = 16, localWorkSize1 = 2.
1798620^16384+1 Time: 136 us/mul. Err: 0.2266 102481 digits
localWorkSize0 = 16, localWorkSize1 = 2.
1471094^32768+1 Time: 180 us/mul. Err: 0.2344 202102 digits
localWorkSize0 = 32, localWorkSize1 = 64.
1203210^65536+1 Time: 276 us/mul. Err: 0.2188 398482 digits
localWorkSize0 = 64, localWorkSize1 = 64.
984108^131072+1 Time: 505 us/mul. Err: 0.2422 785521 digits
localWorkSize0 = 32, localWorkSize1 = 32.
804904^262144+1 Time: 1.07 ms/mul. Err: 0.2178 1548156 digits
localWorkSize0 = 32, localWorkSize1 = 32.
658332^524288+1 Time: 2.03 ms/mul. Err: 0.2256 3050541 digits
localWorkSize0 = 32, localWorkSize1 = 32.
538452^1048576+1 Time: 4.25 ms/mul. Err: 0.2031 6009544 digits
localWorkSize0 = 32, localWorkSize1 = 32.
440400^2097152+1 Time: 8.85 ms/mul. Err: 0.2305 11836006 digits
localWorkSize0 = 32, localWorkSize1 = 32.
360204^4194304+1 Time: 18.6 ms/mul. Err: 0.1953 23305854 digits
localWorkSize0 = 32, localWorkSize1 = 32.
294612^8388608+1 Time: 39.7 ms/mul. Err: 0.1973 45879398 digits

____________
Please do not PM me with support questions. Ask on the forums instead. Thank you!

My lucky number is 75898524288+1

Profile Roger
Volunteer moderator
Project administrator
Volunteer developer
Volunteer tester
Project scientist
Avatar
Send message
Joined: 27 Nov 11
Posts: 1112
ID: 120786
Credit: 261,530,863
RAC: 1,408
Found 1 prime in the 2018 Tour de Primes321 LLR Ruby: Earned 2,000,000 credits (2,012,522)Cullen LLR Amethyst: Earned 1,000,000 credits (1,359,862)ESP LLR Ruby: Earned 2,000,000 credits (2,213,934)Generalized Cullen/Woodall LLR Ruby: Earned 2,000,000 credits (2,088,705)PPS LLR Ruby: Earned 2,000,000 credits (2,617,785)PSP LLR Ruby: Earned 2,000,000 credits (2,420,512)SoB LLR Amethyst: Earned 1,000,000 credits (1,780,064)SR5 LLR Ruby: Earned 2,000,000 credits (2,238,295)SGS LLR Ruby: Earned 2,000,000 credits (2,139,392)TRP LLR Ruby: Earned 2,000,000 credits (2,125,391)Woodall LLR Amethyst: Earned 1,000,000 credits (1,311,937)321 Sieve Turquoise: Earned 5,000,000 credits (5,190,731)Cullen/Woodall Sieve (suspended) Silver: Earned 100,000 credits (207,387)Generalized Cullen/Woodall Sieve Turquoise: Earned 5,000,000 credits (5,049,697)PPS Sieve Double Bronze: Earned 100,000,000 credits (100,422,123)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Ruby: Earned 2,000,000 credits (3,227,972)TRP Sieve (suspended) Turquoise: Earned 5,000,000 credits (5,021,659)AP 26/27 Sapphire: Earned 20,000,000 credits (20,295,860)GFN Emerald: Earned 50,000,000 credits (56,515,310)PSA Sapphire: Earned 20,000,000 credits (43,298,465)
Message 68524 - Posted: 18 Aug 2013 | 4:49:44 UTC - in response to Message 68521.

A new version of oCLgenefer with self-tuning is available on assembla [...]\branches\yves\2013\OclGenefer.

Full test run with self-tuning geneferocl, revision 400. HD7970GHz GPU, CPU is 3.5 GHz AMD Phenom II X6 1100T:

OclGenefer 2013-08-17, Copyright (C) 2001-2013, Yves Gallot. Options: -q "b^N+1" Test expression. Platform 'AMD Accelerated Parallel Processing': GPU device 'Tahiti' found. Platform 'AMD Accelerated Parallel Processing': CPU device 'AMD Phenom(tm) II X6 1100T Processor' found. Running on platform 'AMD Accelerated Parallel Processing', device 'Tahiti', vers ion 'OpenCL 1.2 AMD-APP (1124.2)' and driver '1124.2 (VM)'. Clock frequency = 1050 MHz, compute units = 32. Global mem size = 2048 MB, cache size = 16 kB (ReadWrite), cache line size = 6 4 Bytes. Local mem size = 32 kB (dedicated), Constant mem size = 64 kB. Max workgroup size = 256. localWorkSize0 = 64, localWorkSize1 = 128. 2199064^8192+1 Time: 583 us/mul. Err: 0.2188 51956 digits localWorkSize0 = 1, localWorkSize1 = 2. 1798620^16384+1 Time: 303 us/mul. Err: 0.2227 102481 digits localWorkSize0 = 1, localWorkSize1 = 2. 1471094^32768+1 Time: 526 us/mul. Err: 0.2383 202102 digits localWorkSize0 = 2, localWorkSize1 = 4. 1203210^65536+1 Time: 621 us/mul. Err: 0.2305 398482 digits localWorkSize0 = 8, localWorkSize1 = 8. 984108^131072+1 Time: 625 us/mul. Err: 0.2188 785521 digits localWorkSize0 = 16, localWorkSize1 = 32. 804904^262144+1 Time: 944 us/mul. Err: 0.2266 1548156 digits localWorkSize0 = 4, localWorkSize1 = 4. 658332^524288+1 Time: 711 us/mul. Err: 0.2109 3050541 digits localWorkSize0 = 8, localWorkSize1 = 4. 538452^1048576+1 Time: 1.15 ms/mul. Err: 0.2134 6009544 digits localWorkSize0 = 4, localWorkSize1 = 2. 440400^2097152+1 Time: 2.24 ms/mul. Err: 0.2266 11836006 digits localWorkSize0 = 8, localWorkSize1 = 2. 360204^4194304+1 Time: 4.64 ms/mul. Err: 0.1953 23305854 digits localWorkSize0 = 8, localWorkSize1 = 2. 294612^8388608+1 Time: 9 ms/mul. Err: 0.2109 45879398 digits localWorkSize0 = 1, localWorkSize1 = 1. 102^64+1 is a probable prime. (0.9 sec., err = 1.46e-011) localWorkSize0 = 1, localWorkSize1 = 1. 15000250^64+1 is a probable prime. (1.6 sec., err = 0.375) localWorkSize0 = 2, localWorkSize1 = 2. 120^128+1 is a probable prime. (1.2 sec., err = 5.09e-011) localWorkSize0 = 2, localWorkSize1 = 2. 10000038^128+1 is a probable prime. (2.6 sec., err = 0.344) localWorkSize0 = 1, localWorkSize1 = 2. 278^256+1 is a probable prime. (2.0 sec., err = 3.49e-010) localWorkSize0 = 2, localWorkSize1 = 2. 5684328^256+1 is a probable prime. (4.5 sec., err = 0.164) localWorkSize0 = 2, localWorkSize1 = 2. 46^512+1 is a probable prime. (2.9 sec., err = 1.73e-011) localWorkSize0 = 1, localWorkSize1 = 2. 4619000^512+1 is a probable prime. (6.1 sec., err = 0.174) localWorkSize0 = 2, localWorkSize1 = 4. 824^1024+1 is a probable prime. (6.0 sec., err = 8.5e-009) localWorkSize0 = 2, localWorkSize1 = 2. 3752220^1024+1 is a probable prime. (7.3 sec., err = 0.188) localWorkSize0 = 4, localWorkSize1 = 2. 150^2048+1 is a probable prime. (7.5 sec., err = 4.37e-010) localWorkSize0 = 2, localWorkSize1 = 2. 3066672^2048+1 is a probable prime. (9.8 sec., err = 0.217) localWorkSize0 = 2, localWorkSize1 = 2. 1534^4096+1 is a probable prime. (10.2 sec., err = 7.08e-008) localWorkSize0 = 2, localWorkSize1 = 64. 2485064^4096+1 is a probable prime. (13.7 sec., err = 0.211) localWorkSize0 = 128, localWorkSize1 = 128. 30406^8192+1 is a probable prime. (18.8 sec., err = 4.58e-005) localWorkSize0 = 2, localWorkSize1 = 1. 2030234^8192+1 is a probable prime. (20.6 sec., err = 0.209) localWorkSize0 = 64, localWorkSize1 = 128. 67234^16384+1 is a probable prime. (32.6 sec., err = 0.000351) localWorkSize0 = 64, localWorkSize1 = 128. 1651902^16384+1 is a probable prime. (40.3 sec., err = 0.219) localWorkSize0 = 64, localWorkSize1 = 64. 70906^32768+1 is a probable prime. (53.5 sec., err = 0.000648) localWorkSize0 = 128, localWorkSize1 = 128. 1277444^32768+1 is a probable prime. (69.6 sec., err = 0.203) localWorkSize0 = 2, localWorkSize1 = 4. 48594^65536+1 is a probable prime. (128.6 sec., err = 0.000458) localWorkSize0 = 2, localWorkSize1 = 4. 857678^65536+1 is a probable prime. (155.1 sec., err = 0.137) localWorkSize0 = 8, localWorkSize1 = 8. 62722^131072+1 is a probable prime. (293.9 sec., err = 0.00116) localWorkSize0 = 8, localWorkSize1 = 128. 572186^131072+1 is a probable prime. (352.2 sec., err = 0.0957) localWorkSize0 = 16, localWorkSize1 = 32. 24518^262144+1 is a probable prime. (1310.7 sec., err = 0.000259) localWorkSize0 = 8, localWorkSize1 = 32. 676754^262144+1 is a probable prime. (1636.6 sec., err = 0.215) localWorkSize0 = 4, localWorkSize1 = 4. 75898^524288+1 is a probable prime. (4649.8 sec., err = 0.00391) localWorkSize0 = 4, localWorkSize1 = 4. 475856^524288+1 is a probable prime. (5369.9 sec., err = 0.16)


I compared the results to revision 386 from 14/08/2013 (previous fastest for me).
N values 524288 and higher are definitely having quicker times.
N values 131072 and lower are definitely having slower times.
The auto tuned transform is showing lots of promise but needs tweaking.

NeoMetal*
Volunteer tester
Avatar
Send message
Joined: 25 Mar 11
Posts: 418
ID: 92179
Credit: 1,747,428,303
RAC: 0
Eliminated 1 conjecture "k"Found 1 prime in the 2018 Tour de Primes321 LLR Turquoise: Earned 5,000,000 credits (7,698,294)Cullen LLR Turquoise: Earned 5,000,000 credits (6,655,866)ESP LLR Turquoise: Earned 5,000,000 credits (8,198,062)Generalized Cullen/Woodall LLR Ruby: Earned 2,000,000 credits (2,817,032)PPS LLR Jade: Earned 10,000,000 credits (10,016,096)PSP LLR Jade: Earned 10,000,000 credits (10,891,779)SoB LLR Jade: Earned 10,000,000 credits (11,111,741)SR5 LLR Turquoise: Earned 5,000,000 credits (7,278,494)SGS LLR Turquoise: Earned 5,000,000 credits (7,457,856)TRP LLR Turquoise: Earned 5,000,000 credits (7,714,186)Woodall LLR Turquoise: Earned 5,000,000 credits (5,726,778)Cullen/Woodall Sieve (suspended) Ruby: Earned 2,000,000 credits (4,094,130)Generalized Cullen/Woodall Sieve Jade: Earned 10,000,000 credits (11,628,594)PPS Sieve Double Amethyst: Earned 1,000,000,000 credits (1,040,865,445)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Sapphire: Earned 20,000,000 credits (20,235,150)TRP Sieve (suspended) Sapphire: Earned 20,000,000 credits (20,376,844)AP 26/27 Jade: Earned 10,000,000 credits (11,615,539)GFN Double Silver: Earned 200,000,000 credits (427,593,017)PSA Double Bronze: Earned 100,000,000 credits (125,462,363)
Message 68525 - Posted: 18 Aug 2013 | 4:55:55 UTC
Last modified: 18 Aug 2013 | 5:07:31 UTC

Here's more Fermi benches. I've listed both the new 32 bit version with the old 64 bit one. It looks like the 32 bit version is slightly faster on the GTX 560ti (GF114 chip) but slightly slower on the 'Big Fermis' (GF100, GF110 chips).

EDIT: Forgot to set 470 to stock before testing. Shaders are @ 1415 for test instead of stock 1215.

GTX 470 on 1100T

C:\>geneferocl-windows_2.exe -b
geneferocl 3.1.2-3 (Windows 32-bit OpenGL)
Copyright 2001-2013, Yves Gallot
Copyright 2009, Mark Rodenkirch, David Underbakke
Copyright 2010-2012, Shoichiro Yamada, Ken Brazier
Copyright 2011-2013, Iain Bethune, Michael Goetz, Ronald Schneider

Command line: geneferocl-windows_2.exe -b


Running on platform 'NVIDIA CUDA', device 'GeForce GTX 470', version 'OpenCL 1.1
CUDA' and driver '320.18'.

Generalized Fermat Number Bench
2199064^8192+1 Time: 86.1 us/mul. Err: 0.2188 51956 digits
1798620^16384+1 Time: 92.8 us/mul. Err: 0.2344 102481 digits
1471094^32768+1 Time: 126 us/mul. Err: 0.2344 202102 digits
1203210^65536+1 Time: 177 us/mul. Err: 0.2813 398482 digits
984108^131072+1 Time: 271 us/mul. Err: 0.2295 785521 digits
804904^262144+1 Time: 605 us/mul. Err: 0.2188 1548156 digits
658332^524288+1 Time: 1.11 ms/mul. Err: 0.2266 3050541 digits
538452^1048576+1 Time: 2.25 ms/mul. Err: 0.2188 6009544 digits
440400^2097152+1 Time: 4.61 ms/mul. Err: 0.2188 11836006 digits
360204^4194304+1 Time: 9.69 ms/mul. Err: 0.1953 23305854 digits
294612^8388608+1 Time: 20.6 ms/mul. Err: 0.2070 45879398 digits
Genefer Mark = 43.

C:\>geneferocl-windows.exe -b
geneferocl 3.1.2-2 (Windows 64-bit OpenGL)
Copyright 2001-2013, Yves Gallot
Copyright 2009, Mark Rodenkirch, David Underbakke
Copyright 2010-2012, Shoichiro Yamada, Ken Brazier
Copyright 2011-2013, Iain Bethune, Michael Goetz, Ronald Schneider

Command line: geneferocl-windows.exe -b


Running on platform 'NVIDIA CUDA', device 'GeForce GTX 470', version 'OpenCL 1.1
CUDA' and driver '320.18'.

Generalized Fermat Number Bench
2199064^8192+1 Time: 86.7 us/mul. Err: 0.2344 51956 digits
1798620^16384+1 Time: 96.4 us/mul. Err: 0.2266 102481 digits
1471094^32768+1 Time: 127 us/mul. Err: 0.2344 202102 digits
1203210^65536+1 Time: 182 us/mul. Err: 0.2656 398482 digits
984108^131072+1 Time: 293 us/mul. Err: 0.2188 785521 digits
804904^262144+1 Time: 591 us/mul. Err: 0.2188 1548156 digits
658332^524288+1 Time: 1.06 ms/mul. Err: 0.2188 3050541 digits
538452^1048576+1 Time: 2.13 ms/mul. Err: 0.2266 6009544 digits
440400^2097152+1 Time: 4.34 ms/mul. Err: 0.2266 11836006 digits
360204^4194304+1 Time: 9.14 ms/mul. Err: 0.2031 23305854 digits
294612^8388608+1 Time: 19.5 ms/mul. Err: 0.1895 45879398 digits
Genefer Mark = 554.


GTX 580 on 1100T

C:\>geneferocl-windows_2.exe -b
geneferocl 3.1.2-3 (Windows 32-bit OpenGL)
Copyright 2001-2013, Yves Gallot
Copyright 2009, Mark Rodenkirch, David Underbakke
Copyright 2010-2012, Shoichiro Yamada, Ken Brazier
Copyright 2011-2013, Iain Bethune, Michael Goetz, Ronald Schneider

Command line: geneferocl-windows_2.exe -b


Running on platform 'NVIDIA CUDA', device 'GeForce GTX 580', version 'OpenCL 1.1
CUDA' and driver '314.22'.

Generalized Fermat Number Bench
2199064^8192+1 Time: 72.4 us/mul. Err: 0.2188 51956 digits
1798620^16384+1 Time: 77.2 us/mul. Err: 0.2344 102481 digits
1471094^32768+1 Time: 80.3 us/mul. Err: 0.2344 202102 digits
1203210^65536+1 Time: 122 us/mul. Err: 0.2813 398482 digits
984108^131072+1 Time: 199 us/mul. Err: 0.2295 785521 digits
804904^262144+1 Time: 438 us/mul. Err: 0.2188 1548156 digits
658332^524288+1 Time: 844 us/mul. Err: 0.2266 3050541 digits
538452^1048576+1 Time: 1.73 ms/mul. Err: 0.2188 6009544 digits
440400^2097152+1 Time: 3.54 ms/mul. Err: 0.2188 11836006 digits
360204^4194304+1 Time: 7.52 ms/mul. Err: 0.1953 23305854 digits
294612^8388608+1 Time: 16 ms/mul. Err: 0.2070 45879398 digits
Genefer Mark = 56.

C:\>geneferocl-windows.exe -b
geneferocl 3.1.2-2 (Windows 64-bit OpenGL)
Copyright 2001-2013, Yves Gallot
Copyright 2009, Mark Rodenkirch, David Underbakke
Copyright 2010-2012, Shoichiro Yamada, Ken Brazier
Copyright 2011-2013, Iain Bethune, Michael Goetz, Ronald Schneider

Command line: geneferocl-windows.exe -b


Running on platform 'NVIDIA CUDA', device 'GeForce GTX 580', version 'OpenCL 1.1
CUDA' and driver '314.22'.

Generalized Fermat Number Bench
2199064^8192+1 Time: 73.5 us/mul. Err: 0.2344 51956 digits
1798620^16384+1 Time: 78.6 us/mul. Err: 0.2266 102481 digits
1471094^32768+1 Time: 81.3 us/mul. Err: 0.2344 202102 digits
1203210^65536+1 Time: 123 us/mul. Err: 0.2656 398482 digits
984108^131072+1 Time: 204 us/mul. Err: 0.2188 785521 digits
804904^262144+1 Time: 433 us/mul. Err: 0.2188 1548156 digits
658332^524288+1 Time: 822 us/mul. Err: 0.2188 3050541 digits
538452^1048576+1 Time: 1.68 ms/mul. Err: 0.2266 6009544 digits
440400^2097152+1 Time: 3.43 ms/mul. Err: 0.2266 11836006 digits
360204^4194304+1 Time: 7.28 ms/mul. Err: 0.2031 23305854 digits
294612^8388608+1 Time: 15.6 ms/mul. Err: 0.1895 45879398 digits
Genefer Mark = 699.


GTX 570 0n 2600K

C:\>geneferocl-windows_2.exe -b
geneferocl 3.1.2-3 (Windows 32-bit OpenGL)
Copyright 2001-2013, Yves Gallot
Copyright 2009, Mark Rodenkirch, David Underbakke
Copyright 2010-2012, Shoichiro Yamada, Ken Brazier
Copyright 2011-2013, Iain Bethune, Michael Goetz, Ronald Schneider

Command line: geneferocl-windows_2.exe -b


Running on platform 'NVIDIA CUDA', device 'GeForce GTX 570', version 'OpenCL 1.1
CUDA' and driver '326.41'.

Generalized Fermat Number Bench
2199064^8192+1 Time: 81.8 us/mul. Err: 0.2188 51956 digits
1798620^16384+1 Time: 88.2 us/mul. Err: 0.2344 102481 digits
1471094^32768+1 Time: 117 us/mul. Err: 0.2344 202102 digits
1203210^65536+1 Time: 168 us/mul. Err: 0.2813 398482 digits
984108^131072+1 Time: 249 us/mul. Err: 0.2295 785521 digits
804904^262144+1 Time: 537 us/mul. Err: 0.2188 1548156 digits
658332^524288+1 Time: 1.01 ms/mul. Err: 0.2266 3050541 digits
538452^1048576+1 Time: 2.03 ms/mul. Err: 0.2188 6009544 digits
440400^2097152+1 Time: 4.14 ms/mul. Err: 0.2188 11836006 digits
360204^4194304+1 Time: 8.75 ms/mul. Err: 0.1953 23305854 digits
294612^8388608+1 Time: 18.4 ms/mul. Err: 0.2070 45879398 digits
Genefer Mark = 48.

C:\>geneferocl-windows.exe -b
geneferocl 3.1.2-2 (Windows 64-bit OpenGL)
Copyright 2001-2013, Yves Gallot
Copyright 2009, Mark Rodenkirch, David Underbakke
Copyright 2010-2012, Shoichiro Yamada, Ken Brazier
Copyright 2011-2013, Iain Bethune, Michael Goetz, Ronald Schneider

Command line: geneferocl-windows.exe -b


Running on platform 'NVIDIA CUDA', device 'GeForce GTX 570', version 'OpenCL 1.1
CUDA' and driver '326.41'.

Generalized Fermat Number Bench
2199064^8192+1 Time: 82.4 us/mul. Err: 0.2344 51956 digits
1798620^16384+1 Time: 88.8 us/mul. Err: 0.2266 102481 digits
1471094^32768+1 Time: 118 us/mul. Err: 0.2344 202102 digits
1203210^65536+1 Time: 168 us/mul. Err: 0.2656 398482 digits
984108^131072+1 Time: 273 us/mul. Err: 0.2188 785521 digits
804904^262144+1 Time: 527 us/mul. Err: 0.2188 1548156 digits
658332^524288+1 Time: 967 us/mul. Err: 0.2188 3050541 digits
538452^1048576+1 Time: 1.93 ms/mul. Err: 0.2266 6009544 digits
440400^2097152+1 Time: 3.91 ms/mul. Err: 0.2266 11836006 digits
360204^4194304+1 Time: 8.2 ms/mul. Err: 0.2031 23305854 digits
294612^8388608+1 Time: 17.5 ms/mul. Err: 0.1895 45879398 digits
Genefer Mark = 615.


GTX 560 Ti on 2600K

C:\>geneferocl-windows_2.exe -d 1 -b
geneferocl 3.1.2-3 (Windows 32-bit OpenGL)
Copyright 2001-2013, Yves Gallot
Copyright 2009, Mark Rodenkirch, David Underbakke
Copyright 2010-2012, Shoichiro Yamada, Ken Brazier
Copyright 2011-2013, Iain Bethune, Michael Goetz, Ronald Schneider

Command line: geneferocl-windows_2.exe -d 1 -b


Running on platform 'NVIDIA CUDA', device 'GeForce GTX 560 Ti', version 'OpenCL
1.1 CUDA' and driver '326.41'.

Generalized Fermat Number Bench
2199064^8192+1 Time: 69 us/mul. Err: 0.2188 51956 digits
1798620^16384+1 Time: 73.2 us/mul. Err: 0.2344 102481 digits
1471094^32768+1 Time: 107 us/mul. Err: 0.2344 202102 digits
1203210^65536+1 Time: 194 us/mul. Err: 0.2813 398482 digits
984108^131072+1 Time: 352 us/mul. Err: 0.2295 785521 digits
804904^262144+1 Time: 757 us/mul. Err: 0.2188 1548156 digits
658332^524288+1 Time: 1.5 ms/mul. Err: 0.2266 3050541 digits
538452^1048576+1 Time: 3.13 ms/mul. Err: 0.2188 6009544 digits
440400^2097152+1 Time: 6.6 ms/mul. Err: 0.2188 11836006 digits
360204^4194304+1 Time: 13.8 ms/mul. Err: 0.1953 23305854 digits
294612^8388608+1 Time: 29.1 ms/mul. Err: 0.2070 45879398 digits
Genefer Mark = 30.

C:\>geneferocl-windows.exe -d 1 -b
geneferocl 3.1.2-2 (Windows 64-bit OpenGL)
Copyright 2001-2013, Yves Gallot
Copyright 2009, Mark Rodenkirch, David Underbakke
Copyright 2010-2012, Shoichiro Yamada, Ken Brazier
Copyright 2011-2013, Iain Bethune, Michael Goetz, Ronald Schneider

Command line: geneferocl-windows.exe -d 1 -b


Running on platform 'NVIDIA CUDA', device 'GeForce GTX 560 Ti', version 'OpenCL
1.1 CUDA' and driver '326.41'.

Generalized Fermat Number Bench
2199064^8192+1 Time: 69.6 us/mul. Err: 0.2344 51956 digits
1798620^16384+1 Time: 74.2 us/mul. Err: 0.2266 102481 digits
1471094^32768+1 Time: 107 us/mul. Err: 0.2344 202102 digits
1203210^65536+1 Time: 195 us/mul. Err: 0.2656 398482 digits
984108^131072+1 Time: 376 us/mul. Err: 0.2188 785521 digits
804904^262144+1 Time: 811 us/mul. Err: 0.2188 1548156 digits
658332^524288+1 Time: 1.59 ms/mul. Err: 0.2188 3050541 digits
538452^1048576+1 Time: 3.34 ms/mul. Err: 0.2266 6009544 digits
440400^2097152+1 Time: 7.03 ms/mul. Err: 0.2266 11836006 digits
360204^4194304+1 Time: 14.6 ms/mul. Err: 0.2031 23305854 digits
294612^8388608+1 Time: 30.6 ms/mul. Err: 0.1895 45879398 digits
Genefer Mark = 349.


More tweaking needs to be done.
____________
Largest Primes to Date:
As Double Checker: SR5 109208*5^1816285+1 Dgts-1,269,534
As Initial Finder: SR5 243944*5^1258576-1 Dgts-879,713


Profile chip
Avatar
Send message
Joined: 12 Apr 11
Posts: 128
ID: 94709
Credit: 164,082,201
RAC: 5,606
321 LLR Amethyst: Earned 1,000,000 credits (1,081,800)ESP LLR Amethyst: Earned 1,000,000 credits (1,958,365)PPS LLR Ruby: Earned 2,000,000 credits (3,000,162)PSP LLR Amethyst: Earned 1,000,000 credits (1,022,562)SoB LLR Ruby: Earned 2,000,000 credits (2,015,539)SR5 LLR Ruby: Earned 2,000,000 credits (2,000,481)SGS LLR Ruby: Earned 2,000,000 credits (2,000,014)TRP LLR Ruby: Earned 2,000,000 credits (3,000,865)321 Sieve Ruby: Earned 2,000,000 credits (2,000,357)Cullen/Woodall Sieve (suspended) Ruby: Earned 2,000,000 credits (4,000,096)PPS Sieve Emerald: Earned 50,000,000 credits (50,000,422)TRP Sieve (suspended) Ruby: Earned 2,000,000 credits (2,000,500)AP 26/27 Sapphire: Earned 20,000,000 credits (20,000,721)GFN Sapphire: Earned 20,000,000 credits (20,000,117)PSA Emerald: Earned 50,000,000 credits (50,000,198)
Message 68526 - Posted: 18 Aug 2013 | 5:08:07 UTC
Last modified: 18 Aug 2013 | 5:08:39 UTC

Windows 32-bit OpenGL
Windows 64-bit OpenGL

Maybe OpenCL?

Husu*
Avatar
Send message
Joined: 16 Jan 12
Posts: 15
ID: 127298
Credit: 165,338,156
RAC: 0
321 LLR Bronze: Earned 10,000 credits (15,207)PPS LLR Silver: Earned 100,000 credits (373,654)SGS LLR Silver: Earned 100,000 credits (129,350)TRP LLR Bronze: Earned 10,000 credits (12,028)Woodall LLR Bronze: Earned 10,000 credits (11,232)Cullen/Woodall Sieve (suspended) Silver: Earned 100,000 credits (292,273)PPS Sieve Double Bronze: Earned 100,000,000 credits (159,185,362)GFN Turquoise: Earned 5,000,000 credits (5,316,037)
Message 68528 - Posted: 18 Aug 2013 | 6:58:17 UTC - in response to Message 68517.
Last modified: 18 Aug 2013 | 7:04:03 UTC

Husu*, you forgot to abort the task on previous host 400849 and now it is lost - it is very bad.


It has been aborted now.

------

First regular task done on Titan earlier http://www.primegrid.com/result.php?resultid=474003987, second one coming shortly.

Run time 16,600.85
CPU time 16,596.48

The OpenCL CPU reguirement is bad Nvidia implementation though :(

Seems the new drivers have fixed the crashings, now I can make full runs.

Profile DaveBProject donor
Avatar
Send message
Joined: 20 Jun 09
Posts: 351
ID: 42198
Credit: 11,898,570
RAC: 0
321 LLR Gold: Earned 500,000 credits (547,769)Cullen LLR Silver: Earned 100,000 credits (158,867)PPS LLR Silver: Earned 100,000 credits (162,534)PSP LLR Silver: Earned 100,000 credits (134,373)SoB LLR Gold: Earned 500,000 credits (510,012)SR5 LLR Silver: Earned 100,000 credits (266,757)SGS LLR Gold: Earned 500,000 credits (510,560)TRP LLR Silver: Earned 100,000 credits (242,439)Woodall LLR Gold: Earned 500,000 credits (508,989)321 Sieve Silver: Earned 100,000 credits (200,123)Cullen/Woodall Sieve (suspended) Silver: Earned 100,000 credits (229,801)PPS Sieve Ruby: Earned 2,000,000 credits (4,317,221)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Bronze: Earned 10,000 credits (21,147)TRP Sieve (suspended) Silver: Earned 100,000 credits (333,494)AP 26/27 Bronze: Earned 10,000 credits (24,117)GFN Ruby: Earned 2,000,000 credits (3,730,320)
Message 68529 - Posted: 18 Aug 2013 | 7:59:25 UTC
Last modified: 18 Aug 2013 | 8:10:32 UTC

Under CUDA there is a limit on the size of numbers the GPU can handle. Does this limit still apply in the OPEN CL version or can we go higher?
____________
Member team AUSTRALIA
My lucky number is 9291*2^1085585+1

Profile Roger
Volunteer moderator
Project administrator
Volunteer developer
Volunteer tester
Project scientist
Avatar
Send message
Joined: 27 Nov 11
Posts: 1112
ID: 120786
Credit: 261,530,863
RAC: 1,408
Found 1 prime in the 2018 Tour de Primes321 LLR Ruby: Earned 2,000,000 credits (2,012,522)Cullen LLR Amethyst: Earned 1,000,000 credits (1,359,862)ESP LLR Ruby: Earned 2,000,000 credits (2,213,934)Generalized Cullen/Woodall LLR Ruby: Earned 2,000,000 credits (2,088,705)PPS LLR Ruby: Earned 2,000,000 credits (2,617,785)PSP LLR Ruby: Earned 2,000,000 credits (2,420,512)SoB LLR Amethyst: Earned 1,000,000 credits (1,780,064)SR5 LLR Ruby: Earned 2,000,000 credits (2,238,295)SGS LLR Ruby: Earned 2,000,000 credits (2,139,392)TRP LLR Ruby: Earned 2,000,000 credits (2,125,391)Woodall LLR Amethyst: Earned 1,000,000 credits (1,311,937)321 Sieve Turquoise: Earned 5,000,000 credits (5,190,731)Cullen/Woodall Sieve (suspended) Silver: Earned 100,000 credits (207,387)Generalized Cullen/Woodall Sieve Turquoise: Earned 5,000,000 credits (5,049,697)PPS Sieve Double Bronze: Earned 100,000,000 credits (100,422,123)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Ruby: Earned 2,000,000 credits (3,227,972)TRP Sieve (suspended) Turquoise: Earned 5,000,000 credits (5,021,659)AP 26/27 Sapphire: Earned 20,000,000 credits (20,295,860)GFN Emerald: Earned 50,000,000 credits (56,515,310)PSA Sapphire: Earned 20,000,000 credits (43,298,465)
Message 68530 - Posted: 18 Aug 2013 | 8:26:58 UTC - in response to Message 68529.

Seems they can go higher or lower by just adding auto-tuning:

geneferocl 3.1.2-3 (Windows 32-bit OpenGL)
Copyright 2001-2013, Yves Gallot
Copyright 2009, Mark Rodenkirch, David Underbakke
Copyright 2010-2012, Shoichiro Yamada, Ken Brazier
Copyright 2011-2013, Iain Bethune, Michael Goetz, Ronald Schneider

Command line: geneferocl-windows.exe -l


Running on platform 'AMD Accelerated Parallel Processing', device 'Tahiti', vers
ion 'OpenCL 1.2 AMD-APP (1124.2)' and driver '1124.2 (VM)'.

Generalized Fermat Number b Limits
The upper bound m = 8192, b = 2720000, Err = 0.2969
The upper bound m = 16384, b = 2210000, Err = 0.2969
The upper bound m = 32768, b = 1830000, Err = 0.3008
The upper bound m = 65536, b = 1490000, Err = 0.2969
The upper bound m = 131072, b = 1235000, Err = 0.3008
The upper bound m = 262144, b = 1015000, Err = 0.2891
The upper bound m = 524288, b = 840000, Err = 0.3047
The upper bound m = 1048576, b = 690000, Err = 0.3008
The upper bound m = 2097152, b = 565000, Err = 0.3066
The upper bound m = 4194304, b = 470000, Err = 0.3125
The upper bound m = 8388608, b = 385000, Err = 0.3125


geneferocl 3.1.2-1 and geneferocl 3.1.2-2:

Generalized Fermat Number b Limits
The upper bound m = 8192, b = 2670000, Err = 0.2910
The upper bound m = 16384, b = 2210000, Err = 0.2969
The upper bound m = 32768, b = 1780000, Err = 0.2969
The upper bound m = 65536, b = 1505000, Err = 0.2969
The upper bound m = 131072, b = 1240000, Err = 0.2969
The upper bound m = 262144, b = 1015000, Err = 0.3047
The upper bound m = 524288, b = 825000, Err = 0.3057
The upper bound m = 1048576, b = 680000, Err = 0.3047
The upper bound m = 2097152, b = 555000, Err = 0.2969
The upper bound m = 4194304, b = 455000, Err = 0.2813
The upper bound m = 8388608, b = 385000, Err = 0.3125

These values can be compared against CUDA "B" limits given in this thread:
http://www.primegrid.com/forum_thread.php?id=4152

Yves Gallot
Volunteer developer
Project scientist
Send message
Joined: 19 Aug 12
Posts: 513
ID: 164101
Credit: 295,254,118
RAC: 5,845
GFN Double Silver: Earned 200,000,000 credits (295,254,118)
Message 68531 - Posted: 18 Aug 2013 | 9:30:36 UTC - in response to Message 68529.

Under CUDA there is a limit on the size of numbers the GPU can handle. Does this limit still apply in the OPEN CL version or can we go higher?

A limit? The limit is GPU memory size... and a reasonable amount of time!

We can extend the tests:
360204^4194304+1 Err: 0.1953 23305854 digits
294612^8388608+1 Err: 0.1973 45879398 digits
240964^16777216+1 Err: 0.1914 90294174 digits
197084^33554432+1 Err: 0.1907 177659020 digits

If you test 100000^33554432+1 on a Titan, the computation time is about log2(100000) * 2^25 * 30 ms ~ 200 days.

Yves Gallot
Volunteer developer
Project scientist
Send message
Joined: 19 Aug 12
Posts: 513
ID: 164101
Credit: 295,254,118
RAC: 5,845
GFN Double Silver: Earned 200,000,000 credits (295,254,118)
Message 68532 - Posted: 18 Aug 2013 | 10:02:15 UTC - in response to Message 68524.

A new version of oCLgenefer with self-tuning is available on assembla [...]

Full test run with self-tuning geneferocl, revision 400. HD7970GHz
N values 524288 and higher are definitely having quicker times.
N values 131072 and lower are definitely having slower times.
The auto tuned transform is showing lots of promise but needs tweaking.

Thanks, very good news!
538452^1048576+1 Time: 1.52 => 1.15 ms/mul.
360204^4194304+1 Time: 5.61 => 4.64 ms/mul.
That's the most important for the Generalized Fermat Prime Search.

As expected, "AMD Accelerated Parallel Processing, OpenCL Programming Guide, August 2013":
The fundamental unit of work on AMD GPUs is called a wavefront. Each wavefront consists of 64 work-items; thus, the optimal local work size is an integer multiple of 64 (specifically 64, 128, 192, or 256) work-items per workgroup.

The bench proves that here optimal settings are local work size = 8 and 2!

For small N's, the problem should be accuracy of measurement. A call takes few microseconds. But the timer resolution of OpenCL devices can be obtained, then errors of measurement are known.

Profile Crystal PelletProject donor
Avatar
Send message
Joined: 9 Nov 08
Posts: 180
ID: 31494
Credit: 77,230,917
RAC: 390
321 LLR Amethyst: Earned 1,000,000 credits (1,003,526)Cullen LLR Gold: Earned 500,000 credits (500,200)ESP LLR Gold: Earned 500,000 credits (738,168)Generalized Cullen/Woodall LLR Silver: Earned 100,000 credits (241,969)PPS LLR Ruby: Earned 2,000,000 credits (2,123,786)PSP LLR Amethyst: Earned 1,000,000 credits (1,303,207)SoB LLR Amethyst: Earned 1,000,000 credits (1,567,316)SR5 LLR Gold: Earned 500,000 credits (542,997)SGS LLR Amethyst: Earned 1,000,000 credits (1,256,351)TRP LLR Amethyst: Earned 1,000,000 credits (1,010,058)Woodall LLR Silver: Earned 100,000 credits (118,189)321 Sieve Silver: Earned 100,000 credits (102,310)Cullen/Woodall Sieve (suspended) Jade: Earned 10,000,000 credits (14,643,949)Generalized Cullen/Woodall Sieve Ruby: Earned 2,000,000 credits (3,494,619)PPS Sieve Sapphire: Earned 20,000,000 credits (29,962,345)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Turquoise: Earned 5,000,000 credits (5,208,890)TRP Sieve (suspended) Turquoise: Earned 5,000,000 credits (5,080,539)AP 26/27 Gold: Earned 500,000 credits (507,329)GFN Turquoise: Earned 5,000,000 credits (7,824,901)
Message 68533 - Posted: 18 Aug 2013 | 10:08:43 UTC - in response to Message 68514.

Michael Goetz wrote:
GeneferOCL 3.1.2-3 is now available for download. You DO want this new version if you're running GeneferOCL because it's faster!

It's got a new and improved faster transform.

I give the latest GeneferOCL - 3.1.2-3 a chance, but it is not faster on my machine.
Calculation after 7% done is 29+ hours and a higher cpu usage (96.6%) than v3.1.2-2.
Result with that version returned in 27 hours and cpu usage of 82.1%.
____________

Yves Gallot
Volunteer developer
Project scientist
Send message
Joined: 19 Aug 12
Posts: 513
ID: 164101
Credit: 295,254,118
RAC: 5,845
GFN Double Silver: Earned 200,000,000 credits (295,254,118)
Message 68534 - Posted: 18 Aug 2013 | 11:01:23 UTC - in response to Message 68533.

Michael Goetz wrote:
GeneferOCL 3.1.2-3 is now available for download. You DO want this new version if you're running GeneferOCL because it's faster!

I give the latest GeneferOCL - 3.1.2-3 a chance, but it is not faster on my machine.
Calculation after 7% done is 29+ hours and a higher cpu usage (96.6%) than v3.1.2-2.
Result with that version returned in 27 hours and cpu usage of 82.1%.

Yes, it's slower on ATI cards, sorry for that.
The 3.1.2-4 will solve this problem with a "self-tuning transform".

Profile Roger
Volunteer moderator
Project administrator
Volunteer developer
Volunteer tester
Project scientist
Avatar
Send message
Joined: 27 Nov 11
Posts: 1112
ID: 120786
Credit: 261,530,863
RAC: 1,408
Found 1 prime in the 2018 Tour de Primes321 LLR Ruby: Earned 2,000,000 credits (2,012,522)Cullen LLR Amethyst: Earned 1,000,000 credits (1,359,862)ESP LLR Ruby: Earned 2,000,000 credits (2,213,934)Generalized Cullen/Woodall LLR Ruby: Earned 2,000,000 credits (2,088,705)PPS LLR Ruby: Earned 2,000,000 credits (2,617,785)PSP LLR Ruby: Earned 2,000,000 credits (2,420,512)SoB LLR Amethyst: Earned 1,000,000 credits (1,780,064)SR5 LLR Ruby: Earned 2,000,000 credits (2,238,295)SGS LLR Ruby: Earned 2,000,000 credits (2,139,392)TRP LLR Ruby: Earned 2,000,000 credits (2,125,391)Woodall LLR Amethyst: Earned 1,000,000 credits (1,311,937)321 Sieve Turquoise: Earned 5,000,000 credits (5,190,731)Cullen/Woodall Sieve (suspended) Silver: Earned 100,000 credits (207,387)Generalized Cullen/Woodall Sieve Turquoise: Earned 5,000,000 credits (5,049,697)PPS Sieve Double Bronze: Earned 100,000,000 credits (100,422,123)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Ruby: Earned 2,000,000 credits (3,227,972)TRP Sieve (suspended) Turquoise: Earned 5,000,000 credits (5,021,659)AP 26/27 Sapphire: Earned 20,000,000 credits (20,295,860)GFN Emerald: Earned 50,000,000 credits (56,515,310)PSA Sapphire: Earned 20,000,000 credits (43,298,465)
Message 68535 - Posted: 18 Aug 2013 | 11:10:58 UTC - in response to Message 68534.

I tried a bench of the new 3.1.2-3 geneferocl with HD7970Ghz:

538452^1048576+1 Time: 1.78 ms/mul. Err: 0.2188 6009544 digits
360204^4194304+1 Time: 6.8 ms/mul. Err: 0.1953 23305854 digits
Genefer Mark = 60.

Definitely slower. Then I tried revision 400 of the assembla algorithm with "self-tuning transform" (again):
538452^1048576+1 Time: 1.1 ms/mul. Err: 0.2134 6009544 digits
360204^4194304+1 Time: 4.45 ms/mul. Err: 0.1953 23305854 digits

New best runs! I can see your working on 3.1.2-4 of geneferocl, lets see what that brings.

Profile Crystal PelletProject donor
Avatar
Send message
Joined: 9 Nov 08
Posts: 180
ID: 31494
Credit: 77,230,917
RAC: 390
321 LLR Amethyst: Earned 1,000,000 credits (1,003,526)Cullen LLR Gold: Earned 500,000 credits (500,200)ESP LLR Gold: Earned 500,000 credits (738,168)Generalized Cullen/Woodall LLR Silver: Earned 100,000 credits (241,969)PPS LLR Ruby: Earned 2,000,000 credits (2,123,786)PSP LLR Amethyst: Earned 1,000,000 credits (1,303,207)SoB LLR Amethyst: Earned 1,000,000 credits (1,567,316)SR5 LLR Gold: Earned 500,000 credits (542,997)SGS LLR Amethyst: Earned 1,000,000 credits (1,256,351)TRP LLR Amethyst: Earned 1,000,000 credits (1,010,058)Woodall LLR Silver: Earned 100,000 credits (118,189)321 Sieve Silver: Earned 100,000 credits (102,310)Cullen/Woodall Sieve (suspended) Jade: Earned 10,000,000 credits (14,643,949)Generalized Cullen/Woodall Sieve Ruby: Earned 2,000,000 credits (3,494,619)PPS Sieve Sapphire: Earned 20,000,000 credits (29,962,345)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Turquoise: Earned 5,000,000 credits (5,208,890)TRP Sieve (suspended) Turquoise: Earned 5,000,000 credits (5,080,539)AP 26/27 Gold: Earned 500,000 credits (507,329)GFN Turquoise: Earned 5,000,000 credits (7,824,901)
Message 68536 - Posted: 18 Aug 2013 | 12:21:52 UTC - in response to Message 68534.

Yes, it's slower on ATI cards, sorry for that.
The 3.1.2-4 will solve this problem with a "self-tuning transform".

No worry, Yves. This thread is called "....available for testing" ;)

Profile Michael GoetzProject donor
Volunteer moderator
Project administrator
Project scientist
Avatar
Send message
Joined: 21 Jan 10
Posts: 12669
ID: 53948
Credit: 184,131,428
RAC: 10,627
The "Shut up already!" badge:  This loud mouth has mansplained on the forums over 10 thousand times!  Sheesh!!!Discovered the World's First GFN-19 prime!!!Discovered 1 mega primeFound 1 prime in the 2018 Tour de PrimesFound 1 prime in the 2019 Tour de Primes321 LLR Ruby: Earned 2,000,000 credits (2,063,182)Cullen LLR Ruby: Earned 2,000,000 credits (2,005,249)ESP LLR Ruby: Earned 2,000,000 credits (3,820,430)Generalized Cullen/Woodall LLR Ruby: Earned 2,000,000 credits (2,145,754)PPS LLR Ruby: Earned 2,000,000 credits (2,773,744)PSP LLR Ruby: Earned 2,000,000 credits (2,632,269)SoB LLR Sapphire: Earned 20,000,000 credits (34,158,496)SR5 LLR Turquoise: Earned 5,000,000 credits (8,293,415)SGS LLR Ruby: Earned 2,000,000 credits (2,012,781)TRP LLR Ruby: Earned 2,000,000 credits (2,737,347)Woodall LLR Ruby: Earned 2,000,000 credits (2,195,123)321 Sieve Turquoise: Earned 5,000,000 credits (5,046,112)Cullen/Woodall Sieve (suspended) Ruby: Earned 2,000,000 credits (4,170,256)Generalized Cullen/Woodall Sieve Turquoise: Earned 5,000,000 credits (5,059,304)PPS Sieve Sapphire: Earned 20,000,000 credits (20,110,788)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Amethyst: Earned 1,000,000 credits (1,035,522)TRP Sieve (suspended) Ruby: Earned 2,000,000 credits (2,051,121)AP 26/27 Turquoise: Earned 5,000,000 credits (7,090,096)GFN Emerald: Earned 50,000,000 credits (64,594,991)PSA Jade: Earned 10,000,000 credits (10,135,447)
Message 68537 - Posted: 18 Aug 2013 | 12:34:12 UTC - in response to Message 68532.

Thanks, very good news!
538452^1048576+1 Time: 1.52 => 1.15 ms/mul.
360204^4194304+1 Time: 5.61 => 4.64 ms/mul.
That's the most important for the Generalized Fermat Prime Search.


I'd include N=2097152 in the "most important" list. At most, we're only a few years away from exhausting N=1048576, and it could be a lot shorter than that. GeneferOCL likely will greatly increase the number of computers crunching GFN, and if we do a GFN challenge next year there's a good chance we'll be crunching 2097152 in 2014.
____________
Please do not PM me with support questions. Ask on the forums instead. Thank you!

My lucky number is 75898524288+1

Yves Gallot
Volunteer developer
Project scientist
Send message
Joined: 19 Aug 12
Posts: 513
ID: 164101
Credit: 295,254,118
RAC: 5,845
GFN Double Silver: Earned 200,000,000 credits (295,254,118)
Message 68538 - Posted: 18 Aug 2013 | 12:47:11 UTC - in response to Message 68535.

New best runs! I can see your working on 3.1.2-4 of geneferocl, lets see what that brings.

I've just committed OclGenefer 2013-08-18 (assembla rev 402).

Roger, if it's OK on your HD7970Ghz, I will update real genefer with it.
Just the result of the bench is necessary.

It also prints "Profiling timer resolution": 1 microsecond on NVidia.

Thanks!

Yves Gallot
Volunteer developer
Project scientist
Send message
Joined: 19 Aug 12
Posts: 513
ID: 164101
Credit: 295,254,118
RAC: 5,845
GFN Double Silver: Earned 200,000,000 credits (295,254,118)
Message 68540 - Posted: 18 Aug 2013 | 13:00:28 UTC - in response to Message 68537.

I'd include N=2097152 in the "most important" list. At most, we're only a few years away from exhausting N=1048576, and it could be a lot shorter than that.

I would be interested in some statistics, but I don't find them Genefer subproject status:
What is the range for N=1048576 [2-600000] ?
How many candidates pass through the sieve and should be tested?
38545 were tested.
Today, how many of them are tested (and double checked) per day?

Husu*
Avatar
Send message
Joined: 16 Jan 12
Posts: 15
ID: 127298
Credit: 165,338,156
RAC: 0
321 LLR Bronze: Earned 10,000 credits (15,207)PPS LLR Silver: Earned 100,000 credits (373,654)SGS LLR Silver: Earned 100,000 credits (129,350)TRP LLR Bronze: Earned 10,000 credits (12,028)Woodall LLR Bronze: Earned 10,000 credits (11,232)Cullen/Woodall Sieve (suspended) Silver: Earned 100,000 credits (292,273)PPS Sieve Double Bronze: Earned 100,000,000 credits (159,185,362)GFN Turquoise: Earned 5,000,000 credits (5,316,037)
Message 68541 - Posted: 18 Aug 2013 | 13:50:03 UTC

One of my just tasks got validated, so roughly:
Titan (16,600sec) on OpenCL is about 2x faster than 580 on CUDA (31,596sec).

Although OpenCL also uses as much CPU time as it does GPU time currently, the workunit:
http://www.primegrid.com/workunit.php?wuid=350124153

Also for comparison a 670 takes 45,393sec on OpenCL (ran two wu's): http://www.primegrid.com/workunit.php?wuid=350124165

I'll leave the Titan to continue with OpenCL Genefer, as it's way faster than CUDA per workunit.

Profile Roger
Volunteer moderator
Project administrator
Volunteer developer
Volunteer tester
Project scientist
Avatar
Send message
Joined: 27 Nov 11
Posts: 1112
ID: 120786
Credit: 261,530,863
RAC: 1,408
Found 1 prime in the 2018 Tour de Primes321 LLR Ruby: Earned 2,000,000 credits (2,012,522)Cullen LLR Amethyst: Earned 1,000,000 credits (1,359,862)ESP LLR Ruby: Earned 2,000,000 credits (2,213,934)Generalized Cullen/Woodall LLR Ruby: Earned 2,000,000 credits (2,088,705)PPS LLR Ruby: Earned 2,000,000 credits (2,617,785)PSP LLR Ruby: Earned 2,000,000 credits (2,420,512)SoB LLR Amethyst: Earned 1,000,000 credits (1,780,064)SR5 LLR Ruby: Earned 2,000,000 credits (2,238,295)SGS LLR Ruby: Earned 2,000,000 credits (2,139,392)TRP LLR Ruby: Earned 2,000,000 credits (2,125,391)Woodall LLR Amethyst: Earned 1,000,000 credits (1,311,937)321 Sieve Turquoise: Earned 5,000,000 credits (5,190,731)Cullen/Woodall Sieve (suspended) Silver: Earned 100,000 credits (207,387)Generalized Cullen/Woodall Sieve Turquoise: Earned 5,000,000 credits (5,049,697)PPS Sieve Double Bronze: Earned 100,000,000 credits (100,422,123)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Ruby: Earned 2,000,000 credits (3,227,972)TRP Sieve (suspended) Turquoise: Earned 5,000,000 credits (5,021,659)AP 26/27 Sapphire: Earned 20,000,000 credits (20,295,860)GFN Emerald: Earned 50,000,000 credits (56,515,310)PSA Sapphire: Earned 20,000,000 credits (43,298,465)
Message 68542 - Posted: 18 Aug 2013 | 13:52:01 UTC - in response to Message 68538.
Last modified: 18 Aug 2013 | 13:53:07 UTC

New best runs! I can see your working on 3.1.2-4 of geneferocl, lets see what that brings.

I've just committed OclGenefer 2013-08-18 (assembla rev 402).

Roger, if it's OK on your HD7970Ghz, I will update real genefer with it.
Just the result of the bench is necessary.

It also prints "Profiling timer resolution": 1 microsecond on NVidia.

Thanks!

402 benchies:
OclGenefer 2013-08-18, Copyright (C) 2001-2013, Yves Gallot. Options: -q "b^N+1" Test expression. Platform 'AMD Accelerated Parallel Processing': GPU device 'Tahiti' found. Platform 'AMD Accelerated Parallel Processing': CPU device 'AMD Phenom(tm) II X6 1100T Processor' found. Running on platform 'AMD Accelerated Parallel Processing', device 'Tahiti', vers ion 'OpenCL 1.2 AMD-APP (1124.2)' and driver '1124.2 (VM)'. Clock frequency = 1050 MHz, compute units = 32. Global mem size = 2048 MB, cache size = 16 kB (ReadWrite), cache line size = 6 4 Bytes. Local mem size = 32 kB (dedicated), Constant mem size = 64 kB. Max workgroup size = 256, Profiling timer resolution = 0.0 usec. 2199064^8192+1 Time: 76.5 us/mul. Err: 0.2188 51956 digits 1798620^16384+1 Time: 73.7 us/mul. Err: 0.2227 102481 digits 1471094^32768+1 Time: 73.2 us/mul. Err: 0.2383 202102 digits 1203210^65536+1 Time: 87.6 us/mul. Err: 0.2305 398482 digits 984108^131072+1 Time: 130 us/mul. Err: 0.2188 785521 digits 804904^262144+1 Time: 305 us/mul. Err: 0.2266 1548156 digits localWorkSize0 = 4, localWorkSize1 = 4. 658332^524288+1 Time: 670 us/mul. Err: 0.2109 3050541 digits localWorkSize0 = 4, localWorkSize1 = 2. 538452^1048576+1 Time: 1.1 ms/mul. Err: 0.2134 6009544 digits localWorkSize0 = 4, localWorkSize1 = 2. 440400^2097152+1 Time: 2.34 ms/mul. Err: 0.2266 11836006 digits localWorkSize0 = 8, localWorkSize1 = 2. 360204^4194304+1 Time: 4.45 ms/mul. Err: 0.1953 23305854 digits localWorkSize0 = 8, localWorkSize1 = 2. 294612^8388608+1 Time: 9.34 ms/mul. Err: 0.2109 45879398 digits

Awesome!
Profiling timer resolution a bit dodgy on AMD.

Yves Gallot
Volunteer developer
Project scientist
Send message
Joined: 19 Aug 12
Posts: 513
ID: 164101
Credit: 295,254,118
RAC: 5,845
GFN Double Silver: Earned 200,000,000 credits (295,254,118)
Message 68543 - Posted: 18 Aug 2013 | 13:53:28 UTC - in response to Message 68538.

New best runs! I can see your working on 3.1.2-4 of geneferocl, lets see what that brings.

I've just committed OclGenefer 2013-08-18 (assembla rev 402).

assembla rev 403.

Profile Michael GoetzProject donor
Volunteer moderator
Project administrator
Project scientist
Avatar
Send message
Joined: 21 Jan 10
Posts: 12669
ID: 53948
Credit: 184,131,428
RAC: 10,627
The "Shut up already!" badge:  This loud mouth has mansplained on the forums over 10 thousand times!  Sheesh!!!Discovered the World's First GFN-19 prime!!!Discovered 1 mega primeFound 1 prime in the 2018 Tour de PrimesFound 1 prime in the 2019 Tour de Primes321 LLR Ruby: Earned 2,000,000 credits (2,063,182)Cullen LLR Ruby: Earned 2,000,000 credits (2,005,249)ESP LLR Ruby: Earned 2,000,000 credits (3,820,430)Generalized Cullen/Woodall LLR Ruby: Earned 2,000,000 credits (2,145,754)PPS LLR Ruby: Earned 2,000,000 credits (2,773,744)PSP LLR Ruby: Earned 2,000,000 credits (2,632,269)SoB LLR Sapphire: Earned 20,000,000 credits (34,158,496)SR5 LLR Turquoise: Earned 5,000,000 credits (8,293,415)SGS LLR Ruby: Earned 2,000,000 credits (2,012,781)TRP LLR Ruby: Earned 2,000,000 credits (2,737,347)Woodall LLR Ruby: Earned 2,000,000 credits (2,195,123)321 Sieve Turquoise: Earned 5,000,000 credits (5,046,112)Cullen/Woodall Sieve (suspended) Ruby: Earned 2,000,000 credits (4,170,256)Generalized Cullen/Woodall Sieve Turquoise: Earned 5,000,000 credits (5,059,304)PPS Sieve Sapphire: Earned 20,000,000 credits (20,110,788)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Amethyst: Earned 1,000,000 credits (1,035,522)TRP Sieve (suspended) Ruby: Earned 2,000,000 credits (2,051,121)AP 26/27 Turquoise: Earned 5,000,000 credits (7,090,096)GFN Emerald: Earned 50,000,000 credits (64,594,991)PSA Jade: Earned 10,000,000 credits (10,135,447)
Message 68544 - Posted: 18 Aug 2013 | 13:54:12 UTC - in response to Message 68540.

I'd include N=2097152 in the "most important" list. At most, we're only a few years away from exhausting N=1048576, and it could be a lot shorter than that.

I would be interested in some statistics, but I don't find them Genefer subproject status:
What is the range for N=1048576 [2-600000] ?


The range is 6 to a little bit below the "-l" B limit. Exactly where we'll stop will depend on a number of factors which may change between now and then.

2 and 4 can't be tested with Genefer.

How many candidates pass through the sieve and should be tested?


We've sieved N=20, 21, and 22 to an incredible depth due to 2 unexpected occurrences. If I remember correctly, our original long-term plan was to sieve to about 61P, which was the limit of the sieving software. It was expected to take many years to get there.

Then we had a new GPU-based sieving program. That was followed up by another person using a lot of AWS GPU-servers (each with dual Tesla GPUs) to do a prodigious amount of sieving. All three N ranges have now been sieved beyond 33E (33000P), which is 500 times higher than our original long term goal. N=22 has been sieved to beyond 100E!!! (The sieve goes from b=2 to b=100M.)

Although it's still beneficial to continue sieving, the overall ratio of candidates removed to candidates remaining isn't going to change much.

Jim can probably give you better numbers than I can. For n=20 and b=6 through 199986, there's approximately 36175 candidates that were not removed by the sieve and were tested with Genefer. That is, of course, 36% remaining, and 64% removed by the sieve (counting just even b's).

38545 were tested.
Today, how many of them are tested (and double checked) per day?


Over the last 21 days, we've been averaging 75 short GFN's per day. Extrapolated back to the beginning of the year, that's about 18000 candidates. The difference between 18000 and 38545 is mostly due to the GFN challenge. :)
____________
Please do not PM me with support questions. Ask on the forums instead. Thank you!

My lucky number is 75898524288+1

Yves Gallot
Volunteer developer
Project scientist
Send message
Joined: 19 Aug 12
Posts: 513
ID: 164101
Credit: 295,254,118
RAC: 5,845
GFN Double Silver: Earned 200,000,000 credits (295,254,118)
Message 68545 - Posted: 18 Aug 2013 | 13:57:25 UTC - in response to Message 68542.

Profiling timer resolution = 0.0 usec.
Profiling timer resolution a bit dodgy on AMD.

Please, wait for the 404 (I hope it's the latest).
Maybe Profiling timer resolution < 0.1 usec.

Yves Gallot
Volunteer developer
Project scientist
Send message
Joined: 19 Aug 12
Posts: 513
ID: 164101
Credit: 295,254,118
RAC: 5,845
GFN Double Silver: Earned 200,000,000 credits (295,254,118)
Message 68547 - Posted: 18 Aug 2013 | 14:07:09 UTC - in response to Message 68545.

Profiling timer resolution = 0.0 usec.
Profiling timer resolution a bit dodgy on AMD.

Please, wait for the 404 (I hope it's the latest).
Maybe Profiling timer resolution < 0.1 usec.


I committed the 404.
It seems to work but I would like to understand why!

Roger, a final, please.

Profile 1998golferProject donor
Volunteer moderator
Volunteer tester
Send message
Joined: 4 Dec 12
Posts: 1049
ID: 183129
Credit: 1,029,008,261
RAC: 32,043
Discovered 1 mega primeFound 1 prime in the 2018 Tour de PrimesFound 1 prime in the 2019 Tour de Primes321 LLR Ruby: Earned 2,000,000 credits (3,615,477)Cullen LLR Turquoise: Earned 5,000,000 credits (5,124,261)ESP LLR Turquoise: Earned 5,000,000 credits (7,825,942)Generalized Cullen/Woodall LLR Emerald: Earned 50,000,000 credits (50,656,837)PPS LLR Jade: Earned 10,000,000 credits (11,893,917)PSP LLR Sapphire: Earned 20,000,000 credits (36,318,010)SoB LLR Sapphire: Earned 20,000,000 credits (21,004,651)SR5 LLR Sapphire: Earned 20,000,000 credits (32,288,823)SGS LLR Ruby: Earned 2,000,000 credits (2,771,356)TRP LLR Turquoise: Earned 5,000,000 credits (5,867,891)Woodall LLR Ruby: Earned 2,000,000 credits (2,824,779)321 Sieve Turquoise: Earned 5,000,000 credits (5,863,434)Generalized Cullen/Woodall Sieve Sapphire: Earned 20,000,000 credits (27,921,016)PPS Sieve Double Silver: Earned 200,000,000 credits (340,666,518)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Turquoise: Earned 5,000,000 credits (7,990,608)TRP Sieve (suspended) Sapphire: Earned 20,000,000 credits (20,217,057)AP 26/27 Turquoise: Earned 5,000,000 credits (5,445,921)GFN Sapphire: Earned 20,000,000 credits (20,125,379)PSA Double Silver: Earned 200,000,000 credits (420,586,383)
Message 68548 - Posted: 18 Aug 2013 | 14:12:19 UTC
Last modified: 18 Aug 2013 | 14:12:48 UTC

Here is genefer ocl on my nvidia gt 430. Running genfercuda gives me an error. ("The application was unable to start correctly (0xc0000013). Click OK to close the application.")

Running on platform 'NVIDIA CUDA', device 'GeForce GT 430', version 'OpenCL 1.1 CUDA' and driver '326.19'. Generalized Fermat Number Bench 2199064^8192+1 Time: 101 us/mul. Err: 0.2344 51956 digits 1798620^16384+1 Time: 177 us/mul. Err: 0.5000 102481 digits 1471094^32768+1 Time: 335 us/mul. Err: 0.5000 202102 digits 1203210^65536+1 Time: 691 us/mul. Err: 0.5000 398482 digits 984108^131072+1 Time: 1.28 ms/mul. Err: 0.5000 785521 digits 804904^262144+1 Time: 2.78 ms/mul. Err: 0.5000 1548156 digits 658332^524288+1 Time: 5.72 ms/mul. Err: 0.5000 3050541 digits 538452^1048576+1 Time: 12 ms/mul. Err: 0.5000 6009544 digits 440400^2097152+1 Time: 25.4 ms/mul. Err: 0.5000 11836006 digits 360204^4194304+1 Time: 53.9 ms/mul. Err: 0.5000 23305854 digits 294612^8388608+1 Time: 119 ms/mul. Err: 0.5000 45879398 digits Genefer Mark = 8.

____________

275*2^3585539+1 is prime!!! (1079358 digits)

Proud member of Aggie the Pew

Profile Michael GoetzProject donor
Volunteer moderator
Project administrator
Project scientist
Avatar
Send message
Joined: 21 Jan 10
Posts: 12669
ID: 53948
Credit: 184,131,428
RAC: 10,627
The "Shut up already!" badge:  This loud mouth has mansplained on the forums over 10 thousand times!  Sheesh!!!Discovered the World's First GFN-19 prime!!!Discovered 1 mega primeFound 1 prime in the 2018 Tour de PrimesFound 1 prime in the 2019 Tour de Primes321 LLR Ruby: Earned 2,000,000 credits (2,063,182)Cullen LLR Ruby: Earned 2,000,000 credits (2,005,249)ESP LLR Ruby: Earned 2,000,000 credits (3,820,430)Generalized Cullen/Woodall LLR Ruby: Earned 2,000,000 credits (2,145,754)PPS LLR Ruby: Earned 2,000,000 credits (2,773,744)PSP LLR Ruby: Earned 2,000,000 credits (2,632,269)SoB LLR Sapphire: Earned 20,000,000 credits (34,158,496)SR5 LLR Turquoise: Earned 5,000,000 credits (8,293,415)SGS LLR Ruby: Earned 2,000,000 credits (2,012,781)TRP LLR Ruby: Earned 2,000,000 credits (2,737,347)Woodall LLR Ruby: Earned 2,000,000 credits (2,195,123)321 Sieve Turquoise: Earned 5,000,000 credits (5,046,112)Cullen/Woodall Sieve (suspended) Ruby: Earned 2,000,000 credits (4,170,256)Generalized Cullen/Woodall Sieve Turquoise: Earned 5,000,000 credits (5,059,304)PPS Sieve Sapphire: Earned 20,000,000 credits (20,110,788)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Amethyst: Earned 1,000,000 credits (1,035,522)TRP Sieve (suspended) Ruby: Earned 2,000,000 credits (2,051,121)AP 26/27 Turquoise: Earned 5,000,000 credits (7,090,096)GFN Emerald: Earned 50,000,000 credits (64,594,991)PSA Jade: Earned 10,000,000 credits (10,135,447)
Message 68550 - Posted: 18 Aug 2013 | 14:15:27 UTC

I've moved several posts from the New genefer (3.x.x) apps now available for testing thread to this thread so that all testing information can be consolidated in a single place.
____________
Please do not PM me with support questions. Ask on the forums instead. Thank you!

My lucky number is 75898524288+1

Profile Roger
Volunteer moderator
Project administrator
Volunteer developer
Volunteer tester
Project scientist
Avatar
Send message
Joined: 27 Nov 11
Posts: 1112
ID: 120786
Credit: 261,530,863
RAC: 1,408
Found 1 prime in the 2018 Tour de Primes321 LLR Ruby: Earned 2,000,000 credits (2,012,522)Cullen LLR Amethyst: Earned 1,000,000 credits (1,359,862)ESP LLR Ruby: Earned 2,000,000 credits (2,213,934)Generalized Cullen/Woodall LLR Ruby: Earned 2,000,000 credits (2,088,705)PPS LLR Ruby: Earned 2,000,000 credits (2,617,785)PSP LLR Ruby: Earned 2,000,000 credits (2,420,512)SoB LLR Amethyst: Earned 1,000,000 credits (1,780,064)SR5 LLR Ruby: Earned 2,000,000 credits (2,238,295)SGS LLR Ruby: Earned 2,000,000 credits (2,139,392)TRP LLR Ruby: Earned 2,000,000 credits (2,125,391)Woodall LLR Amethyst: Earned 1,000,000 credits (1,311,937)321 Sieve Turquoise: Earned 5,000,000 credits (5,190,731)Cullen/Woodall Sieve (suspended) Silver: Earned 100,000 credits (207,387)Generalized Cullen/Woodall Sieve Turquoise: Earned 5,000,000 credits (5,049,697)PPS Sieve Double Bronze: Earned 100,000,000 credits (100,422,123)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Ruby: Earned 2,000,000 credits (3,227,972)TRP Sieve (suspended) Turquoise: Earned 5,000,000 credits (5,021,659)AP 26/27 Sapphire: Earned 20,000,000 credits (20,295,860)GFN Emerald: Earned 50,000,000 credits (56,515,310)PSA Sapphire: Earned 20,000,000 credits (43,298,465)
Message 68552 - Posted: 18 Aug 2013 | 14:28:41 UTC - in response to Message 68547.
Last modified: 18 Aug 2013 | 14:38:16 UTC

I committed the 404.
It seems to work but I would like to understand why!

Roger, a final, please.

404:
OclGenefer 2013-08-18, Copyright (C) 2001-2013, Yves Gallot. Options: -q "b^N+1" Test expression. Platform 'AMD Accelerated Parallel Processing': GPU device 'Tahiti' found. Platform 'AMD Accelerated Parallel Processing': CPU device 'AMD Phenom(tm) II X6 1100T Processor' found. Running on platform 'AMD Accelerated Parallel Processing', device 'Tahiti', vers ion 'OpenCL 1.2 AMD-APP (1124.2)' and driver '1124.2 (VM)'. Clock frequency = 1050 MHz, compute units = 32. Global mem size = 2048 MB, cache size = 16 kB (ReadWrite), cache line size = 6 4 Bytes. Local mem size = 32 kB (dedicated), Constant mem size = 64 kB. Max workgroup size = 256, Profiling timer resolution = 0.001 usec. 2199064^8192+1 Time: 74.2 us/mul. Err: 0.2188 51956 digits 1798620^16384+1 Time: 73.3 us/mul. Err: 0.2227 102481 digits 1471094^32768+1 Time: 74.3 us/mul. Err: 0.2383 202102 digits 1203210^65536+1 Time: 79.8 us/mul. Err: 0.2305 398482 digits 984108^131072+1 Time: 122 us/mul. Err: 0.2188 785521 digits 804904^262144+1 Time: 335 us/mul. Err: 0.2266 1548156 digits localWorkSize0 = 4, localWorkSize1 = 4. 658332^524288+1 Time: 670 us/mul. Err: 0.2109 3050541 digits localWorkSize0 = 4, localWorkSize1 = 2. 538452^1048576+1 Time: 1.1 ms/mul. Err: 0.2134 6009544 digits localWorkSize0 = 4, localWorkSize1 = 2. 440400^2097152+1 Time: 2.31 ms/mul. Err: 0.2266 11836006 digits localWorkSize0 = 8, localWorkSize1 = 2. 360204^4194304+1 Time: 4.64 ms/mul. Err: 0.1953 23305854 digits localWorkSize0 = 8, localWorkSize1 = 2. 294612^8388608+1 Time: 9.28 ms/mul. Err: 0.2109 45879398 digits

AMD Profiling timer resolution better than expected, but believable?

Profile Michael GoetzProject donor
Volunteer moderator
Project administrator
Project scientist
Avatar
Send message
Joined: 21 Jan 10
Posts: 12669
ID: 53948
Credit: 184,131,428
RAC: 10,627
The "Shut up already!" badge:  This loud mouth has mansplained on the forums over 10 thousand times!  Sheesh!!!Discovered the World's First GFN-19 prime!!!Discovered 1 mega primeFound 1 prime in the 2018 Tour de PrimesFound 1 prime in the 2019 Tour de Primes321 LLR Ruby: Earned 2,000,000 credits (2,063,182)Cullen LLR Ruby: Earned 2,000,000 credits (2,005,249)ESP LLR Ruby: Earned 2,000,000 credits (3,820,430)Generalized Cullen/Woodall LLR Ruby: Earned 2,000,000 credits (2,145,754)PPS LLR Ruby: Earned 2,000,000 credits (2,773,744)PSP LLR Ruby: Earned 2,000,000 credits (2,632,269)SoB LLR Sapphire: Earned 20,000,000 credits (34,158,496)SR5 LLR Turquoise: Earned 5,000,000 credits (8,293,415)SGS LLR Ruby: Earned 2,000,000 credits (2,012,781)TRP LLR Ruby: Earned 2,000,000 credits (2,737,347)Woodall LLR Ruby: Earned 2,000,000 credits (2,195,123)321 Sieve Turquoise: Earned 5,000,000 credits (5,046,112)Cullen/Woodall Sieve (suspended) Ruby: Earned 2,000,000 credits (4,170,256)Generalized Cullen/Woodall Sieve Turquoise: Earned 5,000,000 credits (5,059,304)PPS Sieve Sapphire: Earned 20,000,000 credits (20,110,788)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Amethyst: Earned 1,000,000 credits (1,035,522)TRP Sieve (suspended) Ruby: Earned 2,000,000 credits (2,051,121)AP 26/27 Turquoise: Earned 5,000,000 credits (7,090,096)GFN Emerald: Earned 50,000,000 credits (64,594,991)PSA Jade: Earned 10,000,000 credits (10,135,447)
Message 68554 - Posted: 18 Aug 2013 | 14:43:42 UTC

404 on GTX460:

OclGenefer 2013-08-18, Copyright (C) 2001-2013, Yves Gallot. Options: -q "b^N+1" Test expression. Platform 'NVIDIA CUDA': GPU device 'GeForce GTX 460' found. Platform 'AMD Accelerated Parallel Processing': CPU device 'Intel(R) Core(TM)2 Q uad CPU @ 2.40GHz' found. Running on platform 'NVIDIA CUDA', device 'GeForce GTX 460', version 'OpenCL 1.1 CUDA' and driver '320.57'. Clock frequency = 1350 MHz, compute units = 7. Global mem size = 1024 MB, cache size = 112 kB (ReadWrite), cache line size = 128 Bytes. Local mem size = 48 kB (dedicated), Constant mem size = 64 kB. Max workgroup size = 1024, Profiling timer resolution = 1.000 usec. 2199064^8192+1 Time: 84 us/mul. Err: 0.2188 51956 digits 1798620^16384+1 Time: 127 us/mul. Err: 0.2266 102481 digits 1471094^32768+1 Time: 171 us/mul. Err: 0.2344 202102 digits 1203210^65536+1 Time: 309 us/mul. Err: 0.2188 398482 digits 984108^131072+1 Time: 563 us/mul. Err: 0.2422 785521 digits 804904^262144+1 Time: 1.14 ms/mul. Err: 0.2178 1548156 digits localWorkSize0 = 32, localWorkSize1 = 32. 658332^524288+1 Time: 2.06 ms/mul. Err: 0.2256 3050541 digits localWorkSize0 = 32, localWorkSize1 = 32. 538452^1048576+1 Time: 4.29 ms/mul. Err: 0.2031 6009544 digits localWorkSize0 = 32, localWorkSize1 = 32. 440400^2097152+1 Time: 8.89 ms/mul. Err: 0.2305 11836006 digits localWorkSize0 = 32, localWorkSize1 = 32. 360204^4194304+1 Time: 18.6 ms/mul. Err: 0.1953 23305854 digits localWorkSize0 = 32, localWorkSize1 = 32. 294612^8388608+1 Time: 39.7 ms/mul. Err: 0.1973 45879398 digits

____________
Please do not PM me with support questions. Ask on the forums instead. Thank you!

My lucky number is 75898524288+1

Yves Gallot
Volunteer developer
Project scientist
Send message
Joined: 19 Aug 12
Posts: 513
ID: 164101
Credit: 295,254,118
RAC: 5,845
GFN Double Silver: Earned 200,000,000 credits (295,254,118)
Message 68555 - Posted: 18 Aug 2013 | 14:54:15 UTC - in response to Message 68552.

Clock frequency = 1050 MHz,
[...] Profiling timer resolution = 0.001 usec.
AMD Profiling timer resolution better than expected, but believable?

Possible if counter = program counter (GPU clock).

It's time for the 3.1.2-4.

Profile chip
Avatar
Send message
Joined: 12 Apr 11
Posts: 128
ID: 94709
Credit: 164,082,201
RAC: 5,606
321 LLR Amethyst: Earned 1,000,000 credits (1,081,800)ESP LLR Amethyst: Earned 1,000,000 credits (1,958,365)PPS LLR Ruby: Earned 2,000,000 credits (3,000,162)PSP LLR Amethyst: Earned 1,000,000 credits (1,022,562)SoB LLR Ruby: Earned 2,000,000 credits (2,015,539)SR5 LLR Ruby: Earned 2,000,000 credits (2,000,481)SGS LLR Ruby: Earned 2,000,000 credits (2,000,014)TRP LLR Ruby: Earned 2,000,000 credits (3,000,865)321 Sieve Ruby: Earned 2,000,000 credits (2,000,357)Cullen/Woodall Sieve (suspended) Ruby: Earned 2,000,000 credits (4,000,096)PPS Sieve Emerald: Earned 50,000,000 credits (50,000,422)TRP Sieve (suspended) Ruby: Earned 2,000,000 credits (2,000,500)AP 26/27 Sapphire: Earned 20,000,000 credits (20,000,721)GFN Sapphire: Earned 20,000,000 credits (20,000,117)PSA Emerald: Earned 50,000,000 credits (50,000,198)
Message 68556 - Posted: 18 Aug 2013 | 15:05:14 UTC

Is it possible to reduce the CPU usage? Value 0.01 and less it would be nice.

Profile DeleteNullProject donor
Volunteer tester
Avatar
Send message
Joined: 6 Apr 06
Posts: 226
ID: 2663
Credit: 5,102,624,386
RAC: 136,622
Discovered 16 mega primesEliminated 2 conjecture "k"sFound 4 primes in the 2018 Tour de PrimesFound 1 mega prime in the 2018 Tour de PrimesFound 4 primes in the 2019 Tour de PrimesFound 2 mega primes in the 2019 Tour de PrimesFound 1 prime in the 2019 Tour de Primes Mountain Stage321 LLR Emerald: Earned 50,000,000 credits (50,589,422)Cullen LLR Emerald: Earned 50,000,000 credits (51,425,403)ESP LLR Emerald: Earned 50,000,000 credits (59,747,151)Generalized Cullen/Woodall LLR Emerald: Earned 50,000,000 credits (59,199,415)PPS LLR Emerald: Earned 50,000,000 credits (97,915,589)PSP LLR Emerald: Earned 50,000,000 credits (55,716,862)SoB LLR Emerald: Earned 50,000,000 credits (71,933,705)SR5 LLR Emerald: Earned 50,000,000 credits (50,919,719)SGS LLR Emerald: Earned 50,000,000 credits (51,327,296)TPS LLR (retired) Bronze: Earned 10,000 credits (61,785)TRP LLR Emerald: Earned 50,000,000 credits (61,671,238)Woodall LLR Emerald: Earned 50,000,000 credits (50,544,668)321 Sieve Emerald: Earned 50,000,000 credits (56,793,407)Cullen/Woodall Sieve (suspended) Ruby: Earned 2,000,000 credits (4,685,309)Generalized Cullen/Woodall Sieve Emerald: Earned 50,000,000 credits (53,324,439)PPS Sieve Double Ruby: Earned 2,000,000,000 credits (3,085,564,413)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Sapphire: Earned 20,000,000 credits (20,057,785)TRP Sieve (suspended) Emerald: Earned 50,000,000 credits (50,146,560)AP 26/27 Double Bronze: Earned 100,000,000 credits (164,762,545)GFN Double Gold: Earned 500,000,000 credits (820,619,526)PSA Double Bronze: Earned 100,000,000 credits (185,423,866)
Message 68558 - Posted: 18 Aug 2013 | 16:18:16 UTC - in response to Message 68556.

Here are the runtimes for the different versions of geneferocl.
device 0: HD7970 clock 1000/1425 (gpu/memory)
device 1: HD7970 clock 1050/1500 (gpu/memory)

geneferocl 3.1.2-1 (Windows 64-bit OpenGL) --device 1: 26666/947
geneferocl 3.1.2-1 (Windows 64-bit OpenGL) --device 0: 27710/935

geneferocl 3.1.2-2 (Windows 64-bit OpenGL) --device 1: 25643/960
geneferocl 3.1.2-2 (Windows 64-bit OpenGL) --device 0: 27638/945

geneferocl 3.1.2-3 (Windows 32-bit OpenGL) --device 1: 29554/966
geneferocl 3.1.2-3 (Windows 32-bit OpenGL) --device 0: 32407/971

So, the best erformance for AMD/ATI 7970 has geneferocl 3.1.2-2
____________
DeleteNull

Profile Michael GoetzProject donor
Volunteer moderator
Project administrator
Project scientist
Avatar
Send message
Joined: 21 Jan 10
Posts: 12669
ID: 53948
Credit: 184,131,428
RAC: 10,627
The "Shut up already!" badge:  This loud mouth has mansplained on the forums over 10 thousand times!  Sheesh!!!Discovered the World's First GFN-19 prime!!!Discovered 1 mega primeFound 1 prime in the 2018 Tour de PrimesFound 1 prime in the 2019 Tour de Primes321 LLR Ruby: Earned 2,000,000 credits (2,063,182)Cullen LLR Ruby: Earned 2,000,000 credits (2,005,249)ESP LLR Ruby: Earned 2,000,000 credits (3,820,430)Generalized Cullen/Woodall LLR Ruby: Earned 2,000,000 credits (2,145,754)PPS LLR Ruby: Earned 2,000,000 credits (2,773,744)PSP LLR Ruby: Earned 2,000,000 credits (2,632,269)SoB LLR Sapphire: Earned 20,000,000 credits (34,158,496)SR5 LLR Turquoise: Earned 5,000,000 credits (8,293,415)SGS LLR Ruby: Earned 2,000,000 credits (2,012,781)TRP LLR Ruby: Earned 2,000,000 credits (2,737,347)Woodall LLR Ruby: Earned 2,000,000 credits (2,195,123)321 Sieve Turquoise: Earned 5,000,000 credits (5,046,112)Cullen/Woodall Sieve (suspended) Ruby: Earned 2,000,000 credits (4,170,256)Generalized Cullen/Woodall Sieve Turquoise: Earned 5,000,000 credits (5,059,304)PPS Sieve Sapphire: Earned 20,000,000 credits (20,110,788)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Amethyst: Earned 1,000,000 credits (1,035,522)TRP Sieve (suspended) Ruby: Earned 2,000,000 credits (2,051,121)AP 26/27 Turquoise: Earned 5,000,000 credits (7,090,096)GFN Emerald: Earned 50,000,000 credits (64,594,991)PSA Jade: Earned 10,000,000 credits (10,135,447)
Message 68560 - Posted: 18 Aug 2013 | 16:48:47 UTC - in response to Message 68555.

Clock frequency = 1050 MHz,
[...] Profiling timer resolution = 0.001 usec.
AMD Profiling timer resolution better than expected, but believable?

Possible if counter = program counter (GPU clock).

It's time for the 3.1.2-4.


3.1.2-4 is available for download from the beta thread http://www.primegrid.com/forum_thread.php?id=4889.

Results on my GTX 460 are unchanged at the higher Ns we're interested in.
____________
Please do not PM me with support questions. Ask on the forums instead. Thank you!

My lucky number is 75898524288+1

Husu*
Avatar
Send message
Joined: 16 Jan 12
Posts: 15
ID: 127298
Credit: 165,338,156
RAC: 0
321 LLR Bronze: Earned 10,000 credits (15,207)PPS LLR Silver: Earned 100,000 credits (373,654)SGS LLR Silver: Earned 100,000 credits (129,350)TRP LLR Bronze: Earned 10,000 credits (12,028)Woodall LLR Bronze: Earned 10,000 credits (11,232)Cullen/Woodall Sieve (suspended) Silver: Earned 100,000 credits (292,273)PPS Sieve Double Bronze: Earned 100,000,000 credits (159,185,362)GFN Turquoise: Earned 5,000,000 credits (5,316,037)
Message 68562 - Posted: 18 Aug 2013 | 17:00:24 UTC
Last modified: 18 Aug 2013 | 17:05:10 UTC

Tested the new version on Titan, I'll replace my app_info executable so get a view of full run.

-----

geneferocl 3.1.2-4 (Windows 32-bit OpenCL)
Copyright 2001-2013, Yves Gallot
Copyright 2009, Mark Rodenkirch, David Underbakke
Copyright 2010-2012, Shoichiro Yamada, Ken Brazier
Copyright 2011-2013, Iain Bethune, Michael Goetz, Ronald Schneider

Command line: geneferocl-windows.exe -b


Running on platform 'NVIDIA CUDA', device 'GeForce GTX TITAN', version 'OpenCL 1.1 CUDA' and driver '326.41'.

Generalized Fermat Number Bench
2199064^8192+1 Time: 78.1 us/mul. Err: 0.2188 51956 digits
1798620^16384+1 Time: 78.7 us/mul. Err: 0.2344 102481 digits
1471094^32768+1 Time: 83 us/mul. Err: 0.2344 202102 digits
1203210^65536+1 Time: 97.7 us/mul. Err: 0.2813 398482 digits
984108^131072+1 Time: 137 us/mul. Err: 0.2295 785521 digits
804904^262144+1 Time: 293 us/mul. Err: 0.2188 1548156 digits
658332^524288+1 Time: 469 us/mul. Err: 0.2266 3050541 digits
538452^1048576+1 Time: 898 us/mul. Err: 0.2188 6009544 digits
440400^2097152+1 Time: 1.64 ms/mul. Err: 0.2188 11836006 digits
360204^4194304+1 Time: 3.28 ms/mul. Err: 0.1953 23305854 digits
294612^8388608+1 Time: 7.19 ms/mul. Err: 0.2070 45879398 digits
Genefer Mark = 120.

------

EDIT:

Had to abort current WU because of the earlier version name difference, probably won't effect anything after this change :D

Checkpoint saved by genefer Windows 32-bit OpenGL, expected Windows 32-bit OpenCL

Profile chip
Avatar
Send message
Joined: 12 Apr 11
Posts: 128
ID: 94709
Credit: 164,082,201
RAC: 5,606
321 LLR Amethyst: Earned 1,000,000 credits (1,081,800)ESP LLR Amethyst: Earned 1,000,000 credits (1,958,365)PPS LLR Ruby: Earned 2,000,000 credits (3,000,162)PSP LLR Amethyst: Earned 1,000,000 credits (1,022,562)SoB LLR Ruby: Earned 2,000,000 credits (2,015,539)SR5 LLR Ruby: Earned 2,000,000 credits (2,000,481)SGS LLR Ruby: Earned 2,000,000 credits (2,000,014)TRP LLR Ruby: Earned 2,000,000 credits (3,000,865)321 Sieve Ruby: Earned 2,000,000 credits (2,000,357)Cullen/Woodall Sieve (suspended) Ruby: Earned 2,000,000 credits (4,000,096)PPS Sieve Emerald: Earned 50,000,000 credits (50,000,422)TRP Sieve (suspended) Ruby: Earned 2,000,000 credits (2,000,500)AP 26/27 Sapphire: Earned 20,000,000 credits (20,000,721)GFN Sapphire: Earned 20,000,000 credits (20,000,117)PSA Emerald: Earned 50,000,000 credits (50,000,198)
Message 68563 - Posted: 18 Aug 2013 | 17:07:28 UTC

I discover very interesed result in genefercl on Nvidia. If with her running task on the processor (trp sieve), genefercl-windows.exe does not consume CPU time, but GPU load drop to ~ 85%

Profile Peciak
Avatar
Send message
Joined: 21 Jul 09
Posts: 17
ID: 43788
Credit: 349,954,843
RAC: 261
321 LLR Amethyst: Earned 1,000,000 credits (1,068,541)Cullen LLR Ruby: Earned 2,000,000 credits (2,008,886)ESP LLR Amethyst: Earned 1,000,000 credits (1,015,088)Generalized Cullen/Woodall LLR Ruby: Earned 2,000,000 credits (2,007,262)PPS LLR Ruby: Earned 2,000,000 credits (2,240,295)PSP LLR Amethyst: Earned 1,000,000 credits (1,021,363)SoB LLR Amethyst: Earned 1,000,000 credits (1,002,907)SR5 LLR Ruby: Earned 2,000,000 credits (2,013,584)SGS LLR Ruby: Earned 2,000,000 credits (2,007,325)TRP LLR Amethyst: Earned 1,000,000 credits (1,005,791)Woodall LLR Amethyst: Earned 1,000,000 credits (1,018,056)321 Sieve Silver: Earned 100,000 credits (200,896)Cullen/Woodall Sieve (suspended) Ruby: Earned 2,000,000 credits (4,381,648)Generalized Cullen/Woodall Sieve Turquoise: Earned 5,000,000 credits (6,410,027)PPS Sieve Double Bronze: Earned 100,000,000 credits (145,163,235)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Jade: Earned 10,000,000 credits (10,006,084)TRP Sieve (suspended) Jade: Earned 10,000,000 credits (10,003,510)AP 26/27 Turquoise: Earned 5,000,000 credits (5,553,750)GFN Sapphire: Earned 20,000,000 credits (38,069,779)PSA Double Bronze: Earned 100,000,000 credits (113,759,350)
Message 68564 - Posted: 18 Aug 2013 | 17:16:52 UTC


GeneferOCL 3.1.2-4 is now available for testing with app_info, and can be downloaded via the first post in this thread.


invalid

<core_client_version>7.0.64</core_client_version> <![CDATA[ <message> (unknown error) - exit code -1073741819 (0xc0000005) </message> <stderr_txt> geneferocl 3.1.2-4 (Windows 32-bit OpenCL) Copyright 2001-2013, Yves Gallot Copyright 2009, Mark Rodenkirch, David Underbakke Copyright 2010-2012, Shoichiro Yamada, Ken Brazier Copyright 2011-2013, Iain Bethune, Michael Goetz, Ronald Schneider Command line: projects/www.primegrid.com/geneferocl-windows.exe -boinc -q 201920^1048576+1 --device 0 Priority change succeeded. Running on platform 'AMD Accelerated Parallel Processing', device 'Tahiti', version 'OpenCL 1.2 AMD-APP (1214.3)' and driver '1214.3 (VM)'. Starting initialization... Initialization complete (5.692 seconds). Testing 201920^1048576+1... </stderr_txt>

http://www.primegrid.com/result.php?resultid=475051861

Yves Gallot
Volunteer developer
Project scientist
Send message
Joined: 19 Aug 12
Posts: 513
ID: 164101
Credit: 295,254,118
RAC: 5,845
GFN Double Silver: Earned 200,000,000 credits (295,254,118)
Message 68565 - Posted: 18 Aug 2013 | 17:23:52 UTC - in response to Message 68562.

Tested the new version on Titan, [...] geneferocl 3.1.2-4 (Windows 32-bit OpenCL)

It was faster with some previous versions. But because that's true for some exponents for which the code is similar, the reason is "Windows 64-bit -> 32-bit" or "driver '320.49' => '326.41' ?

Profile VictordeHollanderProject donor
Send message
Joined: 13 Jan 11
Posts: 25
ID: 81079
Credit: 300,776,184
RAC: 37
321 LLR Silver: Earned 100,000 credits (433,133)Cullen LLR Silver: Earned 100,000 credits (448,329)ESP LLR Gold: Earned 500,000 credits (623,750)PPS LLR Gold: Earned 500,000 credits (520,342)PSP LLR Silver: Earned 100,000 credits (160,346)SoB LLR Silver: Earned 100,000 credits (173,517)SR5 LLR Silver: Earned 100,000 credits (133,674)SGS LLR Silver: Earned 100,000 credits (338,786)TRP LLR Gold: Earned 500,000 credits (618,625)Woodall LLR Silver: Earned 100,000 credits (260,361)Cullen/Woodall Sieve (suspended) Ruby: Earned 2,000,000 credits (4,153,179)PPS Sieve Double Silver: Earned 200,000,000 credits (280,553,783)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Amethyst: Earned 1,000,000 credits (1,160,571)TRP Sieve (suspended) Amethyst: Earned 1,000,000 credits (1,406,640)GFN Ruby: Earned 2,000,000 credits (4,856,299)PSA Ruby: Earned 2,000,000 credits (4,934,850)
Message 68566 - Posted: 18 Aug 2013 | 17:32:42 UTC
Last modified: 18 Aug 2013 | 17:36:21 UTC

So far my 7950 completed 6 'short' Genefer OpenCL tasks without any errors. All are validated by wingmens using CUDA Genefer versions.

GPU: ASUS HD7950 DC2T (Factory OC: 900MHz)
OS: Windows 7 64bit
AMD Catalyst: 13.4
Genefer OpenCL version 3.1.2-2
CPU load: <1%
GPU load: 97-99%
Runtimes: 26,516 - 28,080 sec

For the record:
http://www.primegrid.com/workunit.php?wuid=339797861
http://www.primegrid.com/workunit.php?wuid=342700453
http://www.primegrid.com/workunit.php?wuid=343200843
http://www.primegrid.com/workunit.php?wuid=345650390
http://www.primegrid.com/workunit.php?wuid=345651275
http://www.primegrid.com/workunit.php?wuid=345653871

Some benchmarks:

3.1.2-2

Command line: geneferocl-windows.exe -b Running on platform 'AMD Accelerated Parallel Processing', device 'Tahiti', vers ion 'OpenCL 1.2 AMD-APP (1124.2)' and driver '1124.2 (VM)'. Generalized Fermat Number Bench 2199064^8192+1 Time: 83.3 us/mul. Err: 0.2344 51956 digits 1798620^16384+1 Time: 113 us/mul. Err: 0.2266 102481 digits 1471094^32768+1 Time: 122 us/mul. Err: 0.2344 202102 digits 1203210^65536+1 Time: 127 us/mul. Err: 0.2656 398482 digits 984108^131072+1 Time: 155 us/mul. Err: 0.2188 785521 digits 804904^262144+1 Time: 364 us/mul. Err: 0.2188 1548156 digits 658332^524288+1 Time: 768 us/mul. Err: 0.2188 3050541 digits 538452^1048576+1 Time: 1.48 ms/mul. Err: 0.2266 6009544 digits 440400^2097152+1 Time: 2.81 ms/mul. Err: 0.2266 11836006 digits 360204^4194304+1 Time: 5.55 ms/mul. Err: 0.2031 23305854 digits 294612^8388608+1 Time: 11 ms/mul. Err: 0.1895 45879398 digits


3.1.2-4

Command line: geneferocl-windows3.1.2-4.exe -b Running on platform 'AMD Accelerated Parallel Processing', device 'Tahiti', vers ion 'OpenCL 1.2 AMD-APP (1124.2)' and driver '1124.2 (VM)'. Generalized Fermat Number Bench 2199064^8192+1 Time: 77.8 us/mul. Err: 0.2188 51956 digits 1798620^16384+1 Time: 78.1 us/mul. Err: 0.2344 102481 digits 1471094^32768+1 Time: 80.2 us/mul. Err: 0.2344 202102 digits 1203210^65536+1 Time: 103 us/mul. Err: 0.2813 398482 digits 984108^131072+1 Time: 148 us/mul. Err: 0.2295 785521 digits 804904^262144+1 Time: 354 us/mul. Err: 0.2188 1548156 digits 658332^524288+1 Time: 635 us/mul. Err: 0.2266 3050541 digits 538452^1048576+1 Time: 1.17 ms/mul. Err: 0.2188 6009544 digits 440400^2097152+1 Time: 2.37 ms/mul. Err: 0.2188 11836006 digits 360204^4194304+1 Time: 4.84 ms/mul. Err: 0.1953 23305854 digits 294612^8388608+1 Time: 9.78 ms/mul. Err: 0.2070 45879398 digits


Edit:
So 3.1.2-4 is definitely faster

Yves Gallot
Volunteer developer
Project scientist
Send message
Joined: 19 Aug 12
Posts: 513
ID: 164101
Credit: 295,254,118
RAC: 5,845
GFN Double Silver: Earned 200,000,000 credits (295,254,118)
Message 68567 - Posted: 18 Aug 2013 | 17:35:51 UTC - in response to Message 68564.

invalid
[...]
http://www.primegrid.com/result.php?resultid=475051861


Did you try using interactive mode and 1. bench ?

Husu*
Avatar
Send message
Joined: 16 Jan 12
Posts: 15
ID: 127298
Credit: 165,338,156
RAC: 0
321 LLR Bronze: Earned 10,000 credits (15,207)PPS LLR Silver: Earned 100,000 credits (373,654)SGS LLR Silver: Earned 100,000 credits (129,350)TRP LLR Bronze: Earned 10,000 credits (12,028)Woodall LLR Bronze: Earned 10,000 credits (11,232)Cullen/Woodall Sieve (suspended) Silver: Earned 100,000 credits (292,273)PPS Sieve Double Bronze: Earned 100,000,000 credits (159,185,362)GFN Turquoise: Earned 5,000,000 credits (5,316,037)
Message 68568 - Posted: 18 Aug 2013 | 18:10:39 UTC - in response to Message 68565.
Last modified: 18 Aug 2013 | 18:10:58 UTC

Tested the new version on Titan, [...] geneferocl 3.1.2-4 (Windows 32-bit OpenCL)

It was faster with some previous versions. But because that's true for some exponents for which the code is similar, the reason is "Windows 64-bit -> 32-bit" or "driver '320.49' => '326.41' ?


I made re-runs with the earlier versions I have available and current settings, Driver version '326.41'.

3.1.2-2 (Windows 64-bit OpenGL)
Generalized Fermat Number Bench 2199064^8192+1 Time: 79.3 us/mul. Err: 0.2344 51956 digits 1798620^16384+1 Time: 78.7 us/mul. Err: 0.2266 102481 digits 1471094^32768+1 Time: 84.2 us/mul. Err: 0.2344 202102 digits 1203210^65536+1 Time: 97.7 us/mul. Err: 0.2656 398482 digits 984108^131072+1 Time: 137 us/mul. Err: 0.2188 785521 digits 804904^262144+1 Time: 283 us/mul. Err: 0.2188 1548156 digits 658332^524288+1 Time: 488 us/mul. Err: 0.2188 3050541 digits 538452^1048576+1 Time: 898 us/mul. Err: 0.2266 6009544 digits 440400^2097152+1 Time: 1.64 ms/mul. Err: 0.2266 11836006 digits 360204^4194304+1 Time: 3.44 ms/mul. Err: 0.2031 23305854 digits 294612^8388608+1 Time: 6.88 ms/mul. Err: 0.1895 45879398 digits


3.1.2-3 (Windows 32-bit OpenGL)
Generalized Fermat Number Bench 2199064^8192+1 Time: 79.3 us/mul. Err: 0.2188 51956 digits 1798620^16384+1 Time: 78.7 us/mul. Err: 0.2344 102481 digits 1471094^32768+1 Time: 83 us/mul. Err: 0.2344 202102 digits 1203210^65536+1 Time: 97.7 us/mul. Err: 0.2813 398482 digits 984108^131072+1 Time: 137 us/mul. Err: 0.2295 785521 digits 804904^262144+1 Time: 273 us/mul. Err: 0.2188 1548156 digits 658332^524288+1 Time: 488 us/mul. Err: 0.2266 3050541 digits 538452^1048576+1 Time: 898 us/mul. Err: 0.2188 6009544 digits 440400^2097152+1 Time: 1.64 ms/mul. Err: 0.2188 11836006 digits 360204^4194304+1 Time: 3.44 ms/mul. Err: 0.1953 23305854 digits 294612^8388608+1 Time: 7.19 ms/mul. Err: 0.2070 45879398 digits


3.1.2-4 (Windows 32-bit OpenCL)
Generalized Fermat Number Bench 2199064^8192+1 Time: 79.3 us/mul. Err: 0.2188 51956 digits 1798620^16384+1 Time: 78.7 us/mul. Err: 0.2344 102481 digits 1471094^32768+1 Time: 84.2 us/mul. Err: 0.2344 202102 digits 1203210^65536+1 Time: 97.7 us/mul. Err: 0.2813 398482 digits 984108^131072+1 Time: 137 us/mul. Err: 0.2295 785521 digits 804904^262144+1 Time: 283 us/mul. Err: 0.2188 1548156 digits 658332^524288+1 Time: 488 us/mul. Err: 0.2266 3050541 digits 538452^1048576+1 Time: 898 us/mul. Err: 0.2188 6009544 digits 440400^2097152+1 Time: 1.64 ms/mul. Err: 0.2188 11836006 digits 360204^4194304+1 Time: 3.44 ms/mul. Err: 0.1953 23305854 digits 294612^8388608+1 Time: 7.19 ms/mul. Err: 0.2070 45879398 digits


-----

I did more test runs and there's a slight variation in the numbers per run, this may be because of the "boosting" effects of 1) CPU 2) GPU, so in general for Titan I'd just read the averages instead of to the letter.

Titan "boosts" itself depending on temperature and load, can't make it run on fixed speed, can't disable the feature either. The double precision "slows the boost down" a bit so it won't boost that much over the default GPU Clock.

Example:
Default GPU Clock on my Titan is 837MHz from GPU-Z application information, on idle it's 324MHz.

On Double Precision -b run it's 849.2MHz (48C temperature), hotter it's 836.1MHz on double precision (79C).

On Single Precision -b run it's 1006MHz (no matter of the temperature), other GPU load below 78C it's 992.9MHz - 1006MHz, 79C it's 940.6MHz, etc, etc.

So really depends on the load and temperatures.

NOTE: This is without any overclocking or meddling with the GPU, this is how it works as-is out of the box.

Anyways, the 32-bit version (latest) is more stable in terms of what the output will be, 64-bit 3.1.2-2 version has larger variance

For example I get this on 3.1.2-4 occasionally, usually it's the one I posted before:

Generalized Fermat Number Bench 2199064^8192+1 Time: 79.3 us/mul. Err: 0.2188 51956 digits 1798620^16384+1 Time: 78.7 us/mul. Err: 0.2344 102481 digits 1471094^32768+1 Time: 84.2 us/mul. Err: 0.2344 202102 digits 1203210^65536+1 Time: 97.7 us/mul. Err: 0.2813 398482 digits 984108^131072+1 Time: 132 us/mul. Err: 0.2295 785521 digits 804904^262144+1 Time: 283 us/mul. Err: 0.2188 1548156 digits 658332^524288+1 Time: 488 us/mul. Err: 0.2266 3050541 digits 538452^1048576+1 Time: 859 us/mul. Err: 0.2188 6009544 digits 440400^2097152+1 Time: 1.72 ms/mul. Err: 0.2188 11836006 digits 360204^4194304+1 Time: 3.44 ms/mul. Err: 0.1953 23305854 digits 294612^8388608+1 Time: 6.88 ms/mul. Err: 0.2070 45879398 digits

Profile Peciak
Avatar
Send message
Joined: 21 Jul 09
Posts: 17
ID: 43788
Credit: 349,954,843
RAC: 261
321 LLR Amethyst: Earned 1,000,000 credits (1,068,541)Cullen LLR Ruby: Earned 2,000,000 credits (2,008,886)ESP LLR Amethyst: Earned 1,000,000 credits (1,015,088)Generalized Cullen/Woodall LLR Ruby: Earned 2,000,000 credits (2,007,262)PPS LLR Ruby: Earned 2,000,000 credits (2,240,295)PSP LLR Amethyst: Earned 1,000,000 credits (1,021,363)SoB LLR Amethyst: Earned 1,000,000 credits (1,002,907)SR5 LLR Ruby: Earned 2,000,000 credits (2,013,584)SGS LLR Ruby: Earned 2,000,000 credits (2,007,325)TRP LLR Amethyst: Earned 1,000,000 credits (1,005,791)Woodall LLR Amethyst: Earned 1,000,000 credits (1,018,056)321 Sieve Silver: Earned 100,000 credits (200,896)Cullen/Woodall Sieve (suspended) Ruby: Earned 2,000,000 credits (4,381,648)Generalized Cullen/Woodall Sieve Turquoise: Earned 5,000,000 credits (6,410,027)PPS Sieve Double Bronze: Earned 100,000,000 credits (145,163,235)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Jade: Earned 10,000,000 credits (10,006,084)TRP Sieve (suspended) Jade: Earned 10,000,000 credits (10,003,510)AP 26/27 Turquoise: Earned 5,000,000 credits (5,553,750)GFN Sapphire: Earned 20,000,000 credits (38,069,779)PSA Double Bronze: Earned 100,000,000 credits (113,759,350)
Message 68569 - Posted: 18 Aug 2013 | 18:17:36 UTC - in response to Message 68567.
Last modified: 18 Aug 2013 | 18:23:19 UTC

invalid
[...]
http://www.primegrid.com/result.php?resultid=475051861


Did you try using interactive mode and 1. bench ?

Command line: geneferocl-windows.exe -b Running on platform 'AMD Accelerated Parallel Processing', device 'Tahiti', vers ion 'OpenCL 1.2 AMD-APP (1214.3)' and driver '1214.3 (VM)'. Generalized Fermat Number Bench 2199064^8192+1 Time: 70.2 us/mul. Err: 0.2188 51956 digits 1798620^16384+1 Time: 68.4 us/mul. Err: 0.2344 102481 digits 1471094^32768+1 Time: 69.6 us/mul. Err: 0.2344 202102 digits 1203210^65536+1 Time: 78.1 us/mul. Err: 0.2813 398482 digits 984108^131072+1 Time: 116 us/mul. Err: 0.2295 785521 digits 804904^262144+1 Time: 296 us/mul. Err: 0.2188 1548156 digits 658332^524288+1 Time: 551 us/mul. Err: 0.2266 3050541 digits 538452^1048576+1 Time: 1.05 ms/mul. Err: 0.2188 6009544 digits 440400^2097152+1 Time: 2.19 ms/mul. Err: 0.2188 11836006 digits 360204^4194304+1 Time: 4.56 ms/mul. Err: 0.1953 23305854 digits 294612^8388608+1 Time: 9.06 ms/mul. Err: 0.2070 45879398 digits Genefer Mark = 93.

Yves Gallot
Volunteer developer
Project scientist
Send message
Joined: 19 Aug 12
Posts: 513
ID: 164101
Credit: 295,254,118
RAC: 5,845
GFN Double Silver: Earned 200,000,000 credits (295,254,118)
Message 68570 - Posted: 18 Aug 2013 | 19:08:25 UTC - in response to Message 68569.

invalid
[...]
http://www.primegrid.com/result.php?resultid=475051861

Did you try using interactive mode and 1. bench ?

Command line: geneferocl-windows.exe -b [...]

It seems to work. Maybe it tried to resume from a previous version?

Profile Peciak
Avatar
Send message
Joined: 21 Jul 09
Posts: 17
ID: 43788
Credit: 349,954,843
RAC: 261
321 LLR Amethyst: Earned 1,000,000 credits (1,068,541)Cullen LLR Ruby: Earned 2,000,000 credits (2,008,886)ESP LLR Amethyst: Earned 1,000,000 credits (1,015,088)Generalized Cullen/Woodall LLR Ruby: Earned 2,000,000 credits (2,007,262)PPS LLR Ruby: Earned 2,000,000 credits (2,240,295)PSP LLR Amethyst: Earned 1,000,000 credits (1,021,363)SoB LLR Amethyst: Earned 1,000,000 credits (1,002,907)SR5 LLR Ruby: Earned 2,000,000 credits (2,013,584)SGS LLR Ruby: Earned 2,000,000 credits (2,007,325)TRP LLR Amethyst: Earned 1,000,000 credits (1,005,791)Woodall LLR Amethyst: Earned 1,000,000 credits (1,018,056)321 Sieve Silver: Earned 100,000 credits (200,896)Cullen/Woodall Sieve (suspended) Ruby: Earned 2,000,000 credits (4,381,648)Generalized Cullen/Woodall Sieve Turquoise: Earned 5,000,000 credits (6,410,027)PPS Sieve Double Bronze: Earned 100,000,000 credits (145,163,235)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Jade: Earned 10,000,000 credits (10,006,084)TRP Sieve (suspended) Jade: Earned 10,000,000 credits (10,003,510)AP 26/27 Turquoise: Earned 5,000,000 credits (5,553,750)GFN Sapphire: Earned 20,000,000 credits (38,069,779)PSA Double Bronze: Earned 100,000,000 credits (113,759,350)
Message 68571 - Posted: 18 Aug 2013 | 19:22:42 UTC - in response to Message 68570.

It seems to work. Maybe it tried to resume from a previous version?

No

http://www.primegrid.com/results.php?hostid=396466

Yves Gallot
Volunteer developer
Project scientist
Send message
Joined: 19 Aug 12
Posts: 513
ID: 164101
Credit: 295,254,118
RAC: 5,845
GFN Double Silver: Earned 200,000,000 credits (295,254,118)
Message 68572 - Posted: 18 Aug 2013 | 20:35:47 UTC - in response to Message 68571.

It seems to work. Maybe it tried to resume from a previous version?

No
http://www.primegrid.com/results.php?hostid=396466

I think that I found the bug and fixed it.
It may occur if windows app is a 32-bit application and if the GPU device address space size is 64 bits.
Address space size of NVidia GPUs is 32 bits, that's why it works on NVidia cards.

I committed the fix, then I hope that the 3.1.2-5 will solve your problem.

Yves

Profile Michael GoetzProject donor
Volunteer moderator
Project administrator
Project scientist
Avatar
Send message
Joined: 21 Jan 10
Posts: 12669
ID: 53948
Credit: 184,131,428
RAC: 10,627
The "Shut up already!" badge:  This loud mouth has mansplained on the forums over 10 thousand times!  Sheesh!!!Discovered the World's First GFN-19 prime!!!Discovered 1 mega primeFound 1 prime in the 2018 Tour de PrimesFound 1 prime in the 2019 Tour de Primes321 LLR Ruby: Earned 2,000,000 credits (2,063,182)Cullen LLR Ruby: Earned 2,000,000 credits (2,005,249)ESP LLR Ruby: Earned 2,000,000 credits (3,820,430)Generalized Cullen/Woodall LLR Ruby: Earned 2,000,000 credits (2,145,754)PPS LLR Ruby: Earned 2,000,000 credits (2,773,744)PSP LLR Ruby: Earned 2,000,000 credits (2,632,269)SoB LLR Sapphire: Earned 20,000,000 credits (34,158,496)SR5 LLR Turquoise: Earned 5,000,000 credits (8,293,415)SGS LLR Ruby: Earned 2,000,000 credits (2,012,781)TRP LLR Ruby: Earned 2,000,000 credits (2,737,347)Woodall LLR Ruby: Earned 2,000,000 credits (2,195,123)321 Sieve Turquoise: Earned 5,000,000 credits (5,046,112)Cullen/Woodall Sieve (suspended) Ruby: Earned 2,000,000 credits (4,170,256)Generalized Cullen/Woodall Sieve Turquoise: Earned 5,000,000 credits (5,059,304)PPS Sieve Sapphire: Earned 20,000,000 credits (20,110,788)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Amethyst: Earned 1,000,000 credits (1,035,522)TRP Sieve (suspended) Ruby: Earned 2,000,000 credits (2,051,121)AP 26/27 Turquoise: Earned 5,000,000 credits (7,090,096)GFN Emerald: Earned 50,000,000 credits (64,594,991)PSA Jade: Earned 10,000,000 credits (10,135,447)
Message 68573 - Posted: 18 Aug 2013 | 21:06:12 UTC - in response to Message 68572.



I committed the fix, then I hope that the 3.1.2-5 will solve your problem.

Yves


3.1.2-5 is available for download from the beta download thread.
____________
Please do not PM me with support questions. Ask on the forums instead. Thank you!

My lucky number is 75898524288+1

Profile Peciak
Avatar
Send message
Joined: 21 Jul 09
Posts: 17
ID: 43788
Credit: 349,954,843
RAC: 261
321 LLR Amethyst: Earned 1,000,000 credits (1,068,541)Cullen LLR Ruby: Earned 2,000,000 credits (2,008,886)ESP LLR Amethyst: Earned 1,000,000 credits (1,015,088)Generalized Cullen/Woodall LLR Ruby: Earned 2,000,000 credits (2,007,262)PPS LLR Ruby: Earned 2,000,000 credits (2,240,295)PSP LLR Amethyst: Earned 1,000,000 credits (1,021,363)SoB LLR Amethyst: Earned 1,000,000 credits (1,002,907)SR5 LLR Ruby: Earned 2,000,000 credits (2,013,584)SGS LLR Ruby: Earned 2,000,000 credits (2,007,325)TRP LLR Amethyst: Earned 1,000,000 credits (1,005,791)Woodall LLR Amethyst: Earned 1,000,000 credits (1,018,056)321 Sieve Silver: Earned 100,000 credits (200,896)Cullen/Woodall Sieve (suspended) Ruby: Earned 2,000,000 credits (4,381,648)Generalized Cullen/Woodall Sieve Turquoise: Earned 5,000,000 credits (6,410,027)PPS Sieve Double Bronze: Earned 100,000,000 credits (145,163,235)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Jade: Earned 10,000,000 credits (10,006,084)TRP Sieve (suspended) Jade: Earned 10,000,000 credits (10,003,510)AP 26/27 Turquoise: Earned 5,000,000 credits (5,553,750)GFN Sapphire: Earned 20,000,000 credits (38,069,779)PSA Double Bronze: Earned 100,000,000 credits (113,759,350)
Message 68575 - Posted: 18 Aug 2013 | 21:21:30 UTC

unfortunately, it also invalid
http://www.primegrid.com/results.php?userid=43788
http://www.primegrid.com/result.php?resultid=475080186
http://www.primegrid.com/result.php?resultid=475053577

Yves Gallot
Volunteer developer
Project scientist
Send message
Joined: 19 Aug 12
Posts: 513
ID: 164101
Credit: 295,254,118
RAC: 5,845
GFN Double Silver: Earned 200,000,000 credits (295,254,118)
Message 68576 - Posted: 18 Aug 2013 | 21:30:00 UTC - in response to Message 68575.

unfortunately, it also invalid
http://www.primegrid.com/results.php?userid=43788
http://www.primegrid.com/result.php?resultid=475080186
http://www.primegrid.com/result.php?resultid=475053577

:o(
Anyone else with an ATI 7970?

Please, could you run the test using interactive mode and 2. test?

Yves Gallot
Volunteer developer
Project scientist
Send message
Joined: 19 Aug 12
Posts: 513
ID: 164101
Credit: 295,254,118
RAC: 5,845
GFN Double Silver: Earned 200,000,000 credits (295,254,118)
Message 68577 - Posted: 18 Aug 2013 | 21:48:32 UTC - in response to Message 68569.

invalid
538452^1048576+1 Time: 1.05 ms/mul. Err: 0.2188 6009544 digits 440400^2097152+1 Time: 2.19 ms/mul. Err: 0.2188 11836006 digits 360204^4194304+1 Time: 4.56 ms/mul. Err: 0.1953 23305854 digits 294612^8388608+1 Time: 9.06 ms/mul. Err: 0.2070 45879398 digits Genefer Mark = 93.

But your card run faster than a HD 7970 GHz!
Is it overclocked?

Profile DeleteNullProject donor
Volunteer tester
Avatar
Send message
Joined: 6 Apr 06
Posts: 226
ID: 2663
Credit: 5,102,624,386
RAC: 136,622
Discovered 16 mega primesEliminated 2 conjecture "k"sFound 4 primes in the 2018 Tour de PrimesFound 1 mega prime in the 2018 Tour de PrimesFound 4 primes in the 2019 Tour de PrimesFound 2 mega primes in the 2019 Tour de PrimesFound 1 prime in the 2019 Tour de Primes Mountain Stage321 LLR Emerald: Earned 50,000,000 credits (50,589,422)Cullen LLR Emerald: Earned 50,000,000 credits (51,425,403)ESP LLR Emerald: Earned 50,000,000 credits (59,747,151)Generalized Cullen/Woodall LLR Emerald: Earned 50,000,000 credits (59,199,415)PPS LLR Emerald: Earned 50,000,000 credits (97,915,589)PSP LLR Emerald: Earned 50,000,000 credits (55,716,862)SoB LLR Emerald: Earned 50,000,000 credits (71,933,705)SR5 LLR Emerald: Earned 50,000,000 credits (50,919,719)SGS LLR Emerald: Earned 50,000,000 credits (51,327,296)TPS LLR (retired) Bronze: Earned 10,000 credits (61,785)TRP LLR Emerald: Earned 50,000,000 credits (61,671,238)Woodall LLR Emerald: Earned 50,000,000 credits (50,544,668)321 Sieve Emerald: Earned 50,000,000 credits (56,793,407)Cullen/Woodall Sieve (suspended) Ruby: Earned 2,000,000 credits (4,685,309)Generalized Cullen/Woodall Sieve Emerald: Earned 50,000,000 credits (53,324,439)PPS Sieve Double Ruby: Earned 2,000,000,000 credits (3,085,564,413)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Sapphire: Earned 20,000,000 credits (20,057,785)TRP Sieve (suspended) Emerald: Earned 50,000,000 credits (50,146,560)AP 26/27 Double Bronze: Earned 100,000,000 credits (164,762,545)GFN Double Gold: Earned 500,000,000 credits (820,619,526)PSA Double Bronze: Earned 100,000,000 credits (185,423,866)
Message 68578 - Posted: 18 Aug 2013 | 21:54:44 UTC - in response to Message 68576.

I have tested it with my 7950, .....and it wasted all WU's.

So, back to 3.1.2-2

unfortunately, it also invalid
http://www.primegrid.com/results.php?userid=43788
http://www.primegrid.com/result.php?resultid=475080186
http://www.primegrid.com/result.php?resultid=475053577

:o(
Anyone else with an ATI 7970?

Please, could you run the test using interactive mode and 2. test?

____________
DeleteNull

Profile Michael GoetzProject donor
Volunteer moderator
Project administrator
Project scientist
Avatar
Send message
Joined: 21 Jan 10
Posts: 12669
ID: 53948
Credit: 184,131,428
RAC: 10,627
The "Shut up already!" badge:  This loud mouth has mansplained on the forums over 10 thousand times!  Sheesh!!!Discovered the World's First GFN-19 prime!!!Discovered 1 mega primeFound 1 prime in the 2018 Tour de PrimesFound 1 prime in the 2019 Tour de Primes321 LLR Ruby: Earned 2,000,000 credits (2,063,182)Cullen LLR Ruby: Earned 2,000,000 credits (2,005,249)ESP LLR Ruby: Earned 2,000,000 credits (3,820,430)Generalized Cullen/Woodall LLR Ruby: Earned 2,000,000 credits (2,145,754)PPS LLR Ruby: Earned 2,000,000 credits (2,773,744)PSP LLR Ruby: Earned 2,000,000 credits (2,632,269)SoB LLR Sapphire: Earned 20,000,000 credits (34,158,496)SR5 LLR Turquoise: Earned 5,000,000 credits (8,293,415)SGS LLR Ruby: Earned 2,000,000 credits (2,012,781)TRP LLR Ruby: Earned 2,000,000 credits (2,737,347)Woodall LLR Ruby: Earned 2,000,000 credits (2,195,123)321 Sieve Turquoise: Earned 5,000,000 credits (5,046,112)Cullen/Woodall Sieve (suspended) Ruby: Earned 2,000,000 credits (4,170,256)Generalized Cullen/Woodall Sieve Turquoise: Earned 5,000,000 credits (5,059,304)PPS Sieve Sapphire: Earned 20,000,000 credits (20,110,788)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Amethyst: Earned 1,000,000 credits (1,035,522)TRP Sieve (suspended) Ruby: Earned 2,000,000 credits (2,051,121)AP 26/27 Turquoise: Earned 5,000,000 credits (7,090,096)GFN Emerald: Earned 50,000,000 credits (64,594,991)PSA Jade: Earned 10,000,000 credits (10,135,447)
Message 68579 - Posted: 18 Aug 2013 | 22:10:18 UTC
Last modified: 19 Aug 2013 | 21:54:11 UTC

If you want, you can try a 64-bit version of 3.1.2-5: click here

EDIT: 3.1.2-6 should fix the 32-bit problem, so the 64 bit app is no longer necessary.
____________
Please do not PM me with support questions. Ask on the forums instead. Thank you!

My lucky number is 75898524288+1

Yves Gallot
Volunteer developer
Project scientist
Send message
Joined: 19 Aug 12
Posts: 513
ID: 164101
Credit: 295,254,118
RAC: 5,845
GFN Double Silver: Earned 200,000,000 credits (295,254,118)
Message 68580 - Posted: 18 Aug 2013 | 22:17:48 UTC - in response to Message 68578.

I have tested it with my 7950, .....and it wasted all WU's.

I don't understand :o(

Please, could someone run the test of the win32 3.1.2-5, using interactive mode, on a HD 79x0?

Roger, could you compile a win32 version of OclGenefer 2013-08-18, revision 406 and run the full test?

Thanks, Yves

Profile Plomos
Send message
Joined: 17 Jun 11
Posts: 19
ID: 102558
Credit: 8,016,726
RAC: 0
321 LLR Bronze: Earned 10,000 credits (23,456)ESP LLR Silver: Earned 100,000 credits (104,117)PPS LLR Silver: Earned 100,000 credits (186,363)PSP LLR Bronze: Earned 10,000 credits (31,927)SoB LLR Bronze: Earned 10,000 credits (21,793)SR5 LLR Bronze: Earned 10,000 credits (13,351)SGS LLR Silver: Earned 100,000 credits (105,556)TRP LLR Bronze: Earned 10,000 credits (15,745)Generalized Cullen/Woodall Sieve Silver: Earned 100,000 credits (243,033)PPS Sieve Turquoise: Earned 5,000,000 credits (6,263,318)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Silver: Earned 100,000 credits (451,945)TRP Sieve (suspended) Silver: Earned 100,000 credits (175,461)GFN Silver: Earned 100,000 credits (140,547)PSA Silver: Earned 100,000 credits (227,115)
Message 68582 - Posted: 18 Aug 2013 | 22:39:32 UTC

I am trying to get this to run on my 6870 but when i tell it to run a benchmark i get this

C:\Users\Plomos\Downloads>geneferocl-windows.exe -b
geneferocl 3.1.2-5 (Windows 32-bit OpenCL)
Copyright 2001-2013, Yves Gallot
Copyright 2009, Mark Rodenkirch, David Underbakke
Copyright 2010-2012, Shoichiro Yamada, Ken Brazier
Copyright 2011-2013, Iain Bethune, Michael Goetz, Ronald Schneider

Command line: geneferocl-windows.exe -b

No OpenCL device found.

what am i missing here?

Profile Roger
Volunteer moderator
Project administrator
Volunteer developer
Volunteer tester
Project scientist
Avatar
Send message
Joined: 27 Nov 11
Posts: 1112
ID: 120786
Credit: 261,530,863
RAC: 1,408
Found 1 prime in the 2018 Tour de Primes321 LLR Ruby: Earned 2,000,000 credits (2,012,522)Cullen LLR Amethyst: Earned 1,000,000 credits (1,359,862)ESP LLR Ruby: Earned 2,000,000 credits (2,213,934)Generalized Cullen/Woodall LLR Ruby: Earned 2,000,000 credits (2,088,705)PPS LLR Ruby: Earned 2,000,000 credits (2,617,785)PSP LLR Ruby: Earned 2,000,000 credits (2,420,512)SoB LLR Amethyst: Earned 1,000,000 credits (1,780,064)SR5 LLR Ruby: Earned 2,000,000 credits (2,238,295)SGS LLR Ruby: Earned 2,000,000 credits (2,139,392)TRP LLR Ruby: Earned 2,000,000 credits (2,125,391)Woodall LLR Amethyst: Earned 1,000,000 credits (1,311,937)321 Sieve Turquoise: Earned 5,000,000 credits (5,190,731)Cullen/Woodall Sieve (suspended) Silver: Earned 100,000 credits (207,387)Generalized Cullen/Woodall Sieve Turquoise: Earned 5,000,000 credits (5,049,697)PPS Sieve Double Bronze: Earned 100,000,000 credits (100,422,123)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Ruby: Earned 2,000,000 credits (3,227,972)TRP Sieve (suspended) Turquoise: Earned 5,000,000 credits (5,021,659)AP 26/27 Sapphire: Earned 20,000,000 credits (20,295,860)GFN Emerald: Earned 50,000,000 credits (56,515,310)PSA Sapphire: Earned 20,000,000 credits (43,298,465)
Message 68583 - Posted: 18 Aug 2013 | 22:57:14 UTC - in response to Message 68580.
Last modified: 18 Aug 2013 | 23:38:20 UTC

I have tested it with my 7950, .....and it wasted all WU's.

I don't understand :o(

Please, could someone run the test of the win32 3.1.2-5, using interactive mode, on a HD 79x0?

Roger, could you compile a win32 version of OclGenefer 2013-08-18, revision 406 and run the full test?

Thanks, Yves

406 (x64) with HD7970Ghz:
OclGenefer 2013-08-18, Copyright (C) 2001-2013, Yves Gallot. Options: -q "b^N+1" Test expression. Platform 'AMD Accelerated Parallel Processing': GPU device 'Tahiti' found. Platform 'AMD Accelerated Parallel Processing': CPU device 'AMD Phenom(tm) II X6 1100T Processor' found. Running on platform 'AMD Accelerated Parallel Processing', device 'Tahiti', vers ion 'OpenCL 1.2 AMD-APP (1124.2)' and driver '1124.2 (VM)'. Clock frequency = 1050 MHz, compute units = 32, 32-bit. Global mem size = 2048 MB, cache size = 16 kB (ReadWrite), cache line size = 6 4 Bytes. Local mem size = 32 kB (dedicated), Constant mem size = 64 kB. Max workgroup size = 256, Profiling timer resolution = 0.001 usec. 2199064^8192+1 Time: 76.3 us/mul. Err: 0.2188 51956 digits 1798620^16384+1 Time: 73.4 us/mul. Err: 0.2227 102481 digits 1471094^32768+1 Time: 75.4 us/mul. Err: 0.2383 202102 digits 1203210^65536+1 Time: 81.1 us/mul. Err: 0.2305 398482 digits 984108^131072+1 Time: 129 us/mul. Err: 0.2188 785521 digits 804904^262144+1 Time: 396 us/mul. Err: 0.2266 1548156 digits 658332^524288+1 Time: 639 us/mul. Err: 0.2109 3050541 digits 538452^1048576+1 Time: 1.16 ms/mul. Err: 0.2134 6009544 digits 440400^2097152+1 Time: 2.24 ms/mul. Err: 0.2266 11836006 digits 360204^4194304+1 Time: 4.69 ms/mul. Err: 0.1953 23305854 digits 294612^8388608+1 Time: 8.72 ms/mul. Err: 0.2109 45879398 digits


When I try to compile Win32 version I get linker errors:
51 unresolved externals

I probably need to link some 32 bit libraries rather than 64 bit.
Job for Rebirther. I am time poor for the next few days.

geneferocl 3.1.2-5 (Windows 32-bit OpenCL) Copyright 2001-2013, Yves Gallot Copyright 2009, Mark Rodenkirch, David Underbakke Copyright 2010-2012, Shoichiro Yamada, Ken Brazier Copyright 2011-2013, Iain Bethune, Michael Goetz, Ronald Schneider Command line: geneferocl-windows.exe -b Running on platform 'AMD Accelerated Parallel Processing', device 'Tahiti', vers ion 'OpenCL 1.2 AMD-APP (1124.2)' and driver '1124.2 (VM)'. Generalized Fermat Number Bench 2199064^8192+1 Time: 75.1 us/mul. Err: 0.2188 51956 digits 1798620^16384+1 Time: 73 us/mul. Err: 0.2344 102481 digits 1471094^32768+1 Time: 73.4 us/mul. Err: 0.2344 202102 digits 1203210^65536+1 Time: 84.2 us/mul. Err: 0.2813 398482 digits 984108^131072+1 Time: 132 us/mul. Err: 0.2295 785521 digits 804904^262144+1 Time: 368 us/mul. Err: 0.2188 1548156 digits 658332^524288+1 Time: 701 us/mul. Err: 0.2266 3050541 digits 538452^1048576+1 Time: 1.07 ms/mul. Err: 0.2188 6009544 digits 440400^2097152+1 Time: 2.24 ms/mul. Err: 0.2188 11836006 digits 360204^4194304+1 Time: 4.66 ms/mul. Err: 0.1953 23305854 digits 294612^8388608+1 Time: 9.25 ms/mul. Err: 0.2070 45879398 digits Genefer Mark = 91.


geneferocl 3.1.2-5 (Windows 32-bit OpenCL) Copyright 2001-2013, Yves Gallot Copyright 2009, Mark Rodenkirch, David Underbakke Copyright 2010-2012, Shoichiro Yamada, Ken Brazier Copyright 2011-2013, Iain Bethune, Michael Goetz, Ronald Schneider Command line: geneferocl-windows.exe -b3 Running on platform 'AMD Accelerated Parallel Processing', device 'Tahiti', vers ion 'OpenCL 1.2 AMD-APP (1124.2)' and driver '1124.2 (VM)'. 14^32768+1 37557 digits 0 days 0.0 hours (0.09 ms/mul, 124 758 iterations) 294 GFLOPS 75898^32768+1 159916 digits 0 days 0.0 hours (0.08 ms/mul, 531 226 iterations) 1253 GFLOPS 700000^32768+1 191533 digits 0 days 0.0 hours (0.08 ms/mul, 636 255 iterations) 1501 GFLOPS 5000000^32768+1 219512 digits 0 days 0.0 hours (0.08 ms/mul, 729 201 iterations) 1720 GFLOPS 14^65536+1 75113 digits 0 days 0.0 hours (0.09 ms/mul, 249 517 iterations) 1243 GFLOPS 75898^65536+1 319831 digits 0 days 0.0 hours (0.08 ms/mul, 106 2453 iterations) 5292 GFLOPS 710000^65536+1 383469 digits 0 days 0.0 hours (0.09 ms/mul, 127 3852 iterations) 6345 GFLOPS 2500000^65536+1 419296 digits 0 days 0.0 hours (0.10 ms/mul, 139 2868 iterations) 6938 GFLOPS 14^131072+1 150226 digits 0 days 0.0 hours (0.14 ms/mul, 499 036 iterations) 5233 GFLOPS 75898^131072+1 639662 digits 0 days 0.0 hours (0.14 ms/mul, 212 4908 iterations) 22281 GFLOPS 700000^131072+1 766129 digits 0 days 0.0 hours (0.13 ms/mul, 254 5023 iterations) 26687 GFLOPS 1000000^131072+1 786432 digits 0 days 0.0 hours (0.13 ms/mul, 261 2469 iterations) 27394 GFLOPS 14^262144+1 300451 digits 0 days 0.0 hours (0.33 ms/mul, 998 074 iterations) 21978 GFLOPS 75898^262144+1 1279324 digits 0 days 0.4 hours (0.36 ms/mul, 424 9818 iterations) 93581 GFLOPS 468750^262144+1 1486604 digits 0 days 0.4 hours (0.33 ms/mul, 493 8388 iterations) 108744 GFLOPS 815000^262144+1 1549575 digits 0 days 0.5 hours (0.37 ms/mul, 514 7574 iterations) 113350 GFLOPS 14^524288+1 600902 digits 0 days 0.3 hours (0.70 ms/mul, 199 6149 iterations) 92097 GFLOPS 75898^524288+1 2558647 digits 0 days 1.7 hours (0.73 ms/mul, 849 9637 iterations) 392151 GFLOPS 468750^524288+1 2973207 digits 0 days 1.9 hours (0.72 ms/mul, 987 6777 iterations) 455688 GFLOPS 710000^524288+1 3067745 digits 0 days 2.0 hours (0.73 ms/mul, 101 90825 iterations) 470178 GFLOPS 14^1048576+1 1201803 digits 0 days 1.2 hours (1.13 ms/mul, 399 2299 iterations) 385133 GFLOPS 75898^1048576+1 5117293 digits 0 days 5.2 hours (1.10 ms/mul, 169 99276 iterations) 1639903 GFLOPS 468750^1048576+1 5946413 digits 0 days 6.1 hours (1.13 ms/mul, 197 53555 iterations) 1905606 GFLOPS 700000^1048576+1 6129030 digits 0 days 6.2 hours (1.10 ms/mul, 203 60194 iterations) 1964127 GFLOPS 14^2097152+1 2403605 digits 0 days 5.1 hours (2.31 ms/mul, 798 4600 iterations) 1607512 GFLOPS 75898^2097152+1 10234585 digits 0 days 21.2 hours (2.25 ms/mul, 339 98553 iterations) 6844813 GFLOPS 380742^2097152+1 11703432 digits 1 days 0.4 hours (2.26 ms/mul, 388 77955 iterations) 7827166 GFLOPS 570000^2097152+1 12070945 digits 1 days 1.1 hours (2.26 ms/mul, 400 98808 iterations) 8072956 GFLOPS 14^4194304+1 4807210 digits 0 days 20.7 hours (4.67 ms/mul, 159 69202 iterations) 6697969 GFLOPS 1248^4194304+1 12986466 digits 2 days 6.7 hours (4.57 ms/mul, 431 40102 iterations) 18094270 GFLOPS 10000^4194304+1 16777217 digits 2 days 22.9 hours (4.58 ms/mul, 557 32704 iterations) 23375990 GFLOPS 50000^4194304+1 19708909 digits 3 days 10.6 hours (4.55 ms/mul, 654 71576 iterations) 27460769 GFLOPS 150000^4194304+1 21710101 digits 3 days 19.3 hours (4.56 ms/mul, 721 19391 iterations) 30249065 GFLOPS 309258^4194304+1 23028076 digits 4 days 1.8 hours (4.61 ms/mul, 764 97608 iterations) 32085422 GFLOPS 480000^4194304+1 23828853 digits 4 days 4.7 hours (4.58 ms/mul, 791 57734 iterations) 33201160 GFLOPS 14^8388608+1 9614419 digits 3 days 10.3 hours (9.28 ms/mul, 319 38406 iterations) 27863552 GFLOPS 36^8388608+1 13055212 digits 4 days 15.6 hours (9.27 ms/mul, 433 68473 iterations) 37835316 GFLOPS 100^8388608+1 16777217 digits 5 days 22.7 hours (9.22 ms/mul, 557 32704 iterations) 48622060 GFLOPS

No errors.

Profile DeleteNullProject donor
Volunteer tester
Avatar
Send message
Joined: 6 Apr 06
Posts: 226
ID: 2663
Credit: 5,102,624,386
RAC: 136,622
Discovered 16 mega primesEliminated 2 conjecture "k"sFound 4 primes in the 2018 Tour de PrimesFound 1 mega prime in the 2018 Tour de PrimesFound 4 primes in the 2019 Tour de PrimesFound 2 mega primes in the 2019 Tour de PrimesFound 1 prime in the 2019 Tour de Primes Mountain Stage321 LLR Emerald: Earned 50,000,000 credits (50,589,422)Cullen LLR Emerald: Earned 50,000,000 credits (51,425,403)ESP LLR Emerald: Earned 50,000,000 credits (59,747,151)Generalized Cullen/Woodall LLR Emerald: Earned 50,000,000 credits (59,199,415)PPS LLR Emerald: Earned 50,000,000 credits (97,915,589)PSP LLR Emerald: Earned 50,000,000 credits (55,716,862)SoB LLR Emerald: Earned 50,000,000 credits (71,933,705)SR5 LLR Emerald: Earned 50,000,000 credits (50,919,719)SGS LLR Emerald: Earned 50,000,000 credits (51,327,296)TPS LLR (retired) Bronze: Earned 10,000 credits (61,785)TRP LLR Emerald: Earned 50,000,000 credits (61,671,238)Woodall LLR Emerald: Earned 50,000,000 credits (50,544,668)321 Sieve Emerald: Earned 50,000,000 credits (56,793,407)Cullen/Woodall Sieve (suspended) Ruby: Earned 2,000,000 credits (4,685,309)Generalized Cullen/Woodall Sieve Emerald: Earned 50,000,000 credits (53,324,439)PPS Sieve Double Ruby: Earned 2,000,000,000 credits (3,085,564,413)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Sapphire: Earned 20,000,000 credits (20,057,785)TRP Sieve (suspended) Emerald: Earned 50,000,000 credits (50,146,560)AP 26/27 Double Bronze: Earned 100,000,000 credits (164,762,545)GFN Double Gold: Earned 500,000,000 credits (820,619,526)PSA Double Bronze: Earned 100,000,000 credits (185,423,866)
Message 68586 - Posted: 19 Aug 2013 | 5:43:08 UTC - in response to Message 68579.
Last modified: 19 Aug 2013 | 5:57:12 UTC

Started, ..... this version is running ..... very fast.
Edit: less than 6 hours for a HD7950.

If you want, you can try a 64-bit version of 3.1.2-5: click here

____________
DeleteNull

Profile Crystal PelletProject donor
Avatar
Send message
Joined: 9 Nov 08
Posts: 180
ID: 31494
Credit: 77,230,917
RAC: 390
321 LLR Amethyst: Earned 1,000,000 credits (1,003,526)Cullen LLR Gold: Earned 500,000 credits (500,200)ESP LLR Gold: Earned 500,000 credits (738,168)Generalized Cullen/Woodall LLR Silver: Earned 100,000 credits (241,969)PPS LLR Ruby: Earned 2,000,000 credits (2,123,786)PSP LLR Amethyst: Earned 1,000,000 credits (1,303,207)SoB LLR Amethyst: Earned 1,000,000 credits (1,567,316)SR5 LLR Gold: Earned 500,000 credits (542,997)SGS LLR Amethyst: Earned 1,000,000 credits (1,256,351)TRP LLR Amethyst: Earned 1,000,000 credits (1,010,058)Woodall LLR Silver: Earned 100,000 credits (118,189)321 Sieve Silver: Earned 100,000 credits (102,310)Cullen/Woodall Sieve (suspended) Jade: Earned 10,000,000 credits (14,643,949)Generalized Cullen/Woodall Sieve Ruby: Earned 2,000,000 credits (3,494,619)PPS Sieve Sapphire: Earned 20,000,000 credits (29,962,345)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Turquoise: Earned 5,000,000 credits (5,208,890)TRP Sieve (suspended) Turquoise: Earned 5,000,000 credits (5,080,539)AP 26/27 Gold: Earned 500,000 credits (507,329)GFN Turquoise: Earned 5,000,000 credits (7,824,901)
Message 68587 - Posted: 19 Aug 2013 | 6:38:49 UTC - in response to Message 68586.

Don't know how to interpret, but the developers should know whether this output is OK from my ATI 7770:

geneferocl 3.1.2-5 (Windows 64-bit OpenCL)
Copyright 2001-2013, Yves Gallot
Copyright 2009, Mark Rodenkirch, David Underbakke
Copyright 2010-2012, Shoichiro Yamada, Ken Brazier
Copyright 2011-2013, Iain Bethune, Michael Goetz, Ronald Schneider

Command line: geneferocl-windows-x64.exe -b


Running on platform 'AMD Accelerated Parallel Processing', device 'Capeverde', version 'OpenCL 1.2 AMD-APP (1084.4)' and driver '1084.4 (VM)'.

Generalized Fermat Number Bench
2199064^8192+1 Time: 118 us/mul. Err: 0.2344 51956 digits
1798620^16384+1 Time: 119 us/mul. Err: 0.2266 102481 digits
1471094^32768+1 Time: 144 us/mul. Err: 0.2344 202102 digits
1203210^65536+1 Time: 264 us/mul. Err: 0.2656 398482 digits
984108^131072+1 Time: 522 us/mul. Err: 0.2188 785521 digits
804904^262144+1 Time: 1.24 ms/mul. Err: 0.2188 1548156 digits
658332^524288+1 Time: 2.35 ms/mul. Err: 0.2188 3050541 digits
538452^1048576+1 Time: 5.13 ms/mul. Err: 0.2266 6009544 digits
440400^2097152+1 Time: 10.3 ms/mul. Err: 0.2266 11836006 digits
360204^4194304+1 Time: 21.5 ms/mul. Err: 0.2031 23305854 digits
294612^8388608+1 Time: 46.3 ms/mul. Err: 0.1895 45879398 digits
Genefer Mark = 19.

Profile Roger
Volunteer moderator
Project administrator
Volunteer developer
Volunteer tester
Project scientist
Avatar
Send message
Joined: 27 Nov 11
Posts: 1112
ID: 120786
Credit: 261,530,863
RAC: 1,408
Found 1 prime in the 2018 Tour de Primes321 LLR Ruby: Earned 2,000,000 credits (2,012,522)Cullen LLR Amethyst: Earned 1,000,000 credits (1,359,862)ESP LLR Ruby: Earned 2,000,000 credits (2,213,934)Generalized Cullen/Woodall LLR Ruby: Earned 2,000,000 credits (2,088,705)PPS LLR Ruby: Earned 2,000,000 credits (2,617,785)PSP LLR Ruby: Earned 2,000,000 credits (2,420,512)SoB LLR Amethyst: Earned 1,000,000 credits (1,780,064)SR5 LLR Ruby: Earned 2,000,000 credits (2,238,295)SGS LLR Ruby: Earned 2,000,000 credits (2,139,392)TRP LLR Ruby: Earned 2,000,000 credits (2,125,391)Woodall LLR Amethyst: Earned 1,000,000 credits (1,311,937)321 Sieve Turquoise: Earned 5,000,000 credits (5,190,731)Cullen/Woodall Sieve (suspended) Silver: Earned 100,000 credits (207,387)Generalized Cullen/Woodall Sieve Turquoise: Earned 5,000,000 credits (5,049,697)PPS Sieve Double Bronze: Earned 100,000,000 credits (100,422,123)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Ruby: Earned 2,000,000 credits (3,227,972)TRP Sieve (suspended) Turquoise: Earned 5,000,000 credits (5,021,659)AP 26/27 Sapphire: Earned 20,000,000 credits (20,295,860)GFN Emerald: Earned 50,000,000 credits (56,515,310)PSA Sapphire: Earned 20,000,000 credits (43,298,465)
Message 68589 - Posted: 19 Aug 2013 | 11:36:35 UTC - in response to Message 68580.

Roger, could you compile a win32 version of OclGenefer 2013-08-18, revision 406 and run the full test?

Thanks, Yves

Rebirther furnished me with a Visual Studio project that I can now use to build Win32 versions.
I am starting a full run with revision 406.

Profile DeleteNullProject donor
Volunteer tester
Avatar
Send message
Joined: 6 Apr 06
Posts: 226
ID: 2663
Credit: 5,102,624,386
RAC: 136,622
Discovered 16 mega primesEliminated 2 conjecture "k"sFound 4 primes in the 2018 Tour de PrimesFound 1 mega prime in the 2018 Tour de PrimesFound 4 primes in the 2019 Tour de PrimesFound 2 mega primes in the 2019 Tour de PrimesFound 1 prime in the 2019 Tour de Primes Mountain Stage321 LLR Emerald: Earned 50,000,000 credits (50,589,422)Cullen LLR Emerald: Earned 50,000,000 credits (51,425,403)ESP LLR Emerald: Earned 50,000,000 credits (59,747,151)Generalized Cullen/Woodall LLR Emerald: Earned 50,000,000 credits (59,199,415)PPS LLR Emerald: Earned 50,000,000 credits (97,915,589)PSP LLR Emerald: Earned 50,000,000 credits (55,716,862)SoB LLR Emerald: Earned 50,000,000 credits (71,933,705)SR5 LLR Emerald: Earned 50,000,000 credits (50,919,719)SGS LLR Emerald: Earned 50,000,000 credits (51,327,296)TPS LLR (retired) Bronze: Earned 10,000 credits (61,785)TRP LLR Emerald: Earned 50,000,000 credits (61,671,238)Woodall LLR Emerald: Earned 50,000,000 credits (50,544,668)321 Sieve Emerald: Earned 50,000,000 credits (56,793,407)Cullen/Woodall Sieve (suspended) Ruby: Earned 2,000,000 credits (4,685,309)Generalized Cullen/Woodall Sieve Emerald: Earned 50,000,000 credits (53,324,439)PPS Sieve Double Ruby: Earned 2,000,000,000 credits (3,085,564,413)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Sapphire: Earned 20,000,000 credits (20,057,785)TRP Sieve (suspended) Emerald: Earned 50,000,000 credits (50,146,560)AP 26/27 Double Bronze: Earned 100,000,000 credits (164,762,545)GFN Double Gold: Earned 500,000,000 credits (820,619,526)PSA Double Bronze: Earned 100,000,000 credits (185,423,866)
Message 68590 - Posted: 19 Aug 2013 | 11:55:34 UTC - in response to Message 68589.

64-bit version of 3.1.2-5: 21,411.56 seconds for a HD7950 (1,986.30 [s] CPU).

That's very fast, CPU usage is about 10%
____________
DeleteNull

Yves Gallot
Volunteer developer
Project scientist
Send message
Joined: 19 Aug 12
Posts: 513
ID: 164101
Credit: 295,254,118
RAC: 5,845
GFN Double Silver: Earned 200,000,000 credits (295,254,118)
Message 68591 - Posted: 19 Aug 2013 | 12:08:13 UTC - in response to Message 68589.

Rebirther furnished me with a Visual Studio project that I can now use to build Win32 versions.
I am starting a full run with revision 406.

The status is:
the 3.1.2-4/5 32-bit runs on NVidia Titan
http://www.primegrid.com/result.php?resultid=475125032
the 3.1.2-5 64-bit runs on ATI HD 79x0
http://www.primegrid.com/result.php?resultid=475087259

The benchmark of the 3.1.2-4/5 32-bit is ok on ATI but the test fails.

I extended the command queue to 32k (it was 1k). Is it too large on ATI and allocated memory is > 2 GB and then allocation fails on Win32?

Could you check the memory allocated by the application when a large number is tested?

Thanks, Yves

Yves Gallot
Volunteer developer
Project scientist
Send message
Joined: 19 Aug 12
Posts: 513
ID: 164101
Credit: 295,254,118
RAC: 5,845
GFN Double Silver: Earned 200,000,000 credits (295,254,118)
Message 68592 - Posted: 19 Aug 2013 | 12:31:54 UTC - in response to Message 68582.

I am trying to get this to run on my 6870 but when i tell it to run a benchmark i get this
No OpenCL device found.
what am i missing here?

This card does not have double precision capability.
The HD 58x0 and 69x0 series have DP.

Yves Gallot
Volunteer developer
Project scientist
Send message
Joined: 19 Aug 12
Posts: 513
ID: 164101
Credit: 295,254,118
RAC: 5,845
GFN Double Silver: Earned 200,000,000 credits (295,254,118)
Message 68593 - Posted: 19 Aug 2013 | 12:36:31 UTC - in response to Message 68563.
Last modified: 19 Aug 2013 | 12:37:09 UTC

I discover very interesed result in genefercl on Nvidia. If with her running task on the processor (trp sieve), genefercl-windows.exe does not consume CPU time, but GPU load drop to ~ 85%

That's interesting.
Anybody knows where I can find the source code of The Riesel Problem Sieve application?

Profile Michael GoetzProject donor
Volunteer moderator
Project administrator
Project scientist
Avatar
Send message
Joined: 21 Jan 10
Posts: 12669
ID: 53948
Credit: 184,131,428
RAC: 10,627
The "Shut up already!" badge:  This loud mouth has mansplained on the forums over 10 thousand times!  Sheesh!!!Discovered the World's First GFN-19 prime!!!Discovered 1 mega primeFound 1 prime in the 2018 Tour de PrimesFound 1 prime in the 2019 Tour de Primes321 LLR Ruby: Earned 2,000,000 credits (2,063,182)Cullen LLR Ruby: Earned 2,000,000 credits (2,005,249)ESP LLR Ruby: Earned 2,000,000 credits (3,820,430)Generalized Cullen/Woodall LLR Ruby: Earned 2,000,000 credits (2,145,754)PPS LLR Ruby: Earned 2,000,000 credits (2,773,744)PSP LLR Ruby: Earned 2,000,000 credits (2,632,269)SoB LLR Sapphire: Earned 20,000,000 credits (34,158,496)SR5 LLR Turquoise: Earned 5,000,000 credits (8,293,415)SGS LLR Ruby: Earned 2,000,000 credits (2,012,781)TRP LLR Ruby: Earned 2,000,000 credits (2,737,347)Woodall LLR Ruby: Earned 2,000,000 credits (2,195,123)321 Sieve Turquoise: Earned 5,000,000 credits (5,046,112)Cullen/Woodall Sieve (suspended) Ruby: Earned 2,000,000 credits (4,170,256)Generalized Cullen/Woodall Sieve Turquoise: Earned 5,000,000 credits (5,059,304)PPS Sieve Sapphire: Earned 20,000,000 credits (20,110,788)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Amethyst: Earned 1,000,000 credits (1,035,522)TRP Sieve (suspended) Ruby: Earned 2,000,000 credits (2,051,121)AP 26/27 Turquoise: Earned 5,000,000 credits (7,090,096)GFN Emerald: Earned 50,000,000 credits (64,594,991)PSA Jade: Earned 10,000,000 credits (10,135,447)
Message 68594 - Posted: 19 Aug 2013 | 13:09:51 UTC - in response to Message 68593.
Last modified: 19 Aug 2013 | 13:11:28 UTC

I discover very interesed result in genefercl on Nvidia. If with her running task on the processor (trp sieve), genefercl-windows.exe does not consume CPU time, but GPU load drop to ~ 85%

That's interesting.
Anybody knows where I can find the source code of The Riesel Problem Sieve application?


That is sr2sieve, and can be found here.

The sr2sieve page only has binaries, and referes you to the source for sr5sieve with instructions to change one constant. There's a link there.

This is not a native boinc application and therefore runs inside a wrapper. I'd have to do some research to see where the wrapper comes from if you need that too. (I can point you to the Android version of the wrapper pretty easily.)
____________
Please do not PM me with support questions. Ask on the forums instead. Thank you!

My lucky number is 75898524288+1

Profile chip
Avatar
Send message
Joined: 12 Apr 11
Posts: 128
ID: 94709
Credit: 164,082,201
RAC: 5,606
321 LLR Amethyst: Earned 1,000,000 credits (1,081,800)ESP LLR Amethyst: Earned 1,000,000 credits (1,958,365)PPS LLR Ruby: Earned 2,000,000 credits (3,000,162)PSP LLR Amethyst: Earned 1,000,000 credits (1,022,562)SoB LLR Ruby: Earned 2,000,000 credits (2,015,539)SR5 LLR Ruby: Earned 2,000,000 credits (2,000,481)SGS LLR Ruby: Earned 2,000,000 credits (2,000,014)TRP LLR Ruby: Earned 2,000,000 credits (3,000,865)321 Sieve Ruby: Earned 2,000,000 credits (2,000,357)Cullen/Woodall Sieve (suspended) Ruby: Earned 2,000,000 credits (4,000,096)PPS Sieve Emerald: Earned 50,000,000 credits (50,000,422)TRP Sieve (suspended) Ruby: Earned 2,000,000 credits (2,000,500)AP 26/27 Sapphire: Earned 20,000,000 credits (20,000,721)GFN Sapphire: Earned 20,000,000 credits (20,000,117)PSA Emerald: Earned 50,000,000 credits (50,000,198)
Message 68595 - Posted: 19 Aug 2013 | 13:19:54 UTC
Last modified: 19 Aug 2013 | 13:20:08 UTC

With linpack_xeon64.exe CPU usage ~0%, but GPU usage the same 97-99% and estimated calculation time drop by ~4%. GeneferCL 3.1.2-5 64-bit, GTX580 driver 326.58, Win2008R2.

Profile Roger
Volunteer moderator
Project administrator
Volunteer developer
Volunteer tester
Project scientist
Avatar
Send message
Joined: 27 Nov 11
Posts: 1112
ID: 120786
Credit: 261,530,863
RAC: 1,408
Found 1 prime in the 2018 Tour de Primes321 LLR Ruby: Earned 2,000,000 credits (2,012,522)Cullen LLR Amethyst: Earned 1,000,000 credits (1,359,862)ESP LLR Ruby: Earned 2,000,000 credits (2,213,934)Generalized Cullen/Woodall LLR Ruby: Earned 2,000,000 credits (2,088,705)PPS LLR Ruby: Earned 2,000,000 credits (2,617,785)PSP LLR Ruby: Earned 2,000,000 credits (2,420,512)SoB LLR Amethyst: Earned 1,000,000 credits (1,780,064)SR5 LLR Ruby: Earned 2,000,000 credits (2,238,295)SGS LLR Ruby: Earned 2,000,000 credits (2,139,392)TRP LLR Ruby: Earned 2,000,000 credits (2,125,391)Woodall LLR Amethyst: Earned 1,000,000 credits (1,311,937)321 Sieve Turquoise: Earned 5,000,000 credits (5,190,731)Cullen/Woodall Sieve (suspended) Silver: Earned 100,000 credits (207,387)Generalized Cullen/Woodall Sieve Turquoise: Earned 5,000,000 credits (5,049,697)PPS Sieve Double Bronze: Earned 100,000,000 credits (100,422,123)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Ruby: Earned 2,000,000 credits (3,227,972)TRP Sieve (suspended) Turquoise: Earned 5,000,000 credits (5,021,659)AP 26/27 Sapphire: Earned 20,000,000 credits (20,295,860)GFN Emerald: Earned 50,000,000 credits (56,515,310)PSA Sapphire: Earned 20,000,000 credits (43,298,465)
Message 68596 - Posted: 19 Aug 2013 | 13:20:15 UTC - in response to Message 68589.
Last modified: 19 Aug 2013 | 14:19:01 UTC

Roger, could you compile a win32 version of OclGenefer 2013-08-18, revision 406 and run the full test?

Thanks, Yves

Rebirther furnished me with a Visual Studio project that I can now use to build Win32 versions.
I am starting a full run with revision 406.

I started the Win32 Rev 406 full test with the HD7970Ghz but it crashes when it gets to the N=524288 tests. A short cut is to try this:
C:\Users\Roger\Downloads\oclgenefer_v4\oclgenefer\oclgtest>genefer32.exe -q "758 98^524288+1" OclGenefer 2013-08-18, Copyright (C) 2001-2013, Yves Gallot. Options: -q "b^N+1" Test expression. Platform 'AMD Accelerated Parallel Processing': GPU device 'Tahiti' found. Platform 'AMD Accelerated Parallel Processing': CPU device 'AMD Phenom(tm) II X6 1100T Processor' found. Running on platform 'AMD Accelerated Parallel Processing', device 'Tahiti', vers ion 'OpenCL 1.2 AMD-APP (1124.2)' and driver '1124.2 (VM)'. Clock frequency = 1050 MHz, compute units = 32, 32-bit. Global mem size = 2048 MB, cache size = 16 kB (ReadWrite), cache line size = 6 4 Bytes. Local mem size = 32 kB (dedicated), Constant mem size = 64 kB. Max workgroup size = 256, Profiling timer resolution = 0.001 usec. Testing 75898^524288+1... Unhandled Exception: System.Runtime.InteropServices.SEHException: External compo nent has thrown an exception. at clEnqueueNDRangeKernel(_cl_command_queue* , _cl_kernel* , UInt32 , UInt32* , UInt32* , UInt32* , UInt32 , _cl_event** , _cl_event** ) at Program.Execute(Program* , _cl_kernel* kernel, UInt32 globalWorkSize, UInt 32 localWorkSize) at Program.BaseMod(Program* , Int32 g, Int32 base) at OclTransform.SquareAndMul(OclTransform* , Boolean mul) at Genefer.Check(Genefer* , UInt32 b, UInt32 m) at main(Int32 argc, SByte** argv) at _mainCRTStartup()


Memory usage right before the exception is thrown:


Is not behaving like a memory leak as is stable at that memory usage level for 30 seconds before crashing. Maybe is just too close to the 32 bit limit. Testing at N=131072 is using only 60MB.

Yves Gallot
Volunteer developer
Project scientist
Send message
Joined: 19 Aug 12
Posts: 513
ID: 164101
Credit: 295,254,118
RAC: 5,845
GFN Double Silver: Earned 200,000,000 credits (295,254,118)
Message 68597 - Posted: 19 Aug 2013 | 13:50:07 UTC - in response to Message 68596.

I started the Win32 Rev 406 full test with the HD7970Ghz but it crashes when it gets to the N=524288 tests.

OK, maybe the command queue size is the problem.
I committed the 407 with max command queue size = 1k (the value that was defined for GeneferCL 3.1.2-2).
I printed parameters at the end of initialisation.
Thanks, Yves

Profile chip
Avatar
Send message
Joined: 12 Apr 11
Posts: 128
ID: 94709
Credit: 164,082,201
RAC: 5,606
321 LLR Amethyst: Earned 1,000,000 credits (1,081,800)ESP LLR Amethyst: Earned 1,000,000 credits (1,958,365)PPS LLR Ruby: Earned 2,000,000 credits (3,000,162)PSP LLR Amethyst: Earned 1,000,000 credits (1,022,562)SoB LLR Ruby: Earned 2,000,000 credits (2,015,539)SR5 LLR Ruby: Earned 2,000,000 credits (2,000,481)SGS LLR Ruby: Earned 2,000,000 credits (2,000,014)TRP LLR Ruby: Earned 2,000,000 credits (3,000,865)321 Sieve Ruby: Earned 2,000,000 credits (2,000,357)Cullen/Woodall Sieve (suspended) Ruby: Earned 2,000,000 credits (4,000,096)PPS Sieve Emerald: Earned 50,000,000 credits (50,000,422)TRP Sieve (suspended) Ruby: Earned 2,000,000 credits (2,000,500)AP 26/27 Sapphire: Earned 20,000,000 credits (20,000,721)GFN Sapphire: Earned 20,000,000 credits (20,000,117)PSA Emerald: Earned 50,000,000 credits (50,000,198)
Message 68598 - Posted: 19 Aug 2013 | 13:51:13 UTC

geneferocl 3.1.2-5 (Windows 64-bit OpenCL)
Command line: geneferocl.exe -q 1000000^65536+1

Free CPU core:
1000000^65536+1 is a probable composite. (RES=7a18066afaab61a9) (393216 digits) (err = 0.1875) (time = 0:02:23) cpu time=0:00:34

With all-core linpack:
1000000^65536+1 is a probable composite. (RES=7a18066afaab61a9) (393216 digits) (err = 0.1875) (time = 0:02:27) cpu time=0:00:08

With all-core trp sieving:
1000000^65536+1 is a probable composite. (RES=7a18066afaab61a9) (393216 digits) (err = 0.1875) (time = 0:02:27) cpu time=0:00:07

GPU load with CPU projects is 97% average.

Profile Michael GoetzProject donor
Volunteer moderator
Project administrator
Project scientist
Avatar
Send message
Joined: 21 Jan 10
Posts: 12669
ID: 53948
Credit: 184,131,428
RAC: 10,627
The "Shut up already!" badge:  This loud mouth has mansplained on the forums over 10 thousand times!  Sheesh!!!Discovered the World's First GFN-19 prime!!!Discovered 1 mega primeFound 1 prime in the 2018 Tour de PrimesFound 1 prime in the 2019 Tour de Primes321 LLR Ruby: Earned 2,000,000 credits (2,063,182)Cullen LLR Ruby: Earned 2,000,000 credits (2,005,249)ESP LLR Ruby: Earned 2,000,000 credits (3,820,430)Generalized Cullen/Woodall LLR Ruby: Earned 2,000,000 credits (2,145,754)PPS LLR Ruby: Earned 2,000,000 credits (2,773,744)PSP LLR Ruby: Earned 2,000,000 credits (2,632,269)SoB LLR Sapphire: Earned 20,000,000 credits (34,158,496)SR5 LLR Turquoise: Earned 5,000,000 credits (8,293,415)SGS LLR Ruby: Earned 2,000,000 credits (2,012,781)TRP LLR Ruby: Earned 2,000,000 credits (2,737,347)Woodall LLR Ruby: Earned 2,000,000 credits (2,195,123)321 Sieve Turquoise: Earned 5,000,000 credits (5,046,112)Cullen/Woodall Sieve (suspended) Ruby: Earned 2,000,000 credits (4,170,256)Generalized Cullen/Woodall Sieve Turquoise: Earned 5,000,000 credits (5,059,304)PPS Sieve Sapphire: Earned 20,000,000 credits (20,110,788)Sierpinski (ESP/PSP/SoB) Sieve (suspended) Amethyst: Earned 1,000,000 credits (1,035,522)