Author |
Message |
|
I tried the Proth Prime Search Sieve CUDA app on a ZOTAC ZT-20109-10P GeForce GTS 250 1GB GPU. The machine passes Prime 95 Torture test, Burn Test, Memtest86+, Furmark, and OCCT GPU test. It runs Win 7 64-bit. GPU runs at 73C with fan automatically at 50% when running PPS CUDA.
The PPS CUDA app 9 work units in a row gives "Computation Error" result always 19:01 to 19:09 elapsed time. Should I keep crunching anyway?
Rick
|
|
|
Scott Brown Volunteer moderator Project administrator Volunteer tester Project scientist
 Send message
Joined: 17 Oct 05 Posts: 2165 ID: 1178 Credit: 8,777,295,508 RAC: 0
                                     
|
Update your driver to 195.xx or newer.
____________
141941*2^4299438-1 is prime!
|
|
|
|
I looked in device manager and the NVIDIA driver was 8.16.11.9107 from 9/27/2009.
I let Windows Update do the recommended NVIDIA update and now device manager shows 8.17.12.5896 from 7/9/2010.
Then the PPS CUDA app did 11 successful WUs in a row!
Thanks.
|
|
|
|
I have checked, all your failed tasks completed with message:
Cuda error: cudaEventCreate: out of memory
It was explained to me at my other thread as it's a case that some other process ate up all the video RAM while the program was running.
____________
|
|
|
Ken_g6 Volunteer developer
 Send message
Joined: 4 Jul 06 Posts: 915 ID: 3110 Credit: 183,164,814 RAC: 0
                        
|
Actually, I've since learned that cudaEventCreate: out of memory is caused by a memory leak on the GPU. The current version in testing fixes this, and I'm moving toward getting it into BOINC soon.
____________
|
|
|
Scott Brown Volunteer moderator Project administrator Volunteer tester Project scientist
 Send message
Joined: 17 Oct 05 Posts: 2165 ID: 1178 Credit: 8,777,295,508 RAC: 0
                                     
|
Actually, I've since learned that cudaEventCreate: out of memory is caused by a memory leak on the GPU. The current version in testing fixes this, and I'm moving toward getting it into BOINC soon.
I am a bit confused then as to how the driver updates fix this issue? I know that a memory checker was added to CUDA 3.0 (which is one of the main feature differences in the 195.xx drivers compared to their predecessor 191.xx drivers that had CUDA 2.3), but I am not sure how that (or other driver changes) are correcting the issue...or are they just delaying the memory leak problem through some slower build-up?
____________
141941*2^4299438-1 is prime!
|
|
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 13513 ID: 53948 Credit: 237,712,514 RAC: 0
                           
|
Actually, I've since learned that cudaEventCreate: out of memory is caused by a memory leak on the GPU. The current version in testing fixes this, and I'm moving toward getting it into BOINC soon.
I knew that gigabyte of video ram would come in useful someday!
:)
____________
My lucky number is 75898524288+1 |
|
|
|
When I'm running PPS CUDA, GPU-z shows 98% GPU load, 1% memory controller load, 0% video engine load, and 62MB memory used when the task is 75% done. It looks like the memory slowly counts up. When it starts a new one the memory use starts at 47MB.
I have DEP turned on for all programs, not the default setting. I doubt that applies to the GPU though.
|
|
|
|
i have probleme with the
Proth prime search (sieve) 1.29 (cuda 23) Units
i'm running on Kubuntu maverick AMD 4core
geforce gtx 260² drivers Nvidia 260.19.06
they all goes on calcul error after 2 to 6 sec
any help please ?
|
|
|
Ken_g6 Volunteer developer
 Send message
Joined: 4 Jul 06 Posts: 915 ID: 3110 Credit: 183,164,814 RAC: 0
                        
|
The error happens when trying to get the results from the GPU. "unknown error" is very unhelpful, but I'll try.
First, I'll steal these questions from a SETI thread about a similar error:
Q -- Did you overclock the GPU (or the memory) and if so by how much?
Q -- Did you try to clock the GPU to default speeds and still see the problem?
Q -- Did you set the fan speed manually to anything else than default?
Q -- Which program(s) do you use to overclock the GPU, fan etc.?
Q -- Which program(s) do you use to keep track of the GPU?
And throw in one more:
Q -- Can you run any other CUDA apps successfully, like Collatz?
____________
|
|
|
|
thanks you for your help!
so:
i haven't overclocked CPU or GPU and didn't change anything else.
i can Crunch without any problems
GPUGRID cuda work units
one friend have also the same problem with those work unit.
he sais that for him the problem started with the update of:
"Ubuntu to maverick 10.10 64-bit with installation Boinc (6.10.58)"
in both case we haven't computed those units since a while. (i was on the calendula challenge)
so the change are:
update to the 6.10.58 boinc
update of ubuntu or kubuntu 64 bits to maverick.
and for me i haven't been able to see if i have been able to crunch succefully those unit using linux
(started to crunch with this computer on 24 Sep 2010)
have some change been made on those calculs units recently ?
|
|
|
|
Have same problem on GTX 260, nvidia-drivers-260.19.06 with latest version of BOINC (Using Gentoo)
Now trying to downgrade on the non-beta driver. |
|
|
Ken_g6 Volunteer developer
 Send message
Joined: 4 Jul 06 Posts: 915 ID: 3110 Credit: 183,164,814 RAC: 0
                        
|
That does sound like a driver problem. Can you successfully run the test procedure?
____________
|
|
|
Ken_g6 Volunteer developer
 Send message
Joined: 4 Jul 06 Posts: 915 ID: 3110 Credit: 183,164,814 RAC: 0
                        
|
FYI, now that I have a GTX 460, I have been able to reproduce the "unknown error". It particularly happens when the GPU is already loaded. For instance, when running a Project Staging Area app on the GPU, it might be a good idea not to run BOINC apps on the same GPU.
The bad news is that I haven't found any way to solve it. Worse, when I tried, I got computation errors. Perhaps the final release of the 3.2 nVIDIA drivers will be better?
____________
|
|
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 13513 ID: 53948 Credit: 237,712,514 RAC: 0
                           
|
FYI, now that I have a GTX 460, I have been able to reproduce the "unknown error". It particularly happens when the GPU is already loaded. For instance, when running a Project Staging Area app on the GPU, it might be a good idea not to run BOINC apps on the same GPU.
The bad news is that I haven't found any way to solve it. Worse, when I tried, I got computation errors. Perhaps the final release of the 3.2 nVIDIA drivers will be better?
The way *I* solve that -- or, more accurately avoid it -- is to use the <exclusive_gpu_app> in cc_config.xml to stop BOINC from running GPU apps when something is running that I know is likely to crash the BOINC app.
____________
My lucky number is 75898524288+1 |
|
|
Ken_g6 Volunteer developer
 Send message
Joined: 4 Jul 06 Posts: 915 ID: 3110 Credit: 183,164,814 RAC: 0
                        
|
<exclusive_gpu_app> doesn't appear to work - or at least to work correctly - in Linux 64. :(
____________
|
|
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 13513 ID: 53948 Credit: 237,712,514 RAC: 0
                           
|
<exclusive_gpu_app> doesn't appear to work - or at least to work correctly - in Linux 64. :(
Just checking to be sure -- you are giving it the name of the app you want to exclude, right?
Example:
<exclusive_gpu_app>game1</exclusive_gpu_app>
<exclusive_gpu_app>psa_app2</exclusive_gpu_app>
<exclusive_gpu_app>something_else</exclusive_gpu_app>
Also, make sure you don't have "Use GPU Always" set in the BOINC manager. That will override the <exclusive_gpu_app> settings and run the GPU app anyway. It needs to be set to "Use GPU according to preferences" for the exclusion mechanics to work.
____________
My lucky number is 75898524288+1 |
|
|
Ken_g6 Volunteer developer
 Send message
Joined: 4 Jul 06 Posts: 915 ID: 3110 Credit: 183,164,814 RAC: 0
                        
|
Yes, and yes. The BOINC FAQ even admits it:
Third, you can have a cc_config.xml in your BOINC Data directory that will suspend your BOINC (under Windows only, I'm afraid) automatically when any of the exclusive applications you told it to look out for is detected in main memory.
____________
|
|
|