Author |
Message |
|
Hey. Just a heads up to everyone!
I just updated my GPU drivers to 397.31 on 2 machines, after which I've started receiving driver crashes after which GPU tasks wont run anymore.
Currently I'm running GFN17low mostly on my 2 GPU's, but when the problem start, AP or PPS Sieve tasks dont work either. The error message in stderr it prints out is "The storage control blocks were destroyed. (0x7) - exit code 7 (0x7)".
I'm not 100% certain the drivers are the problem, but its the only recent change on both machines. |
|
|
|
Found the culprit and fix probably: https://forums.geforce.com/default/topic/1051755/geforce-drivers/announcing-geforce-hotfix-driver-397-55/
"Windows 10: Driver may get removed after PC has been left idle for extended period of time."
Its not exactly idle but anyway this hopefully fixes it. I'm not home until evening so just have to hope it does not crash again. |
|
|
|
Thanks for the info, just started having issues myself.
Really hope this fixes it. |
|
|
|
I have been having this issue for a bit now.....have just rolled back to 391.35 at present and hope this is ok.
If the hot-fix works maybe you could give another heads up, but till then I will hang onto the old driver temporarily to be sure
MAGPIE |
|
|
|
I installed the hotfix last night, and for the last 12 hours it seems to have worked fine. I will post if the problem is reintroduced at some point. |
|
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 13513 ID: 53948 Credit: 237,712,514 RAC: 0
                           
|
This chat shows the number of PPS-Sieve tasks completed successfully each day over the last month. There's a noticeable dip in the middle of about one third, although it's since picked up again. I suspect the dip corresponds with the release of 397.31 and the uptick at the end is the 397.55 hotfix.
This is a live chart:
____________
My lucky number is 75898524288+1 |
|
|
|
I noticed that one could get the driver to start temporarily functioning again by restarting the Nvidia services in the Windows Services panel in the Microsoft Management Console. I do not know which one of the three is crashed, so I restarted all three of them. The hotfix driver made the necessity of figuring out which one of those services was crashed moot. |
|
|
|
The hotfix worked fine for my GTX1070, but for my GTX1060 BOINC did not work at all with it, even with a reboot.
I've read that especially GTX 1060's have had some extra problems with the new driver. Reverting back to 391.35 from here: http://www.nvidia.com/download/driverResults.aspx/132845/en-us fixed the problems for me. |
|
|
|
391.35 has solved all my problems as well |
|
|
|
Wondering if this was also causing the issue I was having:
http://www.primegrid.com/forum_thread.php?id=8040 |
|
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 13513 ID: 53948 Credit: 237,712,514 RAC: 0
                           
|
Wondering if this was also causing the issue I was having:
http://www.primegrid.com/forum_thread.php?id=8040
If you were using 397.31, then yes.
____________
My lucky number is 75898524288+1 |
|
|
|
397.64 WHQL was just released. It looks like it replaces the 397.55 hotfix driver for general use. Haven't tested it in anger yet. |
|
|
|
397.64 WHQL was just released. It looks like it replaces the 397.55 hotfix driver for general use. Haven't tested it in anger yet.
One of the reasons I never update my GPU drivers. If it ain't broke, don't fix it .... and I suspect the odds of greater efficiency from a new version at best would be super trivial (if at all). I am still using 388.13. |
|
|
RafaelVolunteer tester
 Send message
Joined: 22 Oct 14 Posts: 885 ID: 370496 Credit: 334,085,845 RAC: 0
                  
|
One of the reasons I never update my GPU drivers. If it ain't broke, don't fix it .... and I suspect the odds of greater efficiency from a new version at best would be super trivial (if at all). I am still using 388.13.
Ironically, that's exactly why I made sure to keep drivers up to date, to find potential bugs as fast as possible. |
|
|
|
I'm mixed on drivers. For gaming systems I tend to keep on bleeding edge for support. For crunchers, I update less frequently but not never either. |
|
|
|
Update your drivers now because the version you are running a driver with several security holes. See http://nvidia.custhelp.com/app/answers/detail/a_id/4649 to see which driver is the minimum version needed to patch those holes, and https://www.nvidia.com/en-us/product-security/ to see which security bulletin is the latest one for your product. Version 388.13 has unpatched security holes. |
|
|
|
Thanks, Jesse. I will proceed with a driver update. |
|
|
|
Thanks, Jesse. I will proceed with a driver update.
With the new driver, I am running about 3 degrees cooler and, by extension, with slightly faster run times for my work units.
Thanks for the heads-up! |
|
|
|
Well, it took a few hours but the latest driver is also problematic for me and as confirmed by a number of other users running Collatz where now new work cannot be downloaded for some reason.
I do have PPS Sieve as my back-up project and hopefully I will not have such problems once my Collatz queue dries up. I was planning to switch to PG this weekend anyway.
I will reacquire 388.13 in the meantime in the event I have a rude surprise tomorrow morning. |
|
|
|
I really appreciate the info/warning. That said I'm receiving the message 1 to two times a day so how do I get off the merry-go-round? |
|
|
|
I'm running 397.64 on my nvidia systems and I'm not aware of any problems with it myself. I'm only crunching PPS-sieve on them for some days and it still works. On one system gaming seems ok too. |
|
|
|
I also just upgraded to 397.64 after 397.31 & 397.55HF did not work for me, but i found a problem that my 940MX shows CUDA support but no OpenCL support in Boinc and GPUz.
Does your card show CUDA and OpenCL support in Boinc?
Part of my BOINC log:
15-5-2018 22:57:55 | | CUDA: NVIDIA GPU 0: GeForce 940MX (driver version 397.64, CUDA version 9.2, compute capability 5.0, 2048MB, 1690MB available, 953 GFLOPS peak)
15-5-2018 22:57:55 | | OpenCL: Intel GPU 0: Intel(R) HD Graphics 620 (driver version 23.20.16.4982, device version OpenCL 2.1 NEO, 3244MB, 3244MB available, 192 GFLOPS peak)
15-5-2018 22:57:55 | | OpenCL CPU: Intel(R) Core(TM) i5-7200U CPU @ 2.50GHz (OpenCL driver vendor: Intel(R) Corporation, driver version 7.6.0.691, device version OpenCL 2.1 (Build 691))
15-5-2018 22:57:55 | Milkyway@Home | Application uses missing NVIDIA GPU
15-5-2018 22:57:55 | Milkyway@Home | Missing coprocessor for task
15-5-2018 22:57:55 | | App version needs OpenCL but GPU doesn't support it
____________
|
|
|
|
BOINC only reports a CUDA when I start it. I tried getting MilkyWay GPU work but it wont give me any. No clues why. |
|
|
|
BOINC only reports a CUDA when I start it. I tried getting MilkyWay GPU work but it wont give me any. No clues why.
This is the explanation we got after Collatz finally started to resend GPU work a few days ago:
"I re-ran the update versions which inserts the server records required for the scheduler to send the work. I'm not 100% sure why it got screwed up but I think it had to do with opencl_nvidia vs opencl_nvidia_gpu plan class stuff that happened last Thursday. Once again, more people weighing in on what might be wrong help me get it back on track faster. Thanks guys! I also found a bug in the BOINC error reporting from this so I'll be sure to forward that to the BOINC developers as well." |
|
|
|
Thanx for the fast response, i do not think (but not sure) these problems are related, as GPUz and Boinc doesn't show any OpenCL driver/capabilities and they do with older drivers i do think this is a driver problem. I have reported this problem to Nvidia but have not received any response yet.
The problem could also be that the latest drivers introduced some OpenCL 2.0 experimental stuff, maybe Boinc and GPUz do not detect the 2.0 capabilities correct.
More info: http://us.download.nvidia.com/Windows/397.64/397.64-win10-win8-win7-notebook-release-notes.pdf
Experimental OpenCL 2.0 Features
Select features in OpenCL 2.0 are available in the driver for evaluation purposes only. The
following are the features as well as a description of known issues with these features in
the driver:
ïµ Device side enqueue
•The current implementation is limited to 64-bit platforms only.
•OpenCL 2.0 allows kernels to be enqueued with global_work_size larger than the
compute capability of the NVIDIA GPU. The current implementation supports only
combinations of global_work_size and local_work_size that are within the compute
capability of the NVIDIA GPU.
The maximum supported CUDA grid and block size of NVIDIA GPUs is available
at http://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#computecapabilities.
For a given grid dimension, the global_work_size can be determined by
CUDA grid size x CUDA block size.
•For executing kernels (whether from the host or the device), OpenCL 2.0 supports
non-uniform ND-ranges where global_work_size does not need to be divisible by
the local_work_size. This capability is not yet supported in the NVIDIA driver, and
therefore not supported for device side kernel enqueues.
ïµ Shared virtual memory
•The current implementation of shared virtual memory is limited to 64-bit platforms
only.
____________
|
|
|