Author |
Message |
valterc Volunteer tester Send message
Joined: 30 May 07 Posts: 119 ID: 8810 Credit: 5,673,154,322 RAC: 0
                    
|
I noticed that since a few days all my TRP sieve tasks (from different computers) require doublechecking. Is there any particular reason for this? |
|
|
|
I noticed that since a few days all my TRP sieve tasks (from different computers) require doublechecking. Is there any particular reason for this?
Yes there is. Here's a link to the thread that explains what happened as I asked the same thing about pps sieves. My bad, I asked my question in a thread that wasn't really on that subject.
http://www.primegrid.com/forum_thread.php?id=4881
Rick
____________
@AggieThePew
|
|
|
valterc Volunteer tester Send message
Joined: 30 May 07 Posts: 119 ID: 8810 Credit: 5,673,154,322 RAC: 0
                    
|
Oh... well... damn... I agree that double checking is probably the only quick solution against this kind of cheating. Even if it slows down things.... |
|
|
|
IIRC, the sieves will use adaptive replication, so things will be slowed down, but they will not be slowed down by a factor of 2.
____________
|
|
|
valterc Volunteer tester Send message
Joined: 30 May 07 Posts: 119 ID: 8810 Credit: 5,673,154,322 RAC: 0
                    
|
ok, I just noticed that some of the trp sieve wu I crunched ended with the "Completed, can't validate" status. You can check them here http://www.primegrid.com/results.php?userid=8810&offset=0&show_names=0&state=4&appid=14. You may notice that errors are produced by different hosts, usually reliable ones....
Is this something related to the new validator? Any hints about why the wu cannot be validated?
thank you in advance |
|
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 13513 ID: 53948 Credit: 237,712,514 RAC: 0
                           
|
ok, I just noticed that some of the trp sieve wu I crunched ended with the "Completed, can't validate" status. You can check them here http://www.primegrid.com/results.php?userid=8810&offset=0&show_names=0&state=4&appid=14. You may notice that errors are produced by different hosts, usually reliable ones....
Is this something related to the new validator? Any hints about why the wu cannot be validated?
thank you in advance
The specific reasons why some tasks may or may not need to be doublechecked won't be discussed in public. These are steps we're taking to prevent cheating.
EDIT: The fact that a particular task may be doublechecked does not necessarily imply that there's anything suspicious about that task. The converse is also true.
____________
My lucky number is 75898524288+1 |
|
|
|
@valterc: I had a couple TRP Sieve tasks go to "can't validate" state a few days ago. There were wingman-tasks still running. When those finished, my work changed to "completed and validated".
Most likely, nothing for you to worry about.
--Gary |
|
|
valterc Volunteer tester Send message
Joined: 30 May 07 Posts: 119 ID: 8810 Credit: 5,673,154,322 RAC: 0
                    
|
Well, all the tasks are indeed waiting for a good return, other wingmans returned errors.
If this is the reason probably a message like 'waiting for something' will be less scary that 'cannot validate'.... (which seems to me more 'definitive').
@micheal: I agree that the validation mechanism should not be public. |
|
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 13513 ID: 53948 Credit: 237,712,514 RAC: 0
                           
|
As you've noticed, the "can't validate" state means -- on PrimeGrid -- "can't validate yet".
On other BOINC systems it may be a more permanent error, but here we have an automated process that resolves this and eventually everything finishes the way it's supposed to.
That being said, this should not have been occurring in this situation, and Jim and I have taken the necessary steps to correct the problem. New TRP sieve workunits will be set to be more permissive, which is needed with the new validator. You shouldn't see this problem, or at least not nearly as often, once you start getting the new workunits. Note, however, that all existing workunits, and the 500 or so tasks currently waiting to be sent out are still susceptible to getting that error message. Fortunately, you can simply ignore the message.
Thanks for bringing this to our attention!
____________
My lucky number is 75898524288+1 |
|
|
valterc Volunteer tester Send message
Joined: 30 May 07 Posts: 119 ID: 8810 Credit: 5,673,154,322 RAC: 0
                    
|
And thank you for the really quick answer! |
|
|
Tyler Project administrator Volunteer tester Send message
Joined: 4 Dec 12 Posts: 1077 ID: 183129 Credit: 1,280,170,555 RAC: 0
                     
|
So this is why my TRP Sieve tasks have pending credit for so long? |
|
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 13513 ID: 53948 Credit: 237,712,514 RAC: 0
                           
|
So this is why my TRP Sieve tasks have pending credit for so long?
Yes.
____________
My lucky number is 75898524288+1 |
|
|
Dave  Send message
Joined: 13 Feb 12 Posts: 2829 ID: 130544 Credit: 954,793,678 RAC: 0
                     
|
Trying to find somewhere appropriate to post this, but hopefully I'm helping this effort by dialling-in a 3rd core of my Note II in an ambient temp of about 8C! CPU temp therefore down to early-50s.
If it's going to be crap weather on my holiday I might as well make good use of it. |
|
|
Dave  Send message
Joined: 13 Feb 12 Posts: 2829 ID: 130544 Credit: 954,793,678 RAC: 0
                     
|
Question (here rather than dig up an old thread): why do some CPU tasks take minutes to move off 0.000%? 20 or 30 mins sonetimes. |
|
|
Michael Goetz Volunteer moderator Project administrator
 Send message
Joined: 21 Jan 10 Posts: 13513 ID: 53948 Credit: 237,712,514 RAC: 0
                           
|
Question (here rather than dig up an old thread): why do some CPU tasks take minutes to move off 0.000%? 20 or 30 mins sonetimes.
That's a good question. Most likely, you're referring to LLR tasks.
There's two possible answers:
Answer #1:
There's essentially two kinds of applications (apps) that can be run by BOINC servers: native apps that were specially written to run with BOINC (or were modified to run with BOINC), and foreign apps that were written for some other purpose.
Native BOINC tasks communicate with the BOINC client on the host computer, and are able to tell the client how much progress has been made, when the last checkpoint was done, etc. The app can be written to send this information to the BOINC client as frequently as desired, so the display can be updated frequently.
Foreign apps don't know anything about BOINC and don't send it any information. In fact, they can't be run directly by BOINC. You need a "wrapper" program that can talk to BOINC, and the wrapper, in turn, runs the foreign app. Sometimes the foreign app prints status messages that the wrapper can see, and use this to figure out how far the app has progressed. This is what we do with LLR. It's a foreign app, and we use a wrapper to read what it displays and use that to tell BOINC how far we've gone. But we have no control over how frequently LLR displays that information. It's not very frequent, which accounts for the long intervals between updates in the BOINC display. Also, the interval depends somewhat on the size of the number, so PPS-LLR tasks should update more frequently than SoB-LLR tasks.
Answer #2:
Some applications have a substantial initialization phase, during which the percentage completed remains at zero. Old CPU GFN-short tasks were like that. Technically, all GFN tasks are *still* like that, but the initialization phase on the CPU version is a lot faster now.
20 or 30 mins sonetimes.
I didn't address this, specifically, because I can't think of a good answer. Can you give me an example of a task that doesn't update for 0 or 30 minutes?
____________
My lucky number is 75898524288+1 |
|
|
Dave  Send message
Joined: 13 Feb 12 Posts: 2829 ID: 130544 Credit: 954,793,678 RAC: 0
                     
|
Thanks.
Yes probably http://www.primegrid.com/result.php?resultid=453225074.
Not bad for having to type the url out manually because I'm lazy and in bed with my fone (no that sounds bad...)!
Also try 453288164. |
|
|