Advanced search

Message boards : Graphics cards (GPUs) : pausing BOINC for 1 hour (but resuming it after a few minutes) gives computation error?

Author Message
3zQBWZLvN5j2CdrKcJkDs8E5n...
Send message
Joined: 12 Apr 20
Posts: 2
Credit: 33,925,000
RAC: 312,540
Level
Val
Scientific publications
wat
Message 61495 - Posted: 10 May 2024 | 21:11:16 UTC

Don't know if this is just a one-time thing or if it happens to other people too?

Got a BOINC WU which says
0,9 CPU + 1 nVIDIA GPU.
It was running for a few hours and had about 20 hours remaining time.
I PAUSED BOINC for 1 hour (from the task bar) and boinc correctly said
"paused by user" on all my tasks.
After a few minutes i unchecked the "pause for 1 hour" menu-item and all projects initially correctly resumed,
BUT the nVIDIA GPU WU was uploaded and then cancelled with
"computation error".
Sad.
Does it mean that pausing generally does not work or is not reliable or could this be a one tme thing?
In any case, the system now seems offended :-) because i get no now GPU work :-)
greetings.

Keith Myers
Send message
Joined: 13 Dec 17
Posts: 1340
Credit: 7,649,467,459
RAC: 13,314,674
Level
Tyr
Scientific publications
watwatwatwatwat
Message 61503 - Posted: 14 May 2024 | 20:04:15 UTC - in response to Message 61495.
Last modified: 14 May 2024 | 20:05:40 UTC

Most of the GPUGrid apps are not capable of being stopped or paused without causing instant errors upon resume.

So don't do it.

AFAIK, at this stage of the project, only the old acemd3 tasks actually checkpoint and can be stopped or resumed.

Also if I remember correctly, the Quantum Chemistry tasks can be stopped and resumed, but all progress is lost and the tasks restart from scratch each time they are stopped.

The Python apps, Python on GPU and ATM tasks just error out and report if I remember correctly.

Been too long since I've seen any of those so may not remember if this is still the case.

Emmanuel Mar
Send message
Joined: 14 May 24
Posts: 10
Credit: 36,680,000
RAC: 405,840
Level
Val
Scientific publications
wat
Message 61516 - Posted: 18 May 2024 | 22:12:07 UTC - in response to Message 61495.


Instead of suspending the specific task, suspend the project in which the task is located and allow a few minutes to pass before disconnecting the computer.

In Boinc /options/Processing/Leave tasks in memory suspended/uncheck this option
____________

Post to thread

Message boards : Graphics cards (GPUs) : pausing BOINC for 1 hour (but resuming it after a few minutes) gives computation error?

//