Advanced search

Message boards : Graphics cards (GPUs) : WUs with double runtime?

Author Message
Profile Kokomiko
Avatar
Send message
Joined: 18 Jul 08
Posts: 190
Credit: 24,093,690
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 12486 - Posted: 13 Sep 2009 | 16:51:58 UTC
Last modified: 13 Sep 2009 | 16:52:19 UTC

What is changed, that the WUs now have double runtime for the same credit? On my GTX295 a runtime of 37.000 seconds is normal, but the actual WUs are running 75.000 seconds and gave the same credit. The wall clock time has changed from 9.5 hours to over 20 hours.

1243984 with 75270.441 s

1235901 with 37362.348 s
____________

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 12487 - Posted: 13 Sep 2009 | 16:54:35 UTC - in response to Message 12486.

These two wus should take the same time to complete.
Maybe, the gpu was in use or something like that.

gdf

Profile Kokomiko
Avatar
Send message
Joined: 18 Jul 08
Posts: 190
Credit: 24,093,690
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 12488 - Posted: 13 Sep 2009 | 17:01:24 UTC - in response to Message 12487.

No, no game playing in this time. The next WU (616-GIANNI_BIND001-28-100-RND7903_1) is already running since 22:25 hours and has still to go for 4:24. The card is running under Vista 64 with driver 190.62. The BOINC version was since friday evening 6.10.4. I have just downgraded to 6.10.3.
____________

Profile Paul D. Buck
Send message
Joined: 9 Jun 08
Posts: 1050
Credit: 37,321,185
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 12492 - Posted: 13 Sep 2009 | 22:05:33 UTC

There is an intermittent issue where tasks take double or more time to complete. It showed up in the early 6.6.? series and has so far proved impossible to pin down as to the cause or reason. It can affect bout GPU and CPU tasks. USUALLY a reboot clears up the situation.

There was nothing interesting in the log files the last time I had one of those and I had just about every log option on. Visual proof, like pictures of long running tasks, are of course not enough to convince UCB that there is a potential issue.

Though lord knows I tried ...

Profile robertmiles
Send message
Joined: 16 Apr 09
Posts: 503
Credit: 758,085,692
RAC: 313,138
Level
Glu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 12508 - Posted: 15 Sep 2009 | 5:52:51 UTC - in response to Message 12492.
Last modified: 15 Sep 2009 | 5:59:34 UTC

I encountered a similar problem with Rosetta@home CPU workunits, but for those the normal log files referred to a lockfile problem, so you might want to make sure you checked for this. First, a workunit failed for some reason that kept it from deleting its lockfile. BOINC then failed to recognize this as sufficient reason not to use that slot until something like a reboot deleted the lockfile instead. Then all following workunits, apparantly from any project, that tried to write a checkpoint in the directory for that slot failed because the lockfile prevented this from working. I don't know if this problem occured only when running under Vista SP2, but that's where I saw it. The problem was more likely to occur if BOINC was set to use less than 100% of the CPU time, such as 90%. However, since installing BOINC 6.6.36, I don't remember seeing this problem any more. The best I can tell, BOINC 6.6.36 is more likely to create extra slots, so there's some chance they have already tried to fix it by some new code in that version to tell it not to reuse any slots with a lockfile left over from a previous workunit.

Just in case they didn't, you may want to add some code to GPUGRID that checks for a lockfile it didn't create at the start of the workunit, and reports whether it found one in a way that will be returned even if the workunit is unable to write any checkpoints.

Another idea of something to add - fairly frequently, check the clock speeds of the GPU and the number of processors it has available, and if there have been significant changes, report them.

Post to thread

Message boards : Graphics cards (GPUs) : WUs with double runtime?

//