Problem - Tasks error when exiting/resuming using 334.67 drivers

Message boards : Number crunching : Problem - Tasks error when exiting/resuming using 334.67 drivers

Author	Message
Jacob Klein Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level Scientific publications	Message 34967 - Posted: 10 Feb 2014 \| 13:41:49 UTC Last modified: 10 Feb 2014 \| 13:43:06 UTC
	MJH / Admins: I'm getting several task errors (Windows 8.1 x64, 334.67 drivers) that say: <core_client_version>7.2.39</core_client_version> <![CDATA[ <message> The file exists. (0x50) - exit code 80 (0x50) </message> and the last line in the stderr.txt file is: # BOINC suspending at user request (exit) I think that suspending/resuming tasks isn't working very well. Tasks are erroring out, when being resumed. http://www.gpugrid.net/result.php?resultid=7747671 http://www.gpugrid.net/result.php?resultid=7749480 http://www.gpugrid.net/result.php?resultid=7750550 http://www.gpugrid.net/result.php?resultid=7751319 Can you please look into this? I'm not sure if it's the application, or if it's the new BETA drivers, or if it's an issue that has always been there. But I would like it fixed! Hoping you agree, and available to help, Jacob PS: I originally posted this in the 8.15 app thread, but decided to create a new thread here. Also, I'm not the only one having this problem.
	ID: 34967 \| Rating: 0 \| rate: / Reply Quote

Jacob Klein Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level Scientific publications	Message 35019 - Posted: 13 Feb 2014 \| 16:21:53 UTC - in response to Message 34967. Last modified: 13 Feb 2014 \| 16:22:31 UTC
	MJH: Have you noticed an increase in instability, with 334.67 drivers, when suspending/resuming tasks, or shutting down and restarting BOINC quickly? If so, is there any way to determine if the application is the problem, or if the driver is the problem?
	ID: 35019 \| Rating: 0 \| rate: / Reply Quote

Killersocke Send message Joined: 18 Oct 13 Posts: 53 Credit: 406,647,419 RAC: 0 Level Scientific publications	Message 35105 - Posted: 17 Feb 2014 \| 21:17:56 UTC Last modified: 17 Feb 2014 \| 21:18:20 UTC
	Confirm same Problems here with 332.21 Driver 589x-SANTI_MAR422cap310-12-32-RND9315_0 Arbeitspaket 5177762 Name 369x-SANTI_MAR422cap310-8-32-RND5608_0 Arbeitspaket 5175511 Simulation unstable. Flag 9 value 375 # The simulation has become unstable. Terminating to avoid lock-up
	ID: 35105 \| Rating: 0 \| rate: / Reply Quote

Jacob Klein Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level Scientific publications	Message 35242 - Posted: 23 Feb 2014 \| 2:13:36 UTC
	This happened again, where suspending the task, then closing BOINC, resulted in the task error'ing: http://www.gpugrid.net/result.php?resultid=7810949 Can an admin please help to resolve this issue, or will it go unanswered? I'm willing to offer whatever it takes to help test to get it resolved. MJH?
	ID: 35242 \| Rating: 0 \| rate: / Reply Quote

Mumak Send message Joined: 7 Dec 12 Posts: 92 Credit: 225,897,225 RAC: 0 Level Scientific publications	Message 35314 - Posted: 24 Feb 2014 \| 9:26:34 UTC
	Same issue here too...
	ID: 35314 \| Rating: 0 \| rate: / Reply Quote

lukeu Send message Joined: 14 Oct 11 Posts: 31 Credit: 81,420,504 RAC: 269 Level Scientific publications	Message 35315 - Posted: 24 Feb 2014 \| 9:34:09 UTC
	Snap! GTX660, Win7-64, Driver 311.06
	ID: 35315 \| Rating: 0 \| rate: / Reply Quote

Jacob Klein Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level Scientific publications	Message 35336 - Posted: 25 Feb 2014 \| 13:04:40 UTC
	Anyone at GPUGrid care to fix this, like we did the previous suspend/resume problems? I'm willing to help test.
	ID: 35336 \| Rating: 0 \| rate: / Reply Quote

Richard Haselgrove Send message Joined: 11 Jul 09 Posts: 1626 Credit: 9,384,566,723 RAC: 19,075,423 Level Scientific publications	Message 35347 - Posted: 25 Feb 2014 \| 19:22:02 UTC - in response to Message 35336.
	Have we any more complete idea of the cause yet? I've recently upgraded to the WHQL version of the driver (334.89) for my GTX 670: no crashes yet, but then I don't routinely suspend tasks once they've started. What I have noticed is the reduced CPU demand, and a welcome reduction in the runtime of the SIMAP tasks running at the same time. I note that stderr says The file exists. (0x50) - exit code 80 (0x50) but MLH's FAQ says * -80 Failed to recover after an access violation (Win32) Any signs of an access violation from Windows, Jacob? I'd be interested if the problem could be narrowed down to a more immediate cause. Candidates are Windows (I see Jacob using v8.1 - I have 7 here) Driver BOINC client (I see Jacob using alpha client v7.3.2) BOINC API (linked into application) Application and of course any combination of the above, plus probably more besides. My instinctive reaction on seeing the thread title was 'API', but I'm not so sure having looked at the full error messages.
	ID: 35347 \| Rating: 0 \| rate: / Reply Quote

Jacob Klein Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level Scientific publications	Message 35348 - Posted: 25 Feb 2014 \| 21:02:54 UTC - in response to Message 35347. Last modified: 25 Feb 2014 \| 21:05:37 UTC
	I was able to get a task to fail by: - Run BOINC such that the GPU Task is processing - Right-click tray, choose "Snooze GPU" - Verify task now says "GPU suspended" - Right-click tray, "Exit", with "Stop running tasks" checked, click OK - Start BOINC They don't fail all the time, but... if you try those exact steps over and over, eventually you might get a failure. I'd like this thread to focus on failures that are a result of those steps above. I hope we can solve it, but we'll need help from MJH.
	ID: 35348 \| Rating: 0 \| rate: / Reply Quote

Dagorath Send message Joined: 16 Mar 11 Posts: 509 Credit: 179,005,236 RAC: 0 Level Scientific publications	Message 35352 - Posted: 25 Feb 2014 \| 23:04:03 UTC - in response to Message 35348.
	They don't fail all the time, but... if you try those exact steps over and over, eventually you might get a failure. I have caused GPUgrid tasks to fail on restart by stopping and restarting BOINC quickly 3 or 4 times in a row on Linux but that was last year not with current app and drivers. If I think of it I'll try to replicate it on a newly started task but I'm not going to try it on a task I've put an hour into. If a single stop BOINC and restart cycle is causing crashes then that's worth fixing but if it happens only after several stop and restart cycles in quick succession then I wonder if it's worth fixing as that is not a likely operating scenario. ____________ BOINC <<--- credit whores, pedants, alien hunters
	ID: 35352 \| Rating: 0 \| rate: / Reply Quote

Jacob Klein Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level Scientific publications	Message 35353 - Posted: 25 Feb 2014 \| 23:05:37 UTC - in response to Message 35352.
	I run applications that I have setup as "exclusive applications" in BOINC. And sometimes I shut down BOINC. These, even in combination, should be supported, by the projects. And I hope to have this issue resolved eventually. :)
	ID: 35353 \| Rating: 0 \| rate: / Reply Quote

MJH Project administrator Project developer Project scientist Send message Joined: 12 Nov 07 Posts: 696 Credit: 27,266,655 RAC: 0 Level Scientific publications	Message 35354 - Posted: 25 Feb 2014 \| 23:29:42 UTC
	I've scheduled some time to sort this out in a week or so, when I'll also be putting out Maxwell support. Matt
	ID: 35354 \| Rating: 0 \| rate: / Reply Quote

Jacob Klein Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level Scientific publications	Message 35356 - Posted: 26 Feb 2014 \| 1:34:09 UTC
	Thank you. I'm not sure how exactly to help, but I'm definitely willing. Last time, we iterated app versions with debug text to solve it, right? We might have to do something similar here.
	ID: 35356 \| Rating: 0 \| rate: / Reply Quote

Beyond Send message Joined: 23 Nov 08 Posts: 1112 Credit: 6,162,416,256 RAC: 0 Level Scientific publications	Message 35457 - Posted: 2 Mar 2014 \| 17:05:00 UTC
	I don't think this is a driver issue. I'd been error free for a long time but in the last 5 days have been seeing errors in SANTI_MAR WUs only. Some of them occur whenever BOINC is exited (gracefully, by exit dialog) for any reason. No other WU types are affected. At first I though the exit error was only on 1GB cards but now I see on other users that it's happening on 660 Ti cards also. The SANTI_MAR WUs also seem to be particularly sensitive to other conditions too and are failing at too high a rate IMO.
	ID: 35457 \| Rating: 0 \| rate: / Reply Quote

skgiven Volunteer moderator Volunteer tester Send message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level Scientific publications	Message 35478 - Posted: 3 Mar 2014 \| 8:26:52 UTC - in response to Message 35457. Last modified: 3 Mar 2014 \| 8:29:26 UTC
	I've had 10 SANTI_MAR failures on the same Linux system in the past 3weeks, http://www.gpugrid.net/results.php?hostid=159186&offset=0&show_names=1&state=5&appid= Other than that there has only been the one SANTI_bax2 failure and 2 WU's I aborted since Nov. They are all, Exit status 255 (0xff) Unknown error number ____________ FAQ's HOW TO: - Opt out of Beta Tests - Ask for Help
	ID: 35478 \| Rating: 0 \| rate: / Reply Quote

TJ Send message Joined: 26 Jun 09 Posts: 815 Credit: 1,470,385,294 RAC: 0 Level Scientific publications	Message 35497 - Posted: 4 Mar 2014 \| 12:15:47 UTC
	I have almost every other day an error of a Santi WU on my 660. On the 770 and 780Ti no errors (yet). I agree with Beyond (nice new picture of dog) that it is not the drivers. Santi's seem to be "special". Coincidentally I found a crunchers tasks list with a Titan and all Santi's failed on that system, but the recent Noeilia's finished okay. ____________ Greetings from TJ
	ID: 35497 \| Rating: 0 \| rate: / Reply Quote

Jacob Klein Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level Scientific publications	Message 35566 - Posted: 8 Mar 2014 \| 23:57:01 UTC - in response to Message 35354. Last modified: 9 Mar 2014 \| 0:00:51 UTC
	Matt: It has been a while -- Have you made any progress on this? I'm still regularly failing tasks during suspend and resume operations, especially SANTI_MAR tasks. It's especially painful to see 2 tasks fail simultaneously, which happens to me, because I have 2 GPUs dedicated to GPUGrid computing. Then when the tasks fail, instantly 10-20 hours of work, dead, to "Computation Error". Frustrating. We need a fix! Please help! Posted: 25 Feb 2014 \| 23:29:42 UTC I've scheduled some time to sort this out in a week or so, when I'll also be putting out Maxwell support. Matt
	ID: 35566 \| Rating: 0 \| rate: / Reply Quote

Jacob Klein Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level Scientific publications	Message 35583 - Posted: 10 Mar 2014 \| 15:08:22 UTC Last modified: 10 Mar 2014 \| 15:13:12 UTC
	I just had another one fail. I had 19 hours invested into it, and needed to restart my machine. I had suspended the task, I had closed BOINC, I restarted the machine, I resumed the task, and poof, Computation Error. 19 hours, wasted. This is very very frustrating. Stderr output <core_client_version>7.3.10</core_client_version> <![CDATA[ <message> The file exists. (0x50) - exit code 80 (0x50) </message> <stderr_txt> # GPU [GeForce GTX 460] Platform [Windows] Rev [3203M] VERSION [42] # SWAN Device 1 : # Name : GeForce GTX 460 # ECC : Disabled # Global mem : 1024MB # Capability : 2.1 # PCI ID : 0000:08:00.0 # Device clock : 1526MHz # Memory clock : 1900MHz # Memory width : 256bit # Driver version : r334_00 : 33489 # GPU 0 : 67C # GPU 1 : 66C # GPU 2 : 78C # GPU 1 : 67C # GPU 1 : 68C # GPU 0 : 68C # GPU 1 : 69C # GPU 1 : 70C # GPU 0 : 69C # GPU 2 : 79C # BOINC suspending at user request (exit) # GPU [GeForce GTX 460] Platform [Windows] Rev [3203M] VERSION [42] # SWAN Device 1 : # Name : GeForce GTX 460 # ECC : Disabled # Global mem : 1024MB # Capability : 2.1 # PCI ID : 0000:08:00.0 # Device clock : 1526MHz # Memory clock : 1900MHz # Memory width : 256bit # Driver version : r334_00 : 33489 # GPU 0 : 66C # GPU 1 : 65C # GPU 2 : 73C # GPU 1 : 66C # GPU 2 : 75C # GPU 0 : 67C # BOINC suspending at user request (exit) </stderr_txt> ]]>
	ID: 35583 \| Rating: 0 \| rate: / Reply Quote

Jacob Klein Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level Scientific publications	Message 35709 - Posted: 17 Mar 2014 \| 15:26:50 UTC
	And another one today. The file exists. (0x50) - exit code 80 (0x50) MJH?
	ID: 35709 \| Rating: 0 \| rate: / Reply Quote

Beyond Send message Joined: 23 Nov 08 Posts: 1112 Credit: 6,162,416,256 RAC: 0 Level Scientific publications	Message 35792 - Posted: 21 Mar 2014 \| 20:59:19 UTC - in response to Message 35583.
	I just had another one fail. I had 19 hours invested into it, and needed to restart my machine. I had suspended the task, I had closed BOINC, I restarted the machine, I resumed the task, and poof, Computation Error. 19 hours, wasted. This is very very frustrating. This same thing happens here on every SANTI_MAR WU when I have to exit BOINC and reboot for an update or whatever. 100% chance of error. Frustrating is the word.
	ID: 35792 \| Rating: 0 \| rate: / Reply Quote

Jacob Klein Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level Scientific publications	Message 35927 - Posted: 27 Mar 2014 \| 11:47:41 UTC Last modified: 27 Mar 2014 \| 11:49:41 UTC
	MJH: Please please please help. I just threw away another several hours of GPUGrid work, because I had to restart BOINC, and the 2 GPUGrid tasks died. :( This time, I didn't suspend the tasks, I just exited BOINC normally. Then, upon restart, both tasks died. Surely this is fixable?!?!? Name 1211-GIANNI_ntl-1-4-RND3734_0 Workunit 5485267 Created 26 Mar 2014 \| 21:32:15 UTC Sent 27 Mar 2014 \| 0:06:01 UTC Received 27 Mar 2014 \| 11:46:13 UTC Server state Over Outcome Computation error Client state Compute error Exit status 80 (0x50) Unknown error number Computer ID 153764 Report deadline 1 Apr 2014 \| 0:06:01 UTC Run time 0.00 CPU time 0.00 Validate state Invalid Credit 0.00 Application version Long runs (8-12 hours on fastest card) v8.15 (cuda42) Stderr output <core_client_version>7.3.11</core_client_version> <![CDATA[ <message> The file exists. (0x50) - exit code 80 (0x50) </message> <stderr_txt> # GPU [GeForce GTX 460] Platform [Windows] Rev [3203M] VERSION [42] # SWAN Device 1 : # Name : GeForce GTX 460 # ECC : Disabled # Global mem : 1024MB # Capability : 2.1 # PCI ID : 0000:08:00.0 # Device clock : 1526MHz # Memory clock : 1900MHz # Memory width : 256bit # Driver version : r334_89 : 33523 # GPU 0 : 58C # GPU 1 : 47C # GPU 2 : 67C # GPU 0 : 60C # GPU 1 : 50C # GPU 2 : 69C # GPU 0 : 61C # GPU 1 : 52C # GPU 0 : 62C # GPU 1 : 55C # GPU 2 : 70C # GPU 0 : 63C # GPU 1 : 56C # GPU 2 : 71C # GPU 1 : 57C # GPU 0 : 64C # GPU 1 : 59C # GPU 2 : 72C # GPU 0 : 65C # GPU 1 : 61C # GPU 1 : 62C # GPU 1 : 63C # GPU 2 : 73C # GPU 0 : 66C # GPU 1 : 64C # GPU 2 : 74C # GPU 1 : 65C # GPU 0 : 67C # GPU 1 : 66C # GPU 2 : 75C # GPU 0 : 68C # GPU 2 : 76C # GPU 1 : 67C # BOINC suspending at user request (exit) </stderr_txt> ]]> Name 1733-GIANNI_ntl-3-4-RND9094_0 Workunit 5485140 Created 26 Mar 2014 \| 21:06:01 UTC Sent 27 Mar 2014 \| 6:35:37 UTC Received 27 Mar 2014 \| 11:46:13 UTC Server state Over Outcome Computation error Client state Compute error Exit status 80 (0x50) Unknown error number Computer ID 153764 Report deadline 1 Apr 2014 \| 6:35:37 UTC Run time 0.00 CPU time 0.00 Validate state Invalid Credit 0.00 Application version Long runs (8-12 hours on fastest card) v8.15 (cuda55) Stderr output <core_client_version>7.3.11</core_client_version> <![CDATA[ <message> The file exists. (0x50) - exit code 80 (0x50) </message> <stderr_txt> # GPU [GeForce GTX 660 Ti] Platform [Windows] Rev [3203M] VERSION [55] # SWAN Device 0 : # Name : GeForce GTX 660 Ti # ECC : Disabled # Global mem : 3072MB # Capability : 3.0 # PCI ID : 0000:09:00.0 # Device clock : 1124MHz # Memory clock : 3004MHz # Memory width : 192bit # Driver version : r334_89 : 33523 # GPU 0 : 64C # GPU 1 : 65C # GPU 2 : 74C # GPU 2 : 75C # GPU 0 : 65C # GPU 1 : 66C # GPU 0 : 66C # GPU 2 : 76C # GPU 1 : 67C # BOINC suspending at user request (exit) </stderr_txt> ]]>
	ID: 35927 \| Rating: 0 \| rate: / Reply Quote

MJH Project administrator Project developer Project scientist Send message Joined: 12 Nov 07 Posts: 696 Credit: 27,266,655 RAC: 0 Level Scientific publications	Message 35928 - Posted: 27 Mar 2014 \| 13:09:29 UTC - in response to Message 35927.
	Jacob Try the acemdshort app 820. Should fix the problem. Matt
	ID: 35928 \| Rating: 0 \| rate: / Reply Quote

Jacob Klein Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level Scientific publications	Message 35934 - Posted: 27 Mar 2014 \| 18:12:43 UTC - in response to Message 35928. Last modified: 27 Mar 2014 \| 18:13:05 UTC
	What was the problem, and what was the fix? When do you think it will land on the Long queue? I will try to monitor application version numbers more closely, as I usually get a variety of Long/Short tasks.
	ID: 35934 \| Rating: 0 \| rate: / Reply Quote

MJH Project administrator Project developer Project scientist Send message Joined: 12 Nov 07 Posts: 696 Credit: 27,266,655 RAC: 0 Level Scientific publications	Message 35941 - Posted: 27 Mar 2014 \| 19:56:23 UTC - in response to Message 35934.
	The problem, I think, is a false positive from the test to see if the Wu has got Stuck in a crash loop, as introduced in 815. I fixed that a while ago but only rolled it out with 820. Let's see... Matt
	ID: 35941 \| Rating: 0 \| rate: / Reply Quote

Jacob Klein Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level Scientific publications	Message 36010 - Posted: 30 Mar 2014 \| 22:42:14 UTC - in response to Message 35941.
	When do you plan on deploying the 8.20 app, to Long-queue? People are still getting the "File already exists" error, losing tons of work, daily. If you were still testing it, then why was it not contained to the Beta-queue? Since it's already on Short, I think it should already be on Long too. Sick of losing work because of this . . .
	ID: 36010 \| Rating: 0 \| rate: / Reply Quote

Stefan Project administrator Project developer Project tester Project scientist Send message Joined: 5 Mar 13 Posts: 348 Credit: 0 RAC: 0 Level Scientific publications	Message 36016 - Posted: 31 Mar 2014 \| 9:08:49 UTC - in response to Message 36010.
	Bugs get through beta-queue testing from time to time. So it's obviously better if we only lose the work on the short queue and not the work from both queues. But I guess at this point 820 looks stable enough, so I will suggest to Matt to push it to long.
	ID: 36016 \| Rating: 0 \| rate: / Reply Quote

MJH Project administrator Project developer Project scientist Send message Joined: 12 Nov 07 Posts: 696 Credit: 27,266,655 RAC: 0 Level Scientific publications	Message 36017 - Posted: 31 Mar 2014 \| 10:37:37 UTC - in response to Message 36010.
	Jacob, 820 for cuda6 is on long now. Matt
	ID: 36017 \| Rating: 0 \| rate: / Reply Quote

Richard Haselgrove Send message Joined: 11 Jul 09 Posts: 1626 Credit: 9,384,566,723 RAC: 19,075,423 Level Scientific publications	Message 36018 - Posted: 31 Mar 2014 \| 11:05:34 UTC - in response to Message 36017.
	Jacob, 820 for cuda6 is on long now. Matt Have you been able to find a way of preventing the server from allocating cuda55 or cuda42 to Maxwell (CC 5.0) cards yet? Doesn't waste any actual computing time, but the downloads are a bit of a pain - and having several hours of expected crunching suddenly disappear rather confuses BOINC's scheduler. :-D
	ID: 36018 \| Rating: 0 \| rate: / Reply Quote

MJH Project administrator Project developer Project scientist Send message Joined: 12 Nov 07 Posts: 696 Credit: 27,266,655 RAC: 0 Level Scientific publications	Message 36019 - Posted: 31 Mar 2014 \| 11:20:11 UTC - in response to Message 36018.
	Have you been able to find a way of preventing the server from allocating cuda55 or cuda42 to Maxwell (CC 5.0) cards yet? No idea, although haven't looked deeply into it yet. Matt
	ID: 36019 \| Rating: 0 \| rate: / Reply Quote

MJH Project administrator Project developer Project scientist Send message Joined: 12 Nov 07 Posts: 696 Credit: 27,266,655 RAC: 0 Level Scientific publications	Message 36020 - Posted: 31 Mar 2014 \| 11:20:15 UTC - in response to Message 36018.
	Have you been able to find a way of preventing the server from allocating cuda55 or cuda42 to Maxwell (CC 5.0) cards yet? No idea, although haven't looked deeply into it yet. Matt
	ID: 36020 \| Rating: 0 \| rate: / Reply Quote

Richard Haselgrove Send message Joined: 11 Jul 09 Posts: 1626 Credit: 9,384,566,723 RAC: 19,075,423 Level Scientific publications	Message 36021 - Posted: 31 Mar 2014 \| 11:31:40 UTC - in response to Message 36020.
	Have you been able to find a way of preventing the server from allocating cuda55 or cuda42 to Maxwell (CC 5.0) cards yet? No idea, although haven't looked deeply into it yet. Matt It should be possible, by setting a maximum compute_capability for the two unwanted plan_classes.
	ID: 36021 \| Rating: 0 \| rate: / Reply Quote

Jacob Klein Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level Scientific publications	Message 36022 - Posted: 31 Mar 2014 \| 12:09:55 UTC - in response to Message 36017. Last modified: 31 Mar 2014 \| 12:11:22 UTC
	Jacob, 820 for cuda6 is on long now. Matt Finally!! I noticed that it was only deployed for the cuda6 plan classes; are there any plans to update the app for the other plan classes? Also, please continue to make stability a priority. It is so very frustrating to lose progress. Some of the tasks that fail say they only had a couple seconds of run-time, where I believe they may have actually had several hours invested. Perhaps that masked the severity of the issue to you guys, not sure. But I hope bug-fixing becomes a high(er) priority. Regards, Jacob
	ID: 36022 \| Rating: 0 \| rate: / Reply Quote

Jacob Klein Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level Scientific publications	Message 36083 - Posted: 4 Apr 2014 \| 2:33:01 UTC
	Had to chime in again to say THANK YOU for fixing this. BOINC Task Stability is obviously very important to me, and this bug had been plaguing me for weeks. The new 8.20 app seems to be suspending/exiting/resuming much better for me thus far. Thank you!
	ID: 36083 \| Rating: 0 \| rate: / Reply Quote

Wdethomas Send message Joined: 6 Feb 10 Posts: 38 Credit: 274,204,838 RAC: 0 Level Scientific publications	Message 36143 - Posted: 7 Apr 2014 \| 19:03:44 UTC
	This has not been fixed. I have all CUDA 55 WU and if the light goes out, the work units get lost.
	ID: 36143 \| Rating: 0 \| rate: / Reply Quote

Variable Send message Joined: 20 Nov 13 Posts: 21 Credit: 452,041,709 RAC: 547,846 Level Scientific publications	Message 36145 - Posted: 7 Apr 2014 \| 19:43:02 UTC
	It looks like I've started getting some errors on my machine as well over the last few days. It's not running overly hot, not sure what's going on. This is the output from the last one: Stderr output <core_client_version>7.2.33</core_client_version> <![CDATA[ <message> (unknown error) - exit code -97 (0xffffff9f) </message> <stderr_txt> # GPU [GeForce GTX 760] Platform [Windows] Rev [3301M] VERSION [60] # SWAN Device 0 : # Name : GeForce GTX 760 # ECC : Disabled # Global mem : 2048MB # Capability : 3.0 # PCI ID : 0000:01:00.0 # Device clock : 1084MHz # Memory clock : 3404MHz # Memory width : 256bit # Driver version : r334_00 : 33489 # GPU 0 : 44C # GPU 0 : 45C # GPU 0 : 47C # GPU 0 : 48C # GPU 0 : 49C # GPU 0 : 50C # GPU 0 : 51C # GPU 0 : 52C # GPU 0 : 53C # GPU 0 : 54C # GPU 0 : 55C # GPU 0 : 56C # GPU 0 : 57C # The simulation has become unstable. Terminating to avoid lock-up (1) # Attempting restart (step 76000) # GPU [GeForce GTX 760] Platform [Windows] Rev [3301M] VERSION [60] # SWAN Device 0 : # Name : GeForce GTX 760 # ECC : Disabled # Global mem : 2048MB # Capability : 3.0 # PCI ID : 0000:01:00.0 # Device clock : 1084MHz # Memory clock : 3404MHz # Memory width : 256bit # Driver version : r334_00 : 33489 # GPU 0 : 56C # GPU 0 : 57C # The simulation has become unstable. Terminating to avoid lock-up (1) # Attempting restart (step 174000) # GPU [GeForce GTX 760] Platform [Windows] Rev [3301M] VERSION [60] # SWAN Device 0 : # Name : GeForce GTX 760 # ECC : Disabled # Global mem : 2048MB # Capability : 3.0 # PCI ID : 0000:01:00.0 # Device clock : 1084MHz # Memory clock : 3404MHz # Memory width : 256bit # Driver version : r334_00 : 33489 # GPU 0 : 56C # GPU 0 : 57C # The simulation has become unstable. Terminating to avoid lock-up (1) # Attempting restart (step 175000) # GPU [GeForce GTX 760] Platform [Windows] Rev [3301M] VERSION [60] # SWAN Device 0 : # Name : GeForce GTX 760 # ECC : Disabled # Global mem : 2048MB # Capability : 3.0 # PCI ID : 0000:01:00.0 # Device clock : 1084MHz # Memory clock : 3404MHz # Memory width : 256bit # Driver version : r334_00 : 33489 # The simulation has become unstable. Terminating to avoid lock-up (1) </stderr_txt> ]]>
	ID: 36145 \| Rating: 0 \| rate: / Reply Quote

Jim1348 Send message Joined: 28 Jul 12 Posts: 819 Credit: 1,591,285,971 RAC: 0 Level Scientific publications	Message 36146 - Posted: 7 Apr 2014 \| 21:46:59 UTC - in response to Message 36145. Last modified: 7 Apr 2014 \| 21:50:15 UTC
	It looks like I've started getting some errors on my machine as well over the last few days. It's not running overly hot, not sure what's going on. I have been seeing that too recently on one of my previously stable GTX 660s. But the other one that I had previously underclocked from 993 MHz to 967 MHz has been stable. So it appears that the work units have just gotten a little harder, and now I am underclocking both of them. I would suggest reducing your GPU clock to 1000 MHz or so. (It is not a heat issue; mine were around 66 C).
	ID: 36146 \| Rating: 0 \| rate: / Reply Quote

petnek Send message Joined: 30 May 09 Posts: 3 Credit: 35,191,012 RAC: 0 Level Scientific publications	Message 36219 - Posted: 11 Apr 2014 \| 4:54:24 UTC
	I have the same issue on two different GPUs with different drivers. On GTX 275: <core_client_version>7.2.39</core_client_version> <![CDATA[ <message> (unknown error) - exit code -59 (0xffffffc5) On Quadro FX 3800: <core_client_version>6.10.18</core_client_version> <![CDATA[ <message> The file exists. (0x50) - exit code 80 (0x50) On both I´am running short tasks. Please solve this failing!
	ID: 36219 \| Rating: 0 \| rate: / Reply Quote

TJ Send message Joined: 26 Jun 09 Posts: 815 Credit: 1,470,385,294 RAC: 0 Level Scientific publications	Message 36220 - Posted: 11 Apr 2014 \| 8:23:12 UTC Last modified: 11 Apr 2014 \| 8:24:05 UTC
	Perhaps a little help. Yesterday I needed to boot all my systems for the necessary Windows updates after running for 26 days. First thing I do is set to accept no new work so the queue can empty. Eventually I needed to go to bed but still WU's running. I suspended all work in BOINC manager and did then a cold boot (install updates and then power off system). After starting the PC's I went to the BOINC manager again and resumed work. All worked fine without error. I know this is not the option Jacob, the original poster wants, but at least in my case it did not result in loss of work. Edit: I need to mention I am still using 331.82 graphics driver ____________ Greetings from TJ
	ID: 36220 \| Rating: 0 \| rate: / Reply Quote

MJH Project administrator Project developer Project scientist Send message Joined: 12 Nov 07 Posts: 696 Credit: 27,266,655 RAC: 0 Level Scientific publications	Message 36223 - Posted: 11 Apr 2014 \| 10:11:57 UTC - in response to Message 36083.
	Thank you! Thank you too, for your help in diagnosing it. On to the next problem! Matt
	ID: 36223 \| Rating: 0 \| rate: / Reply Quote

Jacob Klein Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level Scientific publications	Message 36252 - Posted: 12 Apr 2014 \| 14:38:14 UTC - in response to Message 36223.
	Thank you! Thank you too, for your help in diagnosing it. On to the next problem! Matt I thought this problem was fixed -- why are we still receiving 8.15 tasks? I just had 2 more fail, losing several hours of work, presumably because they were 8.15 instead of 8.20. Upsetting.
	ID: 36252 \| Rating: 0 \| rate: / Reply Quote

Variable Send message Joined: 20 Nov 13 Posts: 21 Credit: 452,041,709 RAC: 547,846 Level Scientific publications	Message 36279 - Posted: 14 Apr 2014 \| 13:48:03 UTC
	I downclocked my card slightly (~50MHz), or more precisely reduced the overclock, and haven't gotten any more errors since. Not sure if that's causal or coincidental since I haven't bumped it back up yet to test.
	ID: 36279 \| Rating: 0 \| rate: / Reply Quote

Jacob Klein Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level Scientific publications	Message 36280 - Posted: 14 Apr 2014 \| 14:05:15 UTC - in response to Message 36279. Last modified: 14 Apr 2014 \| 14:05:49 UTC
	Variable: Your issue(s) are different than the one posted in this thread (see post 1). If you continue to have problems, please create a new thread. Thanks, Jacob
	ID: 36280 \| Rating: 0 \| rate: / Reply Quote

Jacob Klein Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level Scientific publications	Message 36427 - Posted: 19 Apr 2014 \| 11:16:21 UTC - in response to Message 36252. Last modified: 19 Apr 2014 \| 11:16:40 UTC
	And... another 8.15 task crashed just now, losing tons of work. Why are we still using 8.15?!? Thank you! Thank you too, for your help in diagnosing it. On to the next problem! Matt I thought this problem was fixed -- why are we still receiving 8.15 tasks? I just had 2 more fail, losing several hours of work, presumably because they were 8.15 instead of 8.20. Upsetting.
	ID: 36427 \| Rating: 0 \| rate: / Reply Quote

Wdethomas Send message Joined: 6 Feb 10 Posts: 38 Credit: 274,204,838 RAC: 0 Level Scientific publications	Message 36439 - Posted: 19 Apr 2014 \| 16:23:50 UTC
	Power went out yesterday, I lost work units. Power went out today, I lost work units. This needs to get fixed!!!!!
	ID: 36439 \| Rating: 0 \| rate: / Reply Quote

Jacob Klein Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level Scientific publications	Message 36707 - Posted: 28 Apr 2014 \| 12:34:28 UTC Last modified: 28 Apr 2014 \| 12:35:36 UTC
	MJH: Although the 8.41 app appears to have improved the situation, I am still occasionally getting what appears to be the same error. I think the scenario is suspending activity, then restarting BOINC. Can you see if there's some scenario/condition that still causes the task to fail? Error summary: Exit status 80 (0x50) Unknown error number The file exists. (0x50) - exit code 80 (0x50) Last messaged logged in stderr.txt: # BOINC suspending at user request (exit) Task results and stderr.txt: http://www.gpugrid.net/result.php?resultid=9339200 Name I188-NATHAN_RPS1_adapt4-1-5-RND2310_0 Workunit 6566597 Created 25 Apr 2014 \| 22:04:16 UTC Sent 26 Apr 2014 \| 11:18:41 UTC Received 27 Apr 2014 \| 4:06:56 UTC Server state Over Outcome Computation error Client state Compute error Exit status 80 (0x50) Unknown error number Computer ID 153764 Report deadline 1 May 2014 \| 11:18:41 UTC Run time 38,039.02 CPU time 6,213.84 Validate state Invalid Credit 0.00 Application version Long runs (8-12 hours on fastest card) v8.41 (cuda60) Stderr output <core_client_version>7.3.15</core_client_version> <![CDATA[ <message> The file exists. (0x50) - exit code 80 (0x50) </message> <stderr_txt> # GPU [GeForce GTX 660 Ti] Platform [Windows] Rev [3301M] VERSION [60] # SWAN Device 0 : # Name : GeForce GTX 660 Ti # ECC : Disabled # Global mem : 3072MB # Capability : 3.0 # PCI ID : 0000:09:00.0 # Device clock : 1124MHz # Memory clock : 3004MHz # Memory width : 192bit # Driver version : DM337_50 : 33761 # GPU 0 : 68C # GPU 1 : 69C # GPU 2 : 77C # GPU 0 : 69C # GPU 1 : 70C # BOINC suspending at user request (exit) # GPU [GeForce GTX 460] Platform [Windows] Rev [3301M] VERSION [60] # SWAN Device 1 : # Name : GeForce GTX 460 # ECC : Disabled # Global mem : 1024MB # Capability : 2.1 # PCI ID : 0000:08:00.0 # Device clock : 1526MHz # Memory clock : 1900MHz # Memory width : 256bit # Driver version : DM337_50 : 33761 # GPU 0 : 70C # GPU 1 : 69C # GPU 2 : 78C # GPU 1 : 70C # GPU 1 : 71C # GPU 2 : 79C # GPU 0 : 71C # GPU 1 : 72C # GPU 0 : 72C # GPU 0 : 73C # GPU 2 : 80C # BOINC suspending at user request (exit) # GPU [GeForce GTX 460] Platform [Windows] Rev [3301M] VERSION [60] # SWAN Device 1 : # Name : GeForce GTX 460 # ECC : Disabled # Global mem : 1024MB # Capability : 2.1 # PCI ID : 0000:08:00.0 # Device clock : 1526MHz # Memory clock : 1900MHz # Memory width : 256bit # Driver version : DM337_50 : 33761 # GPU 0 : 70C # GPU 1 : 65C # GPU 2 : 74C # GPU 0 : 71C # GPU 1 : 68C # GPU 2 : 75C # GPU 0 : 72C # GPU 1 : 70C # GPU 1 : 72C # GPU 2 : 76C # GPU 2 : 77C # GPU 1 : 73C # GPU 2 : 79C # GPU 1 : 74C # GPU 1 : 75C # GPU 1 : 76C # GPU 1 : 77C # GPU 1 : 78C # GPU 1 : 79C # GPU 1 : 80C # GPU 0 : 74C # GPU 0 : 75C # BOINC suspending at user request (exit) # GPU [GeForce GTX 460] Platform [Windows] Rev [3301M] VERSION [60] # SWAN Device 1 : # Name : GeForce GTX 460 # ECC : Disabled # Global mem : 1024MB # Capability : 2.1 # PCI ID : 0000:08:00.0 # Device clock : 1526MHz # Memory clock : 1900MHz # Memory width : 256bit # Driver version : DM337_50 : 33761 # GPU 0 : 60C # GPU 1 : 58C # GPU 2 : 55C # GPU 0 : 65C # GPU 1 : 63C # GPU 2 : 70C # GPU 0 : 68C # GPU 1 : 67C # GPU 2 : 74C # GPU 0 : 70C # GPU 1 : 69C # GPU 2 : 75C # GPU 0 : 71C # GPU 1 : 70C # GPU 2 : 76C # GPU 0 : 72C # GPU 1 : 73C # GPU 0 : 73C # GPU 2 : 77C # GPU 2 : 78C # GPU 2 : 79C # BOINC suspending at user request (exit) # GPU [GeForce GTX 460] Platform [Windows] Rev [3301M] VERSION [60] # SWAN Device 1 : # Name : GeForce GTX 460 # ECC : Disabled # Global mem : 1024MB # Capability : 2.1 # PCI ID : 0000:08:00.0 # Device clock : 1526MHz # Memory clock : 1900MHz # Memory width : 256bit # Driver version : DM337_50 : 33761 # GPU 0 : 59C # GPU 1 : 58C # GPU 2 : 55C # GPU 0 : 61C # GPU 1 : 63C # GPU 0 : 63C # GPU 1 : 67C # BOINC suspending at user request (exit) # GPU [GeForce GTX 460] Platform [Windows] Rev [3301M] VERSION [60] # SWAN Device 1 : # Name : GeForce GTX 460 # ECC : Disabled # Global mem : 1024MB # Capability : 2.1 # PCI ID : 0000:08:00.0 # Device clock : 1526MHz # Memory clock : 1900MHz # Memory width : 256bit # Driver version : DM337_50 : 33761 # GPU 0 : 61C # GPU 1 : 64C # GPU 2 : 55C # GPU 0 : 63C # GPU 1 : 68C # GPU 0 : 64C # GPU 1 : 70C # GPU 0 : 65C # GPU 1 : 71C # GPU 2 : 56C # GPU 0 : 66C # BOINC suspending at user request (exit) # GPU [GeForce GTX 460] Platform [Windows] Rev [3301M] VERSION [60] # SWAN Device 1 : # Name : GeForce GTX 460 # ECC : Disabled # Global mem : 1024MB # Capability : 2.1 # PCI ID : 0000:08:00.0 # Device clock : 1526MHz # Memory clock : 1900MHz # Memory width : 256bit # Driver version : DM337_50 : 33761 # GPU 0 : 63C # GPU 1 : 66C # GPU 2 : 62C # GPU 0 : 65C # GPU 1 : 70C # GPU 0 : 66C # GPU 1 : 74C # GPU 0 : 67C # GPU 1 : 78C # GPU 0 : 69C # GPU 0 : 71C # GPU 0 : 73C # GPU 2 : 71C # GPU 2 : 73C # GPU 2 : 74C # GPU 2 : 75C # BOINC suspending at user request (exit) # GPU [GeForce GTX 460] Platform [Windows] Rev [3301M] VERSION [60] # SWAN Device 1 : # Name : GeForce GTX 460 # ECC : Disabled # Global mem : 1024MB # Capability : 2.1 # PCI ID : 0000:08:00.0 # Device clock : 1526MHz # Memory clock : 1900MHz # Memory width : 256bit # Driver version : DM337_50 : 33761 # GPU 0 : 61C # GPU 1 : 63C # GPU 2 : 57C # GPU 0 : 63C # GPU 1 : 67C # GPU 0 : 64C # GPU 1 : 70C # GPU 0 : 66C # GPU 1 : 71C # GPU 1 : 73C # GPU 0 : 67C # GPU 1 : 74C # GPU 1 : 75C # GPU 0 : 71C # GPU 0 : 72C # GPU 2 : 69C # BOINC suspending at user request (exit) # GPU [GeForce GTX 460] Platform [Windows] Rev [3301M] VERSION [60] # SWAN Device 1 : # Name : GeForce GTX 460 # ECC : Disabled # Global mem : 1024MB # Capability : 2.1 # PCI ID : 0000:08:00.0 # Device clock : 1526MHz # Memory clock : 1900MHz # Memory width : 256bit # Driver version : DM337_50 : 33761 # GPU 0 : 59C # GPU 1 : 59C # GPU 2 : 56C # GPU 0 : 65C # GPU 1 : 64C # GPU 2 : 64C # GPU 0 : 68C # GPU 1 : 67C # GPU 2 : 69C # GPU 0 : 69C # GPU 1 : 69C # GPU 2 : 71C # GPU 0 : 71C # GPU 2 : 72C # GPU 0 : 73C # GPU 1 : 72C # GPU 2 : 73C # GPU 1 : 73C # GPU 2 : 74C # GPU 2 : 75C # GPU 2 : 76C # BOINC suspending at user request (exit) # GPU [GeForce GTX 460] Platform [Windows] Rev [3301M] VERSION [60] # SWAN Device 1 : # Name : GeForce GTX 460 # ECC : Disabled # Global mem : 1024MB # Capability : 2.1 # PCI ID : 0000:08:00.0 # Device clock : 1526MHz # Memory clock : 1900MHz # Memory width : 256bit # Driver version : DM337_50 : 33761 # GPU 0 : 62C # GPU 1 : 65C # GPU 2 : 57C # GPU 0 : 63C # GPU 1 : 68C # BOINC suspending at user request (exit) </stderr_txt> ]]>
	ID: 36707 \| Rating: 0 \| rate: / Reply Quote

Jozef J Send message Joined: 7 Jun 12 Posts: 112 Credit: 1,140,895,172 RAC: 17,863 Level Scientific publications	Message 36835 - Posted: 14 May 2014 \| 16:20:05 UTC
	I have highlighted the problem in counting the cards gtx 680 a month now happens to me from . Every day becomes that the tasks of collapse in such a weird way-slow down your PC system in windows and also according to GPU-Z stops the card count. entire system is as if in slow motion ... only helps suspend computation on graphics card, abortions every task and the new has withdrawn. ., and after about cca 6-12 aborted about the tasks shall start another 3 working normally .. it's weird errors and concerns only nvidia cards 600, to 700 card counting goes perfectly. I play with the problem for months.... and computing of other projects without problems. It's not boiling cards or a weak PSU.. I'm not able to count on 680 of these normally GPUGRID, consider selling them or any other project..
	ID: 36835 \| Rating: 0 \| rate: / Reply Quote

Jacob Klein Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level Scientific publications	Message 36856 - Posted: 17 May 2014 \| 8:01:05 UTC Last modified: 17 May 2014 \| 8:02:41 UTC
	MJH: The v8.41 version of the application still has the occasional "The file exists. (0x50) - exit code 80 (0x50)" error, trashing loads of work :( Can you please invest some time to fix it? http://www.gpugrid.net/result.php?resultid=10318262 Name A2ART4Ex05x95-GERARD_A2ART4E-13-14-RND0991_0 Workunit 7496762 Created 14 May 2014 \| 5:52:04 UTC Sent 16 May 2014 \| 13:57:32 UTC Received 17 May 2014 \| 3:24:11 UTC Server state Over Outcome Computation error Client state Compute error Exit status 80 (0x50) Unknown error number Computer ID 153764 Report deadline 21 May 2014 \| 13:57:32 UTC Run time 24,161.19 CPU time 6,302.88 Validate state Invalid Credit 0.00 Application version Long runs (8-12 hours on fastest card) v8.41 (cuda60) Stderr output <core_client_version>7.3.19</core_client_version> <![CDATA[ <message> The file exists. (0x50) - exit code 80 (0x50) </message> <stderr_txt> # GPU [GeForce GTX 660 Ti] Platform [Windows] Rev [3301M] VERSION [60] # SWAN Device 0 : # Name : GeForce GTX 660 Ti # ECC : Disabled # Global mem : 3072MB # Capability : 3.0 # PCI ID : 0000:09:00.0 # Device clock : 1124MHz # Memory clock : 3004MHz # Memory width : 192bit # Driver version : DM337_50 : 33761 # GPU 0 : 67C # GPU 1 : 75C # GPU 2 : 74C # GPU 0 : 68C # GPU 1 : 76C # GPU 0 : 69C # GPU 0 : 70C # GPU 1 : 77C # GPU 0 : 71C # GPU 0 : 72C # GPU 2 : 75C # BOINC suspending at user request (exit) # GPU [GeForce GTX 660 Ti] Platform [Windows] Rev [3301M] VERSION [60] # SWAN Device 0 : # Name : GeForce GTX 660 Ti # ECC : Disabled # Global mem : 3072MB # Capability : 3.0 # PCI ID : 0000:09:00.0 # Device clock : 1124MHz # Memory clock : 3004MHz # Memory width : 192bit # Driver version : DM337_50 : 33761 # GPU 0 : 66C # GPU 1 : 71C # GPU 2 : 58C # GPU 0 : 67C # GPU 2 : 62C # GPU 2 : 66C # GPU 2 : 67C # GPU 0 : 68C # GPU 1 : 72C # GPU 2 : 68C # GPU 2 : 69C # GPU 2 : 70C # GPU 0 : 69C # GPU 1 : 73C # GPU 2 : 71C # BOINC suspending at user request (exit) # GPU [GeForce GTX 660 Ti] Platform [Windows] Rev [3301M] VERSION [60] # SWAN Device 0 : # Name : GeForce GTX 660 Ti # ECC : Disabled # Global mem : 3072MB # Capability : 3.0 # PCI ID : 0000:09:00.0 # Device clock : 1124MHz # Memory clock : 3004MHz # Memory width : 192bit # Driver version : DM337_50 : 33761 # GPU 0 : 66C # GPU 1 : 71C # GPU 2 : 65C # GPU 0 : 67C # GPU 1 : 72C # GPU 2 : 67C # GPU 2 : 68C # GPU 0 : 68C # GPU 2 : 69C # GPU 1 : 73C # GPU 0 : 69C # GPU 2 : 70C # GPU 2 : 71C # GPU 1 : 74C # BOINC suspending at user request (exit) # GPU [GeForce GTX 660 Ti] Platform [Windows] Rev [3301M] VERSION [60] # SWAN Device 0 : # Name : GeForce GTX 660 Ti # ECC : Disabled # Global mem : 3072MB # Capability : 3.0 # PCI ID : 0000:09:00.0 # Device clock : 1124MHz # Memory clock : 3004MHz # Memory width : 192bit # Driver version : DM337_50 : 33761 # GPU 0 : 68C # GPU 1 : 73C # GPU 2 : 68C # GPU 2 : 69C # GPU 2 : 70C # GPU 0 : 69C # GPU 1 : 74C # GPU 2 : 71C # GPU 0 : 70C # GPU 1 : 75C # GPU 2 : 72C # GPU 2 : 73C # GPU 1 : 76C # BOINC suspending at user request (exit) # GPU [GeForce GTX 460] Platform [Windows] Rev [3301M] VERSION [60] # SWAN Device 1 : # Name : GeForce GTX 460 # ECC : Disabled # Global mem : 1024MB # Capability : 2.1 # PCI ID : 0000:08:00.0 # Device clock : 1526MHz # Memory clock : 1900MHz # Memory width : 256bit # Driver version : DM337_50 : 33761 # GPU 0 : 57C # GPU 1 : 68C # GPU 2 : 61C # GPU 0 : 61C # GPU 1 : 69C # GPU 0 : 64C # GPU 1 : 70C # GPU 0 : 65C # GPU 1 : 71C # GPU 0 : 66C # GPU 1 : 72C # GPU 0 : 67C # GPU 1 : 73C # GPU 0 : 69C # GPU 0 : 70C # GPU 2 : 67C # BOINC suspending at user request (exit) # GPU [GeForce GTX 460] Platform [Windows] Rev [3301M] VERSION [60] # SWAN Device 1 : # Name : GeForce GTX 460 # ECC : Disabled # Global mem : 1024MB # Capability : 2.1 # PCI ID : 0000:08:00.0 # Device clock : 1526MHz # Memory clock : 1900MHz # Memory width : 256bit # Driver version : DM337_50 : 33761 # BOINC suspending at user request (exit) # GPU [GeForce GTX 460] Platform [Windows] Rev [3301M] VERSION [60] # SWAN Device 1 : # Name : GeForce GTX 460 # ECC : Disabled # Global mem : 1024MB # Capability : 2.1 # PCI ID : 0000:08:00.0 # Device clock : 1526MHz # Memory clock : 1900MHz # Memory width : 256bit # Driver version : DM337_50 : 33761 # GPU 0 : 61C # GPU 1 : 53C # GPU 2 : 67C # GPU 0 : 64C # GPU 1 : 58C # GPU 2 : 69C # GPU 0 : 66C # GPU 1 : 61C # GPU 0 : 67C # GPU 1 : 64C # GPU 2 : 70C # GPU 0 : 68C # GPU 1 : 65C # GPU 1 : 67C # GPU 2 : 71C # GPU 0 : 69C # GPU 1 : 69C # GPU 0 : 70C # GPU 1 : 70C # GPU 1 : 71C # GPU 2 : 72C # GPU 0 : 71C # GPU 1 : 72C # GPU 2 : 73C # BOINC suspending at user request (exit) # GPU [GeForce GTX 460] Platform [Windows] Rev [3301M] VERSION [60] # SWAN Device 1 : # Name : GeForce GTX 460 # ECC : Disabled # Global mem : 1024MB # Capability : 2.1 # PCI ID : 0000:08:00.0 # Device clock : 1526MHz # Memory clock : 1900MHz # Memory width : 256bit # Driver version : DM337_50 : 33761 # GPU 0 : 61C # GPU 1 : 53C # GPU 2 : 67C # GPU 0 : 64C # GPU 1 : 57C # GPU 2 : 68C # GPU 0 : 66C # GPU 1 : 60C # GPU 2 : 69C # GPU 0 : 67C # GPU 1 : 63C # GPU 2 : 70C # GPU 0 : 68C # GPU 1 : 64C # GPU 0 : 69C # GPU 1 : 67C # GPU 1 : 68C # GPU 2 : 71C # GPU 1 : 69C # GPU 0 : 70C # GPU 1 : 70C # GPU 1 : 72C # GPU 2 : 72C # BOINC suspending at user request (exit) # GPU [GeForce GTX 460] Platform [Windows] Rev [3301M] VERSION [60] # SWAN Device 1 : # Name : GeForce GTX 460 # ECC : Disabled # Global mem : 1024MB # Capability : 2.1 # PCI ID : 0000:08:00.0 # Device clock : 1526MHz # Memory clock : 1900MHz # Memory width : 256bit # Driver version : DM337_50 : 33761 # GPU 0 : 54C # GPU 1 : 58C # GPU 2 : 59C # GPU 1 : 62C # GPU 1 : 64C # GPU 0 : 60C # GPU 1 : 66C # GPU 0 : 62C # BOINC suspending at user request (exit) # GPU [GeForce GTX 460] Platform [Windows] Rev [3301M] VERSION [60] # SWAN Device 1 : # Name : GeForce GTX 460 # ECC : Disabled # Global mem : 1024MB # Capability : 2.1 # PCI ID : 0000:08:00.0 # Device clock : 1526MHz # Memory clock : 1900MHz # Memory width : 256bit # Driver version : DM337_50 : 33761 # GPU 0 : 58C # GPU 1 : 53C # GPU 2 : 58C # GPU 0 : 60C # GPU 1 : 58C # GPU 0 : 63C # GPU 1 : 62C </stderr_txt> ]]>
	ID: 36856 \| Rating: 0 \| rate: / Reply Quote

Retvari Zoltan Send message Joined: 20 Jan 09 Posts: 2356 Credit: 16,377,575,319 RAC: 3,451,801 Level Scientific publications	Message 36857 - Posted: 17 May 2014 \| 8:46:09 UTC - in response to Message 36856.
	MJH: The v8.41 version of the application still has the occasional "The file exists. (0x50) - exit code 80 (0x50)" error, trashing loads of work :( Can you please invest some time to fix it? http://www.gpugrid.net/result.php?resultid=10318262 +2 http://www.gpugrid.net/result.php?resultid=10328606 http://www.gpugrid.net/result.php?resultid=10328572 These failed after a simple system restart.
	ID: 36857 \| Rating: 0 \| rate: / Reply Quote

Wdethomas Send message Joined: 6 Feb 10 Posts: 38 Credit: 274,204,838 RAC: 0 Level Scientific publications	Message 36984 - Posted: 1 Jun 2014 \| 20:03:13 UTC
	Every time the lights go out I lose all the units that are being worked on. If I restart the system using the proper procedures, no problem. This has been going on for months and I am really getting sick of it. Bought UPS, now lets see.
	ID: 36984 \| Rating: 0 \| rate: / Reply Quote

Jacob Klein Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level Scientific publications	Message 36985 - Posted: 2 Jun 2014 \| 3:51:56 UTC - in response to Message 36984.
	Is your error: The file exists. (0x50) - exit code 80 (0x50) If not, then create a new thread please. This thread is about that error.
	ID: 36985 \| Rating: 0 \| rate: / Reply Quote

Jacob Klein Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level Scientific publications	Message 37242 - Posted: 7 Jul 2014 \| 20:07:12 UTC Last modified: 7 Jul 2014 \| 20:09:57 UTC
	*This is STILL* an issue. When can we finally get it fully fixed? :(** http://www.gpugrid.net/result.php?resultid=12800989 Outcome Computation error Client state Compute error Exit status 80 (0x50) Unknown error number Run time 32,087.62 Stderr output <core_client_version>7.4.8</core_client_version> <![CDATA[ <message> The file exists. (0x50) - exit code 80 (0x50) </message> <stderr_txt> ... ... ... # BOINC suspending at user request (exit) </stderr_txt> ]]> http://www.gpugrid.net/result.php?resultid=12796113 Outcome Computation error Client state Compute error Exit status 80 (0x50) Unknown error number Run time 2,221.71 Stderr output <core_client_version>7.4.8</core_client_version> <![CDATA[ <message> The file exists. (0x50) - exit code 80 (0x50) </message> <stderr_txt> ... ... ... # BOINC suspending at user request (exit) </stderr_txt> ]]>
	ID: 37242 \| Rating: 0 \| rate: / Reply Quote

MJH Project administrator Project developer Project scientist Send message Joined: 12 Nov 07 Posts: 696 Credit: 27,266,655 RAC: 0 Level Scientific publications	Message 37325 - Posted: 20 Jul 2014 \| 21:08:19 UTC - in response to Message 37242.
	Jacob, You are in luck. It's time for another round of GPUGRID development. Remind me, please, the circumstance under which this is occuring. Matt
	ID: 37325 \| Rating: 0 \| rate: / Reply Quote

Jacob Klein Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level Scientific publications	Message 37327 - Posted: 20 Jul 2014 \| 22:34:15 UTC
	I'm on the road, but will be home tonight. I'll try to re-review, probably tomorrow. Thanks!
	ID: 37327 \| Rating: 0 \| rate: / Reply Quote

TJ Send message Joined: 26 Jun 09 Posts: 815 Credit: 1,470,385,294 RAC: 0 Level Scientific publications	Message 37338 - Posted: 21 Jul 2014 \| 14:39:34 UTC - in response to Message 37325.
	Jacob, You are in luck. It's time for another round of GPUGRID development. Remind me, please, the circumstance under which this is occuring. Matt Hi Matt, I don't know if we need to made a new post for this, but I have a request. Is it possible inn the Stderr output file, show only the temperature of the GPU that did the job? Now the temperature change from every card is shown. Thank you. ____________ Greetings from TJ
	ID: 37338 \| Rating: 0 \| rate: / Reply Quote

GPUGRID Role account Send message Joined: 15 Feb 07 Posts: 134 Credit: 1,349,535,983 RAC: 0 Level Scientific publications	Message 37339 - Posted: 21 Jul 2014 \| 15:22:47 UTC - in response to Message 37338.
	Tricky - the GPU ordering from the temperature query interface doesn't correspond to the CUDA ordering.
	ID: 37339 \| Rating: 0 \| rate: / Reply Quote

Jacob Klein Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level Scientific publications	Message 37353 - Posted: 22 Jul 2014 \| 13:21:25 UTC
	MJH: I've reviewed the notes in the thread. The main posts that detail the problem are: http://www.gpugrid.net/forum_thread.php?id=3621&nowrap=true#35348 http://www.gpugrid.net/forum_thread.php?id=3621&nowrap=true#37242 It is not easy to reproduce on demand. I suspect that your best bet is to investigate/walk the code, to find an area that could result in: <message> The file exists. (0x50) - exit code 80 (0x50) </message> It seems to happen more frequently when the task is suspended before BOINC is shutdown, but suspending the task might not be a requirement of the bug. Testing should involve suspending BOINC, and then shutting BOINC down, and then starting BOINC back up. Also, to test the "power outage" scenario, I think testing could involve right clicking boincmgr.exe in Task Manager, and clicking "End process tree". I hope this helps. The focus should be on code areas that could result in that error message. Regards, Jacob
	ID: 37353 \| Rating: 0 \| rate: / Reply Quote

GPUGRID Role account Send message Joined: 15 Feb 07 Posts: 134 Credit: 1,349,535,983 RAC: 0 Level Scientific publications	Message 37357 - Posted: 22 Jul 2014 \| 15:00:47 UTC - in response to Message 37353.
	That exit circumstance is the failsafe exit that stops a WU getting stuck in an endless cycle of abort - resume, without making any progress. It should only trigger if the machine has been up for a few minutes (from which we infer that the WU crashed the machine). Matt
	ID: 37357 \| Rating: 0 \| rate: / Reply Quote

GPUGRID Role account Send message Joined: 15 Feb 07 Posts: 134 Credit: 1,349,535,983 RAC: 0 Level Scientific publications	Message 37358 - Posted: 22 Jul 2014 \| 15:00:49 UTC - in response to Message 37353.
	That exit circumstance is the failsafe exit that stops a WU getting stuck in an endless cycle of abort - resume, without making any progress. It should only trigger if the machine has been up for a few minutes (from which we infer that the WU crashed the machine). Matt
	ID: 37358 \| Rating: 0 \| rate: / Reply Quote

Jacob Klein Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level Scientific publications	Message 37359 - Posted: 22 Jul 2014 \| 15:28:51 UTC
	Perhaps you could give me even more clues on how to reproduce the error on demand? It seems that it is currently too stringent, causing otherwise-healthy tasks to fail when starting BOINC.
	ID: 37359 \| Rating: 0 \| rate: / Reply Quote

Vagelis Giannadakis Send message Joined: 5 May 13 Posts: 187 Credit: 349,254,454 RAC: 0 Level Scientific publications	Message 37361 - Posted: 22 Jul 2014 \| 15:46:46 UTC - in response to Message 37359.
	He said: It should only trigger if the machine has been up for a few minutes So, you could try suspending / closing BOINC then resuming it without shutting down the machine in-between and with shutting it down. ____________
	ID: 37361 \| Rating: 0 \| rate: / Reply Quote

Jacob Klein Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level Scientific publications	Message 37362 - Posted: 22 Jul 2014 \| 16:00:32 UTC Last modified: 22 Jul 2014 \| 16:04:45 UTC
	Matt, Could you please give me more details about that exit algorithm? Maybe even pseudocode or something, please? Details, like "If it restarts x times without saving a checkpoint" or "If it restarts x times during a computer-uptime-session" or "If it restarts x times during the course of the task", etc. ... Just so I can easily reproduce the issue on demand, and thus help you test/solve it.
	ID: 37362 \| Rating: 0 \| rate: / Reply Quote

Jacob Klein Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level Scientific publications	Message 37387 - Posted: 24 Jul 2014 \| 12:34:05 UTC - in response to Message 37362.
	Matt, Could you please give me more details about that exit algorithm? Maybe even pseudocode or something, please? Details, like "If it restarts x times without saving a checkpoint" or "If it restarts x times during a computer-uptime-session" or "If it restarts x times during the course of the task", etc. ... Just so I can easily reproduce the issue on demand, and thus help you test/solve it. I was able to get another task to error for that reason... so it is still possible, if enough testing is done. Again, could you provide details on the exit algorithm?
	ID: 37387 \| Rating: 0 \| rate: / Reply Quote

GPUGRID Role account Send message Joined: 15 Feb 07 Posts: 134 Credit: 1,349,535,983 RAC: 0 Level Scientific publications	Message 37388 - Posted: 24 Jul 2014 \| 13:09:35 UTC - in response to Message 37387.
	Jacob, When the simulation starts computing, ACEMD puts a file called "canary" in the slot directory, which it then removes the first time it writes a restart file set. When ACEMD is starting up it looks for the "canary" file - if it finds it that means the simulation aborted for some reason very soon after it started before making significant progress. In this case, if the system has been booted for less than 10 minutes we interpret this as meaning that the last instance of ACEMD crashed the machine and so abort the WU as bad. Matt
	ID: 37388 \| Rating: 0 \| rate: / Reply Quote

Jacob Klein Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level Scientific publications	Message 37389 - Posted: 24 Jul 2014 \| 13:30:33 UTC - in response to Message 37388. Last modified: 24 Jul 2014 \| 13:35:42 UTC
	Alright.... So, it looks like the slot directory does get the canary file when the tasks are started within the session. And, by utilizing the <checkpoint_debug> flag in cc_config.xml, I believe I see the file being removed whenever the task's first checkpoint of the session is performed. So, I've tried closing BOINC (normally) about 2 seconds after startup, which leaves the canary files in my slot directories. But, upon starting BOINC, with those files in the directories, it does not fail the tasks. How can I get these tasks to easily fail on-demand? Is there more to the logic that decides when to fail them? EDIT: I just re-read your post... I see "if the system has been booted for less than 10 minutes".... hmm... Let me restart Windows, and perform the same test.
	ID: 37389 \| Rating: 0 \| rate: / Reply Quote

Jacob Klein Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level Scientific publications	Message 37391 - Posted: 24 Jul 2014 \| 13:52:29 UTC Last modified: 24 Jul 2014 \| 13:54:05 UTC
	Hurray! I've been able to make all 3 of my tasks fail, essentially on-demand! All of them with error: "The file exists. (0x50) - exit code 80 (0x50)" ... This genuinely excites me! Here's what I did: - restarted my computer - monitored Task Manager's Performance tab on the CPU selection, to make sure "Up time" was less than 10 minutes - started BOINC - saw the canary files - exited BOINC - confirmed the canary files were still present - started BOINC again - ...and watched the tasks fail. Good thing I didn't mind failing them :) Next thing I'll do (later today if I find time) will be to test whether it is "must see canary on task start within 10 minutes of up-time" or "must see canary on task start within 10 minutes of logged-in time" Either way, though... This algorithm doesn't jive well. Are you able to make changes to it? Perhaps we could work together to develop a better algorithm that hopefully still accomplishes your goals, without killing tasks? Let me know, Thanks, Jacob
	ID: 37391 \| Rating: 0 \| rate: / Reply Quote

Jacob Klein Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level Scientific publications	Message 37392 - Posted: 24 Jul 2014 \| 14:36:00 UTC - in response to Message 37391. Last modified: 24 Jul 2014 \| 14:36:20 UTC
	Either way, though... This algorithm doesn't jive well. Are you able to make changes to it? Perhaps we could work together to develop a better algorithm that hopefully still accomplishes your goals, without killing tasks? It might be a matter of: 1) Removing the canary file on a normal shutdown of BOINC (this could solve the majority of the issues!) 2) Consider removing the 10-minute limit, since... Maybe the machine restarted, and had been sitting at a login screen for several hours, before user logged in to start BOINC Thoughts?
	ID: 37392 \| Rating: 0 \| rate: / Reply Quote

GPUGRID Role account Send message Joined: 15 Feb 07 Posts: 134 Credit: 1,349,535,983 RAC: 0 Level Scientific publications	Message 37393 - Posted: 24 Jul 2014 \| 16:09:11 UTC - in response to Message 37392.
	Jacob, Can you explain exactly the circumstances under which you are getting a false activation of the trap? It sounds to me something like: * You've stopped BOINC because you want the machine for something else. Some of the WUs have only just started running, and haven't reached their first checkpoint, so leave canary files. * You turn off the machine * Later,you turn it back on again and the WUs that had barely started are incorrectly assumed to have been defective and aborted. Is this really a such common occurrence? The window of vulnerability for a WU is pretty narrow - the interval between starting and first checkpoint should only be a few minutes. Anyway, you've hit on a reasonable improvement - to remove the canary if the tasks are responding to a suspend request from the client. Matt
	ID: 37393 \| Rating: 0 \| rate: / Reply Quote

Jacob Klein Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level Scientific publications	Message 37394 - Posted: 24 Jul 2014 \| 16:28:09 UTC - in response to Message 37393. Last modified: 24 Jul 2014 \| 16:39:30 UTC
	Matt, I do all sorts of crazy fun stuff with my computer. Sometimes, I suspend BOINC, because I need the CPUs for something else. A lot of times, I actually close BOINC, because I want the CPUs and the memory, for my main game, iRacing. :) But I think the culprit scenario is likely a bit different. Here goes. The "triggering" scenario goes something like this: - I'm doing something that requires a restart. Maybe I'm installing new software. Go with that as the assumption. Let's say Windows Update required a restart, and I clicked OK to restart Windows. Canary files are not present, because tasks checkpointed before I clicked OK. - I restart, log in, and immediately pause or exit BOINC (bolded for emphasis as the condition that doesn't jive well with the current canary implementation), because I want resources available. Maybe I realized I have to update additional software, that I know will require a restart, and I want to make this installation go quicker. Or maybe I HAVE A RACE RIGHT NOW (and so, close BOINC, to give me resources for iRacing). So, BOINC gets closed. Canary files are present, because tasks started before I closed. Right? - So, later, I start BOINC. And then cry. Because all my GPUGrid work is lost. I have 3 GPUs, and all 3 tasks (which could have been up to 30 hours of work) are lost. I weep the tears of a thousand kernels, swept away in an erroneous exit condition. :) Personally, I think the exit condition might not be needed at all. Have you seen a reason to require it? I assume you want to keep it. If the tasks are responding to a suspend request from the client (ie: BOINC is closed normally, right? That's what you meant, right?), then... Yes, removing the canary file should solve the problem for my scenario above. It won't solve all the problems (as, I could kill BOINC in Task Manager, and then canary files would still be present, and also I think upgrading BOINC causes the tasks to be killed ungracefully), but it should solve the normal scenarios (normal shutdowns). Can you implement it? I'd love to test it.
	ID: 37394 \| Rating: 0 \| rate: / Reply Quote

Retvari Zoltan Send message Joined: 20 Jan 09 Posts: 2356 Credit: 16,377,575,319 RAC: 3,451,801 Level Scientific publications	Message 37395 - Posted: 24 Jul 2014 \| 18:39:11 UTC - in response to Message 37393. Last modified: 24 Jul 2014 \| 18:42:27 UTC
	I've noticed that GPUGrid tasks fail with the "file exists" error when I'm restarting my PC immediately after a restart. I thought that I should wait for the workunits made their first checkpoint to avoid this error, but I didn't thought that it's a protective algorithm. Two (or more) fast system restart is needed (for me) when the USB controllers on my motherboard became unusable in Windows XP after a Windows 7 session on that PC, and I have to physically switch off the power from the PC to fix it. Fast system restart(s) is also needed when updating different drivers / software in succession, or when fixing other hardware related problems (for example: I have a PCIe ethernet controller card in this motherboard. At some point the ethernet card has disappeared from device manager, so there was no network connectivity on this PC which is crucial. I had to restart the PC several times, and make changes in the BIOS to fix it) So this problem can be solved by making this protective algorithm complete: it should delete the canary file during a graceful shutdown. EDIT: an additional safety algorithm could be this: the workunit should abort itself when it's progressing very slowly (for example: if it couldn't finish in 5 days)
	ID: 37395 \| Rating: 0 \| rate: / Reply Quote

skgiven Volunteer moderator Volunteer tester Send message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level Scientific publications	Message 37396 - Posted: 24 Jul 2014 \| 21:26:22 UTC - in response to Message 37395.
	Despite having a primary SSD and secondary Boinc data drive on my main Win7 system, I still use a 30sec cc_config start delay, <options> <start_delay>30</start_delay> </options> After system installations or updates, followed by a system restart, there is still a bit to be done, so if Boinc immediately tries to start loading and running numerous tasks the WU's are competing for resources with each other and the system. If you restart within 30sec of a previous restart tasks might be forcibly shut down even before they start running never mind checkpoint. ____________ FAQ's HOW TO: - Opt out of Beta Tests - Ask for Help
	ID: 37396 \| Rating: 0 \| rate: / Reply Quote

Jacob Klein Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level Scientific publications	Message 37418 - Posted: 26 Jul 2014 \| 3:51:05 UTC
	Any progress?
	ID: 37418 \| Rating: 0 \| rate: / Reply Quote

Jacob Klein Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level Scientific publications	Message 37585 - Posted: 16 Aug 2014 \| 3:53:51 UTC - in response to Message 37418.
	Matt, Has there been any progress on improving the canary-file-detection? I almost got bit by it again, when I installed a round of Windows updates, logged into Windows (which launches BOINC), immediately exited BOINC, so I could install round 2 of updates. Good thing I remembered about the canary issue, and remember to wait until it deleted the files to close BOINC. But, closing BOINC normally should have deleted the canary files. Please fix this.
	ID: 37585 \| Rating: 0 \| rate: / Reply Quote

GPUGRID Role account Send message Joined: 15 Feb 07 Posts: 134 Credit: 1,349,535,983 RAC: 0 Level Scientific publications	Message 37588 - Posted: 16 Aug 2014 \| 8:55:21 UTC - in response to Message 37585.
	Jacob, It's on the todo list. It'll get done early September, after vacaciones. Matt[/u]
	ID: 37588 \| Rating: 0 \| rate: / Reply Quote

Beyond Send message Joined: 23 Nov 08 Posts: 1112 Credit: 6,162,416,256 RAC: 0 Level Scientific publications	Message 37867 - Posted: 9 Sep 2014 \| 16:12:57 UTC
	Sure hope this gets fixed. Updating my machines from 7.4.8 to 7.4.18, carefully shutting down 7.4.8 before installing the new client yielded 3 aborted GPUGrid WUs out of 7. This happens only with GPUGrid WUs, no other projects that I run (many) behave in this way.
	ID: 37867 \| Rating: 0 \| rate: / Reply Quote

Jacob Klein Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level Scientific publications	Message 38134 - Posted: 28 Sep 2014 \| 15:13:53 UTC - in response to Message 37588.
	Jacob, It's on the todo list. It'll get done early September, after vacaciones. Matt[/u] Early September? 2014?
	ID: 38134 \| Rating: 0 \| rate: / Reply Quote

MJH Project administrator Project developer Project scientist Send message Joined: 12 Nov 07 Posts: 696 Credit: 27,266,655 RAC: 0 Level Scientific publications	Message 38137 - Posted: 28 Sep 2014 \| 21:08:46 UTC - in response to Message 38134.
	coming with the 6.5 app under testing on beta now
	ID: 38137 \| Rating: 0 \| rate: / Reply Quote

Jacob Klein Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level Scientific publications	Message 38139 - Posted: 28 Sep 2014 \| 22:14:43 UTC - in response to Message 38137.
	Thank you. Are there minimum requirements for getting tasks on that beta app?
	ID: 38139 \| Rating: 0 \| rate: / Reply Quote

Post to thread

Message boards : Number crunching : Problem - Tasks error when exiting/resuming using 334.67 drivers

	About	Science	Volunteers	Performance	Forum	Join us	Donate