Advanced search

Message boards : Graphics cards (GPUs) : IBUCH_TRYP WU Errors

Author Message
Profile K1atOdessa
Send message
Joined: 25 Feb 08
Posts: 249
Credit: 392,702,681
RAC: 1,417,376
Level
Asp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 13561 - Posted: 15 Nov 2009 | 11:52:39 UTC

Is there something particularly different about the IBUCH_TRYP WU's? I have had a 100% error rate on these WU's over the past several days, while have had very few issues with other types of WU's.

Seems odd that it seems to affect the one type of WU more than others. I do generally have one error every now and then historically, but the past week has been terrible. I removed all overclocking, but that hasn't helped the IBUCH_TRYP WU's. Any ideas? I'd prefer not to have to just abort all these as they do not appear to work on my system (2x 8800GT, 1x 9500GT).

Checking the WU's, it appears others have had issues with a lot of those same WU's as well. Some have been successfully finished, but there are errors on 8800/9800 GT's and GTX2xx cards (though the GTX2xx seem to fair a little better on these WU's).

Richard Haselgrove
Send message
Joined: 11 Jul 09
Posts: 1620
Credit: 8,894,687,378
RAC: 19,752,053
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 13563 - Posted: 15 Nov 2009 | 12:12:02 UTC

Now you come to mention it, four of the seven errors currently visible across my three cards are iBUCH_TRYP, though all are different sub-types (kickout, kickout1, kickin and reverse). reverse was particularly irritating - it ran for 52,500 seconds before erroring: most of the others have been relatively quick and painless.

On the other hand, I can see three successful runs too (repro, kickin and 169).

All my cards are 9800GT variants from the same manufacturer: two have both successes and failures, so no obvious explanation there.

Profile K1atOdessa
Send message
Joined: 25 Feb 08
Posts: 249
Credit: 392,702,681
RAC: 1,417,376
Level
Asp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 13568 - Posted: 15 Nov 2009 | 23:45:31 UTC

Earlier this afternoon I had all WU's error out (at least those running on the 9500GT). No change in my system, no new drivers, new new BOINC, no OC'ing. Temps lower than normal given we're getting into fall. Not sure what happened, but I rebooted. Now I have 3 WU's in process and no errors after about an hour (all the errors today were after a few seconds). Two IBUCH_TRYP WU's in the queue, so we'll see if it was just some weird thing with my system that gave much higher than normal error rates. Hopefully that is the case, though I don't have an explanation.

Richard Haselgrove
Send message
Joined: 11 Jul 09
Posts: 1620
Credit: 8,894,687,378
RAC: 19,752,053
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 13613 - Posted: 19 Nov 2009 | 11:08:41 UTC

Looking at a fuller log, six out of my last 13 errors have been IBUCH_TRYP, but OTTO_HERG are almost as bad: 4 out of 13. I've had errors with OTTO_HERG4, 5, 7 and 8.

Richard Haselgrove
Send message
Joined: 11 Jul 09
Posts: 1620
Credit: 8,894,687,378
RAC: 19,752,053
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 13619 - Posted: 19 Nov 2009 | 15:10:39 UTC

It's particularly annoying when it happens almost 80% of the way through a long task: compare the last two tasks for host 43404.

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 13621 - Posted: 19 Nov 2009 | 17:42:24 UTC

The IBUCH_TRYP WUs are failing on lesser cards and succeeding on the GTX 260 and above.
Here's 5 that failed on multiple "below 260" cards and then succeeded on GTX 260 or higher:

http://www.gpugrid.net/workunit.php?wuid=947203
http://www.gpugrid.net/workunit.php?wuid=939689
http://www.gpugrid.net/workunit.php?wuid=936708
http://www.gpugrid.net/workunit.php?wuid=939220
http://www.gpugrid.net/workunit.php?wuid=941652

It also happens on some other WU types but not as often:

http://www.gpugrid.net/workunit.php?wuid=934230
http://www.gpugrid.net/workunit.php?wuid=939398

In general there seems to be more and more a trend toward WUs not running on sub GTX 260 GPUs. A very bad trend IMO.

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 13641 - Posted: 22 Nov 2009 | 0:14:10 UTC - in response to Message 13621.
Last modified: 22 Nov 2009 | 0:14:57 UTC

Another WU that's failed on 4 machines, including a GTX 295:

http://www.gpugrid.net/workunit.php?wuid=956237

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 13665 - Posted: 23 Nov 2009 | 18:53:57 UTC - in response to Message 13641.

Another WU that's failed on 4 machines, including a GTX 295:

http://www.gpugrid.net/workunit.php?wuid=956237

This one too was finally successfully completed by a GTX 280.

Andrew
Send message
Joined: 9 Dec 08
Posts: 29
Credit: 18,754,468
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwat
Message 13703 - Posted: 26 Nov 2009 | 10:52:54 UTC

Cheers for this thread guys.

I just looked at my previous 5 failures which failed mysteriously with no OC - they failed either on HERG or TRYP tasks. Each of these tasks had another 8800gt/9800gt fail on them, before being successfully completed by a GT260.

Is there a way for the server to detect the graphics card model and send different types of task? I don't want to be wasting resources on tasks which are going to fail.

Profile K1atOdessa
Send message
Joined: 25 Feb 08
Posts: 249
Credit: 392,702,681
RAC: 1,417,376
Level
Asp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 13704 - Posted: 26 Nov 2009 | 13:13:27 UTC - in response to Message 13703.

Is there a way for the server to detect the graphics card model and send different types of task? I don't want to be wasting resources on tasks which are going to fail.


That would be nice.

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 13712 - Posted: 28 Nov 2009 | 3:08:20 UTC - in response to Message 13704.

Is there a way for the server to detect the graphics card model and send different types of task? I don't want to be wasting resources on tasks which are going to fail.

That would be nice.

Now the new TONI_HERG WUs are failing on the sub GTX 260 cards :-(

MarkJ
Volunteer moderator
Volunteer tester
Send message
Joined: 24 Dec 08
Posts: 738
Credit: 200,909,904
RAC: 0
Level
Leu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 13713 - Posted: 28 Nov 2009 | 4:19:34 UTC - in response to Message 13712.

Is there a way for the server to detect the graphics card model and send different types of task? I don't want to be wasting resources on tasks which are going to fail.

That would be nice.

Now the new TONI_HERG WUs are failing on the sub GTX 260 cards :-(


They are suggesting a G200-based card as the minimum these days for here, so its quite likely that none of them will run properly unless you have GTX2xx card.
____________
BOINC blog

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 13750 - Posted: 1 Dec 2009 | 18:15:36 UTC - in response to Message 13713.

My GTX260 is doing fine. The other cards are not fairing so well. Both Compute Capable 1.1 cards are struggling. One is a GTS250 and the other an 8800 512MB. The problems seem to have accelerated around 28th Nov for some reason. Prior to that I was getting about 25% failure rates with most failing early on. Now they are failing at any stage, often after about 10h of work!

1564214 979457 27 Nov 2009 19:22:11 UTC 28 Nov 2009 14:48:35 UTC
Error while computing 34,574.43 1,953.66 4,503.74

1573342 985406 30 Nov 2009 2:15:52 UTC 1 Dec 2009 0:47:42 UTC
Error while computing 39,950.28 2,110.41 4,503.74


Name D370-TONI_HERGdof5-1-40-RND4152_0
Workunit 979457
Created 27 Nov 2009 18:35:19 UTC
Sent 27 Nov 2009 19:22:11 UTC
Received 28 Nov 2009 14:48:35 UTC
Server state Over
Outcome Client error
Client state Compute error
Exit status 1 (0x1)
Computer ID 51279
Report deadline 2 Dec 2009 19:22:11 UTC
Run time 34574.428879
CPU time 1953.663
stderr out

<core_client_version>6.10.18</core_client_version>
<![CDATA[
<message>
Incorrect function. (0x1) - exit code 1 (0x1)
</message>
<stderr_txt>
# Using CUDA device 0
# There is 1 device supporting CUDA
# Device 0: "GeForce GTS 250"
# Clock rate: 1.85 GHz
# Total amount of global memory: 1073741824 bytes
# Number of multiprocessors: 16
# Number of cores: 128
MDIO ERROR: cannot open file "restart.coor"
Cuda error: Kernel [pme_fill_charges_accumulate] failed in file 'fillcharges.cu' in line 73 : unknown error.

</stderr_txt>
]]>

Validate state Invalid
Claimed credit 4503.73958333333
Granted credit 0
application version Full-atom molecular dynamics v6.71 (cuda)

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 13778 - Posted: 4 Dec 2009 | 0:53:22 UTC - in response to Message 13713.

Is there a way for the server to detect the graphics card model and send different types of task? I don't want to be wasting resources on tasks which are going to fail.

That would be nice.

Now the new TONI_HERG WUs are failing on the sub GTX 260 cards :-(

They are suggesting a G200-based card as the minimum these days for here, so its quite likely that none of them will run properly unless you have GTX2xx card.

Actually on the front page they say this:

Graphics card:

* (one or more)Recommended: Geforce GTX 250-275-280-285-295, Tesla10

What the heck is a GTX 250? Are they now not supporting the GTX 260?
Any information would sure be useful...


Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 13779 - Posted: 4 Dec 2009 | 1:51:52 UTC - in response to Message 13778.
Last modified: 4 Dec 2009 | 1:55:10 UTC

The Home page has a mistake, and they have been told about it.
The same mistake has been made many times by many people, including me.

There is a GTX260 and a GTS250, but no GTX250!

The GTX260 cards can either have 196 or 216 shaders. Usually a card with 216 shaders will have 216sp on the box.

The GTS250 does not use a G200 core.

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 13783 - Posted: 4 Dec 2009 | 20:03:12 UTC - in response to Message 13779.

The Home page has a mistake, and they have been told about it.
The same mistake has been made many times by many people, including me.

There is a GTX260 and a GTS250, but no GTX250!

The GTX260 cards can either have 196 or 216 shaders. Usually a card with 216 shaders will have 216sp on the box.

The GTS250 does not use a G200 core.

Exactly, so why don't they bother to correct the main page? That was my point :-)
So is the GTX 260 on the approved list or not? Mine runs all WUs fine.


Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 13787 - Posted: 5 Dec 2009 | 0:03:13 UTC - in response to Message 13783.
Last modified: 5 Dec 2009 | 0:11:49 UTC

The GTX260 216sp is on My List!
My one works Perfectly - 100% success during last 2 weeks running 24/7.

Its a Palit GTX260 216sp, with 2 fans, and it sits at about 62 degrees C.
They have GT200 55nm cores.

People should note that the GTS 250 uses a G92 core which is 65nm.
The one I have is struggling at the minute. 25% fail time. But I will keep it attached for now.

I had to stop trying to support the project with my 8800 GTS 512, which was not far off the performance of the GTS 250 - until recently that is, when most tasks started to fail.

It might go towards the deposit for a G300 card, if they ever turn up!

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 13789 - Posted: 5 Dec 2009 | 0:49:03 UTC - in response to Message 13787.

We have corrected the main page for the 250.

gdf

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 13796 - Posted: 5 Dec 2009 | 14:29:20 UTC - in response to Message 13789.

Thanks,

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 13800 - Posted: 5 Dec 2009 | 19:57:25 UTC - in response to Message 13787.

The GTX260 216sp is on My List!
My one works Perfectly - 100% success during last 2 weeks running 24/7.

Its a Palit GTX260 216sp, with 2 fans, and it sits at about 62 degrees C.
They have GT200 55nm cores.

People should note that the GTS 250 uses a G92 core which is 65nm.
The one I have is struggling at the minute. 25% fail time. But I will keep it attached for now.

I had to stop trying to support the project with my 8800 GTS 512, which was not far off the performance of the GTS 250 - until recently that is, when most tasks started to fail.

It might go towards the deposit for a G300 card, if they ever turn up!

My GTX 260 works perfectly too but it's not on THEIR list. My G92 based cards work for most but not at types of WUs. When the 5 NVidia cards I have stop working here I'm gone. There doesn't seem to be much of an effort to keep the project running with hardware that up until the last month ran fine. The ATI initiative is going in the wrong direction too IMO. They're only supporting a VERY few top end cards. If OpenCL is so limited why not use CAL, which is used effectively by much smaller projects? So 2 codebases would have to be supported, big deal. Smaller projects support many more. I've been transitioning toward ATI cards (the 40nm based HD 4770 for the high energy efficiency and double precision support), but they probably won't work here so I'll go/stay somewhere that they will like MilkyWay and Collatz. Simple as that, no hard feelings...

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 13838 - Posted: 8 Dec 2009 | 18:29:36 UTC - in response to Message 13800.

My GTX260 will be here for a while and my GTS250 is hanging in there for now.

I would have liked to be able to add my ATI 4850 to the project, but I cant see that happening. For now it works away on Folding@home tasks.
The 8800 GTS 512 is now also on Folding@home, as it did not seem to like too many of the recent GPUGrid tasks and got to the pnint it was sitting idle.

When I looked into it TONI-HERG was the main culprate for my GTS250 and the 8800, but the 8800 was also failing other tasks, IBUCH and some GIANNI that the GTS250 was getting through. Fortunately there are other tasks now that are keeping my GTS productive.

canardo
Send message
Joined: 11 Feb 09
Posts: 4
Credit: 8,675,472
RAC: 0
Level
Ser
Scientific publications
watwatwatwatwatwatwatwatwatwatwat
Message 14139 - Posted: 6 Jan 2010 | 8:15:18 UTC - in response to Message 13789.

Just finished an IBUCH .... erev
http://www.gpugrid.net/result.php?resultid=1707999
on a 250
http://www.gpugrid.net/show_host_detail.php?hostid=26091 .
looks like you found a way around the bug. Congrats
Now Tony_Herg
Ciao
____________

Post to thread

Message boards : Graphics cards (GPUs) : IBUCH_TRYP WU Errors

//