Advanced search

Message boards : Graphics cards (GPUs) : continuus errors

Author Message
veebee
Send message
Joined: 12 Oct 08
Posts: 12
Credit: 77,149,797
RAC: 0
Level
Thr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 10955 - Posted: 2 Jul 2009 | 22:31:10 UTC

one of my machines (i7 920 with GTS 250) has been workign fine until last 48 hrs oe so.

Every WU that DOES download, errors our almost immediately, and then gives message "output file xxxxxxxxxxxxx absent".
From what I can see, they are all "IBUCH" WU's.

any ideas or help appreciated, as it is affecting my output a bit having that card out of action..

Cheers
Veebee

MarkJ
Volunteer moderator
Volunteer tester
Send message
Joined: 24 Dec 08
Posts: 738
Credit: 200,909,904
RAC: 0
Level
Leu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 10957 - Posted: 3 Jul 2009 | 6:15:47 UTC - in response to Message 10955.

one of my machines (i7 920 with GTS 250) has been workign fine until last 48 hrs oe so.

Every WU that DOES download, errors our almost immediately, and then gives message "output file xxxxxxxxxxxxx absent".
From what I can see, they are all "IBUCH" WU's.

any ideas or help appreciated, as it is affecting my output a bit having that card out of action..

Cheers
Veebee


I believe this bug was fixed in 6.6.36, you need to update your BOINC to the current version (which also has a few fixes for cuda).
____________
BOINC blog

Profile Hydropower
Avatar
Send message
Joined: 3 Apr 09
Posts: 70
Credit: 6,003,024
RAC: 0
Level
Ser
Scientific publications
watwatwatwatwatwat
Message 10958 - Posted: 3 Jul 2009 | 6:41:02 UTC - in response to Message 10955.
Last modified: 3 Jul 2009 | 6:44:29 UTC

one of my machines (i7 920 with GTS 250) has been workign fine until last 48 hrs oe so.


Hi Mark, veebee, The boinc client veebee uses and has been using successfully is 6.4.5, the error in the logs is "SIGSEGV: segmentation violation". Strange that this happens in the past 48 hours. I would think a new driver was installed or another program that interferes with the client. I do see a clock rate which seems high to me: 1836000 kilohertz. Could it be that the card overheated ?

Given that you arunning linux, were any of the file rights changed for BOINC ? or any of the allocations of resources ?
____________
Join team Bletchley Park, the innovators.

veebee
Send message
Joined: 12 Oct 08
Posts: 12
Credit: 77,149,797
RAC: 0
Level
Thr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 10963 - Posted: 4 Jul 2009 | 11:40:08 UTC - in response to Message 10958.

I may well have re-installed the nvidia driver.. as it (the card) seems to lose being recognised by BOINC if the machine gets restarted/ reboots etc.

I have just tried going back to driver 180.51 and it is now running... thanks for mentioning the new driver install... (though it hasn't affected the other two machines.. strange..)

Thanks again guys

Veebee

ironcold
Send message
Joined: 22 May 08
Posts: 2
Credit: 1,507,793
RAC: 0
Level
Ala
Scientific publications
watwatwatwat
Message 10968 - Posted: 4 Jul 2009 | 22:16:28 UTC - in response to Message 10963.

Got the same problem here with the new driver (185.18.14). I'm running SuSE 11.1.

Profile Wang Solutions
Avatar
Send message
Joined: 23 Feb 09
Posts: 10
Credit: 20,238,048
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwatwatwatwatwatwat
Message 10977 - Posted: 6 Jul 2009 | 5:20:06 UTC - in response to Message 10968.

I have also had continuous errors with the new drivers but they have all been resolved by reverting back to the 180.x drivers.
____________
Proud member of BOINC@AUSTRALIA

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 11048 - Posted: 8 Jul 2009 | 20:37:41 UTC

BTW: 1.83 GHz is standard for 9800GTX / GTS 250.

MrS
____________
Scanning for our furry friends since Jan 2002

Zebra3
Avatar
Send message
Joined: 2 Jul 09
Posts: 1
Credit: 276,594
RAC: 0
Level

Scientific publications
watwatwatwat
Message 11155 - Posted: 16 Jul 2009 | 11:25:58 UTC

Name p290000-IBUCH_36_pYEEI_com_0907-2-3-RND9481_0
Workunit 626489
Created 14 Jul 2009 23:41:14 UTC
Sent 15 Jul 2009 11:10:22 UTC
Received 16 Jul 2009 5:28:35 UTC
Server state Over
Outcome Client error
Client state Compute error
Exit status 1 (0x1)
Computer ID 43807
Report deadline 20 Jul 2009 11:10:22 UTC
CPU time 4910.375
stderr out

<core_client_version>6.6.36</core_client_version>
<![CDATA[
<message>
Incorrect function. (0x1) - exit code 1 (0x1)
</message>
<stderr_txt>
# Using CUDA device 0
# Device 0: "GeForce 9600 GSO"
# Clock rate: 1458000 kilohertz
# Total amount of global memory: 804978688 bytes
# Number of multiprocessors: 12
# Number of cores: 96
MDIO ERROR: cannot open file "restart.coor"
# Using CUDA device 0
# Device 0: "GeForce 9600 GSO"
# Clock rate: 1458000 kilohertz
# Total amount of global memory: 804978688 bytes
# Number of multiprocessors: 12
# Number of cores: 96

</stderr_txt>
]]>

Validate state Invalid
Claimed credit 3977.21064814815
Granted credit 0
application version 6.64

Thank's guys for the option to crunch on my GPU here when my other project is down but that's all for me. I had this WU run almost to completion on my C2D and it borked. Had another 4 go bad on my Quad with one running right now that I don't know what will happen. If this one errors I won't be returning to waste my GPU time. Nothing has changed in my hardware so I am presuming its on your end. I had the 186.18 drivers installed when I did my previous successful WU's so I can't even blame that. Cheers
____________

Post to thread

Message boards : Graphics cards (GPUs) : continuus errors

//