Author |
Message |
|
I have tried to run Nvidia client several time.
I have 1 x 8800 GTX and 1 x 8800 GT non SLi (obviously).
running under Vista 64 bit with latest WQL drivers.
I have tried several Boinc Windows clients.
First time the WU ran for several hours then in final few % it errored. Now every WU i run errors.
How do I get this to work ?
Seti@Home runs great with Cuda so why not GPUGrid ?
This is a example of what is posted against WU
<core_client_version>6.6.12</core_client_version>
<![CDATA[
<message>
Incorrect function. (0x1) - exit code 1 (0x1)
</message>
<stderr_txt>
# Using CUDA device 0
# Device 0: "GeForce 8800 GTX"
# Clock rate: 1350000 kilohertz
# Total amount of global memory: 805306368 bytes
# Number of multiprocessors: 16
# Number of cores: 128
# Device 1: "GeForce 8800 GT"
# Clock rate: 1500000 kilohertz
# Total amount of global memory: 268435456 bytes
# Number of multiprocessors: 14
# Number of cores: 112
Cuda error in file 'nonbonded.cu' in line 189 : invalid device symbol.
</stderr_txt>
]]> |
|
|
|
GPU FAQ: Overview of cards that run Cuda 2.0 compiled applications
The 8800GTX is not supported by GPUGRID - the 8800GT is supported...
It seems BOINC switched between the two cards during computation of the task and that's probably why it errored out...
____________
pixelicious.at - my little photoblog |
|
|
uBronan Send message
Joined: 1 Feb 09 Posts: 139 Credit: 575,023 RAC: 0 Level
Scientific publications
|
Well i have the same error and i only have 1 card so thats not it.
Until now i never had an error on the cuda applications but today my first ever.
<core_client_version>6.5.0</core_client_version>
<![CDATA[
<message>
Onjuiste functie. (0x1) - exit code 1 (0x1)
</message>
<stderr_txt>
# Using CUDA device 0
# Device 0: "GeForce 9600 GT"
# Clock rate: 1800000 kilohertz
# Total amount of global memory: 536543232 bytes
# Number of multiprocessors: 8
# Number of cores: 64
MDIO ERROR: cannot open file "restart.coor"
# Using CUDA device 0
# Device 0: "GeForce 9600 GT"
# Clock rate: 1800000 kilohertz
# Total amount of global memory: 536543232 bytes
# Number of multiprocessors: 8
# Number of cores: 64
</stderr_txt>
]]>
|
|
|
|
Actually it's not the same error. ;)
Your's is also incorrect function (0x1) exit code 1, but in Fatbob's stderr.out there's also -
Cuda error in file 'nonbonded.cu' in line 189 : invalid device symbol.
____________
pixelicious.at - my little photoblog |
|
|
|
Same problem on my new GTX260-216.
<core_client_version>6.4.7</core_client_version>
<![CDATA[
<message>
Unzul�ssige Funktion. (0x1) - exit code 1 (0x1)
</message>
<stderr_txt>
# Using CUDA device 0
# Device 0: "GeForce GTX 260"
# Clock rate: 1350000 kilohertz
# Total amount of global memory: 939196416 bytes
# Number of multiprocessors: 27
# Number of cores: 216
MDIO ERROR: cannot open file "restart.coor"
</stderr_txt>
]]>
Boinc-Client: 6.4.7
OS: MS Windows XP Pro/32Bit, SP3 (05.01.2600.00)
Coprozessor: Gainward GeForce GTX260-216/895MB (620MHz Core, 1242MHz Shader Clock, 896MB 2200MHz GDDR3 Memory)
Nvidia driver: 182.08
ALL WU's crashed on this machine. On the others with GTX280 there is no actual problem. Machine has no problem when playing games.
____________
|
|
|
uBronan Send message
Joined: 1 Feb 09 Posts: 139 Credit: 575,023 RAC: 0 Level
Scientific publications
|
Ok agreed but seems now every units is ending in error direct when starts someone any idea what or how ?!
<core_client_version>6.5.0</core_client_version>
<![CDATA[
<message>
Onjuiste functie. (0x1) - exit code 1 (0x1)
</message>
<stderr_txt>
# Using CUDA device 0
# Device 0: "GeForce 9600 GT"
# Clock rate: 1800000 kilohertz
# Total amount of global memory: 536543232 bytes
# Number of multiprocessors: 8
# Number of cores: 64
MDIO ERROR: cannot open file "restart.coor"
</stderr_txt>
]]>
|
|
|
|
Reboot? And/or power the machine off and remove the power cord for >10 min.
Your computers are hidden, so I can't check it myself. Do you post the entire error message or just part of it? The line "Onjuiste functie. (0x1) - exit code 1 (0x1)" is the general error category and doesn't tell us what's happening. For example in fatbobs case "Cuda error in file 'nonbonded.cu' in line 189 : invalid device symbol." was the actual error message.
MrS
____________
Scanning for our furry friends since Jan 2002 |
|
|
uBronan Send message
Joined: 1 Feb 09 Posts: 139 Credit: 575,023 RAC: 0 Level
Scientific publications
|
changed the setting back to show :)
ill try to reboot and see what happens |
|
|
|
OK, you did post all relevant information and there's nothing else in the task output. But it was worth taking a look anyway.
MrS
____________
Scanning for our furry friends since Jan 2002 |
|
|
uBronan Send message
Joined: 1 Feb 09 Posts: 139 Credit: 575,023 RAC: 0 Level
Scientific publications
|
I guess Stefan is right i think that somehow it tried to switch to the nonbonded device, its probably like switching between different computers
Meanwhile i took the advise and rebooted and closed down my machine for a few minutes to see if that solves the issue.
If so we must reboot once in a few days, probably because of memory leaks in the applications.
But which one is not clear to me it can be both or it could the combination of gpugrid versus seti
Strangly i haven't had any problems on all previous units and never needed a reboot other then about 2 months ago my machine runs 24/7 normally.
Thanks for trying to help anyway but i fear its out of our hands now
PS sadly no solution the newly received unit crashed withing 30 seconds
So its something else other then my machine or boinc since all other projects run without errors |
|
|
|
Normally we don't need to reboot for GPU-Grid, it runs just fine. But if *something* happened an WUs error out in rows, it could be that the PC went into some strange state (which the reboot / power off would solve).
I guess Stefan is right i think that somehow it tried to switch to the nonbonded device, its probably like switching between different computers
Sorry to tell you, but you're on the completely wrong track here. The error message involving file "nonbonded.cu" appeared because fatbobs BOINC tried to run GPU-Grid on a card which does not support the features it needs. I.e. it can not recognize some commands which the GPU-Grid team put into nonbonded.cu. This is not something you could trigger (even if you wanted to), or which just happens, it only happens if you use *incapable* hardware. That's why your errors are completely different from fatboys, except for the general error code 0x1.
MrS
____________
Scanning for our furry friends since Jan 2002 |
|
|
uBronan Send message
Joined: 1 Feb 09 Posts: 139 Credit: 575,023 RAC: 0 Level
Scientific publications
|
Lol you misunderstood me i prolly stink at english but you say exactly what i meant with it
On my personal problem with gpugrid i did something nasty :(
I downclocked my gpu speeds drastically and see what happens with the new units now my perfect record of running non errors is over
I think the admin from gpugrid could not stand me being error free on the runs ;)
Anyway no clue why it keeps crashing if it crashes again ill leave for a while to good running projects untill the problems are solved
I dont want to spend another 24 hours and then getting nothing because it errors out again. |
|
|
uBronan Send message
Joined: 1 Feb 09 Posts: 139 Credit: 575,023 RAC: 0 Level
Scientific publications
|
Not sure if anybody want to know but when i had seti and gpugrid active both as cuda the grugrid was crashed.
And only 3 cores out of 4 active on boinc.
I have now only gpugrid active on the cuda and seems to run without error.
So lets see if it stays running |
|
|
|
Did they try to run at the same time on your single card or was it just that you had projects activated and BOINc would switch between them normally?
In that case we can say for sure that seti is not leaving the machine in a "clean" state.
MrS
____________
Scanning for our furry friends since Jan 2002 |
|
|
uBronan Send message
Joined: 1 Feb 09 Posts: 139 Credit: 575,023 RAC: 0 Level
Scientific publications
|
Did they try to run at the same time on your single card or was it just that you had projects activated and BOINc would switch between them normally?
In that case we can say for sure that seti is not leaving the machine in a "clean" state.
MrS
Well after testing and disabling seti from my projects (also because of i am not very happy with that project (probably fake results))
I must admit that gpugrid runs nicely again altough i have to admit i changed some hardware settings also
First of all i slowed down my dram it was set to run at 4-4-4-12 CR1 but i fear is maybe too fast so i switched it back to CR2 but i am not sure if this was needed.
Second i downclocked my VC back to beneat the default OC settings of this card and run the card at 100% stock speeds for the given 9600GT model.
By default the card was overclocked by EVGA to 675 mhz
I saw also that sometimes 2 or 3 seti cuda units ran simultanous with the gpugrid but for a while this gave NO errors.
Hence i have 4 cores ;)
After i resetted seti cuda and installed the KWSN optimized app it went to run with 1 cuda unit together with gpugrid also not giving any errors.
But then suddenly all the errors came as result, i cleaned up all and set it back to low/standard settings and now the project runs "normal" (slow) again.
Finishing units like it should but i am not sure if/or all these should be seen as needed because the weird thing of it all is that it worked for weeks without problems, and all of a sudden all started to fail.
Its ofcourse kinda hard to find the real culprit in these situations, maybe the given units where nasty, i have no clue yet but i am glad it runs normal again.
Or maybe the damn windows updates where ;), who knows may say it :D
|
|
|
JoeSend message
Joined: 1 Sep 08 Posts: 37 Credit: 5,864,088 RAC: 0 Level
Scientific publications
|
I have a new Win XP with an updates, the newest Nvidia driver for my GTX 295. But I cant complete any WU with GPUGrid. Try to reset the project, use the GTX in SLI and in two core mode... nothing works... Its the same mistake evrytime:
MDIO ERROR: cannot open file "restart.coor"
<core_client_version>6.6.15</core_client_version>
<![CDATA[
<message>
Unzul�ssige Funktion. (0x1) - exit code 1 (0x1)
</message>
<stderr_txt>
# Using CUDA device 0
# Device 0: "GeForce GTX 295"
# Clock rate: 1242000 kilohertz
# Total amount of global memory: 939261952 bytes
# Number of multiprocessors: 30
# Number of cores: 240
# Device 1: "GeForce GTX 295"
# Clock rate: 1242000 kilohertz
# Total amount of global memory: 939196416 bytes
# Number of multiprocessors: 30
# Number of cores: 240
MDIO ERROR: cannot open file "restart.coor"
</stderr_txt>
]]>
The GTX is ok, in a Vista system I can complete the WUs...
Can you help me with this problem, please? |
|
|
Alain MaesSend message
Joined: 8 Sep 08 Posts: 63 Credit: 1,650,962,839 RAC: 2,204,221 Level
Scientific publications
|
This is actually not a problem. The message [MDIO ERROR: cannot open file "restart.coor"] always occurs at the start of a new WU for the simple reason that it has to start from scratch and can not fall back on a previously saved restart.coor, such as will be the case after a shutdown and restart of the PC in the middle of crunching a WU.
Your machine also reports two devices as it should be for a GTX 295. So two GPUGRID WUs will be running together.
Please check your result status and you should see that everything is fine.
PS - and if want any more help, try unhiding your PCs so that other fellow crunchers can have a look at them.
Kind regards.
Alain |
|
|
JoeSend message
Joined: 1 Sep 08 Posts: 37 Credit: 5,864,088 RAC: 0 Level
Scientific publications
|
Hi Alain,
this is the machine: http://www.gpugrid.net/results.php?hostid=29092 Its unhide now.
The PC ist running the hole day, there is no restart. Sometimes the error comes after a few seconds starting the new WU, sometimes after a few hours work... The result quit with a client and computing error...
Hope you can help me a little more...
Kind regards
Joe
PS Here some fact of the machine:
13.03.2009 14:37:09 Starting BOINC client version 6.6.15 for windows_intelx86
13.03.2009 14:37:09 log flags: task, file_xfer, sched_ops
13.03.2009 14:37:09 Libraries: libcurl/7.19.4 OpenSSL/0.9.8j zlib/1.2.3
13.03.2009 14:37:09 Data directory: D:\BOINC\Data
13.03.2009 14:37:09 Running under account Jörg
13.03.2009 14:37:09 Milkyway@home Found app_info.xml; using anonymous platform
13.03.2009 14:37:09 SETI@home Found app_info.xml; using anonymous platform
13.03.2009 14:37:09 Processor: 2 GenuineIntel Intel(R) Core(TM)2 Duo CPU E6750 @ 2.66GHz [x86 Family 6 Model 15 Stepping 11]
13.03.2009 14:37:09 Processor features: fpu tsc sse sse2 mmx
13.03.2009 14:37:09 OS: Microsoft Windows XP: Professional x86 Editon, Service Pack 3, (05.01.2600.00)
13.03.2009 14:37:09 Memory: 2.00 GB physical, 3.85 GB virtual
13.03.2009 14:37:09 Disk: 107.42 GB total, 100.79 GB free
13.03.2009 14:37:09 Local time is UTC +1 hours
13.03.2009 14:37:09 CUDA device: GeForce GTX 295 (driver version 18208, CUDA version 1.3, 896MB, est. 106GFLOPS)
13.03.2009 14:37:09 Not using a proxy
13.03.2009 14:37:09 GPUGRID URL: http://www.gpugrid.net/; Computer ID: 29092; location: (none); project prefs: default
13.03.2009 14:37:09 Reading preferences override file
13.03.2009 14:37:09 Preferences limit memory usage when active to 1023.46MB
13.03.2009 14:37:09 Preferences limit memory usage when idle to 1842.23MB
13.03.2009 14:37:09 Preferences limit disk usage to 53.71GB
...
13.03.2009 14:45:13 GPUGRID Sending scheduler request: To fetch work.
13.03.2009 14:45:13 GPUGRID Requesting new tasks
13.03.2009 14:45:41 GPUGRID Computation for task sM24328-SH2_US_8-0-10-SH2_US_8620000_0 finished
13.03.2009 14:45:41 GPUGRID Output file sM24328-SH2_US_8-0-10-SH2_US_8620000_0_1 for task sM24328-SH2_US_8-0-10-SH2_US_8620000_0 absent
13.03.2009 14:45:41 GPUGRID Output file sM24328-SH2_US_8-0-10-SH2_US_8620000_0_2 for task sM24328-SH2_US_8-0-10-SH2_US_8620000_0 absent
13.03.2009 14:45:41 GPUGRID Output file sM24328-SH2_US_8-0-10-SH2_US_8620000_0_3 for task sM24328-SH2_US_8-0-10-SH2_US_8620000_0 absent
...
|
|
|
Alain MaesSend message
Joined: 8 Sep 08 Posts: 63 Credit: 1,650,962,839 RAC: 2,204,221 Level
Scientific publications
|
OK, there is indeed a problem. All your results have the error code [Unzul�ssige Funktion. (0x1) - exit code 1 (0x1)]. Unfortunately this tells little and gives no real clues.
Worth trying in such cases.
1. Check the version of your video driver. Make sure you have the last one, currently 180.29 if I am not mistaken.
2. Verify your GPU temperature
3. Did you overclock your videocard? If so try easing back.
4. Also, did you try a simple restart?
Hope one of these help, sorry I can not be more specific.
Kind regards
Alain
|
|
|
|
I'd add another question:
5. Do you have GPU tasks ticked in your Seti options on your account @ Seti? The default is yes.
In theory it should be possible to get GPU projects to share resources with outher GPU projects on the same machine but I've not been able to get that to work yet.
Phoneman1 |
|
|
JoeSend message
Joined: 1 Sep 08 Posts: 37 Credit: 5,864,088 RAC: 0 Level
Scientific publications
|
1. Check the version of your video driver.
GeForce Release 182.08 WHQL
2. Verify your GPU temperature
without GPU work:
GPU1: Grafikprozessor (GPU) 53 °C (127 °F)
GPU1: GPU Speicher 45 °C (113 °F)
GPU1: GPU Umgebung 48 °C (118 °F)
GPU1: GPU VRM 50 °C (122 °F)
GPU2: Grafikprozessor (GPU) 56 °C (133 °F)
GPU2: GPU Umgebung 50 °C (122 °F)
GPU2: GPU VRM 51 °C (124 °F)
with GPU work:
GPU1: Grafikprozessor (GPU) 78 °C (172 °F)
GPU1: GPU Umgebung 65 °C (149 °F)
GPU1: GPU VRM 61 °C (142 °F)
GPU2: Grafikprozessor (GPU) 76 °C (169 °F)
GPU2: GPU Speicher 68 °C (154 °F)
GPU2: GPU Umgebung 67 °C (153 °F)
GPU2: GPU VRM 60 °C (140 °F)
3. Did you overclock your videocard?
No, its original
GPU Takt (Geometric Domain) 576 MHz (Original: 576 MHz)
GPU Takt (Shader Domain) 1242 MHz (Original: 1242 MHz)
Busbreite 448 Bit
Tatsächlicher Takt 999 MHz (DDR) (Original: 999 MHz)
Effektiver Takt 1998 MHz
Bandbreite 109.3 GB/s
4. Also, did you try a simple restart?
Yes, but it had no positve effekt
5. Do you have GPU tasks ticked in your Seti options on your account @ Seti?
No, its disable. Because I use the CUDA only for GPUGrid. Its works with all other machines...
Kind regards
Joe
|
|
|
Alain MaesSend message
Joined: 8 Sep 08 Posts: 63 Credit: 1,650,962,839 RAC: 2,204,221 Level
Scientific publications
|
Ohoh. If all that does not work, maybe your GPU is broken?
Check it out
1. with e.g. GPU-Z
2. in another machine if possible
OK, have to drive back home to Belgium now.
Hope to find some positive news on this after the WE
Take care
Kind regards
Alain |
|
|
JoeSend message
Joined: 1 Sep 08 Posts: 37 Credit: 5,864,088 RAC: 0 Level
Scientific publications
|
There an no problems with Everest or GPU-Z.
I testes this card in a XP machine - the same problem with GPUGrid. Then in a Vista PC - all ok. Put in in the next XP PC - and the card wont word... A problem with XP???
Kind regards
Joe |
|
|
|
Hi Joe,
that looks really strange! It's not the GPU-Grid wouldn't run under XP in general. I installed 182.08 myself yesterday and it's running just fine. Since it works under Vista your hardware is fine, so you seem to have a wierd software problem. Did you customize your install via nlite? Or is there some rather uncommon software present on all your machines?
Edit: do you use remote desktop to access the machines?
MrS
____________
Scanning for our furry friends since Jan 2002 |
|
|
JoeSend message
Joined: 1 Sep 08 Posts: 37 Credit: 5,864,088 RAC: 0 Level
Scientific publications
|
Hi MrS,
yes, the hardware is ok. I believe that XP is the problem... Its an 2day old XP with all MS, Nvidia and board updates. And on my other (older) XP PC I have the same problems... I try GPUGrid with the GTX295 in an naked XP - only with problems...
No, there is no remote desktop, only a normal network. There is only XP, Office an Photoshop CS3 installed.
I tested another GTX 295 in both XP PC - its already the same problem - in Vista a is all ok!!!
I tried the Seti Cuda work with the XP machines and it work very well, no problems.
Perhaps the next Nvidia driver will work better with my XP...
Kind regards
Joe
PS If you have any idea or solution for me, please write me!!!!!
|
|
|
JoeSend message
Joined: 1 Sep 08 Posts: 37 Credit: 5,864,088 RAC: 0 Level
Scientific publications
|
PS I can't change to Vista, because I have problems to work with Photoshop CS on Vista, when the network print server works on XP. Photoshop works VERY slow in some functions... |
|
|
Zydor Send message
Joined: 8 Feb 09 Posts: 252 Credit: 1,309,451 RAC: 0 Level
Scientific publications
|
Are the XP PCs updated to SP3 and the latest .NET ?
Regards
Zy |
|
|
JoeSend message
Joined: 1 Sep 08 Posts: 37 Credit: 5,864,088 RAC: 0 Level
Scientific publications
|
Yes, SP3 incl. updates and .Net3.5 Sp1...
There are no more updates for my XP...
Kind regards
Joe |
|
|
|
Also got a corrupted WU, first one for me, could anyone please tell whats wrong!
QW11771-SH2_US_8-4-10-SH2_US_8880000_0
Workunit 305555
Created 13 Mar 2009 17:28:14 UTC
Sent 13 Mar 2009 19:04:44 UTC
Received 14 Mar 2009 19:36:59 UTC
Server state Over
Outcome Client error
Client state Compute error
Exit status 1 (0x1)
Computer ID 21408
Report deadline 17 Mar 2009 19:04:44 UTC
CPU time 6095.469
stderr out <core_client_version>6.6.15</core_client_version>
<![CDATA[
<message>
Felaktig funktion. (0x1) - exit code 1 (0x1)
</message>
<stderr_txt>
</stderr_txt>
]]>
Validate state Invalid
Claimed credit 4214.27546296296
Granted credit 0
application version 6.62
|
|
|
|
Also got a corrupted WU, first one for me, could anyone please tell whats wrong!
Sorry, no. There's no useful information in the task output and if it there are no further errors I'd just forget about it. You could keep an eye on the failed WU, though, and see if others are successful.
MrS
____________
Scanning for our furry friends since Jan 2002 |
|
|
jrobbioSend message
Joined: 13 Mar 09 Posts: 59 Credit: 324,366 RAC: 0 Level
Scientific publications
|
http://www.gpugrid.net/result.php?resultid=401624
<core_client_version>6.4.5</core_client_version>
<![CDATA[
<message>
Incorrect function. (0x1) - exit code 1 (0x1)
</message>
<stderr_txt>
# Using CUDA device 0
# Device 0: "GeForce GTS 250"
# Clock rate: 1890000 kilohertz
# Total amount of global memory: 1073414144 bytes
# Number of multiprocessors: 16
# Number of cores: 128
MDIO ERROR: cannot open file "restart.coor"
Failed to set low-cpu sync mode
# Using CUDA device 0
Cuda error in file 'deviceQuery.cu' in line 59 : initialization error.
</stderr_txt>
]]>
My first WU worked perfectly, but the second one failed on my Windows XP Home SP3 when I changed profile, whilst the other was logged in.
Regards,
Rob |
|
|
|
One of my computers that has been crunching gpugrid quit happily suddenly started giving errors. Suggestions? Here is a typical error:
http://www.gpugrid.net/result.php?resultid=410378
<core_client_version>6.4.5</core_client_version>
<![CDATA[
<message>
The system cannot find the path specified. (0x3) - exit code 3 (0x3)
</message>
<stderr_txt>
# Using CUDA device 0
# Device 0: "GeForce 9600 GSO"
# Clock rate: 1350000 kilohertz
# Total amount of global memory: 402325504 bytes
# Number of multiprocessors: 12
# Number of cores: 96
Cuda error in file '..\cuda/cutil.h' in line 305 : out of memory.
Memory usage: host: bytes device: bytes
Assertion failed: 0, file ..\cuda/cutil.h, line 305
This application has requested the Runtime to terminate it in an unusual way.
Please contact the application's support team for more information. |
|
|
uBronan Send message
Joined: 1 Feb 09 Posts: 139 Credit: 575,023 RAC: 0 Level
Scientific publications
|
Yesterday all started over again all Gpugrid error out
Its not the vc since s@h runs fine the only thing which is active is the other project CPDN on 3 cores
But i have also a problem now getting new units its somehow not accepting new downloads or something not sure what is going on.
The sad thing is the unit seems to run fine for a while and then suddenly crashes. |
|
|
JeremySend message
Joined: 15 Feb 09 Posts: 55 Credit: 3,542,733 RAC: 0 Level
Scientific publications
|
After doing a good bit of work to re-ensure the stability and reliability of my machine, I set it working last night. My first GPUgrid task finished with an error, all other projects are completing without issue. Seems as though there might be something up with the WUs from the project atm, but I have no way to confirm that. |
|
|
GDFVolunteer moderator Project administrator Project developer Project tester Volunteer developer Volunteer tester Project scientist Send message
Joined: 14 Mar 07 Posts: 1957 Credit: 629,356 RAC: 0 Level
Scientific publications
|
Most of the errors are usually due to overclocking either by the factory or by the user.
In your case the GTX 260-216 should be clocked 1242 Mhz, so you might be too high.
http://en.wikipedia.org/wiki/GeForce_200_Series
Try to reduce it to standard clock rates and see if it works.
gdf |
|
|
|
David,
Cuda error in file '..\cuda/cutil.h' in line 305 : out of memory.
This is a typical error, which is resolved by a reboot. Either caused by a memory leak in elder drivers under XP64 or after an app / game reserved lots of memory and GPU-Grid was running in the background.
webbie wrote: Its not the vc since s@h runs fine
That's definitely wrong, take a look at what Paul and me wrote here. You are ~175 MHz above stock speed. It's naive to assume that this could not possibly cause your errors.
MrS
____________
Scanning for our furry friends since Jan 2002 |
|
|
uBronan Send message
Joined: 1 Feb 09 Posts: 139 Credit: 575,023 RAC: 0 Level
Scientific publications
|
I tested with lower speeds i set the card to lowest settings possible even lower then reported for nvidia 9600 gt on nivdia site.
The EVGA defaults for this card settings are indeed much higher then nvidia reports to be default but nevertheless i lowered them all.
But as i was checking i also stopped the CPDN units and saw something what i did not see before, after stopping the CPDN units the drive leds stopped flashing.
So i checked what happens when i turned on the units again, and it started flashing again without stopping.
The anwer on this came from the site of cpdn these units use 1,5 Gb memory a piece so running 3 or 4 of them seem to be too much.
Even if i had booted into my win7 X64 mode it would have had not enough memory to run them all since 8GB isnt enough to run 4 of them.
Hence the continous use of the swap space.
After i turned of cpdn units and let the other projects flow as usual on 3 cores seems to run fine again.
So i guess its a good idea to check which other projects run together with gpugrid.
I start to wonder if it is safe to run s@h or cpdn together with gpugrid
Ill test the settings of cpdn to use only low amount memory ones after this ones are done. |
|
|
jrobbioSend message
Joined: 13 Mar 09 Posts: 59 Credit: 324,366 RAC: 0 Level
Scientific publications
|
I think I may have fixed this issue:
If you have Windows XP Home and you want to switch profiles whilst using Boinc, follow the steps below found here:
http://setiathome.berkeley.edu/forum_thread.php?id=50929
Windows XP Home, huh? Then you don't have the User Account control panel, only that stupid (sorry ;-)) simple file sharing.
What you could try is the following:
- Restart computer into Windows Safe Mode (Keep hammering F8 after you cleared the BIOS and before you see the Windows Logo, when upon the menu choose Safe Mode).
- Select Start > Run
- Type %allusersprofile% in the Open text box. Then click OK.
- Right-click the Shared Documents folder, and select Properties.
- Click the Security tab.
- Select Everyone in Group Or User Names, and then select Allow next to Full Control in the Permissions. (if possible, check that the admin account you're using is a member of the boinc_admins, boinc_projects and boinc_users groups. Normal accounts should only be members of the latter two groups)
- Click Advanced, and select Replace Permission Entries On All Child Objects With Entries Shown Here That Apply To Child Objects.
- Click OK, click Yes, and then click OK to close the Shared Documents Properties dialog box.
- Select Start > Run.
- Type msconfig in the Open text box. Then click OK. The System Configuration Utility will open.
- In the System Configuration Utility, select the General tab.
- Select Normal Startup - Load All Device Drivers And Services. Then click OK.
- Restart the computer.
- Reinstall BOINC.
I didn't even know you could get to the NTFS permissions on XP Home.
Regards,
Rob |
|
|
uBronan Send message
Joined: 1 Feb 09 Posts: 139 Credit: 575,023 RAC: 0 Level
Scientific publications
|
Speeds set by evga precision tool
525 Mhz core speed, shaders linked on core showing 1302 mhz
memory set at 900 mhz
nvidia stock speeds core 650 Mhz shaders not mentioned.
memory 900 Mhz
But i guess i can set them higher again since all works fine when s@h or cpdn aren't jumping in as running projects |
|
|
|
There are issues when running s@h and GPU-Grid together. I don't know everythig, but it seems that if seti errors out the PC will need a reboot to use CUDA again (GPU-Grid also erros). But there may be more.
Stock shaders for 9600GT are 1.625 GHz. I'd put the clocks at NV stock for one WU or 2 and switch to the old values afterwards [if successful ;) ].
MrS
____________
Scanning for our furry friends since Jan 2002 |
|
|
|
Hello.
A friend has just tried the 185.13 drivers on Debian 64b
And all his WU are in computation error.
The error given is :
<core_client_version>6.6.15</core_client_version>
<![CDATA[
<message>
process exited with code 193 (0xc1, -63)
</message>
<stderr_txt>
# Using CUDA device 0
SIGSEGV: segmentation violation
Stack trace (17 frames):
acemd_6.59_x86_64-pc-linux-gnu__cuda[0x4baac9]
/lib/libc.so.6[0x7f48eab1af60]
/usr/lib/libcuda.so.1[0x7f48eba0bce0]
/usr/lib/libcuda.so.1[0x7f48eba11a44]
/usr/lib/libcuda.so.1[0x7f48eb9d79df]
/usr/lib/libcuda.so.1[0x7f48eb65f9cb]
/usr/lib/libcuda.so.1[0x7f48eb6702cb]
/usr/lib/libcuda.so.1[0x7f48eb6580c1]
/usr/lib/libcuda.so.1(cuCtxCreate+0xaa)[0x7f48eb65224a]
../../projects/www.gpugrid.net/libcudart.so.2[0x7f48ebc8cd58]
../../projects/www.gpugrid.net/libcudart.so.2[0x7f48ebc8d2a9]
../../projects/www.gpugrid.net/libcudart.so.2(cudaThreadSynchronize+0x1d)[0x7f48ebc7374d]
acemd_6.59_x86_64-pc-linux-gnu__cuda[0x414253]
acemd_6.59_x86_64-pc-linux-gnu__cuda(sin+0x16ac)[0x408a3c]
acemd_6.59_x86_64-pc-linux-gnu__cuda(sin+0x31b)[0x4076ab]
/lib/libc.so.6(__libc_start_main+0xe6)[0x7f48eab071a6]
acemd_6.59_x86_64-pc-linux-gnu__cuda(sinh+0x49)[0x407489]
Exiting...
</stderr_txt>
]]>
Any ideas ? |
|
|
uBronan Send message
Joined: 1 Feb 09 Posts: 139 Credit: 575,023 RAC: 0 Level
Scientific publications
|
There are issues when running s@h and GPU-Grid together. I don't know everythig, but it seems that if seti errors out the PC will need a reboot to use CUDA again (GPU-Grid also erros). But there may be more.
Stock shaders for 9600GT are 1.625 GHz. I'd put the clocks at NV stock for one WU or 2 and switch to the old values afterwards [if successful ;) ].
MrS
Thanx i put it on these to see if it helps also
I slowly am raising the clock speed of the core but i guess it won't matter much untill it reaches the normal clock of 650 Mhz
Ill keep on setting a step higher till stock speeds are reached on the shaders as well, as i understood memory speed doesn't do much i keep it close to stock.
I should have started boinc under x64 vista instead of under normal xp :D
Then i could have used the full 8 Gb mem xD
Or i have to run the machine totally dry but thats gonna take a while since CPDN has downloaded a large one. |
|
|
jrobbioSend message
Joined: 13 Mar 09 Posts: 59 Credit: 324,366 RAC: 0 Level
Scientific publications
|
I think I may have fixed this issue:
If you have Windows XP Home and you want to switch profiles whilst using Boinc, follow the steps below found here:
http://setiathome.berkeley.edu/forum_thread.php?id=50929
I didn't even know you could get to the NTFS permissions on XP Home.
Regards,
Rob
Turns out that this didn't work for me. The GPU task runs for a few seconds and then fails out followed by any queued tasks for that day.
I have reinstalled Boinc as a service so changing profile should not interfere with the tasks. I'll update this thread with my findings.
Regards,
Rob |
|
|
JoeSend message
Joined: 1 Sep 08 Posts: 37 Credit: 5,864,088 RAC: 0 Level
Scientific publications
|
I think I may have fixed this issue:
If you have Windows XP Home and you want to switch profiles whilst using Boinc, follow the steps below found here:
http://setiathome.berkeley.edu/forum_thread.php?id=50929
I didn't even know you could get to the NTFS permissions on XP Home.
Regards,
Rob
Turns out that this didn't work for me. The GPU task runs for a few seconds and then fails out followed by any queued tasks for that day.
I have reinstalled Boinc as a service so changing profile should not interfere with the tasks. I'll update this thread with my findings.
Regards,
Rob
It's the same for me... Perhaps we have to wait for the next CUDA version???
Kind regards
Joe
|
|
|
|
I have reinstalled Boinc as a service so changing profile should not interfere with the tasks. I'll update this thread with my findings.
I don't think you can run the GPU tasks as a service (a limitation of CUDA itself I believe).
|
|
|
|
I don't think you can run the GPU tasks as a service (a limitation of CUDA itself I believe).
That's correct. The Windows installation process for both 6.4.5 and 6.4.7 explicitly states that you can not install BOINC as a service if you want to use CUDA.
I have BOINC running as a service on all my Windows machines *except* for the one where I'm running CUDA. Can't install it as a service there.
Mike |
|
|
JoeSend message
Joined: 1 Sep 08 Posts: 37 Credit: 5,864,088 RAC: 0 Level
Scientific publications
|
I'm testing 6.6.16 Beta http://boinc.berkeley.edu/dl/boinc_6.6.16_windows_intelx86.exe and it seem to work now with XP Prof 32 Bit. Working now since 4 hours without an error... Let me see more tomorrow.
Kind regard
Joe |
|
|
jrobbioSend message
Joined: 13 Mar 09 Posts: 59 Credit: 324,366 RAC: 0 Level
Scientific publications
|
I don't think you can run the GPU tasks as a service (a limitation of CUDA itself I believe).
That's correct. The Windows installation process for both 6.4.5 and 6.4.7 explicitly states that you can not install BOINC as a service if you want to use CUDA.
I have BOINC running as a service on all my Windows machines *except* for the one where I'm running CUDA. Can't install it as a service there.
Mike
Mike,
I am running 6.6.15 and BOINC is running CUDA whilst installed as a service.
They must have resolved whatever issue there was with the stable editions.
Rob |
|
|
|
The issue is only with Vista.
On XP it worked all the time with BOINC installed as a service...
____________
pixelicious.at - my little photoblog |
|
|
|
I am running 6.6.15 and BOINC is running CUDA whilst installed as a service.
Sweet!
In that case, I'm certainly looking forward to the release of a stable 6.6.x client. I much prefer running BOINC as a service.
I prefer not to run the beta versions of the client, so I'm sticking to the stable release versions. I like running CPDN, and with the length of those work units, too much work gets lost if they error out. Yeah, I know I can backup/restore those WUs, but then I have to remember to back them up, and restoration is less than a pleasant process when you're running lots of projects.
Mike
Edit:
The issue is only with Vista.
On XP it worked all the time with BOINC installed as a service...
Well, maybe not so sweet then... |
|
|
JoeSend message
Joined: 1 Sep 08 Posts: 37 Credit: 5,864,088 RAC: 0 Level
Scientific publications
|
A few minutes before finishing the WU I got "incorrect function (0x1) exit code 1" with 6.6.16... Now I'm testing 6.6.17... |
|
|
|
Any ideas for the error 193 with the 185.13 drivers on Debian 64b ? |
|
|
|
Just had 4 errors on my Vista machine......looks like my fellow crunchers have errored out too. Is there a bad batch of Wu's out there atm? Just thought its worth checking as my GTX 295 is a factory overclock card and it could possibly cause some errors.
http://www.gpugrid.net/workunit.php?wuid=318412
http://www.gpugrid.net/workunit.php?wuid=317639
http://www.gpugrid.net/workunit.php?wuid=317457
This one was completed by someone else but had error at same time as one of the other WU's....possibly killed when he other one went...i dont know
http://www.gpugrid.net/workunit.php?wuid=317194 |
|
|
jrobbioSend message
Joined: 13 Mar 09 Posts: 59 Credit: 324,366 RAC: 0 Level
Scientific publications
|
[quote]I am running 6.6.15 and BOINC is running CUDA whilst installed as a service.
Sweet!
In that case, I'm certainly looking forward to the release of a stable 6.6.x client. I much prefer running BOINC as a service.
/quote]
I thought installing it as a service would fix the problem, but my tasks again failed out when logging into the machine as two users simultaneously on XP Home.
20/03/2009 16:34:22 GPUGRID Computation for task up108704-pYEpYI_US530000-0-10-ignasi_1 finished
20/03/2009 16:34:22 GPUGRID Starting WS10117-SH2_US_8-0-10-SH2_US_8270000_0
20/03/2009 16:34:23 GPUGRID Starting task WS10117-SH2_US_8-0-10-SH2_US_8270000_0 using acemd version 662
20/03/2009 16:34:24 GPUGRID Computation for task WS10117-SH2_US_8-0-10-SH2_US_8270000_0 finished
20/03/2009 16:34:24 GPUGRID Output file WS10117-SH2_US_8-0-10-SH2_US_8270000_0_1 for task WS10117-SH2_US_8-0-10-SH2_US_8270000_0 absent
20/03/2009 16:34:24 GPUGRID Output file WS10117-SH2_US_8-0-10-SH2_US_8270000_0_2 for task WS10117-SH2_US_8-0-10-SH2_US_8270000_0 absent
20/03/2009 16:34:24 GPUGRID Output file WS10117-SH2_US_8-0-10-SH2_US_8270000_0_3 for task WS10117-SH2_US_8-0-10-SH2_US_8270000_0 absent
20/03/2009 16:34:25 GPUGRID Started upload of up108704-pYEpYI_US530000-0-10-ignasi_1_0
20/03/2009 16:34:25 GPUGRID Started upload of up108704-pYEpYI_US530000-0-10-ignasi_1_1
Any suggestions how I can fix this? Should I disable the option on install to allow all users to run Boinc? Anything else?
Rob
|
|
|
|
Any ideas for the error 193 with the 185.13 drivers on Debian 64b ?
Just a check on your environment. I just brought up GPUGRID on my Ubuntu 8.10 64-bit machine. Brand new install. And every WU would error out in seconds. No error 193, but it gave me the "output file absent" message.
Check to see if the 32-bit runtime libraries are installed. If not, try that on the next set of WU's. After I installed the IA32 libraries, and the "microcode.ctl" package, the next set of WU's are running fine.
I'm not saying this will fix your problem, but it's something to check.
I'm running nvidia 180.11 drivers, since that's what Ubuntu "likes". And trying to upgrade to the released versions from Nvidia's website just wasn't going too well, so I reverted to "stock".
Also, I'm running an 8600GTS card, and BOINC 6.6.15.
Mark |
|
|
|
A long string of compute errors to report (task ids):
426788 4 errors
426924 3 errors
427001 4 errors, 1 canceled
427056 2 errors
427583 3 errors
425600 1 error
The error counts are as of the posting of this note ... obviously these may rise by tomorrow or when ever you all at the project look at these ...
These all look like "new" and "improved" tasks so ... maybe back to the drawing board ... |
|
|
ignasiSend message
Joined: 10 Apr 08 Posts: 254 Credit: 16,836,000 RAC: 0 Level
Scientific publications
|
Certainly one of the batches sent yesterday was corrupted.
*pYIpYV1*
I am canceling them out.
sorry about that,
ignasi |
|
|
Zydor Send message
Joined: 8 Feb 09 Posts: 252 Credit: 1,309,451 RAC: 0 Level
Scientific publications
|
The aborts came through ok - it picked up an additional two bad ones in the queue lurking from the same batch as the three that bombed out earlier this morning UTC time.
Many Thanks - Crunch On :)
Regards
Zy |
|
|
|
Help! I have 3 identical winXP AMD 4200 x2 systems with evga 9600gso 384mb cards that have been running GPUGRID happily for at least a month, and suddenly about a week ago ALL of them started giving errors simultaneously. All WUs now crash within seconds of starting. I have tried lots of fixes: going back to stock clocking, underclocking, rebooting, updating drivers, etc. Nothing makes any difference. There have been no changes to my system other than WinXP auto updates. My conclusion is that there was a change in GPUGRID WUs and the problem is not on my side. Here is my typical crash:
http://www.gpugrid.net/result.php?resultid=410378
<core_client_version>6.4.5</core_client_version>
<![CDATA[
<message>
The system cannot find the path specified. (0x3) - exit code 3 (0x3)
</message>
<stderr_txt>
# Using CUDA device 0
# Device 0: "GeForce 9600 GSO"
# Clock rate: 1350000 kilohertz
# Total amount of global memory: 402325504 bytes
# Number of multiprocessors: 12
# Number of cores: 96
Cuda error in file '..\cuda/cutil.h' in line 305 : out of memory.
Memory usage: host: bytes device: bytes
Assertion failed: 0, file ..\cuda/cutil.h, line 305
This application has requested the Runtime to terminate it in an unusual way.
Please contact the application's support team for more information.
----
TIA for any help. |
|
|
ignasiSend message
Joined: 10 Apr 08 Posts: 254 Credit: 16,836,000 RAC: 0 Level
Scientific publications
|
My conclusion is that there was a change in GPUGRID WUs and the problem is not on my side. Here is my typical crash:
http://www.gpugrid.net/result.php?resultid=410378
Nope. These WUs haven't changed at all...
i |
|
|
|
Why would it say "out of memory" even after a fresh reboot? That's very strange. Do you know how to use RivaTuner to watch the video memory usage? It's been discussed here a long time ago. If the search engine can't find it I could write it down again.
Edit: others are able to complete your WUs just fine. That tells us that the error should be on the side of your machines. Among your wingmen was noone with less than 512 MB.. which doesn't tell us anything.
MrS
____________
Scanning for our furry friends since Jan 2002 |
|
|
JoeSend message
Joined: 1 Sep 08 Posts: 37 Credit: 5,864,088 RAC: 0 Level
Scientific publications
|
WOW! Now I have 3 complete WUs with 6.6.17/WinXP and only one error...
Kind regards
Joe |
|
|
|
Did you change anything?
MrS
____________
Scanning for our furry friends since Jan 2002 |
|
|
JoeSend message
Joined: 1 Sep 08 Posts: 37 Credit: 5,864,088 RAC: 0 Level
Scientific publications
|
>Did you change anything?
Yes, I have a second GPU (GeForce 8800 GT) inside and made the update to 6.6.17. But in to moment ist 3:3... See here: http://www.gpugrid.net/results.php?hostid=29092
I belive that problem is inside XP! The same card works in Vista http://www.gpugrid.net/results.php?hostid=24499 without problems and the vista working card make errors in XP...
Kind regards
Joe |
|
|
|
I belive that problem is inside XP!
XP itself is not the problem, as thousands (?) of users are running with just fine. However, something you do on your XP is special and cuases these issues. Too bad I have run out of ideas..
MrS
____________
Scanning for our furry friends since Jan 2002 |
|
|
uBronan Send message
Joined: 1 Feb 09 Posts: 139 Credit: 575,023 RAC: 0 Level
Scientific publications
|
Hmm sadly again error but most nasty it was finished but when sending started it gave an error
Also not a reason given why it crashed also.
this one
So i guess still some problems |
|
|
jrobbioSend message
Joined: 13 Mar 09 Posts: 59 Credit: 324,366 RAC: 0 Level
Scientific publications
|
>Did you change anything?
Yes, I have a second GPU (GeForce 8800 GT) inside and made the update to 6.6.17. But in to moment ist 3:3... See here: http://www.gpugrid.net/results.php?hostid=29092
I belive that problem is inside XP! The same card works in Vista http://www.gpugrid.net/results.php?hostid=24499 without problems and the vista working card make errors in XP...
Kind regards
Joe
Thoughts:
- Have you used driver cleaner before installing a new driver? Driver Cleaner Professional Free. Newer paid version is here: http://www.drivercleaner.net/
- If it is XP home log in to safe mode and check the security permissions to your Boinc folders
- I'm a bit confused as to why the system is reporting that you have 2 x GTX295's when you say it is a Geforce 8800GT. Is that normal?
- Run a CPU/GPU stress test on your XP machine and see how it fare's. The new one that people are recommending is OCCT
- Check your memory
Regards,
Rob
|
|
|
JoeSend message
Joined: 1 Sep 08 Posts: 37 Credit: 5,864,088 RAC: 0 Level
Scientific publications
|
Now I have complete 2 more WUs with XP
No, because the newesz Nvidia driver was the first driver on a new XP
- If it is XP home log in to safe mode and check the security permissions to your Boinc folders
It's XP Prof. 32Bit. The permissions are ok and the dir is exclude from Antivir
- I'm a bit confused as to why the system is reporting that you have 2 x GTX295's when you say it is a Geforce 8800GT. Is that normal?/]OCCT[/url]
Yes, there is a GTX295 (not in SLI mode) and a 8800GT in this system
- Run a CPU/GPU stress test on your XP machine and see how it fare's. The new one that people are recommending is OCCT
- Check your memory
All components in this system are ok...
But in the Vista system I had a new error:
Cuda error: Kernel [scale_coordinate_kernel] failed in file 'ComputePme_kernels.cu' in line 23 : unknown error.
See more here: http://www.gpugrid.net/result.php?resultid=435480
Kind regards
Joe
|
|
|