Message boards : News : acemdshort application 8.15 - discussion
Author | Message |
---|---|
Dear all, | |
ID: 32034 | Rating: 0 | rate: / Reply Quote | |
As Thomas Edison said | |
ID: 32035 | Rating: 0 | rate: / Reply Quote | |
Get one 7.01 app for linux and failed. | |
ID: 32036 | Rating: 0 | rate: / Reply Quote | |
As Thomas Edison said That task used "Long runs (8-12 hours on fastest card) v6.18 (cuda42)", not "ACEMD beta version v7.02 (cuda42)". ____________ Reno, NV Team: SETI.USA | |
ID: 32054 | Rating: 0 | rate: / Reply Quote | |
2 of 2 failed: | |
ID: 32055 | Rating: 0 | rate: / Reply Quote | |
Out of 6 that ran, one completed and validated. | |
ID: 32057 | Rating: 0 | rate: / Reply Quote | |
Is there a particular version of driver equipped? Stupid autocorrect. That should be, "Is there a particular version of the driver required?" ____________ Reno, NV Team: SETI.USA | |
ID: 32059 | Rating: 0 | rate: / Reply Quote | |
703 is now built with CUDA 5.5, so driver 319.17 is the minimum required to run it. | |
ID: 32068 | Rating: 0 | rate: / Reply Quote | |
Looks like you get crashes even with 326.41. So maybe it is a problem with the app? | |
ID: 32070 | Rating: 0 | rate: / Reply Quote | |
Maybe post this also in the proper thread, sorry for double post: | |
ID: 32072 | Rating: 0 | rate: / Reply Quote | |
Several 7.03 MJHARVEY ran successfully: | |
ID: 32074 | Rating: 0 | rate: / Reply Quote | |
Ok for me, | |
ID: 32075 | Rating: 0 | rate: / Reply Quote | |
Ditto: | |
ID: 32076 | Rating: 0 | rate: / Reply Quote | |
I see that there is a new app for windows just posted today (7.03), so I decided to give it another shot. This time success! | |
ID: 32078 | Rating: 0 | rate: / Reply Quote | |
Good, 703 worked ok. There's a new revision 704; please try that out. | |
ID: 32080 | Rating: 0 | rate: / Reply Quote | |
Good, 703 worked ok. Indeed, my Titan did 440 (!) beta v7.03 workunits while I slept and not a single error. There's a new revision 704; please try that out. Some validate ok, some error out after a few seconds. http://www.gpugrid.net/result.php?resultid=7176884 failed http://www.gpugrid.net/result.php?resultid=7176874 ok http://www.gpugrid.net/result.php?resultid=7176867 failed http://www.gpugrid.net/result.php?resultid=7176799 ok http://www.gpugrid.net/result.php?resultid=7176798 failed http://www.gpugrid.net/result.php?resultid=7176782 failed I switched to DP mode for the last three workunits (which also limits the clock of the Titan as you know), but no difference apparently (one ok, two failed). http://www.gpugrid.net/result.php?resultid=7176773 ok http://www.gpugrid.net/result.php?resultid=7176914 failed http://www.gpugrid.net/result.php?resultid=7176734 failed This is with nVidia driver 326.41, BOINC 7.2.11 and on Windows 7 SP1 64bit. | |
ID: 32081 | Rating: 0 | rate: / Reply Quote | |
Is anyone seeing failure of 704 on a card that IS NOT at Titan or 780? | |
ID: 32082 | Rating: 0 | rate: / Reply Quote | |
Good, 703 worked ok. There's a new revision 704; please try that out. Over night, my TITAN ran 50x 7.03 tasks. All validated. It also ran 3x 7.04 tasks. All failed. Here is a sample: http://www.gpugrid.net/result.php?resultid=7177032 ____________ Reno, NV Team: SETI.USA | |
ID: 32083 | Rating: 0 | rate: / Reply Quote | |
Is anyone seeing failure of 704 on a card that IS NOT at Titan or 780? Just got 7 of them on my GTX460SE (314.22, WinXP), all ended and validated, no error. ____________ | |
ID: 32084 | Rating: 0 | rate: / Reply Quote | |
Is anyone seeing failure of 704 on a card that IS NOT at Titan or 780? It's working fine on my GTX 670. It is a CUDA5.5 app? (according to its name in the BOINC manager it's a CUDA4.2 app, but the BOINC manager downloaded a couple of CUDA5.5 dll's with this beta app) | |
ID: 32085 | Rating: 0 | rate: / Reply Quote | |
I've run 10+ tasks on my machine with a 590 (dual GPU) and a 580. No problems at all. 320.49 driver. Should I try the 326.41 BETA driver too? | |
ID: 32086 | Rating: 0 | rate: / Reply Quote | |
7.04 all failed for my Titan | |
ID: 32087 | Rating: 0 | rate: / Reply Quote | |
3 failed, and 1 completed on 780 w/ latest drivers | |
ID: 32088 | Rating: 0 | rate: / Reply Quote | |
Is anyone seeing failure of 704 on a card that IS NOT at Titan or 780? No failures on a GT650M (mobile, Kepler, 384 CUs), BOINC 7.0.64, driver 320.49. | |
ID: 32089 | Rating: 0 | rate: / Reply Quote | |
Ok, thanks. Looks like we'll need two apps, CUDA 5.5 (cc 3.5 or driver > 325) and CUDA 4.2 for everything else. | |
ID: 32090 | Rating: 0 | rate: / Reply Quote | |
Do you guys think there will be any performance improvements with 5.5? | |
ID: 32091 | Rating: 0 | rate: / Reply Quote | |
Is anyone seeing failure of 704 on a card that IS NOT at Titan or 780?GTX 560Ti W7 64bit driver 320.19: no failures GTX 260 rev A.2 (65nm) W7 64bit driver 326.80: no failures! That is great. | |
ID: 32092 | Rating: 0 | rate: / Reply Quote | |
Only for cc 3.5, in the sense that those cards will now work! | |
ID: 32093 | Rating: 0 | rate: / Reply Quote | |
acemdbeta version 705 now has two variants - | |
ID: 32094 | Rating: 0 | rate: / Reply Quote | |
7.05 OK for my TITAN. | |
ID: 32095 | Rating: 0 | rate: / Reply Quote | |
Heading to work so can't test them, but nvidia today released driver 326.80. | |
ID: 32097 | Rating: 0 | rate: / Reply Quote | |
Two 7.05(cuda55) downloaded and ran ok. | |
ID: 32098 | Rating: 0 | rate: / Reply Quote | |
Unfortunately, Windows users with a cc 3.5 card, Windows, and driver >315.25 and < 326.41 will get work but should expect frequent crashes. MJH, understood and IMHO good enough for beta. But can't you take the OS into the equation and send 705-55 only to Windows hosts with cc >= 1.3 and driver >=326.41? (And 705-42 on Windows of course only to hosts with cc >=1.3 and <3.5 and driver between 295.41 and 326.41?) I can reselect all the options preferences calculation GPUGRID my account to receive all types of units ? Zarck, don't think so, at least not for Titan or 780 until a Cuda 5.5 application is released for short and long runs (non-beta): http://www.gpugrid.net/apps.php | |
ID: 32099 | Rating: 0 | rate: / Reply Quote | |
Unfortunately, the 7.05 beta still silently fails here (http://www.gpugrid.net/forum_thread.php?id=3437). Also, my attempts to attach tracing to the short-lived process when it comes available while true; do pid=$(pgrep -f acemd) && pstree -alcpsU $(pidof boinc) && strace64 -vfp $pid; done are also failing: strace64: attach: ptrace(PTRACE_ATTACH, ...): Operation not permitted Same with while true; do pid=$(pgrep -f acemd) && gdb -p $pid; done GNU gdb (GDB) 7.6 (Debian 7.6-5) This GDB was configured as "i486-linux-gnu". Attaching to process 8561 warning: process 8561 is a zombie - the process has already terminated ptrace: Operation not permitted. I understand that you can not put effort into supporting each and every thinkable system, but it must be a small problem and if I'd better understand the cause it'd be easy to fix imho. Is there the possibility to put a tiny bit more debugging output in the beta versions? | |
ID: 32101 | Rating: 0 | rate: / Reply Quote | |
Ubuntu 13.04 64 bit Titan. 325 driver got 7.05 cuda55. | |
ID: 32102 | Rating: 0 | rate: / Reply Quote | |
Dear eMPee584, | |
ID: 32103 | Rating: 0 | rate: / Reply Quote | |
Zarck, | |
ID: 32105 | Rating: 0 | rate: / Reply Quote | |
So far all WU's of ACEMD beta version v7.05 (cuda55) have been OK what I've gotten http://www.gpugrid.net/results.php?hostid=157253 | |
ID: 32106 | Rating: 0 | rate: / Reply Quote | |
70 ACEMD beta version 7.05(cuda55) downloaded and ran ok. No errors. | |
ID: 32107 | Rating: 0 | rate: / Reply Quote | |
120+ beta v7.05 (cuda55) and no errors. | |
ID: 32108 | Rating: 0 | rate: / Reply Quote | |
7.05 works fine on both my TITAN and on my machine with the 590 and 580. | |
ID: 32109 | Rating: 0 | rate: / Reply Quote | |
I wonder what will happen when I eventually move them all into a single machine? I wouldn't expect any problems with the app. MJH | |
ID: 32110 | Rating: 0 | rate: / Reply Quote | |
The acemd 7.05-55 app is running fine on my GTX 670 (WinXPx64, v326.80). | |
ID: 32111 | Rating: 0 | rate: / Reply Quote | |
Yes, TEST10 are 100x longer. | |
ID: 32112 | Rating: 0 | rate: / Reply Quote | |
I suppose I get the 7.05 app when BOINC request new beta work? | |
ID: 32113 | Rating: 0 | rate: / Reply Quote | |
You need to be requesting beta work and subscribed to application "acemdbeta" | |
ID: 32114 | Rating: 0 | rate: / Reply Quote | |
I wonder what will happen when I eventually move them all into a single machine? Yes, I believe the apps work. What I am questioning about is BOINC. Is it smart enough to request both kinds of work, and keep straight the two virtual queues of downloaded tasks? I know it works for ATI vs. nVidia vs. CPU tasks. But what about two different apps, based on slightly different versions of CUDA requirements? Will it always assign the right task/app to the right GPU? ____________ Reno, NV Team: SETI.USA | |
ID: 32115 | Rating: 0 | rate: / Reply Quote | |
Will it always assign the right task/app to the right GPU? You'll get the cuda-55 app for all of them, assuming the driver is >=326.41 Matt | |
ID: 32116 | Rating: 0 | rate: / Reply Quote | |
My Titan has run 4 TEST10's - all failed in around 61 seconds. | |
ID: 32117 | Rating: 0 | rate: / Reply Quote | |
Will it always assign the right task/app to the right GPU? Ah! I was thinking that the 580/580 were only 4.2. But I see they are 5.5 like TITAN. Got it. ____________ Reno, NV Team: SETI.USA | |
ID: 32118 | Rating: 0 | rate: / Reply Quote | |
660TI/Driver 310.90. XP Pro SP1 32 bit. CUDA 4.2 version 7.05 betas are now completing with out errors. 9 in a row and counting. Great work techs. | |
ID: 32119 | Rating: 0 | rate: / Reply Quote | |
Okay, I have all three cards in a single machine, and everything seems to be working fine. Just FYI, running into a 31 tasks/day limit. | |
ID: 32120 | Rating: 0 | rate: / Reply Quote | |
http://www.gpugrid.net/workunit.php?wuid=4700082 | |
ID: 32121 | Rating: 0 | rate: / Reply Quote | |
Overnight 89 tasks with only 2 errors with test10, but the wingman with several cards have also errors there. | |
ID: 32122 | Rating: 0 | rate: / Reply Quote | |
I noticed a few things in the last hour. | |
ID: 32123 | Rating: 0 | rate: / Reply Quote | |
Driver nVidia 326.80 Beta, | |
ID: 32124 | Rating: 0 | rate: / Reply Quote | |
I have now two TEST10 that resulted okay. One on the 770 with driver 320.49 (cuda42) and one on the 660 with driver 326.41 (cuda55). | |
ID: 32126 | Rating: 0 | rate: / Reply Quote | |
Just received 1 that had failed on 2 other crunchers after about 1 minute. I'm 6 minutes in with only 5% completed but it is still progressing. All the ones before this one finished in about 1 minute. At the present rate this one will take close to 2 hours if it completes. Should be interesting. Task errored out at 50 minutes. Seems like these beta tasks that run beyond a minute or so end in an error. | |
ID: 32127 | Rating: 0 | rate: / Reply Quote | |
Just received 1 that had failed on 2 other crunchers after about 1 minute. I'm 6 minutes in with only 5% completed but it is still progressing. All the ones before this one finished in about 1 minute. At the present rate this one will take close to 2 hours if it completes. Should be interesting. No, not all tasks, the TEST10 are longer. MJH said in this thread that TEST10 is 100 times longer. Indeed are around 2 hours on my rigs. Some errorred out, some did finish good. Your 310.90 driver seems to be the issue for the long ones. reead MJH post a way back in this thread for more information. ____________ Greetings from TJ | |
ID: 32128 | Rating: 0 | rate: / Reply Quote | |
After i update driver 326.41: | |
ID: 32129 | Rating: 0 | rate: / Reply Quote | |
Another task (cuda55) with the same error: http://www.gpugrid.net/result.php?resultid=7194889 | |
ID: 32138 | Rating: 0 | rate: / Reply Quote | |
Well, that shouldn't be happening. Could you PM me you app_info.xml please? | |
ID: 32142 | Rating: 0 | rate: / Reply Quote | |
Hello MJH, | |
ID: 32143 | Rating: 0 | rate: / Reply Quote | |
Interesting. I do not have one. My system in this case is: I7 Windows7 64 Boinc 7.2.5 NVIDIA GeForce GTX 680 (2048MB) driver: 326.41 And Tthrottle to proctect my cpu's from overheating. ____________ | |
ID: 32145 | Rating: 0 | rate: / Reply Quote | |
OK, now I see why it's committing hara kiri. 706 coming soon.. MJH | |
ID: 32147 | Rating: 0 | rate: / Reply Quote | |
I can find 3 type of error messages this one seems to be the most at hand: | |
ID: 32151 | Rating: 0 | rate: / Reply Quote | |
707 is now live. Should be an end to the "max time elapsed" problem. | |
ID: 32152 | Rating: 0 | rate: / Reply Quote | |
You broke something: | |
ID: 32153 | Rating: 0 | rate: / Reply Quote | |
Indeed I did. 708. | |
ID: 32154 | Rating: 0 | rate: / Reply Quote | |
Indeed I did. 708. I had no luck with 7.08..... | |
ID: 32155 | Rating: 0 | rate: / Reply Quote | |
Indeed I did. 708. Three more in a row is more than bad luck.... | |
ID: 32157 | Rating: 0 | rate: / Reply Quote | |
All 7.08 WU's are failing: | |
ID: 32158 | Rating: 0 | rate: / Reply Quote | |
Didn't work for me either: | |
ID: 32159 | Rating: 0 | rate: / Reply Quote | |
All 7.07 and 7.08 in errors, | |
ID: 32160 | Rating: 0 | rate: / Reply Quote | |
709. | |
ID: 32161 | Rating: 0 | rate: / Reply Quote | |
709. Still runs to an error right at the start. | |
ID: 32162 | Rating: 0 | rate: / Reply Quote | |
710 for 55 with extra debug. | |
ID: 32163 | Rating: 0 | rate: / Reply Quote | |
I decided to grab a couple of these 7.10 debug versions. | |
ID: 32164 | Rating: 0 | rate: / Reply Quote | |
It would be AWESOME if you could keep the "GPU Name" listed in the stderr.txt for the tasks. Maybe consider printing it each time the task is started/restarted (since it could restart on a different GPU!) Currently, I don't think I have any way of knowing which GPU(s) worked on the task. This is a very good idea. I've requested that in the "Wish List" topic before. | |
ID: 32165 | Rating: 0 | rate: / Reply Quote | |
I decided to grab a couple of these 7.10 debug versions. The same happened on my host. | |
ID: 32166 | Rating: 0 | rate: / Reply Quote | |
Same on my Titan - stopped at mark 15. | |
ID: 32167 | Rating: 0 | rate: / Reply Quote | |
Thanks guys. 711 has even more debug. | |
ID: 32168 | Rating: 0 | rate: / Reply Quote | |
Sweet news that the GPU printout is staying - hopefully you print it each time the task is restarted too, as I have 3 total GPUs in my system. Things that might prove useful: | |
ID: 32169 | Rating: 0 | rate: / Reply Quote | |
Same on the Titan - di mark 13: | |
ID: 32170 | Rating: 0 | rate: / Reply Quote | |
712. Almost at the bottom turtle now. | |
ID: 32171 | Rating: 0 | rate: / Reply Quote | |
For v7.12, for me, they got as far as: | |
ID: 32172 | Rating: 0 | rate: / Reply Quote | |
And yes, the GPU+version debug will be staying. If you want something else there too, now is the time to ask for it! It would be nice to have: - the manufacturer (vendor) ID - the type of the GPU - the number of the GPU - the memory size of the GPU - the clock rate of the GPU in the stderr output file. Something like in this post. | |
ID: 32173 | Rating: 0 | rate: / Reply Quote | |
yup - CPI MRK 8 | |
ID: 32174 | Rating: 0 | rate: / Reply Quote | |
713 now | |
ID: 32175 | Rating: 0 | rate: / Reply Quote | |
7.13 appears to start (and make progress) on each of my 2 GPUs (GTX 660 Ti, GTX 460). It does not crash immediately. | |
ID: 32176 | Rating: 0 | rate: / Reply Quote | |
Better kill that - it will make way too much debug. | |
ID: 32177 | Rating: 0 | rate: / Reply Quote | |
You weren't kidding - 5 minutes created 140MB debug :) | |
ID: 32178 | Rating: 0 | rate: / Reply Quote | |
For me, 7.14 is running/progressing at an appropriate GPU Usage, without any immediate crash. | |
ID: 32179 | Rating: 0 | rate: / Reply Quote | |
One 7.13 finished ok in about 7 minutes on my Titan. | |
ID: 32180 | Rating: 0 | rate: / Reply Quote | |
One 7.13 finished ok in about 12 minutes on my gtx 680 | |
ID: 32181 | Rating: 0 | rate: / Reply Quote | |
Three 7.14's completed ok. | |
ID: 32182 | Rating: 0 | rate: / Reply Quote | |
My 7.14 task finished successfully too, but I don't know what GPU was used - that debug info has now disappeared :) | |
ID: 32183 | Rating: 0 | rate: / Reply Quote | |
the webpage only show the last few lines of the debug. | |
ID: 32184 | Rating: 0 | rate: / Reply Quote | |
One 7.14 finished ok in about 1 minute. GPU utilization ~ 72% | |
ID: 32185 | Rating: 0 | rate: / Reply Quote | |
One short 7.14 finished fine. | |
ID: 32186 | Rating: 0 | rate: / Reply Quote | |
One short 7.14 finished fine. The 7.14 has 10% lower GPU utilization than the 7.15, I'm going to abort the 7.14.... | |
ID: 32187 | Rating: 0 | rate: / Reply Quote | |
I've received a short 7.16 :) | |
ID: 32188 | Rating: 0 | rate: / Reply Quote | |
I've received a short 7.16 :) It's finished ok, but the GPU usage is still 10% lower than the 7.15. | |
ID: 32189 | Rating: 0 | rate: / Reply Quote | |
I've received a short 7.16 :) 7.17 is available :) I've aborted the 7.16.... | |
ID: 32190 | Rating: 0 | rate: / Reply Quote | |
I've received a short 7.16 :) The GPU utilization of the 7.17 seems to be normal (95%) | |
ID: 32191 | Rating: 0 | rate: / Reply Quote | |
On my eVGA GTX 660 Ti FTW (with the CPU fully loaded by CPU tasks), the 7.17 MJHARVEY_TEST10 task doesn't even reach 80% GPU utilization. Also, the drivers don't think the GPU is under enough load to ramp up the clock via GPU Boost. So, at 79% GPU utilization and 74% power utilization, the GPU stays at my normal 3D clock of 1045 Mhz, instead of the 3D boost clock of 1241 Mhz that I expect (and get from other GPUGrid tasks). | |
ID: 32192 | Rating: 0 | rate: / Reply Quote | |
On my eVGA GTX 660 Ti FTW (with the CPU fully loaded by CPU tasks), the 7.17 MJHARVEY_TEST10 task doesn't even reach 80% GPU utilization. Also, the drivers don't think the GPU is under enough load to ramp up the clock via GPU Boost. So, at 79% GPU utilization and 74% power utilization, the GPU stays at my normal 3D clock of 1045 Mhz, instead of the 3D boost clock of 1241 Mhz that I expect (and get from other GPUGrid tasks). You should leave one CPU thread free per GPU (so on your i7-965x with 3 GPUs you should set the multiprocessor setting to 63% CPU). | |
ID: 32193 | Rating: 0 | rate: / Reply Quote | |
Nope. That's not how I roll. | |
ID: 32194 | Rating: 0 | rate: / Reply Quote | |
No problem with GPU boost functioning correctly for me. | |
ID: 32195 | Rating: 0 | rate: / Reply Quote | |
No problem with GPU boost functioning correctly for me. Give the 7.19 a try. One of my 7.17 long has finished, 3 more are near to completion. | |
ID: 32196 | Rating: 0 | rate: / Reply Quote | |
No problem with GPU boost functioning correctly for me. There goes the 7.20 :) It seems to be running fine on my hosts. But I'm going to sleep in 10 minutes, and I'm going to resume the non-beta tasks I've suspended to give the beta units priority. | |
ID: 32197 | Rating: 0 | rate: / Reply Quote | |
A 7.20 has been running ok so far on my Titan - 25 minutes in. Looks like it wants to run about 90 minutes. | |
ID: 32198 | Rating: 0 | rate: / Reply Quote | |
MJH, you must be getting sleepy: | |
ID: 32199 | Rating: 0 | rate: / Reply Quote | |
A 7.20 has been running ok so far on my Titan - 25 minutes in. Looks like it wants to run about 90 minutes. That's good news indeed! How much is the GPU usage? (on Titan and on GTX780) | |
ID: 32200 | Rating: 0 | rate: / Reply Quote | |
Titan's GPU Load is 72%. | |
ID: 32201 | Rating: 0 | rate: / Reply Quote | |
Titan's GPU Load is 72%. This should be higher at least by 20% with real workunits. | |
ID: 32202 | Rating: 0 | rate: / Reply Quote | |
Agreed, unless it turns out we can run multiple instances? | |
ID: 32203 | Rating: 0 | rate: / Reply Quote | |
80% on w7 | |
ID: 32204 | Rating: 0 | rate: / Reply Quote | |
Completed and validated my first MJH TEST10 | |
ID: 32205 | Rating: 0 | rate: / Reply Quote | |
GTX460SE, WinXP, 314.11, Boinc 6.10.58 | |
ID: 32206 | Rating: 0 | rate: / Reply Quote | |
Are you saying it failed? | |
ID: 32207 | Rating: 0 | rate: / Reply Quote | |
I’m seeing 85% GPU usage on a GTX660 and 80% usage on a GTX660Ti. ACEMD beta version 7.20 (cuda55).
<core_client_version>7.0.64</core_client_version> <![CDATA[ <stderr_txt> # GPU [GeForce GTX 660] Platform [Windows] Rev [3170M] VERSION [55] # SWAN Device 1 : # Name : GeForce GTX 660 # ECC : Disabled # Global mem : 2048MB # Capability : 3.0 # PCI ID : 0000:02:00.0 # Device clock : 1032MHz # Memory clock : 3004MHz # Memory width : 192bit # Driver version : r325_00 # Time per step (avg over 250000 steps): 3.487 ms # Approximate elapsed time for entire WU: 871.837 s called boinc_finish </stderr_txt> ]]>
| |
ID: 32208 | Rating: 0 | rate: / Reply Quote | |
Are you saying it failed? Nope, completed and validated - just the GPU utilization was low. ____________ | |
ID: 32209 | Rating: 0 | rate: / Reply Quote | |
Morning all. | |
ID: 32210 | Rating: 0 | rate: / Reply Quote | |
The test WUs I care about are now: NATHAN_s1p_test_titans1 | |
ID: 32211 | Rating: 0 | rate: / Reply Quote | |
The test WUs I care about are now: NATHAN_s1p_test_titans1 I've received a couple of NATHAN_s1p_test_titans2 workunits. They have by far the highest GPU usage on GTX670 and GTX680 (97-99%). | |
ID: 32216 | Rating: 0 | rate: / Reply Quote | |
Mine are almost complete, the titan WUs, currently @ 89% usage, which is where they should be for W7. Time to complete will be about 1:10 total. | |
ID: 32219 | Rating: 0 | rate: / Reply Quote | |
The test WUs I care about are now: NATHAN_s1p_test_titans1 As some of you have noted, after the server/database error, they are now NATHAN_s1p_test_titans2 | |
ID: 32220 | Rating: 0 | rate: / Reply Quote | |
The test WUs I care about are now: NATHAN_s1p_test_titans1 Yes http://www.gpugrid.net/result.php?resultid=7205032 @+ *_* Crédit 60,900.00 ____________ | |
ID: 32221 | Rating: 0 | rate: / Reply Quote | |
I have one finished too in 2h43m. A new one is running at 94% GPU load and estimated time by BOINC 46h20m? 5% done in 8m, so BOINC needs arithmetic lessons :) | |
ID: 32222 | Rating: 0 | rate: / Reply Quote | |
The test WUs I care about are now: NATHAN_s1p_test_titans1 You should fine tune your system (for example: leave a core free for GPUGrid), because 5pot's GTX780 finished a similar workunit in 4200 seconds, while it took 5000 seconds on your Titan. | |
ID: 32224 | Rating: 0 | rate: / Reply Quote | |
Finished two ok in about 1 hr. 24 min.: | |
ID: 32225 | Rating: 0 | rate: / Reply Quote | |
The test WUs I care about are now: NATHAN_s1p_test_titans1 I changed the speed of my Titan by enabling and disabling the double precision mode, being calculated, this probably explains the slow. @+ *_* ____________ | |
ID: 32226 | Rating: 0 | rate: / Reply Quote | |
I changed the speed of my Titan by enabling and disabling the double precision mode Didn't read the entire thread, just in case it hasn't been mentioned before: enabling double precision mode disables turbo completely and hence locks the clocks speed at the base level - independently of double precision being used or not. This is a really crude solution from nVidia, they should have just let turbo cap the power as usual (from my point of view.. but I don't design these cards). MrS ____________ Scanning for our furry friends since Jan 2002 | |
ID: 32238 | Rating: 0 | rate: / Reply Quote | |
The test WUs I care about are now: NATHAN_s1p_test_titans1 Should those who don't have Titan or GTX780 opt out beta units? | |
ID: 32243 | Rating: 0 | rate: / Reply Quote | |
I have completed 3 of the I4R9-NATHAN_s1p_test_titans2 tasks successfully. The test WUs I care about are now: NATHAN_s1p_test_titans1 I think we can stay in the beta. The test, as far as I know, is if the application will work for any of the supported GPUs, including the Titan and GTX780. | |
ID: 32244 | Rating: 0 | rate: / Reply Quote | |
No, it's correct. It is reporting the SM clock, which on pre-Kepler cards is double the main clock. cf http://www.nvidia.es/object/product-geforce-gtx-460-es.html
Yeah, that didn't look right. I'll see if it's possible. MJH | |
ID: 32245 | Rating: 0 | rate: / Reply Quote | |
1) I noticed that it seems that my GTX 460's clock is erroneously doubled -- Take a look at the GTX 460 result below. Device clock should have been 763 Mhz, not 1526 Mhz. Both GPU-Z and Precision-X show 763 Mhz. Is your detection algorithm bugged somehow? There are two clock rates in the Fermi based cards (besides the memory clock rate): the core clock, and the shader clock. You are talking about the core clock, and the debug info shows the shader clock. As you can see, the shader (or CUDA core) frequency has a fixed (by hardware) double rate of the core clock frequency in the Fermi based cards 2) Also, for showing the driver version, although knowing the "branch" (r325_00) is handy, it'd be much better to show the actual driver version (326.80 in my case). Could you please make a change to include that? +1 | |
ID: 32246 | Rating: 0 | rate: / Reply Quote | |
No, please stay in. It's important that it is tested as widely as possible before I push it out to the production queues. MJH | |
ID: 32247 | Rating: 0 | rate: / Reply Quote | |
Then could you please consider having it say something better than "Device clock"? I currently think "GPU Core clock" when you say "Device clock". Maybe "Processor Clock (MHz)", per the English specs found here: http://www.geforce.com/hardware/desktop-gpus/geforce-gtx-460/specifications
Thank you -- Knowing the real driver version would be tremendously useful I think! | |
ID: 32248 | Rating: 0 | rate: / Reply Quote | |
"You should fine tune your system (for example: leave a core free for GPUGrid), because 5pot's GTX780 finished a similar workunit in 4200 seconds, while it took 5000 seconds on your Titan." | |
ID: 32249 | Rating: 0 | rate: / Reply Quote | |
Thats its default boost clock. Different cards have different boost clocks depending on the quality of the chip. | |
ID: 32250 | Rating: 0 | rate: / Reply Quote | |
"You should fine tune your system (for example: leave a core free for GPUGrid), because 5pot's GTX780 finished a similar workunit in 4200 seconds, while it took 5000 seconds on your Titan." Thanks to the detailed debug info, I've noticed the overclocking of the GTX780 too, still your running times should be less (even less than that overclocked GTX780's running times). I'll explain why: The GTX 780's clock rate is higher by 16.457% (1019/875) than your Titan's, but your Titan has 16.666% (2688/2304) more shaders (CUDA cores) than a GTX780. In other words: If we multiply the number of shaders and their frequency the result is a theoretical performance index: Your standard GTX Titan: 875MHz*2688=2352000 The overclocked GTX780:1019MHz*2304=2347776 As you can see, in theory your GTX Titan should be a little (by 1.8%) faster than the overclocked GTX780. Possibly the GPUGrid application can't feed that much CUDA cores. The question is why can't the application do that? Maybe your CPU is too busy to do that; Or this is software-related. To 5pot: Please share your secret with us: How could your overclocked GTX780 be faster than a factory clocked GTX Titan? What is the exact type of your GTX780? Are they water cooled? | |
ID: 32251 | Rating: 0 | rate: / Reply Quote | |
No secret. I do a lot of research before I spend this type of $$$. Best GPU I've found, and apparently others have as well is the EVGA 780 ACX. | |
ID: 32253 | Rating: 0 | rate: / Reply Quote | |
FWIW, my TITAN does them in 4300-4600 with no threads reserved. I will try reserving a thread to see if there is any difference. This is not in the DP-enhanced mode. So it will automatically OC up as must as temps allow. | |
ID: 32255 | Rating: 0 | rate: / Reply Quote | |
I check my Titan, the fan was full of dust is running at 836 Mhz, once cleaned, the card now runs at 928 MHz, the performance should be better. | |
ID: 32257 | Rating: 0 | rate: / Reply Quote | |
Should make quite a large difference. Please post a new update with latest times. | |
ID: 32264 | Rating: 0 | rate: / Reply Quote | |
Gainward GTX 780 Phantom (GLH) | |
ID: 32265 | Rating: 0 | rate: / Reply Quote | |
In the last 24 hours my GTX660 got 10 ACEMD v8.00 WU´s and they all finished fine. | |
ID: 32267 | Rating: 0 | rate: / Reply Quote | |
That's not a beta app | |
ID: 32269 | Rating: 0 | rate: / Reply Quote | |
FWIW, my TITAN does them in 4300-4600 with no threads reserved. I will try reserving a thread to see if there is any difference. This is not in the DP-enhanced mode. So it will automatically OC up as must as temps allow. After more testing, I see that my TITAM takes 4520 seconds with or without a reserved thread. No difference at all. ____________ Reno, NV Team: SETI.USA | |
ID: 32271 | Rating: 0 | rate: / Reply Quote | |
That is good news. I have done the same test with my 660 and saw the GPU load fluctuating from 88% to 1-3% and longer run time. | |
ID: 32272 | Rating: 0 | rate: / Reply Quote | |
Correct! | |
ID: 32273 | Rating: 0 | rate: / Reply Quote | |
FWIW, my TITAN does them in 4300-4600 with no threads reserved. I will try reserving a thread to see if there is any difference. This is not in the DP-enhanced mode. So it will automatically OC up as must as temps allow. That's goos news. Zarck's Titan still needs 4700 secs. Then maybe the AMD architecture is to blame for that. The AMD FX CPU don't have integrated PCIe controller, it uses a Hypertransport link to the North Bridge. The AMD 990FX NB has "only" 2x PCIe 2.0 x16 support, while the Intel i7-3770 and 4770 has (only one) integrated PCIe 3.0 x16. The PCIe 2.0 x16 is quite enough for the GK104 (up to the GTX 680 and 770), however it could be hindering the performance of the GK110 based cards (GTX780 and Titan), because they have 50% and 75% (respectively) more CUDA cores than a GK104 based card. | |
ID: 32274 | Rating: 0 | rate: / Reply Quote | |
Can we continue the discussion about CPU Integrated controller vs AMD chipset/CPU in the CPU Comparisons - general open discussion thread? | |
ID: 32276 | Rating: 0 | rate: / Reply Quote | |
I've had 4 NATHAN_s1p_test_titans2 errors on a GTX660Ti and 2 successes. Errors are of the form, | |
ID: 32278 | Rating: 0 | rate: / Reply Quote | |
The Beta app is now live as ACEMD-Short. Version 800 | |
ID: 32290 | Rating: 0 | rate: / Reply Quote | |
Sweet! Did you get the "Driver version" thing figured out so that it'll show 326.80? | |
ID: 32291 | Rating: 0 | rate: / Reply Quote | |
The Beta app is now live as ACEMD-Short. Version 800 Super. My Titan is crunching first Nathan's. | |
ID: 32292 | Rating: 0 | rate: / Reply Quote | |
Not yet. That'll follow later. MJH | |
ID: 32293 | Rating: 0 | rate: / Reply Quote | |
Wow even on Fermi the new app is ~120secs faster ^^ | |
ID: 32294 | Rating: 0 | rate: / Reply Quote | |
Shouldnt there be a tick box for CUDA 5.5 under short runs, or am I missing something? | |
ID: 32295 | Rating: 0 | rate: / Reply Quote | |
It's selected automatically, based on client driver version. | |
ID: 32301 | Rating: 0 | rate: / Reply Quote | |
Looking like 90 min for the short runs. | |
ID: 32303 | Rating: 0 | rate: / Reply Quote | |
I think that renaming threads is not nice. | |
ID: 32305 | Rating: 0 | rate: / Reply Quote | |
In a week or so. MJH | |
ID: 32306 | Rating: 0 | rate: / Reply Quote | |
OK its good for cc1.3 cards too. Tried the 285GTX witch normally is retired from my side from GPUGrid, but now as Test good enough ^^ 26,241.59 secs = 7,3h | |
ID: 32310 | Rating: 0 | rate: / Reply Quote | |
Load on the graphics card to O% increase at 0%, I left the unit after 45 minutes. | |
ID: 32466 | Rating: 0 | rate: / Reply Quote | |
You need app 8.01 and then the Noelia's run smooth as ever. | |
ID: 32469 | Rating: 0 | rate: / Reply Quote | |
8.02 makes NOELIA tasks run even smootherer | |
ID: 32483 | Rating: 0 | rate: / Reply Quote | |
There is a 8.04 app in the Beta queue. | |
ID: 32559 | Rating: 0 | rate: / Reply Quote | |
There is a new acemdshort application, version 8.11 (Windows only). Since this is (hopefully!) the last app revision, now's a good time to summarise the changes in the 800 series over the older app: | |
ID: 32681 | Rating: 0 | rate: / Reply Quote | |
Here is a list of compute errors codes for the 8xx series applications and their meanings. If you encounter a new one, or have a question or observation about the circumstances of an error, please PM me. | |
ID: 32700 | Rating: 0 | rate: / Reply Quote | |
The beta's are gone and the Santi's start to error again on my GTX660. | |
ID: 32719 | Rating: 0 | rate: / Reply Quote | |
Short runs (2-3 hours on fastest card) v8.11 (cuda55) * -97 "Simulation has become unstable". This indicates that the scientific simulation that the application performs has gone wrong. If this happens as soon as the WU starts, there may be a problem with the WU. If it happens frequently or after the WU has made some progress, a hardware problem is strongly indicated. Check GPU temperatures (now reported in stderr) and for the presence of other GPU-using programs (eg games) - MJH Your temps look reasonable (mostly around 66°C). I suggest you restart the system, and if errors continue to occur look into what else might be causing this problem (games, video programs, antivirus scans, updates...). You might want to note the failure time and check your logs to see what was happening at that time or just before. Both times you had the error, the stderr log ends in, # GPU 0 Current Temp: 64 C # The simulation has become unstable. Terminating to avoid lock-up (1) The slight GPU temperature drop from 66°C to 64°C might indicate resource consumption by something else on your system just before the WU was ended, or the GPU temperature might just have dropped as the WU was ended? ____________ FAQ's HOW TO: - Opt out of Beta Tests - Ask for Help | |
ID: 32720 | Rating: 0 | rate: / Reply Quote | |
The new app does a much better job at determining when a WU has gone bad and aborting. Previously this might have manifested itself as non-specific crash/driver reset. In some circumstances it may be possible to attempt recovery from this failure. Expect a new beta trying an idea out later today. MJH | |
ID: 32721 | Rating: 0 | rate: / Reply Quote | |
Thanks for the information skgiven (and moving my post). | |
ID: 32723 | Rating: 0 | rate: / Reply Quote | |
Hello: I have finished a task well 8.11 but comment some weird stuff. | |
ID: 32735 | Rating: 0 | rate: / Reply Quote | |
Execution times I see them too long if I compare with other short tasks. I think that something does not work very well. Greetings. That is correct Carlesa25, there are problems with. There is a lengthy thread about this. Its this on: http://www.gpugrid.net/forum_thread.php?id=3450 If you start reading at the first post, you will quickly understand what is wrong. ____________ Greetings from TJ | |
ID: 32737 | Rating: 0 | rate: / Reply Quote | |
I just had an 8.13 task result in: | |
ID: 32743 | Rating: 0 | rate: / Reply Quote | |
Side effect of having more code blocked out in critical sections. As the article you found indicates, prompt terminating on suspend requires the monitoring thread to wake up while the app thread is outside a critical region. The only way this is going to get fixed to change the dumb way the boinc client lib blugeons the app process to death, and give the app opportunity to close down gracefully. This will take a bit of work, but it's high on the Todo list. MJH | |
ID: 32745 | Rating: 0 | rate: / Reply Quote | |
Thanks. | |
ID: 32746 | Rating: 0 | rate: / Reply Quote | |
Jacob, | |
ID: 32747 | Rating: 0 | rate: / Reply Quote | |
Well, for my situation there, it was an 8.11 that caused the problem. | |
ID: 32749 | Rating: 0 | rate: / Reply Quote | |
I have tested it on a NOELIA_INS task in order to get a beta. The suspending and starting again worked. (Not getting beta WU as it knew that a task was suspended :( ) | |
ID: 32755 | Rating: 0 | rate: / Reply Quote | |
The only way this is going to get fixed to change the dumb way the boinc client lib blugeons the app process to death, and give the app opportunity to close down gracefully. You have allies in the BOINC community. Eric Korpela of SETI@home wrote (on 13 Nov 2008 - unfortunately in a private forum I can't link): Yes, the terminate with no mercy policy sucks and we should find if there is a way around it, or at least a way to allow I/O to finish. About time we got round to fixing that... | |
ID: 32757 | Rating: 0 | rate: / Reply Quote | |
First MJHarvey_Crash beta just finished. | |
ID: 32760 | Rating: 0 | rate: / Reply Quote | |
Your card mostly ran at 58°C, so it wasn't overly taxed by the Noelia_Klebe WU. | |
ID: 32766 | Rating: 0 | rate: / Reply Quote | |
acemdshort is now updated to 8.14. This version has improved stability during suspend/resume. | |
ID: 32815 | Rating: 0 | rate: / Reply Quote | |
My GTX660Ti also clocks up to ~1200MHz. That said I also get the odd error from it and other similar cards. Wish it was just an odd now or than error. I haven't had 1 NOELIA_KLEBE beta complete and validate yet. The all end with the time exceeded error after running for an hour or so. | |
ID: 32817 | Rating: 0 | rate: / Reply Quote | |
I had one CRASH test overnight that took 22,493.99 seconds to complete. Checking the system shows that the one that was running on half the core clock of the GPU. So that is the explanation. However no reason in the stderr report, core clock was there reported as it should be, 1058MHz. I reboot the system and all is normal again. | |
ID: 32826 | Rating: 0 | rate: / Reply Quote | |
Hello: 8.14 Tasks are running low load on the GPU <60% and also very unstable, varies more than 10% + -. in my GTX 770. | |
ID: 32846 | Rating: 0 | rate: / Reply Quote | |
TJ, I guess you are referring to this WU, | |
ID: 32848 | Rating: 0 | rate: / Reply Quote | |
Hello: If this task is completed, but is now running; SANTI_MARwt2-4-25-uan RND4912_0 with load <65%. GPU | |
ID: 32849 | Rating: 0 | rate: / Reply Quote | |
Hello: If this task is completed, but is now running; SANTI_MARwt2-4-25-uan RND4912_0 with load <65%. GPU Hello: Regarding the issue of little use GPU if it has to be the way of working of these tasks, the solution will perform two tasks on the GPU to achieve maximum load. That those responsible will be interesting to confirm this issue in order to decide how to handle these tasks. NOTE: I happened to run two tasks at the same GPU 8.14 GTX770 and the total charge passed 55% to 70% + - 5% Memory 777 MB FB and BUS 22% and 8% 1254 Mhz GPU. | |
ID: 32850 | Rating: 0 | rate: / Reply Quote | |
TJ, I guess you are referring to this WU, Yes, skgiven that is the one. Later this morning I had one error, but that did not down clock the core clock. But as these CRASH tests are Santi's SR and I had a lot of errors of them, my error rate has lowered significantly. ____________ Greetings from TJ | |
ID: 32851 | Rating: 0 | rate: / Reply Quote | |
Hello: If this task is completed, but is now running; SANTI_MARwt2-4-25-uan RND4912_0 with load <65%. GPU Hello: Sorry ... 8.14 my problems with no load on the GPU result from a corruption of the driver, reinstalled the question has been solved and GPU load is normal 85% + - | |
ID: 32860 | Rating: 0 | rate: / Reply Quote | |
My Asus 770 runs at 91-92% GPU load steady, with core clock of 1097MHz, however I have it set to 1060MHz. This is for a Nathan WU and obvious no 8.14 app yet. | |
ID: 32861 | Rating: 0 | rate: / Reply Quote | |
The server should now once again be dishing out Short tasks to Linux clients. | |
ID: 33143 | Rating: 0 | rate: / Reply Quote | |
I've promoted the 8.15 beta application to the short queue. | |
ID: 34154 | Rating: 0 | rate: / Reply Quote | |
Unfortunately it does not remedy the current sudden reboots I am experiencing, pls. see report here. | |
ID: 34157 | Rating: 0 | rate: / Reply Quote | |
The same type of WU that was restarting my system (SANTI_baxbimSPW2) now fails without restarting the system: | |
ID: 34158 | Rating: 0 | rate: / Reply Quote | |
C:\ProgramData\BOINC\projects\www.gpugrid.net\acemd.815-55.exe | |
ID: 34160 | Rating: 0 | rate: / Reply Quote | |
Heard the fan on my GTX770 roar, quickly opened MSI Afterburner and saw the GPU power usage at over 4000%!! A few seconds later and the system restarted, blue screen... I guess the app has found a driver bug. | |
ID: 34162 | Rating: 0 | rate: / Reply Quote | |
That's interesting. As mentioned in the other thread, I tested a v8.14 short run with my GT 650M. The workunit did finish ok, but while crunching, the subnotebook shut down gracefully once. Very likely to prevent overheating, since the GPU temp. acc. to stderr.txt was at 85 °C at that time, which is unusual, esp. at the current room temperatures. I activated the notebook-cooler after that... | |
ID: 34165 | Rating: 0 | rate: / Reply Quote | |
Upgraded drivers to latest WHQL but still got systems restarts and GPUGrid errors. Some fail with error messages, others just say aborted: | |
ID: 34171 | Rating: 0 | rate: / Reply Quote | |
Message boards : News : acemdshort application 8.15 - discussion