Message boards : News : acemdlong application 815 updated for Maxwell
Author | Message |
---|---|
The cuda60 version of application version 815 is now present on acemdlong for those of you with the latest driver or a Maxwell card. | |
ID: 35741 | Rating: 0 | rate:
![]() ![]() ![]() | |
Getting a lot of failures on app 815(Cuda60) | |
ID: 35751 | Rating: 0 | rate:
![]() ![]() ![]() | |
Realize the effort and focus of the 8.15 and cuda60 is probably for maxwell, but just wanted to ensure you knew WinXP seems to error out on every one of these tasks while Win7/8 is ok. | |
ID: 35791 | Rating: 0 | rate:
![]() ![]() ![]() | |
Jeremy, Could you try a project reset on the affected XP machines please? The error suggests the application's files haven't downloaded correctly | |
ID: 35795 | Rating: 0 | rate:
![]() ![]() ![]() | |
Different security architecture? | |
ID: 35796 | Rating: 0 | rate:
![]() ![]() ![]() | |
Hi, | |
ID: 35802 | Rating: 0 | rate:
![]() ![]() ![]() | |
Downgrade video driver and you won't get Cuda 6 | |
ID: 35804 | Rating: 0 | rate:
![]() ![]() ![]() | |
Downgrade video driver and you won't get Cuda 6 That is a way around it but surely there must be a bug in the application that is causing the failures in Windows XP machines. Some participants prefer the performance advantage of XP for crunching GPUGRID. | |
ID: 35805 | Rating: 0 | rate:
![]() ![]() ![]() | |
Matt, | |
ID: 35807 | Rating: 0 | rate:
![]() ![]() ![]() | |
Thanks for the error message. Should be enough to track down the problem. I've disabled the acemdlong cuda60 app for the moment. | |
ID: 35808 | Rating: 0 | rate:
![]() ![]() ![]() | |
Would it be possible to add a checkbox to the 'Run only the selected application' named 'Cuda 6' and one named 'Cuda < 6' ? | |
ID: 35809 | Rating: 0 | rate:
![]() ![]() ![]() | |
I got the first tasks cuda 60 on March 19 and all were Calculate flawlessly. Already about 20 + already completed okay. | |
ID: 35811 | Rating: 0 | rate:
![]() ![]() ![]() | |
Would it be possible to add a checkbox to the 'Run only the selected application' named 'Cuda 6' and one named 'Cuda < 6' ? I agree. However, it looks like they want to use CUDA 6.0 on all the new work, so that would be only a temporary solution. | |
ID: 35812 | Rating: 0 | rate:
![]() ![]() ![]() | |
Try acemdbeta 812 or acemdshort 813 - these should fix the crash on 32bit Win XP. | |
ID: 35813 | Rating: 0 | rate:
![]() ![]() ![]() | |
Will this also be compatible for WinXP 64bit? From the leader list the top machines are using it. | |
ID: 35814 | Rating: 0 | rate:
![]() ![]() ![]() | |
Yep | |
ID: 35815 | Rating: 0 | rate:
![]() ![]() ![]() | |
Hi, | |
ID: 35818 | Rating: 0 | rate:
![]() ![]() ![]() | |
Update - rebooted and performed a project reset to no avail. | |
ID: 35825 | Rating: 0 | rate:
![]() ![]() ![]() | |
Try the new version 820 | |
ID: 35827 | Rating: 0 | rate:
![]() ![]() ![]() | |
I have just completed my first 8.20 short run on a GTX 660 under WinXP 32-bit. Name 990x-SANTI_MAR423cap310-45-84-RND7785_1 | |
ID: 35834 | Rating: 0 | rate:
![]() ![]() ![]() | |
I can confirm . 8.20 cuda 6 short run ; ran one each on gtx770 & gtx780 WinXP Pro 64bit with the 335.28 driver all went fine. Only just slightly longer GPU run time than observed with 5.5 and 332.21 driver. | |
ID: 35841 | Rating: 0 | rate:
![]() ![]() ![]() | |
Try the new version 820 Moved one of the XP Pro32 machines back to 335.28 last night and it just completed a 8.20 Cuda60 WU this morning. This one was a short on a GTX460. http://www.gpugrid.net/result.php?resultid=8012977 I'll move the other XP machine back to 335.28 now. Thank you. | |
ID: 35849 | Rating: 0 | rate:
![]() ![]() ![]() | |
PS: I have this wild dream that one day Nvidia comes out with a new driver and it actually shortens GPU run time ! DARE TO DREAM :) Doesn't really happen with compute applications the way it does with games. I suspect that the larger performance increases boasted about for individual games comes from shipping manually-optimised versions of a game's shader kernels along with the driver. For compute applications (for ours at least), that's not possible. We'll only get a benefit from the driver if it addresses system-level issues; for example the launch overhead of WDM. Matt | |
ID: 35851 | Rating: 0 | rate:
![]() ![]() ![]() | |
Finally got some of the new 820(CUDA60) units on my XP32 cruncher and they started ok without erroring out :-) | |
ID: 35872 | Rating: 0 | rate:
![]() ![]() ![]() | |
Update on 820 Cuda60 on XP Pro32 | |
ID: 35874 | Rating: 0 | rate:
![]() ![]() ![]() | |
I'm still getting the occasional v8.15 allocated to my Maxwell - which is a bit of a waste for all concerned. | |
ID: 35882 | Rating: 0 | rate:
![]() ![]() ![]() | |
Don't know what's going wrong there. Try a project reset? Matt | |
ID: 35886 | Rating: 0 | rate:
![]() ![]() ![]() | |
Same here: My 750 Ti is getting Cuda55 long work units and they all error out... the short units are Cuda60 and are successful. A few days ago I got some long run Cuda60 units which were also sucessful - but I'm not getting those anymore. | |
ID: 35887 | Rating: 0 | rate:
![]() ![]() ![]() | |
Oh, you are talking about long. The app isn't on acemdlong yet. No until I'm happy with the results I get on short. | |
ID: 35888 | Rating: 0 | rate:
![]() ![]() ![]() | |
I'm still getting the occasional v8.15 allocated to my Maxwell I don't see how that would help? I started testing the card with short run tasks, switched to long tasks when the the cuda60 app was made available. Got a long task error when the highest version dropped back to cuda55, so switched to short tasks again. Mostly, the short tasks have been tagged as cuda60, and have run successfully. It's just that single short task allocated at 10:47 this morning which failed because of the cuda55 tagging: since then, I've completed one cuda60 task without error, and I'm in the middle of another one now. All tasks for host 170387 Which plan_class a task is allocated to use (cuda42, cuda55, cuda60) isn't something either we or our BOINC clients can control (though some wish....). Currently, it's a server decision only, which is why I drew attention to it. Referring to Specifying plan classes in XML, it looks as if you may need filter the lower plan_classes (cuda42, cuda55) with Fields for NVIDIA GPU apps: or something like that - I'm not 100% certain of my translation of "<5.0" into MMmm format. | |
ID: 35889 | Rating: 0 | rate:
![]() ![]() ![]() | |
I've had 6 App820(CUDA60) units successfully complete on my XP32 cruncher... Looking good!!! | |
ID: 35890 | Rating: 0 | rate:
![]() ![]() ![]() | |
Agree; 820 now looking good on win x86 - gtx 650ti | |
ID: 35892 | Rating: 0 | rate:
![]() ![]() ![]() | |
I'm still getting the occasional v8.15 allocated to my Maxwell - which is a bit of a waste for all concerned. This is still an issue. I only just attached to the project and my first unit was cuda55 which errored. My 2nd unit is cuda60. Task 8203849 | |
ID: 35980 | Rating: 0 | rate:
![]() ![]() ![]() | |
Should we detach until this is fixed? Half my units are erroring due to 8.15. | |
ID: 36009 | Rating: 0 | rate:
![]() ![]() ![]() | |
I guess no one cares enough. I'll detach, back to PrimeGrid with me. | |
ID: 36062 | Rating: 0 | rate:
![]() ![]() ![]() | |
I would keep running the WU's, but I would abort any 5.5 I see downloading. The WU's that fail do so very quickly but there is a lot of wasted Internet usage. | |
ID: 36063 | Rating: 0 | rate:
![]() ![]() ![]() | |
I would keep running the WU's, but I would abort any 5.5 I see downloading. The WU's that fail do so very quickly but there is a lot of wasted Internet usage. And, at the moment, there is a lot of idle-time due to slow server ... | |
ID: 36065 | Rating: 0 | rate:
![]() ![]() ![]() | |
It would be incorrect to say I understand why the scheduler is fickle in giving out cuda55 and cuda60 to Maxwell machines, but I think I can see how to stop it happening. Scheduler restart shortly. | |
ID: 36067 | Rating: 0 | rate:
![]() ![]() ![]() | |
One day, my heterogeneous PC will have: 1 Maxwell GPU, 1 Kepler GTX 660 Ti, and 1 Fermi GTX 460... while running the latest NVIDIA drivers. I hope that, whatever scheduler changes you make, will still allow for heterogeneous scenarios such as mine. | |
ID: 36068 | Rating: 0 | rate:
![]() ![]() ![]() | |
I think the scheduler is written to 'test' sub-optimal plan_classes randomly, to avoid a client getting stuck indefinitely in a one-off rogue allocation. | |
ID: 36070 | Rating: 0 | rate:
![]() ![]() ![]() | |
Should be fixed now. | |
ID: 36074 | Rating: 0 | rate:
![]() ![]() ![]() | |
I don't receive any long tasks on my 750 Ti. Should I be getting them ? | |
ID: 36238 | Rating: 0 | rate:
![]() ![]() ![]() | |
short and long.. | |
ID: 36239 | Rating: 0 | rate:
![]() ![]() ![]() | |
@Mumak, I dont get any either (Ubuntu 12.04.4, GTX750Ti) since about two days, but the server status says always only 15, 21, 27 are available, so I ***guess*** there are simply much less long runs available than there is demand. | |
ID: 36262 | Rating: 0 | rate:
![]() ![]() ![]() | |
Doesn't seem that long cuda60 WUs are being sent out any more. Any particular reason? Has the bug with the cuda55 app that causes WU crashes when a machine is rebooted or BOINC is restarted been fixed? | |
ID: 36389 | Rating: 0 | rate:
![]() ![]() ![]() | |
My research indicates that I am still being given 8.15 apps that still infuriatingly crash. It seems that they never updated the non-cuda6 applications to the fixed 8.20 version :( | |
ID: 36402 | Rating: 0 | rate:
![]() ![]() ![]() | |
Please try to suspend all work first and to then the reboot. That works for me, no errors when I start all projects again after booting. I am only getting 8.15 apps as I don't upgrade my 331.82 drivers yet. | |
ID: 36436 | Rating: 0 | rate:
![]() ![]() ![]() | |
That's correct for Linux. Too many clients were not correctly reporting the Nvidia driver version, which makes correct scheduling difficult. Matt | |
ID: 36437 | Rating: 0 | rate:
![]() ![]() ![]() | |
For Windows, I am still regularly getting 8.15 tasks. It's almost as if the non-cuda60 app versions were not rebuilt for 8.20. Any hopes of seeing it get fixed (since the 8.15's have the restart/resume bug?) | |
ID: 36443 | Rating: 0 | rate:
![]() ![]() ![]() | |
For Windows, I am still regularly getting 8.15 tasks. It's almost as if the non-cuda60 app versions were not rebuilt for 8.20. Any hopes of seeing it get fixed (since the 8.15's have the restart/resume bug?) You can always check the current build status on the applications page. We do appear to be in a transitional state at the moment, with a number of imbalances between the long and short queues again. | |
ID: 36447 | Rating: 0 | rate:
![]() ![]() ![]() | |
To clarify what I meant: There was an 8.20 cuda60 app on the Long Queue (proof pasted below), but now that app is gone, leaving only the buggy 8.15 apps. The applications page doesn't even indicate 8.20 on Long at all, and is a bit misleading. | |
ID: 36448 | Rating: 0 | rate:
![]() ![]() ![]() | |
Worse yet, I think I had managed to, at some point, get an 8.20 task to error out with the "The file exists. (0x50) - exit code 80 (0x50)" error I had been seeing with the 8.15's. Saddening and maddenning. | |
ID: 36449 | Rating: 0 | rate:
![]() ![]() ![]() | |
At present the demand for all types of WU outstrips supply, server status. The projects current GigaFLOPs is 1,359,099. With 3450 GPU WU's in the wild, and a maximum of 2/GPU that means there is over 1725 GPU's attached to the project. A mere 1,401 CPU WU's is even more limiting. Clearly the project is struggling to maintain WU supply, never mind honing the apps, developing new research models and introducing server side fixes. | |
ID: 36458 | Rating: 0 | rate:
![]() ![]() ![]() | |
Thanks skgiven, | |
ID: 36464 | Rating: 0 | rate:
![]() ![]() ![]() | |
jacob, | |
ID: 36466 | Rating: 0 | rate:
![]() ![]() ![]() | |
Hurray!! Thanks!! [I'll be sure to test the normal scenarios of exiting/restarting BOINC, and suspending/resuming tasks without restarting... as I rely on those scenarios all of the time!] | |
ID: 36468 | Rating: 0 | rate:
![]() ![]() ![]() | |
There will be an update for the older versions of the windows application coming tomorrow. Received one GERRARD cuda60 long WU about 4 hours ago. Hopeful that the app results will be good. The Maxwells will be happy and so will the rest of us :-) | |
ID: 36485 | Rating: 0 | rate:
![]() ![]() ![]() | |
cuda-42 and cuda-55 are updated now. | |
ID: 36488 | Rating: 0 | rate:
![]() ![]() ![]() | |
Thank you so much Matt. I know this will help to improve stability, although I think there is still some lingering issue, even in 8.20. :) I'm glad we are moving forward. Is there any way you would consider including additional debug in the stderr.txt, especially during suspends/resumes, especially so we might be better able to figure out why a task might abruptly just quit/exit with an error? | |
ID: 36490 | Rating: 0 | rate:
![]() ![]() ![]() | |
Jacob, | |
ID: 36491 | Rating: 0 | rate:
![]() ![]() ![]() | |
Actually, I have to put out another revision, as I inadvertently missed out G80 support in version 820 (yes, there are still enough GTX280s out there to care!). | |
ID: 36492 | Rating: 0 | rate:
![]() ![]() ![]() | |
I'd want anything that would indicate WHY a task ended prematurely. I'd want it printed in stderr.txt. | |
ID: 36495 | Rating: 0 | rate:
![]() ![]() ![]() | |
Actually, I have to put out another revision, as I inadvertently missed out G80 support in version 820 (yes, there are still enough GTX280s out there to care!). I know what you mean but to be precise the G80's (CC1.0) are no longer supported. Ditto for the next incarnations G92... (CC1.1). It's predominantly the GT200 (and mostly 55nm) models of the high end GF200 range that still just about, occasionally work (CC1.3). Even the CC1.2 cards no longer work AFAIK. ____________ FAQ's HOW TO: - Opt out of Beta Tests - Ask for Help | |
ID: 36498 | Rating: 0 | rate:
![]() ![]() ![]() | |
Quite right, G2xx/sm13 is what I meant. | |
ID: 36501 | Rating: 0 | rate:
![]() ![]() ![]() | |
Apps updated to fix some numbering discrepencies and reintroduce (pointless) sm13 support. Version # is now 840 across all Windows applications. | |
ID: 36502 | Rating: 0 | rate:
![]() ![]() ![]() | |
Matt, thanks for the support and effort you put in, it's much appreciated. | |
ID: 36504 | Rating: 0 | rate:
![]() ![]() ![]() | |
Quite right, G2xx/sm13 is what I meant. Every little contribution should be appreciatted. This user for example will have surely suffered a bit :) http://www.gpugrid.net/hosts_user.php?userid=67028 No pun intended, your continuos dedication to the volunteers is very much appreciated and in my view unequalled in the distributed projects world. | |
ID: 36507 | Rating: 0 | rate:
![]() ![]() ![]() | |
Mayday mayday mayday | |
ID: 36512 | Rating: 0 | rate:
![]() ![]() ![]() | |
Same here - I have churned out 2 pages of failed v8.20's and 8.40's. Looks like it started with units sent around 1140 UTC. Wingmen are also erroring out. | |
ID: 36513 | Rating: 0 | rate:
![]() ![]() ![]() | |
Same here! 8.40 does not work on w7 and driver 332.21. | |
ID: 36514 | Rating: 0 | rate:
![]() ![]() ![]() | |
Same here. All 8.40 failing on Win7-64 with driver 331.82. | |
ID: 36516 | Rating: 0 | rate:
![]() ![]() ![]() | |
Yes, don't know what's gone wrong here. I've reverted back to 815 for now. | |
ID: 36517 | Rating: 0 | rate:
![]() ![]() ![]() | |
Yes, don't know what's gone wrong here. I've reverted back to 815 for now. I'm still receiving the 8.40 Cuda 5.5 app. | |
ID: 36518 | Rating: 0 | rate:
![]() ![]() ![]() | |
See if you can get a new 841. | |
ID: 36519 | Rating: 0 | rate:
![]() ![]() ![]() | |
Me too! now as a Cuda 4.2. | |
ID: 36520 | Rating: 0 | rate:
![]() ![]() ![]() | |
See if you can get a new 841. I got a 8.15, and then a 8.41 on my another host. | |
ID: 36522 | Rating: 0 | rate:
![]() ![]() ![]() | |
See if you can get a new 841. The 8.41 is working on my GTX 680. | |
ID: 36524 | Rating: 0 | rate:
![]() ![]() ![]() | |
Is 8.41 the same as 8.15? | |
ID: 36525 | Rating: 0 | rate:
![]() ![]() ![]() | |
I put the 335.23 driver on my Win7-64 machine, and it is now crunching the 8.40 Cuda55 tasks. They are short and not long, but now crunching and not getting errors. Will leave this driver on for a while. I had a soft spot for the 331.82 driver. Was able to get the better utilization with 331.82 according to Precision as compared to the 335.23. | |
ID: 36526 | Rating: 0 | rate:
![]() ![]() ![]() | |
841, 840 and 820 are all the same code, I'd just made a mistake in the deployment (calling a cuda60 version cuda55), which only showed up when it hit hosts that weren't cuda 6 capable. | |
ID: 36527 | Rating: 0 | rate:
![]() ![]() ![]() | |
Hang in there, Matt. Thanks for the prompt response, for keeping us informed, and for moving forward with the apps. Much respect. | |
ID: 36528 | Rating: 0 | rate:
![]() ![]() ![]() | |
Remember SWAN_SYNC? Next time I rev the app, I'll introduce the ability to force busy waiting, which ought to give best performance, at the cost of CPU. Matt | |
ID: 36529 | Rating: 0 | rate:
![]() ![]() ![]() | |
Remember SWAN_SYNC? Next time I rev the app, I'll introduce the ability to force busy waiting, which ought to give best performance, at the cost of CPU. All thumbs up for providing it as an option, for those who really want or need it! MrS ____________ Scanning for our furry friends since Jan 2002 | |
ID: 36531 | Rating: 0 | rate:
![]() ![]() ![]() | |
Remember SWAN_SYNC? Next time I rev the app, I'll introduce the ability to force busy waiting, which ought to give best performance, at the cost of CPU. In the other thread you said 8.41 would already have it. I have such a CUDA 6 WU running under Win 8.1. I created a user or system environment variable called SWAN_SYNC, set it to 0 and rebooted. However, CPU usage remains at 1.4 - 1.8% (8 threads), so it's obviously not working yet. Did I do anything wrong? MrS ____________ Scanning for our furry friends since Jan 2002 | |
ID: 36558 | Rating: 0 | rate:
![]() ![]() ![]() | |
Try setting SWAN_SYNC to 1. | |
ID: 36564 | Rating: 0 | rate:
![]() ![]() ![]() | |
Remember SWAN_SYNC? Next time I rev the app, I'll introduce the ability to force busy waiting, which ought to give best performance, at the cost of CPU. How can we do that set the environment variable for SWAN_SYNC? If that helps performance I might try updating to latest beta driver and see how the 780Ti will perform then. Edit: Sorry for spamming, I found it here: http://www.gpugrid.net/forum_thread.php?id=2123&nowrap=true#16463 Did a search in the forum, but did not find, did a search in Google and there is the answer. ____________ Greetings from TJ | |
ID: 36566 | Rating: 0 | rate:
![]() ![]() ![]() | |
Remember SWAN_SYNC? Next time I rev the app, I'll introduce the ability to force busy waiting, which ought to give best performance, at the cost of CPU. It is working for me, sort of, on my GTX 660 Ti, using the 8.41 cuda60 app. But it seems it is bugged - read on. Note: You don't have to restart Windows in order to make this work. I'm using Process Explorer to monitor the CPU usage. - If I close BOINC, go to Control Panel -> System -> Advanced -> Environment Variables, set a System variable SWAN_SYNC set to value 0, restart BOINC, I see the process use a full core. - If I close BOINC, go to Control Panel -> System -> Advanced -> Environment Variables, set a System variable SWAN_SYNC set to value 1, restart BOINC, I see the process use a full core. - If I close BOINC, remove the System variable SWAN_SYNC, restart BOINC, I see the process use a partial (approximately 1/12th of a) core. This leads me to believe the feature is bugged. I would have thought the application would have acted differently, when setting the variable to 1 vs setting it to 0. It's only acting differently based on the EXISTENCE of the variable, not the SETTING of the variable. Bug? Now I get to stay up all night, contemplating whether I want to actually use this. Sigh. Thanks for making in an option at least. We appreciate options, we really do. | |
ID: 36569 | Rating: 0 | rate:
![]() ![]() ![]() | |
I was so happy receiving just cuda55/cuda42 wus for the GTX660Ti and now I'm receiving cuda60 units that are all failing since my driver is old but was perfect for my cards and ubuntu system. | |
ID: 36570 | Rating: 0 | rate:
![]() ![]() ![]() | |
Hello guys, | |
ID: 36571 | Rating: 0 | rate:
![]() ![]() ![]() | |
I had SWAN_SYNC=0 set as a User Variable, from way back, and it worked as soon as I used the 8.41 app version. It was the case that it should be set as an environmental variable and should be set to 0, but it use to work when set to other numbers including 1. However, I remember Gianni or Toni wasn't happy with it being set to other numbers - don't know why? | |
ID: 36573 | Rating: 0 | rate:
![]() ![]() ![]() | |
skgiven, | |
ID: 36575 | Rating: 0 | rate:
![]() ![]() ![]() | |
In the other thread you said 8.41 would already have it. I have such a CUDA 6 WU running under Win 8.1. I created a user or system environment variable called SWAN_SYNC, set it to 0 and rebooted. However, CPU usage remains at 1.4 - 1.8% (8 threads), so it's obviously not working yet. Did I do anything wrong? Guys, sorry for the confusion. After sleeping over it I changed nothing and came back just to find SWAN_SYNC working, making the app use a full core again. Can't quantify any performance gains yet, though. Edit: for a statistically insufficient sample size of "1" I saw no change in performance (GTX660Ti, 335.23, Win 8.1, CPU not completely saturated). Which makes sense since otherwise I would have sticked to earlier drivers. MrS ____________ Scanning for our furry friends since Jan 2002 | |
ID: 36580 | Rating: 0 | rate:
![]() ![]() ![]() | |
Have had SWAN_SYNC set to 0 on all my machines. Also running with the <cpu_usage>1.0</cpu_usage> line in my app_config. Woke up today to see each machine has completed tasks on the 335 driver (XP and Win7) where CPU time is nearly equal to GPU time. | |
ID: 36588 | Rating: 0 | rate:
![]() ![]() ![]() | |
All tasks v8.21(cuda60) are failing in my Linux host with driver 304.88. My cards are GTx 660Ti. | |
ID: 36595 | Rating: 0 | rate:
![]() ![]() ![]() | |
See the "Important new for Linux crunchers" post in the News. | |
ID: 36596 | Rating: 0 | rate:
![]() ![]() ![]() | |
Thanks, major issue I see. no time for it now. | |
ID: 36599 | Rating: 0 | rate:
![]() ![]() ![]() | |
Thank you skgiven,
| |
ID: 36602 | Rating: 0 | rate:
![]() ![]() ![]() | |
So now it seems established that SWAN_SYNC reserves a whole CPU core. But is it any faster? If so, how much? | |
ID: 36608 | Rating: 0 | rate:
![]() ![]() ![]() | |
It's very easy to test, just close BOINC, create or remove the System Variable, and restart BOINC. | |
ID: 36610 | Rating: 0 | rate:
![]() ![]() ![]() | |
It's very easy to test, just close BOINC, create or remove the System Variable, and restart BOINC. Thanks for the reply. I try to load the CPUs up to the point just before GPU performance starts degrading. At the moment the CPU tasks I'm running have long periods between checkpoints so restarting BOINC is not something I want to do. A 3% speedup doesn't sound like much if sacrificing a CPU task is the result. Since WU types show large differences in GPU usage, I wonder if SWAN_SYNC would have widely varying results? | |
ID: 36612 | Rating: 0 | rate:
![]() ![]() ![]() | |
So now it seems established that SWAN_SYNC reserves a whole CPU core. But is it any faster? If so, how much? In my case (GTX660Ti, 335.23, Win 8.1, CPU not completely saturated) I am not seeing any performance increase due to setting SWAN_SYNC, whereas something like 3% should have been visible during the 4 WUs I've crunched with this setting now. Switching back. Generally the benefit should increase if CPU interaction is needed more often, which happens for smaller molecules / systems and for faster cards. If anyone profits from this it's going to be high-end GK110 users first. MrS ____________ Scanning for our furry friends since Jan 2002 | |
ID: 36614 | Rating: 0 | rate:
![]() ![]() ![]() | |
So now it seems established that SWAN_SYNC reserves a whole CPU core. But is it any faster? If so, how much? I've experienced with some settings, and it seems that the latest drivers, and CUDA 6.0 tasks are pretty fast without SWAN_SYNC set. The gain depends on many factors: 1. The GPU: high-end GPUs (GTX 660Ti, 670, 680, 760, 770, 780, 780Ti, Tinans) can gain a little, lesser GPUs can gain less. 2. Operating system: Windows XP is faster than other versions of Windows, but it still can be up to 3% faster with SWAN_SYNC set. 3. The type of the workunit: there are such workunits which use more CPU, they utilize the GPU less, and can gain more by setting SWAN_SYNC (I'm using WinXPx64) 4. The speed and saturation of the CPU cores: The less the CPU usage, the more the GPU utilization. It also depends the CPU app. It is good to know that hyperthreading means that 1 core can handle 2 threads, but these 2 threads won't detain the other only while they don't try to access simultaneously the same resource (FPU) of the core they running on. 5. The CPU affinity of the tasks: the GPUGrid application can gain up to 3% if it runs on the same thread of the CPU all the time (and no other application using the same core). | |
ID: 36651 | Rating: 0 | rate:
![]() ![]() ![]() | |
So, would you say that the following is true, regarding GPUGrid SWAN_SYNC: | |
ID: 36652 | Rating: 0 | rate:
![]() ![]() ![]() | |
It really depends on what you want to prioritise and your system. The number of GPU's you have is important as it will apply to them all, and of course what type they are. | |
ID: 36653 | Rating: 0 | rate:
![]() ![]() ![]() | |
So, would you say that the following is true, regarding GPUGrid SWAN_SYNC: Yes. - If you are only working on the GPUGrid project, then use it Yes. - If you are also working on other CPU projects, then do not use it I would say you can use it, if you reduce the number of usable CPUs at least by the number of the GPUs in the system. Those are the guidelines I'd recommend, at least. | |
ID: 36655 | Rating: 0 | rate:
![]() ![]() ![]() | |
- If you are also working on other CPU projects, then do not use it I've always hated that advice, because if GPUGrid runs out of work, then you're hopefully working on some other GPU project, but now you've unnecessarily taken out one or more CPUs. It's much better (in my opinion) to define an appropriate <cpu_usage> value in a GPUGrid app_config.xml file, instead of changing the "X% of the processors" setting (which is admittedly easier). To each their own. Options are indeed good. Your approach is good, though -- if you are going to use SWAN_SYNC, then you should somehow make sure that, for each GPUGrid task that is actively running, you "budget" a full core to it. :) | |
ID: 36656 | Rating: 0 | rate:
![]() ![]() ![]() | |
Waited until I had a couple different WU's to compare the SWAN_SYNC impact. Wanted a low and high utilization WU to review. Image scales Num_Threads GPU1 GPU2
2 2.0 6.1
3 2.3 5.3
4 2.6 5.4
5 2.9 4.4
6 3.1 5.1
7 2.7 3.7
8 2.3 2.5 I was surprised to see higher variation and larger impacts to utilization with a WU which starts with a much lower utilization. ![]() | |
ID: 36682 | Rating: 0 | rate:
![]() ![]() ![]() | |
Thanks, Jeremy! | |
ID: 36685 | Rating: 0 | rate:
![]() ![]() ![]() | |
SWAN_SYNC on means setting it to 1? | |
ID: 36692 | Rating: 0 | rate:
![]() ![]() ![]() | |
Doesn't matter what it is set to. It just needs to be set. | |
ID: 36694 | Rating: 0 | rate:
![]() ![]() ![]() | |
What Matt means is: It doesn't matter what value it has; it only matters that the variable exists. To use it, just create a system variable called SWAN_SYNC, set it to some value (like 1, doesn't matter, may not even need a value, but just set it to 1 to be sure), then restart BOINC. | |
ID: 36695 | Rating: 0 | rate:
![]() ![]() ![]() | |
Jeremy Zimmerman, more good work. | |
ID: 36701 | Rating: 0 | rate:
![]() ![]() ![]() | |
1. The GPU: high-end GPUs (GTX 660Ti, 670, 680, 760, 770, 780, 780Ti, Tinans) can gain a little, lesser GPUs can gain less. I tried SWAN_SYNC on a couple slower cards with a 128-bit memory bus: a 650TI and a 750TI. No noticeable difference in WU completion time even though the GPU utilization increased by a percent or so. The only real world difference on those cards in Win7-64 was that they now grabbed a whole CPU core so that one less CPU WU could be run. SWAN_SYNC was a losing proposition at least on those GPUs and Win7-64. Edit: Decided to try SWAN_SYNC on a box with a 750Ti in a PCI 2.0 X4 slot. Will report back with results. | |
ID: 36741 | Rating: 0 | rate:
![]() ![]() ![]() | |
1. The GPU: high-end GPUs (GTX 660Ti, 670, 680, 760, 770, 780, 780Ti, Tinans) can gain a little, lesser GPUs can gain less. Did an extended SWAN_SYNC test on 3 machines. Two showed no improvement and one yielded a 1 to 1.5% decrease in run time. All machines also are running an AMD GPU in PCIe slot 0 and 3-4 CPU WUs on Phenom X6 CPUs. SWAN_SYNC at least on these machines is definitely a waste of resources IMO. | |
ID: 36882 | Rating: 0 | rate:
![]() ![]() ![]() | |
Did an extended SWAN_SYNC test on 3 machines. Two showed no improvement and one yielded a 1 to 1.5% decrease in run time. All machines also are running an AMD GPU in PCIe slot 0 and 3-4 CPU WUs on Phenom X6 CPUs. SWAN_SYNC at least on these machines is definitely a waste of resources IMO. I do agree. SWAN_SYNC can make the crunching a little bit faster only under Windows XP. (My previous post wasn't that straightforward about this.) I assume you did your tests on your computers under Windows 7 (x64). It is known that the more recent OSes than Windows XP have a new Windows Display Driver Model which makes the OS more stable, but it comes with an overhead, which makes the crunching slower on the GPU, and this overhead makes the gain from SWAN_SYNC negligible. However the recent Windows 7 (8, Vista) drivers are faster than the older (CUDA 3.1) versions. One of my hosts (using Windows XP x64) did (does) an unintended testing of the SWAN_SYNC, as this host sometimes receives CUDA4.2 tasks, which don't use the SWAN_SYNC. This comparison is not fully adequate as I'm comparing CUDA6.0 tasks to CUDA4.2 tasks, and the CUDA 6.0 app is a little bit faster of its own. This host have two GTX780Ti's: the faster one (3500MHz RAM clock) is in a PCIe3.0x16 slot, and the slower one (2700MHz RAM clock) is in a PCIe2.0x4 slot. NOELIA_BI_3 workunits: Faster GPU: without SWAN_SYNC:17.353, 17.243 +5.26% .....with SWAN_SYNC: 16.483, 16.384 Slower GPU: without SWAN_SYNC:18.435, 18.426, 18.382, 18.373 +8.56% .....with SWAN_SYNC: 16.992, 16.958, 16.925, 16.935 SDOERR_BARNA5 workunits Faster GPU: without SWAN_SYNC:16.041, 16.060 +6.5% .....with SWAN_SYNC: 15.104, 15.045 Slower GPU: without SWAN_SYNC:16.980, 16.975 +9.2% .....with SWAN_SYNC: 15.545, 15.550 GERARD_A2ARNUL_adapt3 workunits: Slower GPU: without SWAN_SYNC:15.685, GERARD_A2ART4E_adapt workunits: Faster GPU: without SWAN_SYNC: 10.977, 10.966 +6.2% .....with SWAN_SYNC: 10.328, 10.324 SANTI_marsalWTbound2 workunits: Slower GPU: without SWAN_SYNC: 18.686 +11.3% .....with SWAN_SYNC: 16.781 Faster GPU: .....with SWAN_SYNC: 15.586 NATHAN_RPS1_adapt5 workunits: Slower GPU: without SWAN_SYNC: 14.415 +6.9% .....with SWAN_SYNC: 13.484 Faster GPU: without SWAN_SYNC: 13.387 +4% .....with SWAN_SYNC: 12.862 | |
ID: 36887 | Rating: 0 | rate:
![]() ![]() ![]() | |
Message boards : News : acemdlong application 815 updated for Maxwell