Author |
Message |
Beyond Send message
Joined: 23 Nov 08 Posts: 1112 Credit: 6,162,416,256 RAC: 0 Level
Scientific publications
|
Saw some new NATHAN WUs coming through a couple hours ago and said OH BOY because the last NATHANs ran so well on all my cards. So I dumped the other projects from my GTX 460/768MB GPUs and grabbed the new WUs. Not sure if it's that they won't work properly on < 1GB or what, but they are SLOW and the projected time is well over 24 hours on all 4 GPUs. What happened? The previous NATHANs ran in 10-11 hours on the GTX 460/768MB. Ouch :-( |
|
|
|
Hm i will need around 14 hours on 570s with min. 96% gpu load. So these are really long units ^^
____________
DSKAG Austria Research Team: http://www.research.dskag.at
|
|
|
Mumak Send message
Joined: 7 Dec 12 Posts: 92 Credit: 225,897,225 RAC: 0 Level
Scientific publications
|
I'm currently running one such on 660 Ti - 75% after 10 hours. |
|
|
skgivenVolunteer moderator Volunteer tester
Send message
Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level
Scientific publications
|
Long runs (8-12 hours on fastest card)
Presently, the fastest single GPU card is the GTX680.
On XP the GTX680 takes the following times for different Long WU's:
SDOERR_2HDQd ~25,700sec (7.1h) 135,000
NOELIA_klebe ~27,500sec (7.6h) 127,000 (yes, there is a slight credit disparity, but you could argue that the SDOERR WU's are the ones that overpay)
NATHAN_KIDc22 ~36,500sec (10.1h) 167,550 (smack in the middle of the 8-10h estimate)
NATHAN_dhfr36 ~13,600sec (3.8h) 70,800 {these are the old NATHAN WU's and would normally have been in the short queue if there wasn't a shortage of WU's}
____________
FAQ's
HOW TO:
- Opt out of Beta Tests
- Ask for Help |
|
|
Mumak Send message
Joined: 7 Dec 12 Posts: 92 Credit: 225,897,225 RAC: 0 Level
Scientific publications
|
My results on 660 Ti (Boost @ 1228 MHz):
NATHAN_dhfr36: 19,700 s
SDOERR_2HDQd: ~36,700 s
NOELIA_klebe_run2: ~39,300 s
NATHAN_KIDc22: ~48,100 s |
|
|
klepelSend message
Joined: 23 Dec 09 Posts: 189 Credit: 4,736,973,079 RAC: 579,440 Level
Scientific publications
|
My first result on EVGA GTX 670 SC @ AMD 8150 (One Core reserved for the GPU):
http://www.gpugrid.net/result.php?resultid=6892235
I13R6-NATHAN_KIDc22_2-0-8-RND0204_0
Time GPU: 52042.51
Time CPU: 51079.20
It seems to me quite long in comparison to other times posted.
On the other hand the new task on the same machine:
http://www.gpugrid.net/result.php?resultid=6892718
I60R3-NATHAN_KIDc22_2-0-8-RND5457_0 (Advanced 36% of total task, 5 h 7 m)
On EVGA OC Scanner X: The GPU POwer drops to ca. 96% TDP after restart, as I found it this morning at ca. 64% TDP. |
|
|
Beyond Send message
Joined: 23 Nov 08 Posts: 1112 Credit: 6,162,416,256 RAC: 0 Level
Scientific publications
|
Presently, the fastest single GPU card is the GTX680.
On XP the GTX680 takes the following times for different Long WU's:
NATHAN_KIDc22 ~36,500sec (10.1h) 167,550 (smack in the middle of the 8-10h estimate)
Realistically, how many real world people are running the fastest GPU on the fastest but now unsupported (arguably obsolete) operating system? Only a few crunchers specializing in this one project. When they finally get the titan running are we going to see another huge leap in WU times? Wouldn't it be better for total work speed/throughput to include more crunchers and keep them happier? Keeping the WU sizes reasonable and perhaps relaxing the 24hr time a bit would go a long way towards doing that IMO. |
|
|
skgivenVolunteer moderator Volunteer tester
Send message
Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level
Scientific publications
|
Linux is the fastest OS, then XP. 11 of the top 20 machines use Linux or XP. So many of the top/elite crunchers appear to go out of their way to accommodate the project (and their credits). Perhaps this will continue.
The Linux/xp popularity drops when you expand to the top 250 systems, it's 65/250 - so 26%. However it remains at that even when you go past the 1000th top system - from 1000 to 1100 its 27% Linux and XP.
I have no doubt XP usage will continue to fall with time, but Linux seems to be picking up a lot of the slack (vaguely remember looking at this in the past).
The non project elite crunchers tend to use the less productive Operating systems and this continues through to the occasional cruncher and crunchers with bad setups or entry level cards.
While these lesser systems still belong to crunchers, how much work they do/can do, how important is it to accommodate them, and how much effort it takes is really a question that the project team has to continuously ask itself.
Ideally we could all crunch what we wanted, without problems, but this is more of a hands on project that is always changing. While I don't have the answers (just opinions)t I do have some data that's worth noting:
GPUGrid's total daily RAC is 191,926,972.
The top 20 systems contribute 20,507,084 (just over 10%)
The top 20 crunchers contribute 43,518,798 (23.9%)
The top 100 crunchers contribute 92,956,103 (48%)
The 1000th cruncher has a RAC of ~22,000 and by the time you get to 2000, the RAC is only ~1200. You could get a RAC of >22K by just running one long WU every 6th day.
There is two ways to look at this data, depending on human resources; we don't have a big enough team to accommodate everybody so it's best to concentrate on those that help the most. Our team is big enough to develop and expand to better accommodate more crunchers and new crunchers.
I doubt that the researchers will immediately go by the non-mainstream GK110 Titan when setting task steps against complexity (which equates to task duration on the top GPU), though it is a GeForce GPU. At present the number of crunchers attached to this project with a Titan is likely to be a single figure. It would make more sense to go by the GK104 GTX680 and then the GK104 GTX770 (which might be 10 or 20% percent faster), after there are plenty of crunchers who have the GPU.
____________
FAQ's
HOW TO:
- Opt out of Beta Tests
- Ask for Help |
|
|
Beyond Send message
Joined: 23 Nov 08 Posts: 1112 Credit: 6,162,416,256 RAC: 0 Level
Scientific publications
|
On XP the GTX680 takes the following times for different Long WU's:
SDOERR_2HDQd ~25,700sec (7.1h) 135,000
NOELIA_klebe ~27,500sec (7.6h) 127,000 (yes, there is a slight credit disparity, but you could argue that the SDOERR WU's are the ones that overpay)
NATHAN_KIDc22 ~36,500sec (10.1h) 167,550 (smack in the middle of the 8-10h estimate)
NATHAN_dhfr36 ~13,600sec (3.8h) 70,800 {these are the old NATHAN WU's and would normally have been in the short queue if there wasn't a shortage of WU's}
Then again, maybe the SDOERR WUs are actually longer than the NOELIAs but run more efficiently. Looking at my GPUs the NOELIA WUs are running at 88-89% usage and the SDOERR WUs are running at 92-94% usage. |
|
|
skgivenVolunteer moderator Volunteer tester
Send message
Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level
Scientific publications
|
Then again, maybe the SDOERR WUs are actually longer than the NOELIAs but run more efficiently. Looking at my GPUs the NOELIA WUs are running at 88-89% usage and the SDOERR WUs are running at 92-94% usage.
That's the case; the more complex the model the less efficient it becomes. Of course the credit argument could be expanded to entail WU complexity, or lack of. Anyway, their model of assigning credit is reasonably accurate, ~14% difference throughout the range of long WU's, which includes 4 different research lines/models.
____________
FAQ's
HOW TO:
- Opt out of Beta Tests
- Ask for Help |
|
|
Beyond Send message
Joined: 23 Nov 08 Posts: 1112 Credit: 6,162,416,256 RAC: 0 Level
Scientific publications
|
Or they could award credits alphabetically. |
|
|
matlockSend message
Joined: 12 Dec 11 Posts: 34 Credit: 86,423,547 RAC: 0 Level
Scientific publications
|
My last NATHAN_KIDc22 WU completed in 48,862 seconds:
http://www.gpugrid.net/workunit.php?wuid=4476442
This is with an Asus GTX 660 and Linux using the 304.88 drivers.
From looking at other results with Win7, it seems you would need a 660 Ti to get this running time.
|
|
|
skgivenVolunteer moderator Volunteer tester
Send message
Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level
Scientific publications
|
XP is 11% faster than Vista, W7 and probably W8 (though I haven't measured it).
Linux is faster than XP (up to 5% if it's setup correctly, but it might vary a lot on the system).
I've seen a GTX660Ti on Linux that is 25% faster than my FOC GTX660Ti on W7.
____________
FAQ's
HOW TO:
- Opt out of Beta Tests
- Ask for Help |
|
|
StefanProject administrator Project developer Project tester Project scientist Send message
Joined: 5 Mar 13 Posts: 348 Credit: 0 RAC: 0 Level
Scientific publications
|
skgiven, do you maybe have an idea if the performance difference is project dependent? So do other projects (and other gpu projects) also run that much faster on XP, in which case it would be an OS issue (I know that Win 7 is in general a bit slower than XP), or does this only happen with GPUgrid? |
|
|
Beyond Send message
Joined: 23 Nov 08 Posts: 1112 Credit: 6,162,416,256 RAC: 0 Level
Scientific publications
|
skgiven, do you maybe have an idea if the performance difference is project dependent? So do other projects (and other gpu projects) also run that much faster on XP, in which case it would be an OS issue (I know that Win 7 is in general a bit slower than XP), or does this only happen with GPUgrid?
If I may jump in. Sometimes there is a small difference between XP and W7 on other projects, but generally not and the difference is not always in the favor of XP. Between Linux and W7: it's a mixed bag on other projects with W7 coming out on top as often as not. As far as the 25% listed above, no way IMO. You can generally tell the GPU efficiency by the percent of utilization (87-88% for me in W7-64 on NOELIA & NATHAN WUs). A 25% difference would most likely be due to something like an extreme OC on an exceptional card and perhaps good liquid cooling. Many other projects running CAL or CUDA run at 99% usage. OpenCL is often less and tends to need more CPU support than CUDA or CAL. |
|
|
skgivenVolunteer moderator Volunteer tester
Send message
Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level
Scientific publications
|
Beyond has covered the comparison to other projects.
The issue here is with the Windows Display Driver Model (WDDM) in Vista, W7, W8 and 2008+ servers. XP and Linux are not affected. It might be the case that there is even more of a gap now between XP and Linux when using CUDA 4.2.
From using MSI Afterburner it's obvious that Windows basically grabs a chunk of the display GPU's GDDR and keeps that for itself. It may control other GPU resources too. Don't know if this is a set/fixed amount, based on GPU model, CC or GDDR amount. It may be the case that if there are 2 GPU's in use, the one supporting the monitor under performs - but you would really need two identical GPU's and careful WU monitoring to determine that. Perhaps using the on-die GPU's prevents this 11% loss, but I can't test that (you need a motherboard that supports that to test it).
It was the Vista WDDM that introduced the idea of being able to restart the display driver, without restarting the OS.
http://msdn.microsoft.com/en-us/library/aa480220.aspx
As for relative performances, I have a reasonably FOC on my GTX660Ti. Sometimes it runs at 1202MHz, but at present it's operating at 1189MHz. If Linux is 5% faster than XP and XP is 11% faster than W7, then that's 16.55% faster, so to get to 25% faster you just need to up that by 7% (1267 or 1280MHz, without a GDDR increase). Readily achievable using liquid cooling, and possible with a GDDR tweak on air, with a good GPU. Another possibility is that on Linux the GPU utilization and power target is higher; 95% GPU usage over 88% is an 8% increase.
____________
FAQ's
HOW TO:
- Opt out of Beta Tests
- Ask for Help |
|
|
Mumak Send message
Joined: 7 Dec 12 Posts: 92 Credit: 225,897,225 RAC: 0 Level
Scientific publications
|
Higher latency can be caused by WDDM, since drivers are split between kernel and user mode (unlike XP, where all drivers were in kernel only). But exact numbers depend on particular implementation (of CUDA especially).
Another performance loss might be caused by using the Aero interface. |
|
|
|
I suspect it can be called a difference of "more closer to metal" with XP and Linux, whereas more abstraction in WDDM allows advanced features at the expense of performance.
SK wrote: the more complex the model the less efficient it becomes
I think it'S the other way around. That's why GPU utilization on short queue tasks drops. The more atoms / pixels the task contains, the easier it is to keep more shaders busy concurrently.
MrS
____________
Scanning for our furry friends since Jan 2002 |
|
|
|
Hi guys,
I have completed two NATHAN_KIDc22 WUs and wanted to ask you guys how you find my GTX 650Ti's times:
Received: 25 May 2013 11:18:10 UTC WU: I65R8-NATHAN_KIDc22_2-1-8-RND5776_0 Run Time: 81,090.74 Credit: 167,550.00
Received: 26 May 2013 09:50:00 UTC WU: I29R3-NATHAN_KIDc22_2-2-8-RND5999_2 Run Time: 81,103.17 Credit: 139,625.00
Note: Boinc 7.0.65 on Ubuntu 12.04 x86_64 with Nvidia driver 319.17
Looking at some more powerful cards' times, I believe the times are just about where they should be, am I correct?
Also, isn't the credit difference a little bit strange? The WUs look to be the same type and the run times are virtually the same, shouldn't the credit be the same as well? Except if they just adjusted the credit gain. |
|
|
flashawkSend message
Joined: 18 Jun 12 Posts: 297 Credit: 3,572,627,986 RAC: 0 Level
Scientific publications
|
Received: 26 May 2013 09:50:00 UTC WU: I29R3-NATHAN_KIDc22_2-2-8-RND5999_2 Run Time: 81,103.17 Credit: 139,625.00
I've never seen that point award before for any NATHAN_KIDc22's, 135,000 is the amount I get for completing a SDOERR_2HDQd. That's strange, is it from the short queue perhaps? I haven't done any of those in quit sometime.
Edit: I took a look at the wu and it was sent on the 25th, I thought you had still managed to complete it within the 24 hour bonus period, you made the 48 hour bonus though. |
|
|
Mumak Send message
Joined: 7 Dec 12 Posts: 92 Credit: 225,897,225 RAC: 0 Level
Scientific publications
|
My runtime on a 650Ti (slightly OCed) is 76,764s, credit 167,550.
On a 660Ti they complete in ~48,100s, credit is always the same. |
|
|
|
I've never seen that point award before for any NATHAN_KIDc22's, 135,000 is the amount I get for completing a SDOERR_2HDQd. That's strange, is it from the short queue perhaps? I haven't done any of those in quit sometime.
Edit: I took a look at the wu and it was sent on the 25th, I thought you had still managed to complete it within the 24 hour bonus period, you made the 48 hour bonus though.
Yes, I figured that out myself after posting. I missed the 24h bonus by 2 hours. The WU had finished, but the client didn't report it in time.. Maybe the 0.2+0.2 cache rule is a bit on the edge for my 650Ti. I adjusted it to 0.12+0.28, let's see how that goes.
The WU was from the long queue, I only take from the long queue. |
|
|
|
My runtime on a 650Ti (slightly OCed) is 76,764s, credit 167,550.
On a 660Ti they complete in ~48,100s, credit is always the same.
My 650Ti is stock-clocked. Can your OC be responsible for the ~4500sec difference?
Not that I'm dieing to squeeze the latest ounce of performance out of the card, I just want it to perform as it should. |
|
|
flashawkSend message
Joined: 18 Jun 12 Posts: 297 Credit: 3,572,627,986 RAC: 0 Level
Scientific publications
|
That should do the trick, others say they set theirs to 0.1 and it works good too. The NATHAN_KIDc22 are the longest running at this time (for me anyway), so you should be able to do any of the 3 in under 24 hours with you're 650Ti. |
|
|
Mumak Send message
Joined: 7 Dec 12 Posts: 92 Credit: 225,897,225 RAC: 0 Level
Scientific publications
|
My 650Ti is stock-clocked. Can your OC be responsible for the ~4500sec difference?
Not that I'm dieing to squeeze the latest ounce of performance out of the card, I just want it to perform as it should.
Indeed, it can. Moreover, the 650Ti is running on XP, which performs slightly better. |
|
|