Advanced search

Message boards : Number crunching : Limit on GPUs per system?

Author Message
Operator
Send message
Joined: 15 May 11
Posts: 108
Credit: 297,176,099
RAC: 0
Level
Asn
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 28468 - Posted: 10 Feb 2013 | 19:37:39 UTC

Is there a limitation on how many GPUs that can be run from one system?

As an example, given dual 6 core Xeons and 48gb of ram, lets say you had two GPUs running in the system and another 4 GPUs running in an expansion chassis.

Anybody ever tried that?

I've been getting the idea (from blog articles etc.) that somehow there is a limit of 4 GPUs max regardless of the capabilities of the rest of the system. I don't know whether its a driver limitation or PCIe bus limitation. But I'm getting the impression that there is a limit.

Is this true?

Operator
____________

Dagorath
Send message
Joined: 16 Mar 11
Posts: 509
Credit: 179,005,236
RAC: 0
Level
Ile
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 28469 - Posted: 10 Feb 2013 | 20:34:34 UTC - in response to Message 28468.
Last modified: 10 Feb 2013 | 20:38:26 UTC

Good timing as I've been wondering about similar things.

I've never heard of an expansion chassis but I would consider trying one depending on what the gurus here say about it. For now, do you have a link to an expansion chassis in an online catalogue? I'm curious about price, specs, etc. And for purposes of discussion it's always nice to have an example to point to.

One of the admins sort of promised some specs on the 4 GPU systems they run in the project lab. I would still love to see those specs and possibly some photos.
____________
BOINC <<--- credit whores, pedants, alien hunters

MarkJ
Volunteer moderator
Volunteer tester
Send message
Joined: 24 Dec 08
Posts: 738
Credit: 200,909,904
RAC: 0
Level
Leu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 28470 - Posted: 10 Feb 2013 | 21:58:51 UTC

I believe Nvidia had an 8 card limit in GeForce drivers, although this may have been removed from later drivers. I have run an expansion chassis with 4 x GTS450SP and another two in the machine itself without any issues. I believe BOINC also had a limit (8 or 16) and we asked for it to be changed, some time back now. Not sure if its actually been done.

You'll have to run a 64bit OS as you run out of address space with that many cards.
____________
BOINC blog

Dagorath
Send message
Joined: 16 Mar 11
Posts: 509
Credit: 179,005,236
RAC: 0
Level
Ile
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 28473 - Posted: 11 Feb 2013 | 0:58:43 UTC

Found an expansion chassis. Now I see. Hmmm, bottleneck in the cables and it looks to me like the slots will run at x8 if all occupied.

____________
BOINC <<--- credit whores, pedants, alien hunters

Operator
Send message
Joined: 15 May 11
Posts: 108
Credit: 297,176,099
RAC: 0
Level
Asn
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 28474 - Posted: 11 Feb 2013 | 16:07:22 UTC - in response to Message 28473.

I was originally considering something like the Dell C410x chassis but it's complete overkill.

The downsides as I see it are, everything is air-cooled, and noisy,16 slots that each take a special card carrier, but the ability to only connect 4 cards using one PCIe iPass connection to the host.

I guess this means that up to 4 hosts can connect to the expansion chassis but that's not what I had in mind. I don't know if you could populate 8 GPU slots and connect all of that through 2 iPass cables to the same host machine.

The smaller expansion chassis from people like Cyclone Microsystems and Dynapower Netstor are pretty spendy to start with even before you mod it for water cooling to keep the noise down and improve the ability to bump the clocks,

I see a lot of folks here with GTX 690s as the maximum solution (even the current top host has 3 GTX 690s but I bet they're all internal to the system chassis).

So that's why I wondered if it was doable at all or if anybody had successfully tried to create a 'mini Titan'.

Operator
____________

Profile MJH
Project administrator
Project developer
Project scientist
Send message
Joined: 12 Nov 07
Posts: 696
Credit: 27,266,655
RAC: 0
Level
Val
Scientific publications
watwat
Message 28475 - Posted: 11 Feb 2013 | 23:18:44 UTC - in response to Message 28474.

I've run Tyan FT48-B7025 units with 16 Tesla C1070s, and also seen them configured with 8x K10s. It's not cost-effective, in terms of $/GPU, because the more exotic host system has a significant price premium.

The C410xs are.. disappointing. Expensive, bad PCIe topology and custom sleds.

Dagorath - I did promise specs on our lab systems, didn't ?. We build machines with the following spec:

4xGTX680
Asus P9X79WS motherboard
Xeon E5 1620
16GB RAM
1500W Silverstone PSU
2TB hard disk.

We've our own custom case.

MJH

Dagorath
Send message
Joined: 16 Mar 11
Posts: 509
Credit: 179,005,236
RAC: 0
Level
Ile
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 28477 - Posted: 12 Feb 2013 | 3:35:17 UTC - in response to Message 28474.

Aye, they're pricey and I don't like the bottleneck inherent in the cable. The PSU in the one I linked to didn't have enough capacity in the PSU for the GPUs I am considering.

I'm all for jamming as many GTX690 as I can onto a single mobo but was told in another thread here recently 690s have driver and heat problems. There wasn't much elaboration on those problems (perhaps because I didn't ask for elaboration) but the way I see it if one person can make it work then so can I. If it's just a matter of time and effort then that's not a problem. I guess the question is.... are the reported problems insurmountable or merely difficult and when it's finally working does it work reasonably well or is it something you forever wish you had not done?

I'm looking favorably on the 660Ti too thanks to skgiven's advice. Good performance at reasonable price.

At newegg.ca I found this ASUS P6T7 WS Supercomputer mobo with 4 PCI-E x16 slots spaced such that they would accept 4 double height cards. $470, LGA-1366, Intel X58. The specs at newegg say

6 x PCIe 2.0 x16 (triple x16 or six x8)
1 x PCIe 2.0 x16


To me that means I can put 4 cards in the 4 blue slots and they'll all run at x16, is that right?

I would much rather have Gen. 3 PCI-E even at the additional expense, for future considerations, but the word is GPUgrid tasks don't need that much bandwidth and do just fine on Gen. 2 so maybe I would compromise on that point. I am considering starting out with this mobo and one card then add say one more card per month/paycheque.

For CPU probably a low cost Xeon like this one with 8MB L2 cache. I like big cache for multitasking.

For power supply.... 4 X GTX690 needs at least 2800 watts so I am seriously considering 2 of these 1500 watt EVGA SuperNOVA NEX1500 gold certified model. They have a 10 year warranty. One review says this model can be configured via onboard DIP switches to run when not connected to a mobo. So the plan would be to buy the mobo + CPU + 1 GTX690 + 1 PSU for now and the PSU has enough capacity for a second GTX690 which I'll purchase later. On purchase of the third 690 (or perhaps one of the rumored Titans) I would buy another of the same PSU and configure it to run without the mobo connection. It will also power the fourth GTX690/TiTan/whatever.

I haven't looked hard but the only 2800 watt PSUs I can find are these 2 models by Cisco and at $1675 they're too rich for me. Two of the 1500 watt models mentioned above are only $920 and the pair deliver 200 watts more.

Buying 2 X 1500 watt PSU has the advantage of being modular. If one fails I will still have the CPU and 2 GPUs running. There is also the advantage of splitting the cost over 2 payments. And I can't find a 3,000 watt PSU anywhere!

Another PSU option is to have 1 regular PSU to power the mobo and 1 card plus another that puts out only the +12 required for the other cards. A big watt +12 only PSU is less complicated than a normal PSU so it's gotta be cheaper. Also it would probably be very simple and cheap to build one.

____________
BOINC <<--- credit whores, pedants, alien hunters

Dagorath
Send message
Joined: 16 Mar 11
Posts: 509
Credit: 179,005,236
RAC: 0
Level
Ile
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 28478 - Posted: 12 Feb 2013 | 4:02:54 UTC - in response to Message 28475.


Dagorath - I did promise specs on our lab systems, didn't ?. We build machines with the following spec:

4xGTX680
Asus P9X79WS motherboard
Xeon E5 1620
16GB RAM
1500W Silverstone PSU
2TB hard disk.


Thanks, MJH!

I'm interested mainly in that motherboard. Newegg.ca stocks them and here it is for $377, $90 cheaper than the mobo I mentioned in my previous post. The thing about it that gives me concern is the PCI-E specs. It's PCI-E Gen 3 but the slots run at x8 if they're all occupied. The mobo I mentioned in my previous post has 4 slots that run at x16 even when all occupied but they're only Gen 2 not Gen 3. Which mobo would give the best performance on GPUgrid tasks? My hunch is x16 is more important than Gen 3 but I really don't know for sure.



We've our own custom case.


The best case is no case, IMHO, and I won't waste money on another one. A cardboard box is good enough for me.


____________
BOINC <<--- credit whores, pedants, alien hunters

Profile Gattorantolo [Ticino]
Avatar
Send message
Joined: 29 Dec 11
Posts: 44
Credit: 251,211,525
RAC: 0
Level
Asn
Scientific publications
watwatwatwatwatwatwatwatwat
Message 28479 - Posted: 12 Feb 2013 | 7:48:21 UTC - in response to Message 28478.
Last modified: 12 Feb 2013 | 7:51:55 UTC

With Windows is possible to have "only" 4 GPU, for example 4xGTX680 or 2xGTX690 (dual GPU).
Another example:
[3] NVIDIA GeForce GTX 690 (2047MB) driver: 310.33

This is actually one GTX 690 and another GPU (GTX670 or something).
Boinc reports each individual GPU. A GTX690 has two GPU's so Boinc reports two GTX690's! It's doesn't report the card count, just the GPU count, and it has to call it something, so GTX690 it is. Boinc also reports the first (or biggest/most powerful GPU) and then the number of GPU's. So if you have a GTX680 in there, it will report it as another GTX690.
____________
Member of Boinc Italy.

werdwerdus
Send message
Joined: 15 Apr 10
Posts: 123
Credit: 1,004,473,861
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 28480 - Posted: 12 Feb 2013 | 9:12:27 UTC

PCI3 3.0 is double the bandwidth of 2.0. So x16/2.0 == x8/3.0. So no difference in bandwidth.
____________
XtremeSystems.org - #1 Team in GPUGrid

Profile MJH
Project administrator
Project developer
Project scientist
Send message
Joined: 12 Nov 07
Posts: 696
Credit: 27,266,655
RAC: 0
Level
Val
Scientific publications
watwat
Message 28481 - Posted: 12 Feb 2013 | 15:12:41 UTC - in response to Message 28478.
Last modified: 12 Feb 2013 | 15:43:05 UTC

I'm interested mainly in that motherboard. Newegg.ca stocks them and here it is for $377, $90 cheaper than the mobo I mentioned in my previous post. The thing about it that gives me concern is the PCI-E specs. It's PCI-E Gen 3 but the slots run at x8 if they're all occupied. The mobo I mentioned in my previous post has 4 slots that run at x16 even when all occupied but they're only Gen 2 not Gen 3. Which mobo would give the best performance on GPUgrid tasks? My hunch is x16 is more important than Gen 3 but I really don't know for sure.


Dagorath - For GPUGRID tasks, PCIe speed isn't tremendously important, except in the occasional experiment where we add customisations. In that case, having a fast CPU is just as important, if not more. Also, for best performance, run Linux.

For GPUGRID tasks alone, check out the Asus P8Z77-WS board. It'll give the same performance at a lower price point. Similarly, 16GB is overkill - 1GB per GPU is adequate.

MJH

Dagorath
Send message
Joined: 16 Mar 11
Posts: 509
Credit: 179,005,236
RAC: 0
Level
Ile
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 28484 - Posted: 12 Feb 2013 | 17:14:27 UTC

Thanks for all the info guys, Gattorantolo, werdwerdus, MJH. It helps us GPU newbies a lot.

____________
BOINC <<--- credit whores, pedants, alien hunters

underwater
Send message
Joined: 16 May 13
Posts: 12
Credit: 455,242,410
RAC: 0
Level
Gln
Scientific publications
watwatwatwatwatwatwatwatwat
Message 31734 - Posted: 24 Jul 2013 | 22:04:23 UTC - in response to Message 28479.

With Windows is possible to have "only" 4 GPU, for example 4xGTX680 or 2xGTX690 (dual GPU).


This seems the best answer to the OP question.
BUT is there a limit to the amount of gpu's per machine on GPUGrid ??
I ask because I can't get a third GTX 690 to be recognized correctly.
Looks like it will be going into another system because it is just wasting power draw sitting in a slot doing sweet fa.
Willing to provide all information required if someone willing to help me solve this issue - if it is solvable.
Thanks

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2356
Credit: 16,377,930,940
RAC: 3,470,976
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 31736 - Posted: 24 Jul 2013 | 22:32:49 UTC

Have you ever checked out the top host list?
There is a host with 6 GTX 690's (it's actually 3 cards) under Win7, and another with 7 GTX 580's under Linux.
If Windows recognizes the 3rd card, and BOINC is not, you should put a line into cc_config.xml under the <options> section:
<use_all_gpus>1</use_all_gpus>
If you don't have a cc_config.xml, you should create one. See this post on how to do that.

underwater
Send message
Joined: 16 May 13
Posts: 12
Credit: 455,242,410
RAC: 0
Level
Gln
Scientific publications
watwatwatwatwatwatwatwatwat
Message 31737 - Posted: 24 Jul 2013 | 22:48:34 UTC - in response to Message 31736.

Cheers - yep running xml files.
currently running 4 GPUGrid wu's simultanious.
I'm sure it is a windows problem getting 3rd card to start playing.
While relatively new to GPUGrid I've been in top 20 daily now for past few months - minus maintenace days (like last few days)
The 3rd 690 won't go wasted but I would prefer that it stayed in this machine as part of a trio of 690's.
My rig we discuss in this thread is No 56 after 3 months
http://www.gpugrid.net/show_host_detail.php?hostid=154980
So trying to upgrade to improve performance on the grid.
Any help welcomed.
Thanks all

tbret
Send message
Joined: 2 Jan 13
Posts: 5
Credit: 233,329,525
RAC: 0
Level
Leu
Scientific publications
watwatwatwat
Message 31746 - Posted: 26 Jul 2013 | 4:59:02 UTC - in response to Message 28468.

What you're going to want to do is investigate "PCIe lanes."

Different processors support different numbers of lanes.

You can get 7 discrete GPU chips (e.g. three 690s and a 680) to work under Windows 7 provided that your CPU, that's with a "C" like CAT, has the "connections" to support that many PCIe "lanes."

You can find the number of lanes supported by various CPUs in the Intel documentation. I never found it for AMD.

So, just because your motherboard may be a bad mother in its own right, that doesn't mean that every CPU you might install in it will support enough "lanes" to the PCIe bus to let you get away with installing three 690s and a 680.

This was all new to me, I read it with interest, and since I wasn't in the market I promptly forgot the details.

I'm afraid that I have just exhausted all of my knowledge. The CPU is the ultimate limiter, although obviously a lousy motherboard would also prevent it from working even if you had the right CPU.

nanoprobe
Send message
Joined: 26 Feb 12
Posts: 184
Credit: 222,376,233
RAC: 0
Level
Leu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwat
Message 31747 - Posted: 26 Jul 2013 | 11:59:41 UTC

You can get 7 discrete GPU chips (e.g. three 690s and a 680) to work under Windows 7 provided that your CPU, that's with a "C" like CAT, has the "connections" to support that many PCIe "lanes."

Do you think Windows Server 2008 HPC would support more than 7 if the CPU supports it?

tbret
Send message
Joined: 2 Jan 13
Posts: 5
Credit: 233,329,525
RAC: 0
Level
Leu
Scientific publications
watwatwatwat
Message 31751 - Posted: 26 Jul 2013 | 18:02:51 UTC - in response to Message 31747.

You can get 7 discrete GPU chips (e.g. three 690s and a 680) to work under Windows 7 provided that your CPU, that's with a "C" like CAT, has the "connections" to support that many PCIe "lanes."

Do you think Windows Server 2008 HPC would support more than 7 if the CPU supports it?


I really did tell everything I know and I'm not even sure that information is exhaustively correct.

It seems like I've seen more than 7 cards running on a system, so someone has figured-out something; but I also know that someone else was trying to write a custom BIOS to get 4 GTX 690s to run on a consumer board and found-out that it wasn't possible due to the CPU hardware limitations.


Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2356
Credit: 16,377,930,940
RAC: 3,470,976
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 31752 - Posted: 26 Jul 2013 | 18:48:53 UTC - in response to Message 31737.

I can see that you've solved this problem.
What have you done to fix it?

underwater
Send message
Joined: 16 May 13
Posts: 12
Credit: 455,242,410
RAC: 0
Level
Gln
Scientific publications
watwatwatwatwatwatwatwatwat
Message 31753 - Posted: 26 Jul 2013 | 20:13:35 UTC - in response to Message 31752.
Last modified: 26 Jul 2013 | 20:14:49 UTC

I can see that you've solved this problem.
What have you done to fix it?


It was a driver issue.
Uninstalled everything Nvidia, then ran drive-sweeper.
When re-booted and installed latest Nvidia drivers suite everything was recognized and worked in harmony.
I also updated the SR-2 to new Bios recently released (thanks EVGA) but that did not make any difference to the problem I had.
running 2 x Xeon 5650 cpus

This score form yesterday just can't be correct can it ??
25,483,750 from this one host.
Have no idea what the glitch was, but today seems more normal.

werdwerdus
Send message
Joined: 15 Apr 10
Posts: 123
Credit: 1,004,473,861
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 31764 - Posted: 27 Jul 2013 | 14:22:04 UTC

that happens when you merge computers on the GPUGrid site
____________
XtremeSystems.org - #1 Team in GPUGrid

underwater
Send message
Joined: 16 May 13
Posts: 12
Credit: 455,242,410
RAC: 0
Level
Gln
Scientific publications
watwatwatwatwatwatwatwatwat
Message 31769 - Posted: 27 Jul 2013 | 16:00:46 UTC - in response to Message 31764.

that happens when you merge computers on the GPUGrid site

After I got 3rd card working that is exactly what I did.
Thanks

tbret
Send message
Joined: 2 Jan 13
Posts: 5
Credit: 233,329,525
RAC: 0
Level
Leu
Scientific publications
watwatwatwat
Message 31906 - Posted: 9 Aug 2013 | 7:45:12 UTC

I've been semi-corrected elsewhere, so just wanted to correct the information I gave here in an effort not to confuse anyone who might read this thread.

The resource limitation with "lanes" and the like with CPUs is only a factor when trying to run four GTX 690s.

Apparently the GTX 590 is not such a resource hog and can be run four at a time.

That would mean it IS possible to run four, double-GPU cards in one system, just not four GTX 690s which are resource hogs. I have no idea what the "extra" burden a GTX 690 places on a system that a GTX 590 does not. DMA channels? Interrupts?

I don't know and don't know where to look.

I hope my previous comment hasn't mislead anyone.

juan BFP
Send message
Joined: 11 Dec 11
Posts: 21
Credit: 145,887,858
RAC: 0
Level
Cys
Scientific publications
watwatwatwatwatwatwatwatwat
Message 32637 - Posted: 2 Sep 2013 | 16:17:38 UTC - in response to Message 31906.
Last modified: 2 Sep 2013 | 16:21:52 UTC

I've been semi-corrected elsewhere, so just wanted to correct the information I gave here in an effort not to confuse anyone who might read this thread.

The resource limitation with "lanes" and the like with CPUs is only a factor when trying to run four GTX 690s.

Apparently the GTX 590 is not such a resource hog and can be run four at a time.

That would mean it IS possible to run four, double-GPU cards in one system, just not four GTX 690s which are resource hogs. I have no idea what the "extra" burden a GTX 690 places on a system that a GTX 590 does not. DMA channels? Interrupts?

I don't know and don't know where to look.

I hope my previous comment hasn't mislead anyone.


Actualy why 4x690 not work is nothing related to the CPU/MB is related to the windows itself but that limitation is only in place due the way the 690 works, 3x690+590 actualy works (or any other combination of 4 GPU´s), so actualy you could run up to 8 GPU´s on a single host as a theorical maximum. But of course you need to have the PSU, a way to cool, a fast CPU and a top of the class MB to feed this monster.

Probabily (i don´t know one who try) 4x690 could work on Linux or other OS.

The information about that is very limited and hard to find, and yes even with the help of a new bios from NVidia the 4x690 not works, and one warning, the mate who try needs to send back to NVidia his 690... for repairs...
____________

zombie67 [MM]
Avatar
Send message
Joined: 16 Jul 07
Posts: 209
Credit: 4,220,036,456
RAC: 11,579,753
Level
Arg
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 32865 - Posted: 10 Sep 2013 | 0:22:58 UTC - in response to Message 28475.

We build machines with the following spec:

4xGTX680
Asus P9X79WS motherboard
Xeon E5 1620
16GB RAM
1500W Silverstone PSU
2TB hard disk.

We've our own custom case.

MJH


Question: How do you (or anyone) get the bottom card to fit? I have several different models of these "4x dual wide GPU boards", and they all have the same flaw: They put a lot of headers under the space of the 4th double-wide GPU. And so you can't push the GPU down into the PCIe slot without crushing the cables plugged into those headers.


____________
Reno, NV
Team: SETI.USA

Profile MJH
Project administrator
Project developer
Project scientist
Send message
Joined: 12 Nov 07
Posts: 696
Credit: 27,266,655
RAC: 0
Level
Val
Scientific publications
watwat
Message 32867 - Posted: 10 Sep 2013 | 10:43:57 UTC - in response to Message 32865.

Question: How do you (or anyone) get the bottom card to fit?


We only connect power and LED switches. The flying leads are bent sharply but do fit. See:

https://www.youtube.com/watch?v=0FIRG6H0sSI


MJH

nanoprobe
Send message
Joined: 26 Feb 12
Posts: 184
Credit: 222,376,233
RAC: 0
Level
Leu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwat
Message 32868 - Posted: 10 Sep 2013 | 11:50:17 UTC - in response to Message 32865.

We build machines with the following spec:

4xGTX680
Asus P9X79WS motherboard
Xeon E5 1620
16GB RAM
1500W Silverstone PSU
2TB hard disk.

We've our own custom case.

MJH


Question: How do you (or anyone) get the bottom card to fit? I have several different models of these "4x dual wide GPU boards", and they all have the same flaw: They put a lot of headers under the space of the 4th double-wide GPU. And so you can't push the GPU down into the PCIe slot without crushing the cables plugged into those headers.


FWIW maybe one could buy a couple of these and experiment with removing the cables and cutting down the plastic and pins so they don't stick up so high. Just a thought.

zombie67 [MM]
Avatar
Send message
Joined: 16 Jul 07
Posts: 209
Credit: 4,220,036,456
RAC: 11,579,753
Level
Arg
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 32873 - Posted: 10 Sep 2013 | 15:41:50 UTC

Ah. That explains it. Thanks!
____________
Reno, NV
Team: SETI.USA

Post to thread

Message boards : Number crunching : Limit on GPUs per system?

//