Tesla

Message boards : Graphics cards (GPUs) : Tesla

Author	Message
Mumak Send message Joined: 7 Dec 12 Posts: 92 Credit: 225,897,225 RAC: 0 Level Scientific publications	Message 40419 - Posted: 10 Mar 2015 \| 21:11:00 UTC
	Just in case somebody is interested how some exotic GPUs perform, I did a run on a Tesla K20c: e4s166_e3s4f169-NOELIA_27x3-0-2-RND0924_0 https://www.gpugrid.net/result.php?resultid=13938855 Runtime: 24,097 s GPU load: ~95% Temperature: ~60 C Power: ~120 W For comparison, my 750 Ti does the same task in 40,300 s. Sure, Teslas are better in DPFP. The only such app I tried was Milkyway, which however uses OpenCL, so it's not ideal. The performance there was comparable to a RADEON HD7970/280X.
	ID: 40419 \| Rating: 0 \| rate: / Reply Quote

[CSF] Thomas H.V. DUPONT Send message Joined: 20 Jul 14 Posts: 732 Credit: 126,845,366 RAC: 156,524 Level Scientific publications	Message 40420 - Posted: 11 Mar 2015 \| 6:40:09 UTC - in response to Message 40419.
	Thanks Mumak! Really interesting :) I expected better performance from a GPU of this quality... ____________ [CSF] Thomas H.V. Dupont Founder of the team CRUNCHERS SANS FRONTIERES 2.0 www.crunchersansfrontieres
	ID: 40420 \| Rating: 0 \| rate: / Reply Quote

Mumak Send message Joined: 7 Dec 12 Posts: 92 Credit: 225,897,225 RAC: 0 Level Scientific publications	Message 40421 - Posted: 11 Mar 2015 \| 9:45:36 UTC
	Oh, I had ECC enabled on the Tesla. Switching off and giving it another run..
	ID: 40421 \| Rating: 0 \| rate: / Reply Quote

eXaPower Send message Joined: 25 Sep 13 Posts: 293 Credit: 1,897,601,978 RAC: 0 Level Scientific publications	Message 40422 - Posted: 11 Mar 2015 \| 12:39:44 UTC
	Excellent power/performance ratio for [13]SMX (2496CUDA) while computing FP32. You're GK110 120W operating points: 0.04w per core or 9.23W per 192CUDA SMX. 120W is 1024CUDA GTX960 domain. A 12 SMX 780 can sip 145W at 836MHz or lower. (reference base clock) Not including Maxwell's GM204 GTX970 - cut GK110 are the best eco-tuners NVidia produced. Cut GK104 are also able eco-tuners as GTX660ti and GTX760 have proven. Will a full GK110 see ~150W at 95% core? Lowest is about 165W or so. This really good for an eco-tune even as GK110 are capable of maximizing every available ounce of power at 1.2GHz/250W. There are a lot of DP64 enabled GK110 running FP32 ACEMD. Maybe a FP64 ACEMD app will be created for those specific high performance DPFP GPU's?
	ID: 40422 \| Rating: 0 \| rate: / Reply Quote

GPUGRID Role account Send message Joined: 15 Feb 07 Posts: 134 Credit: 1,349,535,983 RAC: 0 Level Scientific publications	Message 40423 - Posted: 11 Mar 2015 \| 13:50:11 UTC
	You can use nvidia-smi to increase the application clocks. This will extract another 10% or so.
	ID: 40423 \| Rating: 0 \| rate: / Reply Quote

Mumak Send message Joined: 7 Dec 12 Posts: 92 Credit: 225,897,225 RAC: 0 Level Scientific publications	Message 40424 - Posted: 11 Mar 2015 \| 14:15:35 UTC
	Thanks for the hint. Current/default clock is 705 MHz, max should be 758. Will first finish the current WU to see the difference between ECC/Non-ECC, then will try some OC ;-)
	ID: 40424 \| Rating: 0 \| rate: / Reply Quote

Mumak Send message Joined: 7 Dec 12 Posts: 92 Credit: 225,897,225 RAC: 0 Level Scientific publications	Message 40425 - Posted: 11 Mar 2015 \| 16:05:31 UTC
	Decided to try max GPU clock (758 MHz), the WU has not finished yet. Just for comparison (running NOELIA_PO now): 705 MHz - 133 W 758 MHz - 150 W
	ID: 40425 \| Rating: 0 \| rate: / Reply Quote

Mumak Send message Joined: 7 Dec 12 Posts: 92 Credit: 225,897,225 RAC: 0 Level Scientific publications	Message 40434 - Posted: 12 Mar 2015 \| 9:13:17 UTC
	NOELIA_ot, 758 MHz, ECC off: 27k seconds http://www.gpugrid.net/result.php?resultid=13950350
	ID: 40434 \| Rating: 0 \| rate: / Reply Quote

ExtraTerrestrial Apes Volunteer moderator Volunteer tester Send message Joined: 17 Aug 08 Posts: 2705 Credit: 1,311,122,549 RAC: 0 Level Scientific publications	Message 40589 - Posted: 22 Mar 2015 \| 23:19:16 UTC - in response to Message 40420.
	I expected better performance from a GPU of this quality... There's no reason to do so. It's just the same GK110 chips as GTX780/Ti and Titan/Z. It gains energy efficiency by being run at very low clock speeds and voltages. To some extent thiis could easily be done on other cards as well. Although most will prefer a high performance state anyway. A comparison to a stock GTX960 sounds impressive, but that GPU is driven quite hard up to maximum voltages around 1.20 V and has a lot of room to run more efficiently for a minor performance loss (down to 1.10 - 1.00 V). Maybe a FP64 ACEMD app will be created for those specific high performance DPFP GPU's? Why? Even the super expensive Titan looses 2/3 of the maximum throughput in DP mode. If the app can get by with 32 bit it's always better to use only 32 bit. That's why "mixed precision" with 16 bit fp enhancements will become a topic for nVidia with Pascal. A valid reason would be to use new physical models which might not be possible in FP32. But I don't think it's the precision which limits, it's probably more often the flow control which makes these tasks better suited to CPUs. MrS ____________ Scanning for our furry friends since Jan 2002
	ID: 40589 \| Rating: 0 \| rate: / Reply Quote

Post to thread

Message boards : Graphics cards (GPUs) : Tesla

	About	Science	Volunteers	Performance	Forum	Join us	Donate