NVIDIA BigKepler

Message boards : Graphics cards (GPUs) : NVIDIA BigKepler

Author	Message
Carlesa25 Send message Joined: 13 Nov 10 Posts: 328 Credit: 72,619,453 RAC: 0 Level Scientific publications	Message 24499 - Posted: 20 Apr 2012 \| 19:41:31 UTC Last modified: 20 Apr 2012 \| 19:45:18 UTC
	Hi, NVIDIA plans to introduce GK100 (or GK110) known as Big Kepler in the upcoming GPU Thechnology Conference in San Jose (California, USA) May 14-17, 2012 Has 7,000 milion transistors. Greetings. NVIDIA-GPU Technology Conference
	ID: 24499 \| Rating: 0 \| rate: / Reply Quote

5pot Send message Joined: 8 Mar 12 Posts: 411 Credit: 2,083,882,218 RAC: 0 Level Scientific publications	Message 24500 - Posted: 20 Apr 2012 \| 21:14:32 UTC Last modified: 20 Apr 2012 \| 21:20:33 UTC
	Well that's good news!! Wonder how the design will be in regards to the FP32 vs. FP64 core layout, meaning I'm wondering if it will just be a monster at FP32, and slightly so at FP64 (like 680), or if they will make this one a more well-rounded GPU for crunching purposes. Hate to see how long it will take for any to actually become available though!! 680 been out for over a month, and they're still hard to get. Maybe a christmas gift for me? lol. 7 billion though, EXCITING!! EDIT: Hmmmm.... read some other forums, and many discussing that this may be for ONLY for their compute designed chips Quaddro and Tesla, which would make sense from a business standpoint : ( Gpu compute conference= expensive series, meant for scientists with grant money......
	ID: 24500 \| Rating: 0 \| rate: / Reply Quote

ExtraTerrestrial Apes Volunteer moderator Volunteer tester Send message Joined: 17 Aug 08 Posts: 2705 Credit: 1,311,122,549 RAC: 0 Level Scientific publications	Message 24504 - Posted: 20 Apr 2012 \| 22:30:39 UTC
	Suggestion: they handle it just as with Fermi. The big chip gets FP64 at 1/2 speed. It's active on Telsa and Quadro, but restricted on Geforce. Probably not 1/24 (as GK104) though, more like the previous 1/8. MrS ____________ Scanning for our furry friends since Jan 2002
	ID: 24504 \| Rating: 0 \| rate: / Reply Quote

5pot Send message Joined: 8 Mar 12 Posts: 411 Credit: 2,083,882,218 RAC: 0 Level Scientific publications	Message 24505 - Posted: 20 Apr 2012 \| 22:51:44 UTC
	I mean the 680 isn't restricted though like other cards though if I'm not mistaken. 680 only has 8 fp64 cores which run at 1/1. They're not throttled, they just don't exist, not even added in the core count too.
	ID: 24505 \| Rating: 0 \| rate: / Reply Quote

5pot Send message Joined: 8 Mar 12 Posts: 411 Credit: 2,083,882,218 RAC: 0 Level Scientific publications	Message 24506 - Posted: 21 Apr 2012 \| 0:23:09 UTC
	This should be interesting. Probably should be somewhere between 25-30% faster than 680? My guess anyways. Hope they have yields compared to 680! Rollout has been way to slow
	ID: 24506 \| Rating: 0 \| rate: / Reply Quote

5pot Send message Joined: 8 Mar 12 Posts: 411 Credit: 2,083,882,218 RAC: 0 Level Scientific publications	Message 24507 - Posted: 21 Apr 2012 \| 5:08:58 UTC
	Now that I think about it, with 7B transistors, couldn't this be the 690?
	ID: 24507 \| Rating: 0 \| rate: / Reply Quote

GDF Volunteer moderator Project administrator Project developer Project tester Volunteer developer Volunteer tester Project scientist Send message Joined: 14 Mar 07 Posts: 1957 Credit: 629,356 RAC: 0 Level Scientific publications	Message 24508 - Posted: 21 Apr 2012 \| 10:25:51 UTC - in response to Message 24507. Last modified: 21 Apr 2012 \| 10:26:24 UTC
	I am surprised to see 7000 M. A gtx680 has only 3500 M, while the number of cores is only 50% more (2304) on gk110 according to some websites. Maybe they will actually have 2048 cores and not 2304, but these are actually more similar to a 110 chip rather than a 104, so more performing for us. It would be great because then I would expect almost a factor 2x in performance against a GTX680 at the same clock. gdf
	ID: 24508 \| Rating: 0 \| rate: / Reply Quote

5pot Send message Joined: 8 Mar 12 Posts: 411 Credit: 2,083,882,218 RAC: 0 Level Scientific publications	Message 24510 - Posted: 21 Apr 2012 \| 13:33:11 UTC Last modified: 21 Apr 2012 \| 14:25:48 UTC
	If it is a 685, and if it does have 2305, instead of 2048. Those extra cores could be the FP64 cores ? Seems like there's no way they could fit this on a 520 die though. IMHO. Also saw release date was rumored at August to September, which at the current rate of availability of 680 would probably make them AVAILABLE around Xmas. EDIT: Further if this is the case, this would begin to bring it dangerously close to Maxwell if I'm not mistaken. Guess I'll have to wait and see.
	ID: 24510 \| Rating: 0 \| rate: / Reply Quote

frankhagen Send message Joined: 18 Sep 08 Posts: 65 Credit: 3,037,414 RAC: 0 Level Scientific publications	Message 24511 - Posted: 21 Apr 2012 \| 14:48:39 UTC - in response to Message 24510.
	If it is a 685, and if it does have 2305, instead of 2048. Those extra cores could be the FP64 cores ? one thing for sure: they will come up with a chip that has decent DP-capabilities for the quadro/tesla line. did you read the announcement? "low-overhead ECC" will definitely mean something not to be seen on consumer-cards.... ..so most likely it's not a GTX-something.
	ID: 24511 \| Rating: 0 \| rate: / Reply Quote

5pot Send message Joined: 8 Mar 12 Posts: 411 Credit: 2,083,882,218 RAC: 0 Level Scientific publications	Message 24512 - Posted: 21 Apr 2012 \| 15:48:06 UTC
	Very true. Seems weird they would post it on Twitter feed to me anyways. If it wasn't consumer based. Have seen dev based betas 301.26 and 301.27 for them to build their tools. My 680 is using 301.25, which isn't even on their website yet, which has quaddro feature on it, as well as 670 or 660 ti specs as well.
	ID: 24512 \| Rating: 0 \| rate: / Reply Quote

Retvari Zoltan Send message Joined: 20 Jan 09 Posts: 2343 Credit: 16,217,465,968 RAC: 1,257,790 Level Scientific publications	Message 24519 - Posted: 21 Apr 2012 \| 20:43:47 UTC
	I still think nVidia won't release BigKepler as a consumer card (i.e. GeForce). Besides the crunchers, there is no need for such a card in this market segment. We crunchers are a minority among the consumer card buyers, so we do not present such an urge for nVidia to release a cheap cruncher card built on an expensive chip. GDF's reaction was kind of a confirmation of my opinion, naming the GTX 680 as the flagship product.
	ID: 24519 \| Rating: 0 \| rate: / Reply Quote

skgiven Volunteer moderator Volunteer tester Send message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level Scientific publications	Message 24521 - Posted: 21 Apr 2012 \| 23:47:13 UTC - in response to Message 24519.
	Non-professional video editors might disagree, as might their favourite software developer. ____________ FAQ's HOW TO: - Opt out of Beta Tests - Ask for Help
	ID: 24521 \| Rating: 0 \| rate: / Reply Quote

ExtraTerrestrial Apes Volunteer moderator Volunteer tester Send message Joined: 17 Aug 08 Posts: 2705 Credit: 1,311,122,549 RAC: 0 Level Scientific publications	Message 24530 - Posted: 22 Apr 2012 \| 11:37:43 UTC
	7B transistors seem fit for typical nVidia strategy: build it as large as TSMC will allow you. And 2x the transistors for 1.5x the shaders makes sense considering these will have to be cores, which can do FP64 at 1/2 the FP32 rate, just like on GF100 and GF110, which requires more transistors per shader. And I wouldn't be surprised if they moved away from the superscalar execution again, just like on GF10 and GF110, which improves the utilization of the shaders in typical HPC code, but requires more control logic (i.e. more transistors) per shader, again. Support for ECC memory may add some transistors. There may also be larger caches and other tweaks we do not yet know about. And considering the market for Quadros is still large compared to the market for pure Teslas, I'm sure these chips will still have graphics capabilities. This way they can be used for both markets, lowering design cost. And if there's graphics hardware in there and the chip is faster in games than GK104 (which it should be at this transistor count, although by far less than a factor of 2), they will introduce consumer Geforce cards based on them. The win margin on these high end cards is huge and since they already have the chip, it doesn't cost them much. And the high end GPUs will be bought, just like in previous generations. Despite the fact that GK104 should be much more efficient (power and price) for games. Name of the product: who knows, maybe even nVidia themselves have not decided this yet. Maybe straight GTX780 (which would make GTX680 look bad), or GTX685 (which would make big Kepler look weak) or "GTX680XT XXX Ultra-Monster-Core Golden Sample Edition" (which would make their other names look pretty good). Personally I'd bet the name of my sisters first-born on the latter ;) Regarding yields: they'll be bad for such a huge chip, but there'll be plety of untis to deactivate. No big deal. And the scarce availability of current 28 nm cards is not primarily a yield issue (otherwise we'd already be seeing more cut-down versions of the chip and no fully activated ones), but rather an issue of overall 28 nm capacity. This capacity will improve as TSMC converts more fabs or lines within each fab, but demand for 28 nm chips will also increase as more designs transition to the newer process. Anyway, TSMC expects supplies of any 28 nm chips to be tight until the end of the year. MrS ____________ Scanning for our furry friends since Jan 2002
	ID: 24530 \| Rating: 0 \| rate: / Reply Quote

GPUGRID Role account Send message Joined: 15 Feb 07 Posts: 134 Credit: 1,349,535,983 RAC: 0 Level Scientific publications	Message 24533 - Posted: 22 Apr 2012 \| 13:00:43 UTC - in response to Message 24508.
	I am surprised to see 7000 M. A gtx680 has only 3500 M Nowhere does it say all those 7B transistors are all on a single die. In marketting-speak at least, a dual-GK104 card would satisfy the description "7B transistor GPU". MJH
	ID: 24533 \| Rating: 0 \| rate: / Reply Quote

ExtraTerrestrial Apes Volunteer moderator Volunteer tester Send message Joined: 17 Aug 08 Posts: 2705 Credit: 1,311,122,549 RAC: 0 Level Scientific publications	Message 24535 - Posted: 22 Apr 2012 \| 13:49:21 UTC - in response to Message 24533. Last modified: 22 Apr 2012 \| 13:50:48 UTC
	Sure, but GK104 would be far worse than an unlocked GF110 for FP64 compute. You can't offer a compute chip without good FP64 performance, the usage cases would be far too few. And I don't think it's got ECC either, as this is not needed for gaming. MrS ____________ Scanning for our furry friends since Jan 2002
	ID: 24535 \| Rating: 0 \| rate: / Reply Quote

Retvari Zoltan Send message Joined: 20 Jan 09 Posts: 2343 Credit: 16,217,465,968 RAC: 1,257,790 Level Scientific publications	Message 24536 - Posted: 22 Apr 2012 \| 14:41:09 UTC - in response to Message 24530.
	7B transistors seem fit for typical nVidia strategy: build it as large as TSMC will allow you. They learned a lesson from the Fermi-fiasco not to come out with the largest chip first. And 2x the transistors for 1.5x the shaders makes sense considering these will have to be cores, which can do FP64 at 1/2 the FP32 rate, just like on GF100 and GF110, which requires more transistors per shader. And I wouldn't be surprised if they moved away from the superscalar execution again, just like on GF10 and GF110, which improves the utilization of the shaders in typical HPC code, but requires more control logic (i.e. more transistors) per shader, again. Support for ECC memory may add some transistors. There may also be larger caches and other tweaks we do not yet know about. I agree. However if all of this is true, I would be surprised if the BigKepler had more than 1024 cores. And considering the market for Quadros is still large compared to the market for pure Teslas, I'm sure these chips will still have graphics capabilities. This way they can be used for both markets, lowering design cost. Or they return to the design of the GT200, and use a discrete chip for this purpose. And if there's graphics hardware in there and the chip is faster in games than GK104 (which it should be at this transistor count, although by far less than a factor of 2), they will introduce consumer Geforce cards based on them. I hope it's right, and then we could have a nice cruncher card. The win margin on these high end cards is huge and since they already have the chip, it doesn't cost them much. The win margin is high on Teslas and Quadros, but it's low on the top end GeForces (like the GTX 295, the GTX 590, or even the GTX 580). And the high end GPUs will be bought, just like in previous generations. Despite the fact that GK104 should be much more efficient (power and price) for games. It's true, but they still could build a dual GPU card on the GK104, which would be very fast and very efficient at the same time. Name of the product: who knows, maybe even nVidia themselves have not decided this yet. Maybe straight GTX780 (which would make GTX680 look bad), or GTX685 (which would make big Kepler look weak) or "GTX680XT XXX Ultra-Monster-Core Golden Sample Edition" (which would make their other names look pretty good). Personally I'd bet the name of my sisters first-born on the latter ;) :) I'm sure they will find a fully satisfying name for this product, if there will be a product to name. What I meant was that if the GTX 680 is the flagship (GeForce) product, then we won't have a better (single-chip) GeForce this time.
	ID: 24536 \| Rating: 0 \| rate: / Reply Quote

5pot Send message Joined: 8 Mar 12 Posts: 411 Credit: 2,083,882,218 RAC: 0 Level Scientific publications	Message 24537 - Posted: 22 Apr 2012 \| 14:56:08 UTC
	Safe to say they gave just enough to get us excited didn't they!!!
	ID: 24537 \| Rating: 0 \| rate: / Reply Quote

ExtraTerrestrial Apes Volunteer moderator Volunteer tester Send message Joined: 17 Aug 08 Posts: 2705 Credit: 1,311,122,549 RAC: 0 Level Scientific publications	Message 24539 - Posted: 22 Apr 2012 \| 16:03:40 UTC - in response to Message 24536.
	They learned a lesson from the Fermi-fiasco not to come out with the largest chip first. This may be wise and intentional, although some rumors were floating around they tried to introduce the chip ealier, but couldn't due to some problems. I agree. However if all of this is true, I would be surprised if the BigKepler had more than 1024 cores. If they just expanded GF110 to 7B transistors, that would indeed yield ~1024 shaders. However, by getting rid of the hot clock they can save some transistors, as well as everything already implemented in "little Kepler". Starting from GK104 we could remove the superscalar capability, i.e. 1/3 the shaders and as a first-order approximation 1/3 the transistors. That yields 1024 shaders for 2.33 billion transistors, so for 2048 shaders only 4.7 billion would be neccessary, ~2400 could be possible at 5.5 billion transistors. These would be FP32 only, so now add some of the suff I mentioned in my post above and I think it works out. Or they return to the design of the GT200, and use a discrete chip for this purpose. Wasn't that the same chip for all of them? And the dual GK104 will come, I'm sure. It will rock for gaming and it will sell. The purpose of building GK104 fast & efficient, yet not so big to reach 250 W again was probably to have a decent dual chip GPU again (without heavy binning and downclocking to stay below 300 W). MrS ____________ Scanning for our furry friends since Jan 2002
	ID: 24539 \| Rating: 0 \| rate: / Reply Quote

frankhagen Send message Joined: 18 Sep 08 Posts: 65 Credit: 3,037,414 RAC: 0 Level Scientific publications	Message 24543 - Posted: 22 Apr 2012 \| 18:04:10 UTC - in response to Message 24539. Last modified: 22 Apr 2012 \| 18:04:42 UTC
	And the dual GK104 will come, I'm sure. It will rock for gaming and it will sell. The purpose of building GK104 fast & efficient, yet not so big to reach 250 W again was probably to have a decent dual chip GPU again (without heavy binning and downclocking to stay below 300 W). SINGED! until now they had a single chip design for consumer and professional purposes and simply limited the consumer cards on DP-performance. is it such a wild guess, that this will no longer be the case? they will have absolutely no problem to scale down from GK-104 to feed every range they want. and they will come up with something to replace the current quadro/tesla line. but we will know mid of may..
	ID: 24543 \| Rating: 0 \| rate: / Reply Quote

Retvari Zoltan Send message Joined: 20 Jan 09 Posts: 2343 Credit: 16,217,465,968 RAC: 1,257,790 Level Scientific publications	Message 24546 - Posted: 22 Apr 2012 \| 20:03:43 UTC - in response to Message 24539.
	If they just expanded GF110 to 7B transistors, that would indeed yield ~1024 shaders. However, by getting rid of the hot clock they can save some transistors, as well as everything already implemented in "little Kepler". Starting from GK104 we could remove the superscalar capability, i.e. 1/3 the shaders and as a first-order approximation 1/3 the transistors. That yields 1024 shaders for 2.33 billion transistors, so for 2048 shaders only 4.7 billion would be neccessary, ~2400 could be possible at 5.5 billion transistors. These would be FP32 only, so now add some of the suff I mentioned in my post above and I think it works out. From what you say I got the feeling that the GF100 and the GF110 is a very wasteful design regarding the transistor count. I think nVidia wouldn't develop their professional product line from the consumer product line, which was derived from the previous professional product line.
	ID: 24546 \| Rating: 0 \| rate: / Reply Quote

ExtraTerrestrial Apes Volunteer moderator Volunteer tester Send message Joined: 17 Aug 08 Posts: 2705 Credit: 1,311,122,549 RAC: 0 Level Scientific publications	Message 24547 - Posted: 22 Apr 2012 \| 22:04:47 UTC - in response to Message 24546.
	GF100 and GF110 have been about as wasteful as GF104 compared to GK104. It's actually the other way around: Kepler is a huge step forward in efficiency, both power-consumption and transistor-wise. Rest assured that "big Kepler" will borrow a lot of these tricks from "small Kepler". It wouldn't make sense to develop 2 different architectures. What they need to change, though, is the balance of execution units. GK104 is heavily tilted towards gaming, with OK FP32 compute and FP64 just for development. By changing the shader design somewhat (the internals, external they'll act pretty much the same way, so the same scheduling hardware etc. can be used) and maybe other tweaks they can provide good FP32 and massive FP64 (1/2 rate) compute power. You can't draw a straight line through these designs and generations. Both Fermis, the compute and the gamer versions, have been developed in parallel. They're just as much different incarnations of the same basic architecture as the Keplers will be. MrS ____________ Scanning for our furry friends since Jan 2002
	ID: 24547 \| Rating: 0 \| rate: / Reply Quote

Zarck Send message Joined: 16 Aug 08 Posts: 145 Credit: 328,473,995 RAC: 0 Level Scientific publications	Message 24575 - Posted: 24 Apr 2012 \| 10:56:25 UTC - in response to Message 24547. Last modified: 24 Apr 2012 \| 10:56:56 UTC
	http://www.hardware.fr/news/12254/nvidia-gk110-7-milliards-transistors-gtc.html?utm_source=feedburner&utm_medium=feed&utm_campaign=Feed%3A+hardware%2Ffr%2Fnews+%28HardWare.fr+-+News%29 "As we noted in our review of the GeForce GTX 680, it is not based on the largest GPU family Kepler, which was delayed or replaced for unknown reasons, probably related to the problem of making the 28 nanometer process. If Nvidia has decided to introduce initially the GeForce GTX 600 premium based on the "small" GK104, originally intended for the lower segment, this does not mean that the largest family of GPU was canceled. If it is not strictly necessary in terms of performance involved, Nvidia can not do without it for the professional market, especially the high performance computing. Indeed, the GK104 made many compromises at this level, and if it can offer excellent performance in video games, this is no longer the case when it comes to the use as an accelerator massively parallel. Like GK104 (GTX 680) is actually the successor of GF104/114 (GTX 460/560 Ti), GK110 is the true successor of GF100/110 (GTX 480/580). This task will be to drive the point in terms of raw performance, which will also benefit the graphics, but also to continue the evolution towards greater flexibility for massively parallel computing. Apart from its obvious purpose, little information flowing to date on this GPU. As a teaser, however Nvidia has slipped a little information in the description of the sessions of the GPU Technology Conference (GTC) to be held in San José from 14 to 17 May: In this talk, Individuals from the GPU architecture and CUDA software Will Dive Into groups the features of the architecture for compute "Kepler" - NVIDIA's new 7-billion transistor GPU. From the processing cores reorganized with new instructions and processing capabilities, to An improved memory systems with faster processing and low-atomic ECC overhead, We Will explores how the Kepler GPU Achieves world leading performance and efficiency, and how it Enables Wholly new kinds of parallel To Be solved problems. Without the GK110 is directly appointed, a new GPU Nvidia mentions Kepler no less than 7 billion transistors! To recap, the GK104 is satisfied with 3.5 billion while the largest GPU from AMD, Tahiti shows "only" 4.3 billion to the counter. Compared to GK104, many features strictly related to professional computing will be introduced. Nvidia mentions here a memory subsystem for improved atomic operations more efficient and reduce overhead for ECC, already on the GF100/110 but with a relatively high cost. The new GPU will also have a capacity of double counting very high accuracy, while the holder is anecdotal GK104 with performance equivalent to 1/24th that of the single precision. Previously, Nvidia also put forward the explosion of energy efficiency in double precision calculation (x3) to position relative to the Fermi Kepler. The GTX 680 already replaced after 2 months? Not really: except huge surprise, the GeForce GTX 690 to be presented shortly will be a bi-GPU card based on GK104. GeForce based GK110 undoubtedly emerge (GTX 685? GTX 780?), But probably not before school starts. In the immediate Nvidia probably wants above all to prepare professionals arrival, development cycles are very long, but also convince them of the benefits of GPU computing, at a time when Intel's response to Knights Corner and MIC architecture is no longer far away. In all cases, we will be at the GTC and we will certainly bring you all the information related to the new GPUs and the CUDA platform 5!" ____________
	ID: 24575 \| Rating: 0 \| rate: / Reply Quote

5pot Send message Joined: 8 Mar 12 Posts: 411 Credit: 2,083,882,218 RAC: 0 Level Scientific publications	Message 24594 - Posted: 26 Apr 2012 \| 13:47:46 UTC Last modified: 26 Apr 2012 \| 13:55:39 UTC
	I MAY have found something interesting, unless this is old news... But what do you notice that is interesting about this pic from Microcenter (probably old, and they forgot to change it) http://www.microcenter.com/storefronts/nvidia/GTX_680%20promo/index.html EDIT: It's either a 680 4GB reference version or............ 2nd EDIT: http://www.techspot.com/news/47898-nvidia-reclaims-performance-crown-with-geforce-gtx-680.html Same pic, but, can you spot the difference?
	ID: 24594 \| Rating: 0 \| rate: / Reply Quote

Retvari Zoltan Send message Joined: 20 Jan 09 Posts: 2343 Credit: 16,217,465,968 RAC: 1,257,790 Level Scientific publications	Message 24596 - Posted: 26 Apr 2012 \| 16:10:24 UTC - in response to Message 24594.
	There is a 8+6 pin PCIe connector on the first image, while there is only a 6+6 pin on the second one. Both of these pictures are not actual photographs of a real product, they are just rendered images for teasing your mind :)
	ID: 24596 \| Rating: 0 \| rate: / Reply Quote

5pot Send message Joined: 8 Mar 12 Posts: 411 Credit: 2,083,882,218 RAC: 0 Level Scientific publications	Message 24599 - Posted: 27 Apr 2012 \| 3:44:11 UTC
	And tease they did.... Knew it was of a 680, but with all the talk of them switching the 670 to 680, or whatever everyone was "rumor mill' talking about when it first came out, I still found it interesting none the less. Still looking forward to this release date though. :)
	ID: 24599 \| Rating: 0 \| rate: / Reply Quote

5pot Send message Joined: 8 Mar 12 Posts: 411 Credit: 2,083,882,218 RAC: 0 Level Scientific publications	Message 24616 - Posted: 28 Apr 2012 \| 14:03:44 UTC
	In 12 hours, in appears that they will be releasing (paper release at 680 rate) the lower end Keplers. Could be the 690 but I doubt it. Should be interesting to see the specs.
	ID: 24616 \| Rating: 0 \| rate: / Reply Quote

Zarck Send message Joined: 16 Aug 08 Posts: 145 Credit: 328,473,995 RAC: 0 Level Scientific publications	Message 25100 - Posted: 15 May 2012 \| 22:21:20 UTC
	http://smooth-las-akam.istreamplanet.com/live/demo/nvid1/player.html ____________
	ID: 25100 \| Rating: 0 \| rate: / Reply Quote

5pot Send message Joined: 8 Mar 12 Posts: 411 Credit: 2,083,882,218 RAC: 0 Level Scientific publications	Message 25104 - Posted: 16 May 2012 \| 4:29:32 UTC
	Well here's GK110 Tesla style http://www.brightsideofnews.com/news/2012/5/15/nvidia-tesla-k20-ie-gk110-is-71-billion-transistors2c-300w-tdp2c-384-bit-interface.aspx Whoever is good with math and figuring out specs based on knowmn Kepler config, would love to know how someone like this might be configured, it's "guessed" specs, etc. Won't be coming out til 4Q 2012, so a 7xx series GK110 is possible WAY WAY down the road (next spring maybe) I would ASSUME. Enjoy
	ID: 25104 \| Rating: 0 \| rate: / Reply Quote

5pot Send message Joined: 8 Mar 12 Posts: 411 Credit: 2,083,882,218 RAC: 0 Level Scientific publications	Message 25121 - Posted: 17 May 2012 \| 3:46:43 UTC
	And here's the whitepaper. Would love to know if any of those interesting new features (doubt will get them) would be beneficial to this project. Also, what do you guys think, could this be what they release for us as a 780 next year? http://www.nvidia.com/content/PDF/kepler/NVIDIA-Kepler-GK110-Architecture-Whitepaper.pdf
	ID: 25121 \| Rating: 0 \| rate: / Reply Quote

Retvari Zoltan Send message Joined: 20 Jan 09 Posts: 2343 Credit: 16,217,465,968 RAC: 1,257,790 Level Scientific publications	Message 25123 - Posted: 17 May 2012 \| 7:09:22 UTC - in response to Message 25121.
	...what do you guys think, could this be what they release for us as a 780 next year? They released a modified GTX 690 as a compute card, so I don't think they will do the opposite with GK110. BTW looking at the GK110 architecture, I can see that it is superscalar (in single precision) as well as the GK104, so only 960 of its 2880 shaders could be utilized by GPUGrid.
	ID: 25123 \| Rating: 0 \| rate: / Reply Quote

GDF Volunteer moderator Project administrator Project developer Project tester Volunteer developer Volunteer tester Project scientist Send message Joined: 14 Mar 07 Posts: 1957 Credit: 629,356 RAC: 0 Level Scientific publications	Message 25124 - Posted: 17 May 2012 \| 7:09:26 UTC - in response to Message 25121.
	I would not be surprised if we can get a 2x compared to a gtx680. gdf
	ID: 25124 \| Rating: 0 \| rate: / Reply Quote

skgiven Volunteer moderator Volunteer tester Send message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level Scientific publications	Message 25125 - Posted: 17 May 2012 \| 7:33:56 UTC - in response to Message 25121. Last modified: 17 May 2012 \| 7:50:57 UTC
	GK110 looks like it's basically GK104 but with an increase in registers/thread, Hyper‐Q and Dynamic Parallelism. I think this GPU will offer a lot to researchers generally, but my fear is that it might not turn up in a GeForce GTX card, or if it does too much will be trimmed from it. Anyone for 'Big Data'? Massive scalability through network clustering is arriving. Lets hope the rest of technology can keep up. Of course this would be aimed more at data centers and studios than research centers, but for those with the resources, still very useful. To more fully utilize it's resources (and GK104's) GPUGrid would need to redesign the apps/research methods (around Hyper-Q for the GK110). I don't see any reason, other than supply, that these could not be released this year, and I think the Autumn suggestions sound reasonable. ____________ FAQ's HOW TO: - Opt out of Beta Tests - Ask for Help
	ID: 25125 \| Rating: 0 \| rate: / Reply Quote

ExtraTerrestrial Apes Volunteer moderator Volunteer tester Send message Joined: 17 Aug 08 Posts: 2705 Credit: 1,311,122,549 RAC: 0 Level Scientific publications	Message 25126 - Posted: 17 May 2012 \| 9:53:48 UTC - in response to Message 25123.
	If I was nVidia I would surely put this chip on a Geforce. Even if this wouldn't make much sense, some people would by it to have the latest and greatest. Castrate DP, as usual, leave out ECC memory, but give it all the rest. Make it expensive, if you must. BTW looking at the GK110 architecture, I can see that it is superscalar (in single precision) as well as the GK104, so only 960 of its 2880 shaders could be utilized by GPUGrid. Isn't that "1920 out of 2880"? Assuming the superscalar capabilities can still not be used (newer architecture, newer software.. not sure this still holds true for Kepler). MrS ____________ Scanning for our furry friends since Jan 2002
	ID: 25126 \| Rating: 0 \| rate: / Reply Quote

5pot Send message Joined: 8 Mar 12 Posts: 411 Credit: 2,083,882,218 RAC: 0 Level Scientific publications	Message 25165 - Posted: 19 May 2012 \| 0:18:38 UTC Last modified: 19 May 2012 \| 0:24:54 UTC
	From Amorphous@NVIDIA, he saw us discussing K10 and K20 in regards to how this "confirmed" that GK104 was originally the 660Ti (680) and 660 (670). His response, "Pssst. I've been saying that the GK104 has always been intended as our flagship GPU since launch. :thumbup: You're only willing to accept insider information when it confirms your belief." So with this in mind, I personally still do not see a 685 coming out this year. Maybe next year (780), but not 2012. They're going to be making WAY too much money off selling K20s.
	ID: 25165 \| Rating: 0 \| rate: / Reply Quote

skgiven Volunteer moderator Volunteer tester Send message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level Scientific publications	Message 25166 - Posted: 19 May 2012 \| 7:31:29 UTC - in response to Message 25165. Last modified: 19 May 2012 \| 7:41:33 UTC
	Well maybe the cloud will gobble up all the K20's, leaving us research enthusiasts floundering. It will be interesting to see how these perform with 1/3 (I think) FP64/DP performance against the HD7970's and probably some dual AMD version. ____________ FAQ's HOW TO: - Opt out of Beta Tests - Ask for Help
	ID: 25166 \| Rating: 0 \| rate: / Reply Quote

ExtraTerrestrial Apes Volunteer moderator Volunteer tester Send message Joined: 17 Aug 08 Posts: 2705 Credit: 1,311,122,549 RAC: 0 Level Scientific publications	Message 25178 - Posted: 20 May 2012 \| 17:08:00 UTC Last modified: 20 May 2012 \| 17:08:18 UTC
	K20's will not be for us. They may be more a lot efficient at Milkyway than other nVidia chips.. but that's still way too expensive to consider over AMDs. However, GK110 based Geforce should arrive in 2013. MrS ____________ Scanning for our furry friends since Jan 2002
	ID: 25178 \| Rating: 0 \| rate: / Reply Quote

Retvari Zoltan Send message Joined: 20 Jan 09 Posts: 2343 Credit: 16,217,465,968 RAC: 1,257,790 Level Scientific publications	Message 25180 - Posted: 20 May 2012 \| 23:15:11 UTC - in response to Message 25178.
	GK110 based Geforce should arrive in 2013. I wouldn't take this source trustworthy. AMD maybe could force nVidia to release the BigKepler as a GeForce card by releasing a new series of GPU, but it would be a prestige card only (to keep the "fastest single chip GPU card manufacturer" title at nVidia), because it wouldn't be much faster than a GTX 690 (but producing a GK110 chip costs more than producing two GK104 chips). I think there is no sense in releasing a card which is not faster than the GTX 690 while it costs more to produce (and last but not least it shrinks the supply of Teslas and Quadros). There is only one way for nVidia to top the GTX 690: releasing a GeForce card with two GK110 on it. Now that would be a miracle.
	ID: 25180 \| Rating: 0 \| rate: / Reply Quote

ExtraTerrestrial Apes Volunteer moderator Volunteer tester Send message Joined: 17 Aug 08 Posts: 2705 Credit: 1,311,122,549 RAC: 0 Level Scientific publications	Message 25183 - Posted: 21 May 2012 \| 11:09:15 UTC - in response to Message 25180.
	You can trust that source as far as it says "someone at the conference (presumably nVidia) said, they're planning to introduce GK110 based Geforce's in 2013". Obviously nVidia could change their mind any day. If a GK110 based Geforce makes sense is difficult to tell now. Performance depends on yield (=number of functional units active) and final clock speed, both of which nVidia can't know exactly yet. The raw gaming power will not be higher than 2 GK104 chips. However, dual chip SLI doesn't yield a 100% benefit, let alone 3 and 4-chip configurations. If you compare 4 GK104 to 2 GK110, such a card should start to make sense. You get beter perfmance (due to the scaling issue using 4 smaller chips), reduce the amount of micro stutter and can use 50% of the installed memory rather than 25%. If history has told us anything: if they introduce such cards there will be people buying them. Regarding "taking chips away from the Tesla and Quadro supply": that's why we're talking about anytime in 2013. The Geforce won't arrive before they can satisfy demand for the more expensive versions. Or need to papaer-launch a new flag ship product ;) MrS ____________ Scanning for our furry friends since Jan 2002
	ID: 25183 \| Rating: 0 \| rate: / Reply Quote

Carlesa25 Send message Joined: 13 Nov 10 Posts: 328 Credit: 72,619,453 RAC: 0 Level Scientific publications	Message 25187 - Posted: 21 May 2012 \| 17:58:32 UTC
	Hello: The truth is that in the light of these returns, it is interesting to consider the use of in GPUGRID RADEON. Greetings. From GTC 2012: AMD R7970 Preferred over NVIDIA Kepler in Real GPGPU Deployments? Read more: http://vr-zone.com/articles/from-gtc-2012-amd-r7970-preferred-over-nvidia-kepler-in-real-gpgpu-deployments-/15903.html#ixzz1vWnW4AJW
	ID: 25187 \| Rating: 0 \| rate: / Reply Quote

5pot Send message Joined: 8 Mar 12 Posts: 411 Credit: 2,083,882,218 RAC: 0 Level Scientific publications	Message 25188 - Posted: 21 May 2012 \| 18:55:14 UTC
	It's well known that radeon is faster. That's not the issue, the issue is the coding in opencl. Radeon does not help out the researchers code their work like NVIDIA (CUDA) does. From what I've read and heard, you're on your own. This is one reason why many prefer CUDA. They get help when asked. Again, from what I've heard.
	ID: 25188 \| Rating: 0 \| rate: / Reply Quote

ExtraTerrestrial Apes Volunteer moderator Volunteer tester Send message Joined: 17 Aug 08 Posts: 2705 Credit: 1,311,122,549 RAC: 0 Level Scientific publications	Message 25191 - Posted: 21 May 2012 \| 19:46:07 UTC - in response to Message 25187.
	The article says it all: for simple code AMD gives you more bang for the buck. However, GPU-Grid does not fit into this category (as far as I see it). They already built an AMD app.. however, performance was quite bad (on VLIW chips, no GCN yet). And it wasn't stable due to bugs in the driver or SDK. In short: they're trying to support AMDs, but it's not as easy as the article might make one to believe. MrS ____________ Scanning for our furry friends since Jan 2002
	ID: 25191 \| Rating: 0 \| rate: / Reply Quote

Carlesa25 Send message Joined: 13 Nov 10 Posts: 328 Credit: 72,619,453 RAC: 0 Level Scientific publications	Message 25205 - Posted: 22 May 2012 \| 14:22:15 UTC
	Hi, What I find is that AMD inmegable is getting the batteries. Greetings. http://blogs.amd.com/developer/2012/05/21/opencl%E2%84%A2-1-2-and-c-static-kernel-language-now-available/
	ID: 25205 \| Rating: 0 \| rate: / Reply Quote

ExtraTerrestrial Apes Volunteer moderator Volunteer tester Send message Joined: 17 Aug 08 Posts: 2705 Credit: 1,311,122,549 RAC: 0 Level Scientific publications	Message 25214 - Posted: 23 May 2012 \| 20:16:47 UTC - in response to Message 25205.
	Hi, What I find is that AMD inmegable is getting the batteries. Is that a machine translation? Sorry, I can't figure out any sense in this sentence. MrS ____________ Scanning for our furry friends since Jan 2002
	ID: 25214 \| Rating: 0 \| rate: / Reply Quote

Carlesa25 Send message Joined: 13 Nov 10 Posts: 328 Credit: 72,619,453 RAC: 0 Level Scientific publications	Message 25220 - Posted: 24 May 2012 \| 13:27:01 UTC - in response to Message 25214.
	Hi, What I find is that AMD inmegable is getting the batteries. Is that a machine translation? Sorry, I can't figure out any sense in this sentence. MrS Hi, I'm sorry, I mean that AMD is getting better and pushing Nvidia. Greetings. ____________ http://stats.free-dc.org/cpidtagb.php?cpid=b4bdc04dfe39b1028b9c5d6fef3082b8&theme=9&cols=1
	ID: 25220 \| Rating: 0 \| rate: / Reply Quote

Retvari Zoltan Send message Joined: 20 Jan 09 Posts: 2343 Credit: 16,217,465,968 RAC: 1,257,790 Level Scientific publications	Message 25379 - Posted: 31 May 2012 \| 14:54:06 UTC - in response to Message 25126.
	BTW looking at the GK110 architecture, I can see that it is superscalar (in single precision) as well as the GK104, so only 960 of its 2880 shaders could be utilized by GPUGrid. Isn't that "1920 out of 2880"? Assuming the superscalar capabilities can still not be used (newer architecture, newer software.. not sure this still holds true for Kepler). MrS Looking at the performance of the beta workunits, I came to the conclusion that Kepler can utilize only 1/3rd of it's shaders.
	ID: 25379 \| Rating: 0 \| rate: / Reply Quote

ExtraTerrestrial Apes Volunteer moderator Volunteer tester Send message Joined: 17 Aug 08 Posts: 2705 Credit: 1,311,122,549 RAC: 0 Level Scientific publications	Message 25385 - Posted: 31 May 2012 \| 18:55:10 UTC - in response to Message 25379.
	You're referring to this post? If so (I don't want to read the entire thread now, it's quite long and the forum says I haven't read it yet) .. you forgot an important factor of 2! GTX680 is 150ns/115ns = 1.30434 times faster than GTX580. It's got 3 times as many shaders, so we might have expected a performance increase by a factor of 3. That would be 1.3/3 = 43% shader utilization by now. However, GTX580 runs its core at 772 MHz and its shaders at 1544 MHz. GTX680 runs the shaders at 1006 MHz, so we should actually expect a performance increase of 3*1006/1544 = 1.95. So the shader utilization on Kepler appears to be 1.3/1.95 = 66.5%. That's 2/3 rather than 1/3 and fits the assumption very well that the super scalar capability can still not be used by GPU-Grid, despite the changes nVidia made to the compiler and scheduling (which was actually meant to save power, not increase performance). BTW: the lower clock speeds of Kepler chips are key to its power efficiency. MrS ____________ Scanning for our furry friends since Jan 2002
	ID: 25385 \| Rating: 0 \| rate: / Reply Quote

Retvari Zoltan Send message Joined: 20 Jan 09 Posts: 2343 Credit: 16,217,465,968 RAC: 1,257,790 Level Scientific publications	Message 25398 - Posted: 31 May 2012 \| 21:33:53 UTC - in response to Message 25385.
	I got your point. I assumed that nVidia redesigned their shaders in a way that the shaders of the Kepler chip can do as much work as the shaders of the Fermi can do at doubled clock speed (to compensate the eliminated hot clocks). Apparently I wasn't right about it. (they put 3 times as much shaders to compensate the lack of hot clocks) My 'reviewed' conclusion is that while it can utilize 2/3rd of its shaders, the Kepler performs as if it could utilize only the 1/3rd of its shaders, because of the eliminated hot clocks. On the other hand it's much more power efficient for the same reason. Scaling the (theoretical) performance indices of the Kepler (GK104) and the Fermi (GF110) looks like: (15362/31006)/(5127722)=1.3031 Apparently we get the same result, if we take out the "2" from both performance indices at the same time: (1536/31006)/(512*772)=1.3031 BTW: the lower clock speeds of Kepler chips are key to its power efficiency. Yes, it's the key for every chip. That's why Intel couldn't reach 10GHz with Pentium 4, as it was planned when the netburst architecture was introduced. I wonder how much overclock the Kepler could take?
	ID: 25398 \| Rating: 0 \| rate: / Reply Quote

5pot Send message Joined: 8 Mar 12 Posts: 411 Credit: 2,083,882,218 RAC: 0 Level Scientific publications	Message 25400 - Posted: 31 May 2012 \| 22:05:37 UTC Last modified: 31 May 2012 \| 22:06:38 UTC
	Depends on the chip. But not alot. 1300 seems to be about the limit. At this point, the locked voltage becomes an issue. However, realistically most are seeing 1200-1250. My 680 can go up to about 1275, but I tend to keep it at 1200 w/ +100 memory (3100) With a flashed BIOS and extreme cooling. They're getting into "hot clock" territory. Around 1400+
	ID: 25400 \| Rating: 0 \| rate: / Reply Quote

skgiven Volunteer moderator Volunteer tester Send message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level Scientific publications	Message 25404 - Posted: 31 May 2012 \| 22:41:07 UTC - in response to Message 25400.
	Won't be until next year that we see the full potential of the Keplers, in any format. Looking forward to seeing a full-fat Kepler, or at least the results. I'm not saving for one however. ____________ FAQ's HOW TO: - Opt out of Beta Tests - Ask for Help
	ID: 25404 \| Rating: 0 \| rate: / Reply Quote

ExtraTerrestrial Apes Volunteer moderator Volunteer tester Send message Joined: 17 Aug 08 Posts: 2705 Credit: 1,311,122,549 RAC: 0 Level Scientific publications	Message 25446 - Posted: 2 Jun 2012 \| 12:48:55 UTC - in response to Message 25398.
	My 'reviewed' conclusion is that while it can utilize 2/3rd of its shaders, the Kepler performs as if it could utilize only the 1/3rd of its shaders, because of the eliminated hot clocks. On the other hand it's much more power efficient for the same reason. While we're talking about the same numbers, I prefer to use the "real" clock speed and say "2/3 of the shaders", as this corresponds directly to not being able to use the super scalar ones. This is more straight forward and easier to understand. Saying "can use only 1/3 of its shaders" makes it seem like a really really bad design - which it isn't. MrS ____________ Scanning for our furry friends since Jan 2002
	ID: 25446 \| Rating: 0 \| rate: / Reply Quote

Post to thread

Message boards : Graphics cards (GPUs) : NVIDIA BigKepler

	About	Science	Volunteers	Performance	Forum	Join us	Donate