Nvidia GT300

Message boards : Graphics cards (GPUs) : Nvidia GT300

Author	Message
a Send message Joined: 29 Aug 08 Posts: 3 Credit: 2,514,384 RAC: 0 Level Scientific publications	Message 12937 - Posted: 30 Sep 2009 \| 14:37:49 UTC Last modified: 30 Sep 2009 \| 15:05:51 UTC
	GPU specifications This is the meat part you always want to read fist. So, here it how it goes: * 3.0 billion transistors * 40nm TSMC * 384-bit memory interface * 512 shader cores [renamed into CUDA Cores] * 32 CUDA cores per Shader Cluster * 1MB L1 cache memory [divided into 16KB Cache - Shared Memory] * 768KB L2 unified cache memory * Up to 6GB GDDR5 memory * Half Speed IEEE 754 Double Precision nVidia GT300 chip is a computational beast like you have never seen before. The memory controller is a GDDR5 native controller, which means it can take advantage of built-in ECC features inside the GDDR5 SDRAM memory. Ferni architecture natively supports C [CUDA], C++, DirectCompute, DirectX 11, Fortran, OpenCL, OpenGL 3.1 and OpenGL 3.2. Now, you've read that correctly - Ferni comes with a support for native execution of C++. For the first time in history, a GPU can run C++ code with no major issues or performance penalties and when you add Fortran or C to that, it is easy to see that GPGPU-wise, nVidia did a huge job. Guru3d Brightsideofnews
	ID: 12937 \| Rating: 0 \| rate: / Reply Quote

MarkJ Volunteer moderator Volunteer tester Send message Joined: 24 Dec 08 Posts: 738 Credit: 200,909,904 RAC: 0 Level Scientific publications	Message 12938 - Posted: 30 Sep 2009 \| 16:43:57 UTC
	It all sounds good but no news yet on when it will be out, other than the "in the next few weeks" line. ____________ BOINC blog
	ID: 12938 \| Rating: 0 \| rate: / Reply Quote

a Send message Joined: 29 Aug 08 Posts: 3 Credit: 2,514,384 RAC: 0 Level Scientific publications	Message 12941 - Posted: 30 Sep 2009 \| 22:20:44 UTC - in response to Message 12938.
	From anandtech Widespread availability won't be until at least Q1 2010. Anandtech
	ID: 12941 \| Rating: 0 \| rate: / Reply Quote

GDF Volunteer moderator Project administrator Project developer Project tester Volunteer developer Volunteer tester Project scientist Send message Joined: 14 Mar 07 Posts: 1957 Credit: 629,356 RAC: 0 Level Scientific publications	Message 12961 - Posted: 1 Oct 2009 \| 16:22:28 UTC - in response to Message 12941. Last modified: 1 Oct 2009 \| 18:22:34 UTC
	Some more technical data of the G300 for Nvidia. http://www.nvidia.com/content/PDF/fermi_white_papers/NVIDIAFermiArchitectureWhitepaper.pdf It is likely to be at least 3 times faster than a GTX285 for GPUGRID. We will probably compile specifically for it with multiple application versions, so that people can use the maximum of performance. gdf
	ID: 12961 \| Rating: 0 \| rate: / Reply Quote

Skip Da Shu Send message Joined: 13 Jul 09 Posts: 63 Credit: 2,507,935,249 RAC: 11,301,344 Level Scientific publications	Message 12969 - Posted: 2 Oct 2009 \| 0:21:51 UTC
	How big of a loan would one have to apply for to get one of these things? Do I need to be looking at a ten year term? ____________ - da shu @ HeliOS, "A child's exposure to technology should never be predicated on an ability to afford it."
	ID: 12969 \| Rating: 0 \| rate: / Reply Quote

skgiven Volunteer moderator Volunteer tester Send message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level Scientific publications	Message 12982 - Posted: 2 Oct 2009 \| 17:24:26 UTC - in response to Message 12969.
	It is the 25 percent deposit that I am worried about!
	ID: 12982 \| Rating: 0 \| rate: / Reply Quote

MarkJ Volunteer moderator Volunteer tester Send message Joined: 24 Dec 08 Posts: 738 Credit: 200,909,904 RAC: 0 Level Scientific publications	Message 12988 - Posted: 2 Oct 2009 \| 23:04:46 UTC - in response to Message 12969.
	How big of a loan would one have to apply for to get one of these things? Do I need to be looking at a ten year term? I think they will be looking at similar pricing to the HD5870 as thats their main competition. The GT200-based cards will probably fall in price in the short term, so they can compete against ATI until the GT300 gets into volume production. ____________ BOINC blog
	ID: 12988 \| Rating: 0 \| rate: / Reply Quote

GDF Volunteer moderator Project administrator Project developer Project tester Volunteer developer Volunteer tester Project scientist Send message Joined: 14 Mar 07 Posts: 1957 Credit: 629,356 RAC: 0 Level Scientific publications	Message 12994 - Posted: 3 Oct 2009 \| 8:22:41 UTC - in response to Message 12988.
	I have heard 20% higher than current price of GTX295. gdf
	ID: 12994 \| Rating: 0 \| rate: / Reply Quote

chumbucket843 Send message Joined: 22 Jul 09 Posts: 21 Credit: 195 RAC: 0 Level Scientific publications	Message 13014 - Posted: 4 Oct 2009 \| 23:16:50 UTC
	the best article is from realworldtech.com. its a good read. they know what they are talking about.will gpugrid take advamtage of the double precision capabilities? does scientific computing cache well? ie data locality.
	ID: 13014 \| Rating: 0 \| rate: / Reply Quote

robertmiles Send message Joined: 16 Apr 09 Posts: 503 Credit: 757,773,003 RAC: 363,150 Level Scientific publications	Message 13034 - Posted: 6 Oct 2009 \| 3:05:28 UTC - in response to Message 13014.
	the best article is from realworldtech.com. its a good read. they know what they are talking about.will gpugrid take advamtage of the double precision capabilities? does scientific computing cache well? ie data locality. You'll have to wait for someone on the GPUGRID project team to check whether their program, and the underlying science, even has a significant need for double precision, which would probably double the amount of GPU board memory needed to make full use of the same number of GPU cores. In case you're interested in the Milkyway@home project, I've already found that the GPU version of their application requires a GPU card with double precision, and therefore a 200 series Nvidia chip if that card's already available with an Nvidia GPU. Looks worth checking if it's a good use for the many GTX 260 cards with a rather high error rate under GPUGRID, though. They've already said that their underlying science needs double precision to produce useful results. As for caching well, I'd expect that to depend highly on how the program was written, and not be the same for all scientific computing.
	ID: 13034 \| Rating: 0 \| rate: / Reply Quote

GDF Volunteer moderator Project administrator Project developer Project tester Volunteer developer Volunteer tester Project scientist Send message Joined: 14 Mar 07 Posts: 1957 Credit: 629,356 RAC: 0 Level Scientific publications	Message 13052 - Posted: 6 Oct 2009 \| 14:30:54 UTC - in response to Message 13034.
	We will take advantage of the 2.5 Tflops single precision. gdf
	ID: 13052 \| Rating: 0 \| rate: / Reply Quote

zpm Send message Joined: 2 Mar 09 Posts: 159 Credit: 13,639,818 RAC: 0 Level Scientific publications	Message 13054 - Posted: 6 Oct 2009 \| 17:32:39 UTC - in response to Message 13052.
	We will take advantage of the 2.5 Tflops single precision. gdf with those flops, that card would finish a current wu in little under 2 hrs, if my math is correct, which it may not be.
	ID: 13054 \| Rating: 0 \| rate: / Reply Quote

Hydropower Send message Joined: 3 Apr 09 Posts: 70 Credit: 6,003,024 RAC: 0 Level Scientific publications	Message 13070 - Posted: 7 Oct 2009 \| 11:49:24 UTC - in response to Message 13054. Last modified: 7 Oct 2009 \| 11:56:09 UTC
	My confidence in NVidia has dropped significantly. They presented a mockup Fermi-Tesla card as 'the real thing' without telling anyone beforehand that it was a fake card. It took this article to make them confirm it was a mockup. This would suggest they do not have a real card. http://www.semiaccurate.com/2009/10/01/nvidia-fakes-fermi-boards-gtc/
	ID: 13070 \| Rating: 0 \| rate: / Reply Quote

skgiven Volunteer moderator Volunteer tester Send message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level Scientific publications	Message 13093 - Posted: 9 Oct 2009 \| 21:07:59 UTC - in response to Message 13070. Last modified: 9 Oct 2009 \| 21:09:27 UTC
	Scientific accuracy can be obtained/validated in more than one way. So, double precision is not the only method! PS. I doubt if the amount of DDR5 will make any difference to GPUGRID, well not any time soon. If there is a card released with 512MB that costs 30% less than one with say 2GB, get the 512MB card - If you want to go from A to B fast and not spent too much getting there, buy the car without the caravan attached!
	ID: 13093 \| Rating: 0 \| rate: / Reply Quote

MarkJ Volunteer moderator Volunteer tester Send message Joined: 24 Dec 08 Posts: 738 Credit: 200,909,904 RAC: 0 Level Scientific publications	Message 13101 - Posted: 10 Oct 2009 \| 5:20:00 UTC - in response to Message 13070.
	My confidence in NVidia has dropped significantly. They presented a mockup Fermi-Tesla card as 'the real thing' without telling anyone beforehand that it was a fake card. It took this article to make them confirm it was a mockup. This would suggest they do not have a real card. http://www.semiaccurate.com/2009/10/01/nvidia-fakes-fermi-boards-gtc/ Apparently the GTX260 and GTX275 have been killed off. Better get your last orders in. They expect the GTX295 will be next. Pity they haven't got anything to replace them with (ie the GT300-based cards). ____________ BOINC blog
	ID: 13101 \| Rating: 0 \| rate: / Reply Quote

Beyond Send message Joined: 23 Nov 08 Posts: 1112 Credit: 6,162,416,256 RAC: 0 Level Scientific publications	Message 13102 - Posted: 10 Oct 2009 \| 5:56:41 UTC - in response to Message 13101.
	My confidence in NVidia has dropped significantly. They presented a mockup Fermi-Tesla card as 'the real thing' without telling anyone beforehand that it was a fake card. It took this article to make them confirm it was a mockup. This would suggest they do not have a real card. http://www.semiaccurate.com/2009/10/01/nvidia-fakes-fermi-boards-gtc/ Apparently the GTX260 and GTX275 have been killed off. Better get your last orders in. They expect the GTX295 will be next. Pity they haven't got anything to replace them with (ie the GT300-based cards). Doesn't sound too good for NVidia at the moment. Nvidia kills GTX285, GTX275, GTX260, abandons the mid and high end market: http://www.semiaccurate.com/2009/10/06/nvidia-kills-gtx285-gtx275-gtx260-abandons-mid-and-high-end-market/ Current NV high end cards dead, fermi so far vapor, stopping development on all chipsets, legal problems with intel. Add that to the sony PS3 no Linux debacle and I'd think GPUGRID might want to accelerate the development of an ATI client.
	ID: 13102 \| Rating: 0 \| rate: / Reply Quote

Paul D. Buck Send message Joined: 9 Jun 08 Posts: 1050 Credit: 37,321,185 RAC: 0 Level Scientific publications	Message 13105 - Posted: 10 Oct 2009 \| 7:32:48 UTC
	I have run across some other discussion threads that don't seem to see things quite so direly ... only time is going to tell if all this is true or not and how this will shake out. Maybe Nvidia is dead or will be dead ... but all the current cards are going to die tomorrow... so there will be plenty of support for months to come. In the PS3 case it was more that any machine updated with the new firmware would not be able to run Linux any longer and no new PS3 would be able to either ... I don't know how much of the total work was being done by the PS3 but if it was low enough then it is a logical decision to shout it down because the cost was not worth the benefit along with the fact that it would be a rapidly shrinking pool ... The cases thusly, are not parallel. Superficially similar, but no where near on the same timescale ... Showing a mock-up that is not functional at a press briefing? Makes sense to me ... who in their right mind wants to risk one of the few working prototypes to butterfingered execs or press flacks?
	ID: 13105 \| Rating: 0 \| rate: / Reply Quote

GDF Volunteer moderator Project administrator Project developer Project tester Volunteer developer Volunteer tester Project scientist Send message Joined: 14 Mar 07 Posts: 1957 Credit: 629,356 RAC: 0 Level Scientific publications	Message 13107 - Posted: 10 Oct 2009 \| 7:56:38 UTC - in response to Message 13105.
	Guys the G300 is simply amazing! I want to test it before saying it, but it could really be the reference architecture for two years. Once it is out nobody would want to buy a GTX295. That is why factories are not producing them anymore. It is different for the GTX285, I am sure that it will just become a low end card, maybe under another name. Nvidia has always, as a practice, advertised something before they have it in the market. In this case, it is just few months away, so quite in time. G300 should be out between Christmas and new year. The G300 should then the fastest GPU out there, including ATI (which however will be just almost there with the 5870, amazing as well). When to support ATI does not depend on us, but on ATI themselves to provide a working openCL implementation. I believe that this is close now. gdf
	ID: 13107 \| Rating: 0 \| rate: / Reply Quote

Hydropower Send message Joined: 3 Apr 09 Posts: 70 Credit: 6,003,024 RAC: 0 Level Scientific publications	Message 13112 - Posted: 10 Oct 2009 \| 9:47:27 UTC - in response to Message 13105. Last modified: 10 Oct 2009 \| 10:39:54 UTC
	Showing a mock-up that is not functional at a press briefing? Makes sense to me ... who in their right mind wants to risk one of the few working prototypes to butterfingered execs or press flacks? The issue is: the mockup was presented as the real item and it was not at a press briefing, it was at the GPU Technology Conference for developers and investors. If they had a real card, it would have made sense to me to actually show it because at that time rumours were already spreading that they did not have a card. A few smudges on a real card, or a damaged one if fallen, would be no match to a stain on the NVidia image. It reminds me of a (freely reproduced) quote by an Intel exec many years ago when they introduced the microchip. "someone asked me 'how are they going to service such a small part when one breaks', I replied 'if one breaks there are plenty more where that one came from', they just did not understand the concept of a replaceable chip." If you drop a shell of a real GT300, you replace the shell. If you cannot even show a real shell, what does that say about the egg ? Sounds like the chicken won here. (And believe me I'd love to have a nest of GT300 today)
	ID: 13112 \| Rating: 0 \| rate: / Reply Quote

Paul D. Buck Send message Joined: 9 Jun 08 Posts: 1050 Credit: 37,321,185 RAC: 0 Level Scientific publications	Message 13119 - Posted: 10 Oct 2009 \| 20:10:03 UTC - in response to Message 13112.
	Showing a mock-up that is not functional at a press briefing? Makes sense to me ... who in their right mind wants to risk one of the few working prototypes to butterfingered execs or press flacks? The issue is: the mockup was presented as the real item and it was not at a press briefing, it was at the GPU Technology Conference for developers and investors. Still sounds like a dog-and-pony show ... And I am sorry, why is waving around a "real" card that much more impressive than waving about a "fake" one ... to be honest I just want to see it work in a machine. All else is meaningless ... As to GDF's comment, yes I would like to buy a GTX300, but if the only option is to buy at the top, I would rather buy a "slower" and less capable GTX295 so I can get the multi-Core aspect for processing two tasks at once in the same box ...at times throughput is not just measured by how fast you can pump out the tasks, but by how many you can have in flight ... If Nvidia takes out too many levels in the structure they may lose me on my upgrade path as productivity wise the HD4870s I have beat all the Nvidia cards I have, and they are cheaper too ... should OpenCl hit and several projects move to support it ... well, I would not hesitate to consider replacing failed Nvidia cards with ATI versions...
	ID: 13119 \| Rating: 0 \| rate: / Reply Quote

GDF Volunteer moderator Project administrator Project developer Project tester Volunteer developer Volunteer tester Project scientist Send message Joined: 14 Mar 07 Posts: 1957 Credit: 629,356 RAC: 0 Level Scientific publications	Message 13122 - Posted: 10 Oct 2009 \| 20:27:15 UTC - in response to Message 13119.
	Well, I would think that none of your ATI card could actually run faster than a GTX285 for our application. If you mean that you get more credits, this is just due to overcrediting by the projects. Something that it is bound to change long term. gdf
	ID: 13122 \| Rating: 0 \| rate: / Reply Quote

Hydropower Send message Joined: 3 Apr 09 Posts: 70 Credit: 6,003,024 RAC: 0 Level Scientific publications	Message 13128 - Posted: 10 Oct 2009 \| 22:57:29 UTC - in response to Message 13119. Last modified: 10 Oct 2009 \| 22:58:30 UTC
	Hi Paul, you wrote: why is waving around a "real" card that much more impressive than waving about a "fake" one If I show my Alcoholics Anonymous membership card and proclaim 'Here is my real American Express Gold card' that may not cast such a good impression in my local Gucci store. If I pass my real American Express Gold card to the cashier and say 'could you please wrap the purse as a gift' in the same Gucci store, I may have more credibility. If you claim to have something, it can be wise to actually show it. Especially if your audience consists of investors and important business relations. If you do not have it it may be unwise to show a mockup and pretend it is the real thing. I am not against NVidia, mind you, but this was not a wise move.
	ID: 13128 \| Rating: 0 \| rate: / Reply Quote

DJStarfox Send message Joined: 14 Aug 08 Posts: 18 Credit: 16,944 RAC: 0 Level Scientific publications	Message 13132 - Posted: 11 Oct 2009 \| 3:59:29 UTC - in response to Message 12961.
	Wow, did you read about their Nexus plugin for Visual Studio? Now you can have a machine/GPU state debugger for CUDA applications in an integrated development environment. Heck, I might even take on GPU coding; this stuff may be a good job skill to have in the coming years.
	ID: 13132 \| Rating: 0 \| rate: / Reply Quote

Paul D. Buck Send message Joined: 9 Jun 08 Posts: 1050 Credit: 37,321,185 RAC: 0 Level Scientific publications	Message 13136 - Posted: 11 Oct 2009 \| 5:45:02 UTC - in response to Message 13128.
	If you claim to have something, it can be wise to actually show it. Especially if your audience consists of investors and important business relations. If you do not have it it may be unwise to show a mockup and pretend it is the real thing. I am not against NVidia, mind you, but this was not a wise move. Having worked with electronics for years I can also assure you that waving a "working" version of a card around with no anti-static protection can turn that "working" card into a dead one in moments. The cases you used are interesting but not parallel. There is no risk, other than to have your card remotely scanned to waving it about in a crowd. There is a huge risk to a chunk of electronics. You see value to the exercise and I do not ... simple as that, and we are never going to agree ... :)
	ID: 13136 \| Rating: 0 \| rate: / Reply Quote

Paul D. Buck Send message Joined: 9 Jun 08 Posts: 1050 Credit: 37,321,185 RAC: 0 Level Scientific publications	Message 13137 - Posted: 11 Oct 2009 \| 5:52:10 UTC - in response to Message 13122.
	Well, I would think that none of your ATI card could actually run faster than a GTX285 for our application. If you mean that you get more credits, this is just due to overcrediting by the projects. Something that it is bound to change long term. I have not yet tried to compare something that is more apples to apples but the comparisons of the cards relative speeds can be gesstimated with a relative comparison with Collatz on the two cards which would be more SP to SP ... I know my 4870s are 3-4 times faster than the 260 cards for MW, in part because of the weakness of the 260 cards in DP capability ... but my understanding is that same carries through to a lesser extent with the 260 cards and the 4870s with Collatz. The new 5870 is about 2x faster than the 4870 ... and is shipping now ... Yes the GTX300 will redress this balance somewhat, how much is still not known as the card is not shipping yet ... then we also get into that whole price to performance thing ...
	ID: 13137 \| Rating: 0 \| rate: / Reply Quote

skgiven Volunteer moderator Volunteer tester Send message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level Scientific publications	Message 13143 - Posted: 11 Oct 2009 \| 12:16:11 UTC - in response to Message 13137. Last modified: 11 Oct 2009 \| 12:55:27 UTC
	There is a fundamental flaw with NVidia. They sell GPU’s that are excellent for gamers and for crunchers, but they are expensive due to research and design of cutting edge technology. Partially as a result of this, they don’t sell well against Intel in the low end market – where most of the sales occur. Unfortunately for NVidia Intel have proprietary rights on chip design and can therefore hold NVidia to random – and have been doing so for some time! With a year of financial instability it is little wonder that the manufactures of a panicle technological design are struggling. When you are faced with competitors that are capable of flexing considerable muscle (buy our GPUs or you wont get our CPU’s) and governments who are scared to help NVidia, you are in a difficult position! It is likely that in Europe Intel could be fined for their present actions, but by the time that happens there might be two CPU manufacturers and the same two GPU manufacturers. By most accounts the G200 range was difficult and expensive to manufacture so it would be an unnecessary financial burden on the company to try to keep these production lines running. Now is not the time to produce a card that will not sell at a profit! It seems sensible that they are cutting manufacturing back now, several months before the release of the G300 based GPU’s. I expect the new G300 line will require the full attention of NVidia – it could make or break the company. To keep manufacturing lots of pointless old lines of technologies, as Intel do with their CPU’s, would definitely spell the end for NVidia.
	ID: 13143 \| Rating: 0 \| rate: / Reply Quote

Hydropower Send message Joined: 3 Apr 09 Posts: 70 Credit: 6,003,024 RAC: 0 Level Scientific publications	Message 13162 - Posted: 13 Oct 2009 \| 14:33:50 UTC - in response to Message 13128.
	This is just a FYI, I do not want to start a new discussion. Paul, I agree we disagree and join you in crunching :) http://www.nordichardware.com/news,10006.html Which eventually links to : http://www.xbitlabs.com/news/video/display/20091002130844_Nvidia_Admits_Showing_Dummy_Fermi_Card_at_GTC_Claims_First_Graphics_Cards_on_Track_for_Q4_2009.html
	ID: 13162 \| Rating: 0 \| rate: / Reply Quote

GDF Volunteer moderator Project administrator Project developer Project tester Volunteer developer Volunteer tester Project scientist Send message Joined: 14 Mar 07 Posts: 1957 Credit: 629,356 RAC: 0 Level Scientific publications	Message 13165 - Posted: 13 Oct 2009 \| 17:50:50 UTC - in response to Message 13162.
	So by the end of the year there will be some G300 out (I hope more than few). gdf
	ID: 13165 \| Rating: 0 \| rate: / Reply Quote

chumbucket843 Send message Joined: 22 Jul 09 Posts: 21 Credit: 195 RAC: 0 Level Scientific publications	Message 13259 - Posted: 22 Oct 2009 \| 22:10:03 UTC - in response to Message 13143.
	There is a fundamental flaw with NVidia. They sell GPU’s that are excellent for gamers and for crunchers, but they are expensive due to research and design of cutting edge technology. Partially as a result of this, they don’t sell well against Intel in the low end market – where most of the sales occur. Unfortunately for NVidia Intel have proprietary rights on chip design and can therefore hold NVidia to random – and have been doing so for some time! With a year of financial instability it is little wonder that the manufactures of a panicle technological design are struggling. When you are faced with competitors that are capable of flexing considerable muscle (buy our GPUs or you wont get our CPU’s) and governments who are scared to help NVidia, you are in a difficult position! It is likely that in Europe Intel could be fined for their present actions, but by the time that happens there might be two CPU manufacturers and the same two GPU manufacturers. By most accounts the G200 range was difficult and expensive to manufacture so it would be an unnecessary financial burden on the company to try to keep these production lines running. Now is not the time to produce a card that will not sell at a profit! It seems sensible that they are cutting manufacturing back now, several months before the release of the G300 based GPU’s. I expect the new G300 line will require the full attention of NVidia – it could make or break the company. To keep manufacturing lots of pointless old lines of technologies, as Intel do with their CPU’s, would definitely spell the end for NVidia. thats a little out there to think a company will fail from one generation. AMD is still with us. 5870 is not 2x faster. it is bottlenecked by bandwidth specifically L1 cache. on milky way the card gets 2tflops which is a perfect match for the 2TB/s of bandwidth.nvidias l1 bandwidth on fermi should be 3TB/s so the card will be very fast. something that should be noted about ati's architecture is that it was designed for dx10, not gpgpu. not all applications can take full advantage of vliw or vectors. nvidia has spent very little r&d lately. gt200 and g92 were very small tweaks. fermi is the same basic architecture with more programmablity and cache. how gpugrid performs with ati is a mystery. its going to come down to bandwidth and ilp.
	ID: 13259 \| Rating: 0 \| rate: / Reply Quote

zpm Send message Joined: 2 Mar 09 Posts: 159 Credit: 13,639,818 RAC: 0 Level Scientific publications	Message 13262 - Posted: 23 Oct 2009 \| 2:03:04 UTC - in response to Message 13259.
	well said.
	ID: 13262 \| Rating: 0 \| rate: / Reply Quote

skgiven Volunteer moderator Volunteer tester Send message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level Scientific publications	Message 13270 - Posted: 24 Oct 2009 \| 18:50:53 UTC - in response to Message 13259.
	I did not say anyone would fail. I suggested that a plan to save the company was in place and this does not include selling old line GPUs! Nice technical insight but businesses go bust because they cant generate enough business to pay off their debts in time, not because of the technical details of a future architecture.
	ID: 13270 \| Rating: 0 \| rate: / Reply Quote

Gipsel Send message Joined: 17 Mar 09 Posts: 12 Credit: 0 RAC: 0 Level Scientific publications	Message 13429 - Posted: 9 Nov 2009 \| 19:27:49 UTC - in response to Message 13052.
	We will take advantage of the 2.5 Tflops single precision. gdf You expect Fermi to hit more than 2.4GHz shader clock? The extra MUL is gone with Fermi. The theoretical peak throughput for single precision is just: number of SP * 2 * shader clock That means for the top model with 512 SPs and a clock of 1.7GHz (if nv reaches that) we speak about 1.74 TFlop/s theoretical peak in single precision.
	ID: 13429 \| Rating: 0 \| rate: / Reply Quote

Gipsel Send message Joined: 17 Mar 09 Posts: 12 Credit: 0 RAC: 0 Level Scientific publications	Message 13430 - Posted: 9 Nov 2009 \| 19:48:24 UTC - in response to Message 13137.
	I have not yet tried to compare something that is more apples to apples but the comparisons of the cards relative speeds can be gesstimated with a relative comparison with Collatz on the two cards which would be more SP to SP ... Collatz does not run a single floating point instruction on the GPU. It's pure integer math. Nvidias are currently slower there because 32bit integer math are not exactly one of the strengths of nvida GPUs and their memory controller and the cache system has more difficulties with the random accesses necessary there (coalescing memory accesses is simply not possible). But integer operations get a significant speedup with Fermi and the new cache system as well as the new memory controller should be able handle their tasks also much better than current nvidia GPUs. With Fermi I would expect a significant speedup (definitely more than factor 2 compared to a GTX285) for Collatz. How bad nvidia currently does there (originally I expected them to be faster than ATI GPUs) is maybe more clear when I say that the average utilization of the 5 slots of each VLIW unit of the ATI GPUs is actually only ~50% (MW arrives at ~4.3/5, i.e. 86% on average). That is also the reason the GPUs consume less power on Collatz than with MW.
	ID: 13430 \| Rating: 0 \| rate: / Reply Quote

Gipsel Send message Joined: 17 Mar 09 Posts: 12 Credit: 0 RAC: 0 Level Scientific publications	Message 13431 - Posted: 9 Nov 2009 \| 20:55:36 UTC - in response to Message 13259.
	5870 is not 2x faster. it is bottlenecked by bandwidth specifically L1 cache. [..] nvidias l1 bandwidth on fermi should be 3TB/s so the card will be very fast. For some problems it is 2x as fast (or even more), just think of MW and Collatz. And the L1 bandwidth didn't change per unit und clock (each SIMD engine can fetch 64 Bytes per clock from L1, and there are twenty of them ;). That means it isn't more a bottleneck as it was with the HD4800 series. What wasn't scaled as well is the L2 cache bandwidth (only size doubled) and the memory bandwidth. I don't know where you got the L1 bandwidth figure for Fermi from, but it is a bit speculative to assume every L/S unit can fetch 8 bytes per clock. Another estimate would be 16 SMs * 16 L/S units per SM * 4 bytes per clock = 1024 Bytes per clock (Cypress stands at 1280 Bytes/clock but at a significantly lower clockspeed and more units). With a clock of about 1.5GHz one would arrive at roughly half of your figure. on milky way the card gets 2tflops which is a perfect match for the 2TB/s of bandwidth. As said, a HD5870 has only about 1.1 TB/s L1 cache bandwidth. The 2 TB/s figure someone came up with is actually adding the L1 cache and shared memory bandwidth. And I can tell you that it is nothing to consider at all for the MW application. First, the MW ATI applications doesn't use the shared memory (the CUDA version does, but I didn't found it useful for ATIs) and second, the MW applications are so severly compute bound that the bandwidth figures doesn't matter at all. Depending on the problem one has between 5 and 12 floating point operations per fetched byte (not per fetched value), and we are speaking about double precision operations. A HD5870 is coming close to about 400 GFlop/s (double precision) over at MW, that means with the longer WUs (consuming less bandwidth than the shorter ones) one needs only about 33GB/s L1 bandwidth. Really nothing to write home about. That is a bit different with Collatz which is quite bandwidth hungry, not exactly cache bandwidth hungry but memory bandwidth hungry. A HD5870 peaks somewhere just below 100GB/s used bandwidth. And that with virtually random accesses (16Bytes are fetched by each access) to a 16 MB buffer (larger than all caches), which is actually a huge lookup table with 2^20 entries. Quite amazing a HD5870 is able to pull that off (from some memory bandwidth scaling experiments with a HD4870 I first thought it would be a bottleneck). Obviously the on chip buffers are quite deep so they can find some consecutive accesses to raise the efficiency of the memory controllers. Contrary to nvidia the coalescing of the accesses are apparently not that important for ATI cards. something that should be noted about ati's architecture is that it was designed for dx10, not gpgpu. Actually Cypress was designed for DX11 with a bit of GPGPU in mind which is now even part of the DirectX11 specification (DX compute shader). In fact DX11 required that the shared memory gets doubled compared to what is available on the latest DX10.x compatible cards. gt200 and g92 were very small tweaks. fermi is the same basic architecture with more programmablity and cache I would really oppose this statements as Fermi is going to be a quite large step to be considered the same architecture. how gpugrid performs with ati is a mystery. its going to come down to bandwidth and ilp. I guess GDF will best know what GPUGrid stresses most and if there are particular weaknesses of the architectures. Generally GPUGrid does some kind of molecular dynamics, which have the potential to run fast on ATI hardware if some conditions are met. In the moment ATI's OpenCL implementation is lacking the image extension which really helps the available bandwidth for quite some usage scenarios. And OpenCL is far from being mature right now. That means the first ATI applications may very well not show the true potential the hardware is capable of. And if ATI cards can match nvidia's offerings here is of course dependent on the details of the actual code and the employed algorithms to the same extent as it is dependent on the hardware itself ;)
	ID: 13431 \| Rating: 0 \| rate: / Reply Quote

GDF Volunteer moderator Project administrator Project developer Project tester Volunteer developer Volunteer tester Project scientist Send message Joined: 14 Mar 07 Posts: 1957 Credit: 629,356 RAC: 0 Level Scientific publications	Message 13432 - Posted: 9 Nov 2009 \| 22:25:07 UTC - in response to Message 13431.
	As soon as we get a H5870 I will be able to tell you. Maybe we can have a chat in case it is not fast to see where is the problem. gdf
	ID: 13432 \| Rating: 0 \| rate: / Reply Quote

GDF Volunteer moderator Project administrator Project developer Project tester Volunteer developer Volunteer tester Project scientist Send message Joined: 14 Mar 07 Posts: 1957 Credit: 629,356 RAC: 0 Level Scientific publications	Message 13454 - Posted: 10 Nov 2009 \| 14:29:46 UTC - in response to Message 13432.
	We got access to a 4850, and you were right shared memory is still emulated via global memory, so of not use. gdf
	ID: 13454 \| Rating: 0 \| rate: / Reply Quote

skgiven Volunteer moderator Volunteer tester Send message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level Scientific publications	Message 13688 - Posted: 24 Nov 2009 \| 18:31:46 UTC - in response to Message 13454.
	We got access to a 4850, and you were right shared memory is still emulated via global memory, so of not use. gdf Not sure I am picking this up right. Am I right in thinking that the HD4850's bottleneck is due to using System memory?
	ID: 13688 \| Rating: 0 \| rate: / Reply Quote

skgiven Volunteer moderator Volunteer tester Send message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level Scientific publications	Message 13861 - Posted: 10 Dec 2009 \| 12:52:53 UTC - in response to Message 13688.
	NVIDIA has released another GT300 card. They did it in the only way they know how. They Re-branded the GT220 and called it a GeForce 315! Fortunately, it is OEM only. So shoppers need only look out for the existing range of p0rkies. The first effort was particularly special. The GeForce 310 uses DDR2. At this rate dont count out an AGP comeback.
	ID: 13861 \| Rating: 0 \| rate: / Reply Quote

ExtraTerrestrial Apes Volunteer moderator Volunteer tester Send message Joined: 17 Aug 08 Posts: 2705 Credit: 1,311,122,549 RAC: 0 Level Scientific publications	Message 13872 - Posted: 10 Dec 2009 \| 21:03:05 UTC - in response to Message 13861.
	LOL!!! I guess this put's nVidias comment "You'll be surprised once the Geforce 300 lineup is complete" into a whole new light.. MrS ____________ Scanning for our furry friends since Jan 2002
	ID: 13872 \| Rating: 0 \| rate: / Reply Quote

GDF Volunteer moderator Project administrator Project developer Project tester Volunteer developer Volunteer tester Project scientist Send message Joined: 14 Mar 07 Posts: 1957 Credit: 629,356 RAC: 0 Level Scientific publications	Message 13874 - Posted: 11 Dec 2009 \| 9:05:28 UTC - in response to Message 13872.
	Please nobody buy a G315 ever. gdf
	ID: 13874 \| Rating: 0 \| rate: / Reply Quote

STE\/E Send message Joined: 18 Sep 08 Posts: 368 Credit: 3,400,942,445 RAC: 52,831,077 Level Scientific publications	Message 13882 - Posted: 11 Dec 2009 \| 21:24:16 UTC - in response to Message 13874.
	Please nobody buy a G315 ever. gdf I have some on Pre-Order ... ;)
	ID: 13882 \| Rating: 0 \| rate: / Reply Quote

robertmiles Send message Joined: 16 Apr 09 Posts: 503 Credit: 757,773,003 RAC: 363,150 Level Scientific publications	Message 13885 - Posted: 12 Dec 2009 \| 1:39:03 UTC - in response to Message 13882.
	I've noticed that some of the recent high end HP computers offer Nvidia boards that I don't remember seeing on your list of what is recommended and what is not. notebook: GT230M 1 GB GT220 1 GB desktop: G210 512 MB also an older card, already on your list: GTX 260 1.8 GB Nothing else already on the list you now recommend, and no information about whether the higher GTX cards will even fit. Since some of us NEED to buy computers only from a computer company that will unpack them and do the initial setup, you might want to make sure that the first three are added to your list of what is suitable and what is not. Also, Nvidia is rather slow about upgrading their notebook-specific drivers to have the latest capabilities; for example, their web site for downloading drivers says that their 190.* family of drivers are NOT suitable replacements for the driver for the G 105M board in my notebook computer. The closest I've been able to find is the 186.44 driver, and it did NOT come from the Nvidia site. That one provides CUDA 2.2, but not CUDA 2.3. I did find two sites that offer drivers for SOME Nvidia notebook cards, apparantly NOT including a general purpose one for all such cards: http://www.nvidia.com/object/notebook_winvista_win7_x64_186.81_whql.html http://www.laptopvideo2go.com/drivers
	ID: 13885 \| Rating: 0 \| rate: / Reply Quote

CTAPbIi Send message Joined: 29 Aug 09 Posts: 175 Credit: 259,509,919 RAC: 0 Level Scientific publications	Message 13891 - Posted: 12 Dec 2009 \| 4:03:39 UTC - in response to Message 13885.
	robertmiles "don't worry, be happy" (c) Bob Marley all drivers are universal, so i do not think that 19x.xx will not work on your video card. Look, how this can be that older driver supports new video card and new version - does not? And furthermore - what stops you to try 195.xx? :-) About GF100. Last rumours - it will be available in March. ____________
	ID: 13891 \| Rating: 0 \| rate: / Reply Quote

robertmiles Send message Joined: 16 Apr 09 Posts: 503 Credit: 757,773,003 RAC: 363,150 Level Scientific publications	Message 13895 - Posted: 12 Dec 2009 \| 13:02:25 UTC - in response to Message 13891.
	The Nvidia site said otherwise when I asked it which driver was suitable. My guess is that the 190.* Nvidia drivers are universal ONLY for the desktop graphics cards; they have separate series (186.* and 191.*) for the laptop graphics cards, and mention that even for those series, there are likely to be manufacturer-specific requirements for a specific member of the series.
	ID: 13895 \| Rating: 0 \| rate: / Reply Quote

robertmiles Send message Joined: 16 Apr 09 Posts: 503 Credit: 757,773,003 RAC: 363,150 Level Scientific publications	Message 13898 - Posted: 12 Dec 2009 \| 15:58:18 UTC
	Now, Nvidia has changed their driver downloads page since the last time I looked at it. It now says that 195.62 will work on MOST, but not all, of their laptop video cards, at least on most laptops. They are not very clear on just which board-laptop combinations it will work properly on, but that looks hopeful enough that my laptop is installng 195.62 now.
	ID: 13898 \| Rating: 0 \| rate: / Reply Quote

CTAPbIi Send message Joined: 29 Aug 09 Posts: 175 Credit: 259,509,919 RAC: 0 Level Scientific publications	Message 13899 - Posted: 12 Dec 2009 \| 16:35:59 UTC
	just try :-) ____________
	ID: 13899 \| Rating: 0 \| rate: / Reply Quote

skgiven Volunteer moderator Volunteer tester Send message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level Scientific publications	Message 13904 - Posted: 13 Dec 2009 \| 12:25:25 UTC - in response to Message 13899.
	Here is my opinion on the GT210, GT220 and GT315. The G210 is a resounding No, don’t go there card! The GT220, GT230M and GT315 cards are not cards I would buy to participate in this project. So stay away from them. They are not good value for money, if you want to participate here. That said the GT220 / GT315 might get through about 1 task a day, if the system is on 24/7. Mobile devices tend not to be on too much. They tend to use power saving features by default, so the system goes to sleep after a very short time. I would expect a GT230M to overheat quickly and make the system very noisy. I found that the graphics card in my laptop caused system instability when running GPUGrid, and rarely finished a job. It was not worth it, and may have caused more bother than it was worth to the project. As for the GTX 260 1.8GB, the amount of RAM makes no difference. What matters is the core.
	ID: 13904 \| Rating: 0 \| rate: / Reply Quote

robertmiles Send message Joined: 16 Apr 09 Posts: 503 Credit: 757,773,003 RAC: 363,150 Level Scientific publications	Message 13919 - Posted: 14 Dec 2009 \| 7:51:08 UTC - in response to Message 13899. Last modified: 14 Dec 2009 \| 8:04:00 UTC
	just try :-) So far, it's worked successfully for both Collatz and Einstein. The G105M board in that laptop is listed as not suitable for GPUGRID, so I haven't tried it for GPUGRID. Recently, I looked at what types of graphics boards are available for the high-end HP desktop computers, using the option to have one built with you choice of options. As far as I could tell, the highest-end graphics boards they offer are G210 (not the same as a GT210), GTX 260 (with no indication of which core), and for ATI, part of the HD4800 series. Looks like a good reason NOT to buy an HP computer now, even though they look good for the CPU-only BOINC projects. Even that much wasn't available until I entered reviews for several of the high-end HP computers, saying that I had thought of buying them until I found that they weren't available with sufficiently high-end graphics cards to use with GPUGRID. Anyone else with an HP computer (and therefore eligible for an HP Passport account) want to enter more such reviews, to see if that will persuade them even more?
	ID: 13919 \| Rating: 0 \| rate: / Reply Quote

ExtraTerrestrial Apes Volunteer moderator Volunteer tester Send message Joined: 17 Aug 08 Posts: 2705 Credit: 1,311,122,549 RAC: 0 Level Scientific publications	Message 13930 - Posted: 14 Dec 2009 \| 20:56:22 UTC - in response to Message 13919.
	Seriously, notebooks and GPU-crunching don't mix well. Nobody expects a desktop GPU to run under constant load and even less so for a mobile card. It will challenge the cooling system heavily, if the GPU has any horsepower at all (e.g. not G210). MrS ____________ Scanning for our furry friends since Jan 2002
	ID: 13930 \| Rating: 0 \| rate: / Reply Quote

skgiven Volunteer moderator Volunteer tester Send message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level Scientific publications	Message 13933 - Posted: 14 Dec 2009 \| 21:51:50 UTC - in response to Message 13930.
	MrS, as always you are quite correct, but perhaps he will be fine with Einstein, for a while at least. It barely uses the GPU. Its not even 50/50, CPU/GPU; more like 90% CPU. Einstein GPU+CPU tasks compared to CPU tasks only improve turnover by about 30min; from about 8h to 7h 30min and so on. When I tested it, albeit on a desktop, my temps did not even rise above normal usage. Not even by one degree! GPUGrid puts on about 13 degrees (mind you my GTX260 has 2 fans, and the system has a front fan, side fan, exhaust fan, and PSU fan). I would not recommend using it with Collatz however! RM, Keep an eye on the temps just in case they manage to improve on the design, you can use GPU-Z, and see if there are any notable changes. I use to use a USB laptop cooler that just sat under the laptop blowing cold air around it. I think I bought it for around £10. At the time I was just crunching on the CPU, but it did pull the temps down to a reasonable level.
	ID: 13933 \| Rating: 0 \| rate: / Reply Quote

CTAPbIi Send message Joined: 29 Aug 09 Posts: 175 Credit: 259,509,919 RAC: 0 Level Scientific publications	Message 13970 - Posted: 17 Dec 2009 \| 17:50:58 UTC - in response to Message 13919. Last modified: 17 Dec 2009 \| 17:55:50 UTC
	robertmiles while you have REALLY strong reason to deal with HP (and other such kind of stuff), but me personally prefer to keep miles away from them. Just look inside HP: the very cheapest component money can buy, very often - mATX mobos (with no OCing - can you believe?), the worst cases in terms on air flow, the worst PSU, stock CPU cooler, the cheapest RAM... The worst think - you can do nothing coz it sealed and under warranty. I can continue but it's clear for me that it's just piece of crap for huge amount of money. Imagine: you paying up to 50% just for brand... Me personally biult my rig early this fall: - i7-920 D0 - TR IFX-14 (polished base + 2 Scythe Slip Stream running on 1200rpm) - OCZ Freeze eXtreme thermal interface - Asus P6T6 WS Revolution - 6Gb Muskin 998691 running at 6-7-6-18@1600 - eVGA GTX275 OCed 702/1584/1260 - Enermax Infinity 720W - CM HAF932 Can you imagine what much you'll pay if you can (but you can not at all) get it from HP? This rig easily runs at 4200 (20021) 21.4V, but I was not lucky enough to get better CPU, so now I'm runnig at 4009 (21191) @1.325V solid rock stable. ____________
	ID: 13970 \| Rating: 0 \| rate: / Reply Quote

robertmiles Send message Joined: 16 Apr 09 Posts: 503 Credit: 757,773,003 RAC: 363,150 Level Scientific publications	Message 13997 - Posted: 19 Dec 2009 \| 21:21:58 UTC - in response to Message 13970.
	robertmiles while you have REALLY strong reason to deal with HP (and other such kind of stuff), but me personally prefer to keep miles away from them. Just look inside HP: the very cheapest component money can buy, very often - mATX mobos (with no OCing - can you believe?), the worst cases in terms on air flow, the worst PSU, stock CPU cooler, the cheapest RAM... The worst think - you can do nothing coz it sealed and under warranty. I can continue but it's clear for me that it's just piece of crap for huge amount of money. Imagine: you paying up to 50% just for brand... From what I've read, Dell is even worse. Significant reliability problems, even compared to HP. I'm no longer capable of handling a desktop well enough to build one myself, or even unpack one built elsewhere and then shipped here, so I'll need to choose SOME brand that offers the service of building it and unpacking it for me. Want to suggest one available in the southeast US? There's a local Best Buy, but they do not have any of the recommended Nvidia boards in stock; that's the closest I've found yet. For the high-end HP products I've been looking at lately, I've found that they offer you SOME choices in what to include, at least for the custom-built models, just not enough; and they don't send them sealed. The rest of your description could fit, though; I don't have a good way of checking. I sent email to CyperPower today asking them if they offer the unpacking service, so there's a possibility that I may just have to check more brands to find one that meets my needs.
	ID: 13997 \| Rating: 0 \| rate: / Reply Quote

skgiven Volunteer moderator Volunteer tester Send message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level Scientific publications	Message 14001 - Posted: 20 Dec 2009 \| 12:52:51 UTC - in response to Message 13997. Last modified: 20 Dec 2009 \| 12:54:10 UTC
	Ask around and find somone who can build a system for you. Pay them $200 and you will save money on buying an OEM HP, Dell... Alternatively contact a local dealer (shop) and get them to build one. If you must get a system with a low spec card, get a GT240. There is not much difference in price between a GT220 and a GT240, but the GT240 will do twice the work. You could even have Two GT240s instead of a GTX260 sp216 or GTX275. The GT240 cards do not require additional power connectors, so you will not need an expensive PSU.
	ID: 14001 \| Rating: 0 \| rate: / Reply Quote

ExtraTerrestrial Apes Volunteer moderator Volunteer tester Send message Joined: 17 Aug 08 Posts: 2705 Credit: 1,311,122,549 RAC: 0 Level Scientific publications	Message 14256 - Posted: 20 Jan 2010 \| 21:28:11 UTC Last modified: 20 Jan 2010 \| 21:28:35 UTC
	The gaming-related features of GT300 (or now GF100) have been revealed (link). Impressive raw power and more flexible than previous chips. MrS ____________ Scanning for our furry friends since Jan 2002
	ID: 14256 \| Rating: 0 \| rate: / Reply Quote

skgiven Volunteer moderator Volunteer tester Send message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level Scientific publications	Message 14263 - Posted: 21 Jan 2010 \| 16:19:49 UTC - in response to Message 14256. Last modified: 21 Jan 2010 \| 16:20:14 UTC
	Good Link MrS. The limited article I read this morning had little to offer, that one was much better. For what it’s worth, this is what I would speculate: It will be about 512/240 *1.4 times as fast as a GTX285 (3 times as fast). This would make it over 65% faster than the GTX 295. The 1.4 is based on a guess that the cores will be faster due to the 40nm GT100 architecture and using speedier DDR5 (which is not confirmed). Assuming it uses GDDR5 the frequency increase will be advantageous despite the memory bus width. The memory temperatures should be lower. I would say 2GB would be the new standard. With higher transistor count the performance should be higher. With the 40nm core size (and given the low temperatures of cards such as the GT 240 and GT220) core temperatures should also be lower. Unfortunately I cannot see these cards being sold at low prices. I would speculate that they would be well over the $500 mark; around $720 to $850 – hope I am wrong! If they are released in March at that price I don’t think I will be rushing out to buy one, though many will. I doubt that they will be significantly more power hungry than a GTX 295, but again I would want to know before I bought one. I don’t fancy paying £400 per year to run one. Perhaps someone else would care to speculate, or correct me. I expect that over the comming few weeks there will be many more details released.
	ID: 14263 \| Rating: 0 \| rate: / Reply Quote

GDF Volunteer moderator Project administrator Project developer Project tester Volunteer developer Volunteer tester Project scientist Send message Joined: 14 Mar 07 Posts: 1957 Credit: 629,356 RAC: 0 Level Scientific publications	Message 14267 - Posted: 21 Jan 2010 \| 19:53:11 UTC - in response to Message 14263.
	Unrelated. We are doing the last tests on the new application. After a lot of work, at the moment we are 60% faster on the same hardware than the previous one. On linux is ready, we are fixing it for Windows. gdf
	ID: 14267 \| Rating: 0 \| rate: / Reply Quote

ExtraTerrestrial Apes Volunteer moderator Volunteer tester Send message Joined: 17 Aug 08 Posts: 2705 Credit: 1,311,122,549 RAC: 0 Level Scientific publications	Message 14273 - Posted: 21 Jan 2010 \| 22:53:40 UTC - in response to Message 14263.
	Perhaps someone else would care to speculate Well.. :D - 512/240 = 2.13 times faster per clock is a good starting point - 2x the performance also requires 2x the memory bandwidth - they'll get approximately 2x the clock speed from GDDR5 compared to GDDR3, but reduce bandwidth from 512 to 386 bit -> ~1.5 times more bandwidth -> so GDDR5 is not going to speed things up, but I'd suspect it to be sufficiently fast so it doesn't hold back performance - the new architecture is generally more efficient per clock and has the following benefits: better cache system, much more double precision performance, much more raw geometry power, much more texture filtering power, more ROPs, more flexible "fixed function hardware" -> the speedup due to these greatly depends on the application - 40 nm alone doesn't make it much faster: the current ATIs aren't clocked much higher than their 55 and 65 nm brothers - neither does the high transistor count add anything more: it's already included in the 512 "CUDA cores" - and neither does 40 nm guarantee a cool card: look at RV790 vs RV870: double the transistor count, same clock speed, almost similar power consumption and significantly reduced voltage -> ATI needed to lower the voltage considerably (1.3V on RV770 to 1.0V on RV870) to keep power in check - nVidia is more than doubling transistors (1.4 to 3.0 Billion) and has already been at lower voltages of ~1.1V before (otherwise GT200 would have consumed too much power) and paid a clock speed penalty compared to G92, even at the 65 nm node (1.5 GHz on GT200 compared to 2 GHz on G92) -> nVidia can't lower their voltage as much as ATI did (without choosing extremly low clocks) and needs to power even more transistors -> I expect GT300 / GF100 to be heavily power limited, i.e. to run at very low voltages (maybe 0.9V) and to barely reach the same clock speeds as GT200 (it could run faster at higher voltages, but that would blow the power budget) -> I expect anywhere between 200 and 300W for the single chip flagship.. more towards 250W than 200W -> silent air cooling will be a real challenge and thus temperatures will be very high.. except people risk becoming deaf - definitely high prices.. 3 Billion transistors is just f*cking large and expensive - I think GF100s smaller brother could be a real hit: give it all the features and half the crunching power (256 shaders) together with a 256 bit GDDR5 interface. That's 66% the bandwidth and 50% performance per clock. However, since you'd now be at 1.6 - 1.8 Billion transistors it'd be a little cheaper than RV870 and consume less power. RV870 is already power contrained: it needs 1.0V to keep power in check and thus can't fully exploit the clock speed headroom the design and process have. With less transistors the little GF100 could hit the same power envelope at ~1.1V. Compare this to my projected 0.9V for the big GF100 and you'll get considerably higher clock speeds at a reasonable power consumption (by todays standards..) and you'd probably end up at ~66% the performance of a full GF100 at half the die size. MrS ____________ Scanning for our furry friends since Jan 2002
	ID: 14273 \| Rating: 0 \| rate: / Reply Quote

skgiven Volunteer moderator Volunteer tester Send message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level Scientific publications	Message 14275 - Posted: 22 Jan 2010 \| 2:29:44 UTC - in response to Message 14273.
	My rough guess of 512/240=2.13 and multiplied by an improved architecture factor (1.4) giving about 3 times the performance of a GTX 285 is only meant as a guide to anyone reading this and interested in one of these cards, as is my price speculation and power consumption guess! I broadly incorporated, what you more accurately referred to as architecture improvements, into what I thought would result in GPU and RAM system gains via their new supporting architectures (not that I understand them well), but I was just looking (stumbling about) for an overall ball park figure. I take your point about the 40nm core being packed with 3B transistors; the performance vs power consumption is tricky, so there may not be any direct gain there. Where the RAM is concerned, I spotted the lower bandwidth, but with DDR5 being faster and with the cache improvements I guessed there might be some overall gain. Although my methods are not accurate they might suffice at this stage. Do you think the new architecture will improve GPUGrid performance in itself by about 40% (my guesstimate of 1.4)? I like the sound of your little GF100. Buying one of those might just about be possible at some stage. Something around the performance of a GTX 295 or perhaps 25% faster would go down very well, especially if it uses less power. O/T GDF – Your’ new, 60% faster application sounds like an excellent achievement. Does it require Cuda Capable 1.3 cards or can 1.1 and 1.2 cards also benefit to this extent?
	ID: 14275 \| Rating: 0 \| rate: / Reply Quote

GDF Volunteer moderator Project administrator Project developer Project tester Volunteer developer Volunteer tester Project scientist Send message Joined: 14 Mar 07 Posts: 1957 Credit: 629,356 RAC: 0 Level Scientific publications	Message 14276 - Posted: 22 Jan 2010 \| 9:37:08 UTC - in response to Message 14275.
	O/T GDF – Your’ new, 60% faster application sounds like an excellent achievement. Does it require Cuda Capable 1.3 cards or can 1.1 and 1.2 cards also benefit to this extent?[/quote] It is slightly slower if compiled for 1.1. We are trying to optimize it, the only other solution is to release an application for 1.3 cards alone. gdf
	ID: 14276 \| Rating: 0 \| rate: / Reply Quote

Quinid Send message Joined: 11 Jan 10 Posts: 1 Credit: 3,791,364 RAC: 0 Level Scientific publications	Message 14282 - Posted: 22 Jan 2010 \| 17:40:17 UTC - in response to Message 14276.
	I can't remember what article I read a couple days ago, but Nvidia admitted the new cards will run VERY hot. They claimed an average PC case and cooling will NOT handle more than one of these new cards. Just FYI..... If that's the case, I wonder if waterblock versions will be more common this time around. My 260 GTX(216) already spews air hotter than a hairdyer just running GPUGRID or Milkyway. ____________
	ID: 14282 \| Rating: 0 \| rate: / Reply Quote

Beyond Send message Joined: 23 Nov 08 Posts: 1112 Credit: 6,162,416,256 RAC: 0 Level Scientific publications	Message 14285 - Posted: 22 Jan 2010 \| 20:07:00 UTC - in response to Message 14276.
	It is slightly slower if compiled for 1.1. We are trying to optimize it, the only other solution is to release an application for 1.3 cards alone. gdf Can't BOINC server be set to send separate apps based on compute ability? If not and if there's only a slight difference I'd vote for the one most compatible with all cards. Another option would be differently compiled optimized apps available to install via an app_info.xml.
	ID: 14285 \| Rating: 0 \| rate: / Reply Quote

ExtraTerrestrial Apes Volunteer moderator Volunteer tester Send message Joined: 17 Aug 08 Posts: 2705 Credit: 1,311,122,549 RAC: 0 Level Scientific publications	Message 14286 - Posted: 22 Jan 2010 \| 22:31:46 UTC - in response to Message 14275.
	Hi SK, I'm not saying your factor of 1.4 is wrong. In fact, I couldn't even if I wanted to ;) The point is that I don't have a "better" number. Just the gut feeling that in typical games the speed up will be lower, because otherwise GT200 would have been quite unbalanced in some way. However, if any of the shiny new features can be used, the speed up can be much larger. At Milkyway I'd expect a factor of 8 per clock due to the massive increase of raw DP number crunching power. Regarding a speed up of 1.4 for GPU-Grid. Well, I wouldn't be surprised if GPU-Grid could make good use of the new features. But deciding for a specific number would be too bold.. Regarding the memory bandwidth: on GT200 the GDDR3 is used between 1000 and 1250 MHz (data rate = frequency * 2) whereas current GDDR5 reaches up to 1200 MHz (data rate = frequency * 4). That's a nice speed up of 2 due to the frequency. Overall bandwidth should be 1.5 times higher than on GT200 (512 vs 386 bit). To sustain 2x the performance one needs to first approximation 2x the bandwidth - So we're not winning much here. However, as you also said, the on chip cache is better and larger now. And GT200 wasn't terribly bandwidth constrained. That's why I think we're not winning much here, but the memory subsystem should still be fast enough. I like the sound of your little GF100. Me too :D I just hope nVidia doesn't do it the GT200-way: don't release a mainstream version of the fat "flag-chip" at all. They've done quite some dodgy moves in the past (just remember the endless renaming) .. but I don't think they're stupid or suicidal! Best regards, MrS ____________ Scanning for our furry friends since Jan 2002
	ID: 14286 \| Rating: 0 \| rate: / Reply Quote

Post to thread

Message boards : Graphics cards (GPUs) : Nvidia GT300

	About	Science	Volunteers	Performance	Forum	Join us	Donate