Author |
Message |
|
Hello folks;
I have been thinking about the recent issues with this project.
As far as I understand, it has to do with problems in adopting the latest CUDA versions and/or drivers, and with adopting the RTX cards as well. Am I right?
Now: As you all know CUDA is NVidia only.
And as you all may know: Intel is coming with dedicated high-end GPUs next year.
Will many people buy Intel GPUs?
Yes they will, simply because many OEMs will make pcs with Intel-only inside, taking advantage of Intel's "preferred partner" programmes which favor Intel-inside-only.
Will Intel ever adopt CUDA? Probably not.
Not even if NVidia turns it into an open standard.
Because Intel already puts effort into supporting OpenCl for its current integrated GPUS.
I don't see why they should bother adopting CUDA as well then.
So... which way to go ahead in the future?
I am not a developer myself, so I have no idea how much effort it would cost to rewrite the code for OpenCl.
But with giant Intel coming to play along, I think you should at least think about this.
And with OpenCl, your software will run on all NVidia, AMD and Intel GPUs.
So, I think you should at least think about this option for the future.
And keep an eye open on what Intel is doing.
It depends upon how much weight Intel will put on its high-end GPU business.
As Intel needs high-end GPUs for the datacenters to fight AMD, I fear they will put quite a lot of weight on their GPU business in the future.
So, what if CUDA is a dead end?
(I don't know of course.)
Have a nice weekend all of you;
Carl Philip
|
|
|
|
And if you ever consider this option of migrating to OpenCl:
As Intel needs its high-end GPU business to become a success,
I'm sure that if you ever ask them for help in rewriting your code to OpenCl,
they WILL help you, for free.
Or you could as AMD the same question, and see which one is prepared to help you out. |
|
|
|
As far as I understand, it has to do with problems in adopting the latest CUDA versions and/or drivers, and with adopting the RTX cards as well. Am I right? Yes and no.
There is a working GPUGrid client for Turing (RTX and GTX 16xx) and previous generation cards, but it hasn't been released for general use for unknown reasons.
with OpenCl, your software will run on all NVidia, AMD and Intel GPUs. That's right, the advantage of OpenCL is that you can develop an application independently from the actual GPU platform and its manufacturer. But native CUDA apps are more effective (far more effective in some cases) on NVidia cards than their OpenCL counterparts. So while a project could reach a broader user base through an OpenCL app, it would lose a lot of computing power on the NVidia part of its user base at the same time. This effect would be amplified by the fact that knowledgeable NVidia owners would choose projects with recent CUDA apps instead of projects with OpenCL apps. So the project could end up with many AMD (and Intel iGPU) users, and much less NVidia users.
Intel entering the discrete GPU market could change the balance for the favor of OpenCL, but it won't happen overnight. It depends on how big market share they could reach say in about a year. Their present integrated GPUs don't excel in computing by stability and reliability; their discrete GPU will be based on this iGPU, so I don't expect that their first chips will be a serious threat for NVidia (the pioneer of GPU computing) and AMD.
Will many people buy Intel GPUs?
Yes they will, simply because many OEMs will make pcs with Intel-only inside, taking advantage of Intel's "preferred partner" programmes which favor Intel-inside-only. If they gonna force their product on customers (like they did before for many years), and it will not meet the customer's expectations then perhaps the rage against them will make it their last attempt to enter the GPGPU market.
Will Intel ever adopt CUDA? Probably not.
Not even if NVidia turns it into an open standard.
Because Intel already puts effort into supporting OpenCl for its current integrated GPUS.
I don't see why they should bother adopting CUDA as well then. Intel does not have a choice between OpenCL and CUDA. There's no reason for NVidia to turn CUDA into an open standard as they should make their hardware design open standard with it as well. NVidia develops their GPU hardware and CUDA together, as their hardware evolves, so does CUDA evolve with it. There's no such unity in the development of other GPU hardware and OpenCL. NVidia's method could result in more efficient systems (GPU + GPU app). |
|
|
|
Thanks for the reply and for the information!
Just to be sure you understand me correctly:
I am NOT an Intel fan. :-)
If I am a fan of one of these companies, it's probably AMD.
I am usually a fan of the underdog. :-)
But I do have Intel-based computers as well, so I'm not fanatic.
And in the end, when it comes to helping the scientists in the world of healthcare, I will be a fan of any company whose hardware works best.
There is 1 thing which we should not underestimate:
Intel needs discrete GPUs badly, not so much for the gamers among us, but for its own long-term profitability:
Data centers (and supercomputers) are quickly becoming a mix of CPUs and GPUs,
so Intel has a lot of catching up to do in the GPU part.
And even though I'm usually a fan of the underdogs, I am pretty sure the Impact of Intel in the discrete GPU market is going to be big.
They cannot afford to fail.
|
|
|
|
Quote:
"But native CUDA apps are more effective (far more effective in some cases) on NVidia cards than their OpenCL counterparts.
End of quote.
You are probably right.
Let us assume that today CUDA is technically the best solution for GPU computing indeed.
But that is only true TODAY. We don't know whet the future will bring.
The problem with GLOBAL standards is that they almost never are closed standards, but open standards. And OpenCl is an open standard while CUDA is not.
So, global standards are usually not the best from a technical point of view.
It's usually not the best solution which becomes the global standard.
Wether we like it or not.
So I stand by my point of view:
If I were part of the GPUGRID team, I would keep a very close look on what Intel will do in the short term, and if and how that will affect worldwide OpenCl adoption (or any other open standard, if there are any others). |
|
|
Aurum Send message
Joined: 12 Jul 17 Posts: 401 Credit: 16,812,360,762 RAC: 2,761,797 Level
Scientific publications
|
And besides, GPUGRID does not even produce enough work to keep a fraction of their crunchers busy. We can easily do several times the amount of work.
____________
|
|
|
Erich56Send message
Joined: 1 Jan 15 Posts: 1142 Credit: 11,007,655,976 RAC: 21,537,108 Level
Scientific publications
|
And besides, GPUGRID does not even produce enough work to keep a fraction of their crunchers busy. We can easily do several times the amount of work.
+ 1 |
|
|
|
And besides, GPUGRID does not even produce enough work to keep a fraction of their crunchers busy. We can easily do several times the amount of work.
Makes my point about small work units:
Wouldn't 1000 small work units make more people happy than 100 big work units? :-)
The project people seem to prefer large work units for some reason:
Maybe small work units are less interesting to the scientists somehow. |
|
|
|
And besides, GPUGRID does not even produce enough work to keep a fraction of their crunchers busy. We can easily do several times the amount of work.
Makes my point about small work units:
Wouldn't 1000 small work units make more people happy than 100 big work units? :-)
The project people seem to prefer large work units for some reason:
Maybe small work units are less interesting to the scientists somehow.
I appreciate the enthusiasm, but let's not confuse maximum participation with maximum production. I'd rather have 100 backhoes than 1000 men with shovels if I had a lot of digging to do.
____________
Team USA forum | Team USA page
Join us and #crunchforcures. We are now also folding:join team ID 236370! |
|
|
|
And besides, GPUGRID does not even produce enough work to keep a fraction of their crunchers busy. We can easily do several times the amount of work.
Makes my point about small work units:
Wouldn't 1000 small work units make more people happy than 100 big work units? :-)
The project people seem to prefer large work units for some reason:
Maybe small work units are less interesting to the scientists somehow.
I appreciate the enthusiasm, but let's not confuse maximum participation with maximum production. I'd rather have 100 backhoes than 1000 men with shovels if I had a lot of digging to do.
I guess you are right. :-) |
|
|
|
For those who are interested,
here is an article from Tomshardware on Intel's GPU roadmap.
At first, they will start with desktop GPUs in june, and then later on with all kinds of GPUs from cellphones all the way up to datacenter GPUs.
https://www.tomshardware.com/news/intel-announces-ponte-vecchio-graphics-cards-sapphire-rapids-cpus-and-data-center-roadmap
Happy reading :-)
Carl |
|
|
|
https://www.tomshardware.com/news/intel-announces-ponte-vecchio-graphics-cards-sapphire-rapids-cpus-and-data-center-roadmap FYI: There's no mention of OpenCL in this article. |
|
|
|
No, but Intel already supports OpenCl for its integrated GPUs.
By the way:
I remember you mentioning that CUDA performance on NVidia cards is much faster than OpenCL.
I had some interesting data on OpenCl performance, which I will show you in the coming days:
I've run several work units of Milkyway on a GT730, on a GT1030, and also on an integrated Radeon R7 on an entry-level AMD A8-9600 cpu.
I started the wus on the A8 almost by accident. :-)
I really was not planning to use an integrated GPU.
But then I almost fell off my chair about the results:
The GT730 needs about 25 minutes to finish a Milkyway work unit.
And then I saw the integrated Radeon R7 do the same in... 3 minutes.
When it comes to playing games, the GT730 and R7 are more or less in the same league.
But apparently, in OpenCl, the R7 is about 8 times faster than a GT730...
Could this be true?
I need to look further into detail, but it would say a lot about the performance optimization effort NVidia made for OpenCl: Close to zero.
I'll keep you posted.
Cheers;
Carl
|
|
|
|
Fine if Intel can reduce the heat production inside their CPU+GPUs enough that their idea of high-end performance does not require more power than either chip in the separate CPU and GPU packages method, but still gives as much performance.
However, so far, their method concentrates enough heat into one package that this package will require some rather exotic method of cooling, such as liquid nitrogen, to give the same performance as the high end of the separate CPU and GPU packages method, without making the lifespan of each CPU+GPU much shorter than expected.
Milkyway workunits appear to use mostly double-precision arithmetic, and are therefore likely to run much faster than expected on hardware that is very good at double-precision. Does GPUGrid also use mostly double-precision arithmetic? |
|
|
|
Does GPUGrid also use mostly double-precision arithmetic? No. It does DP on the CPU.
|
|
|
|
Good idea about Double Precision, but according to the Techpowerup GPU database this cannot be the cause of the R7 being so much faster than the GT730.
Not 100% sure if the data on that site is correct of course, but according to them,
The GT730 has a DP performance of about 230 GFlops,
while the R7 reaches only 63 GFlops.
When it comes to Single Precision though, it is reversed:
GT730: 693 GFlops
R7: 1004 GFlops
I use Techpowerup all the time to check GPU specifications, see:
GT730:
https://www.techpowerup.com/gpu-specs/geforce-gt-730.c1988
R7:
https://www.techpowerup.com/gpu-specs/radeon-r7-graphics.c2532
So I still don't understand why my R7 is 8 times faster in Milkyway compared to the GT730.
To make things even worse for the R7, it sits in a rather old AMD A8-9600 cpu and uses the DDR4-2400 RAM of the computer, while my GT730 is used with a somewhat faster Athlon 200GE, and it has its own much faster GDDR5 memory.
So, I still think that NVidia's implementation of OpenCl might just be very poor, although I have nothing to prove that for the moment.
I'l see if I can find more info.
I'll try to compare with my GT1030 too, but it is currently busy doing an ACEMD3 work unit. :-)
Greetings;
Carl |
|
|
|
Carl Philip said:
So I still don't understand why my R7 is 8 times faster in Milkyway compared to the GT730.
From what I remember, Milkyway performs double precision calculations on the GPU. A GPU that supports double precision natively will be much faster than one that does not. Depending on the particular model, your R7 is much faster at double precision math that your GT730 according to the ratings on Wikipedia.
I used to run Milkyway on an AMD 6950 and it would run circles around my GT970's. |
|
|
|
Carl Philip said:
So I still don't understand why my R7 is 8 times faster in Milkyway compared to the GT730.
From what I remember, Milkyway performs double precision calculations on the GPU. A GPU that supports double precision natively will be much faster than one that does not. Depending on the particular model, your R7 is much faster at double precision math that your GT730 according to the ratings on Wikipedia.
I used to run Milkyway on an AMD 6950 and it would run circles around my GT970's.
Thanks for the info!
I suppose it plays a role indeed: Your GT970s are supposed to be much faster (at gaming) than your AMD 6950.
I just discovered that my R7 in that old A8-9600 is also 5 times faster than my GT1030 in Milkyway:
The GT1030 does a wu in about 15 minutes, while the A8-9600 does it in about 3 minutes.
If I extrapolate this, it means that this old and tiny integrated R7 iGPU in my A8-9600 is even faster in Milkyway than a GTX1050... and who knows even a 1050Ti.
That is what I call fantastic performance per Watt or per Dollar. :-)
Greetings;
Carl
|
|
|