Message boards : Graphics cards (GPUs) : GPUs not being used?
Author | Message |
---|---|
I built a box and put 1 EVGA GeForceGTX 780 in it. It worked fine with GPUGRID for 2 months. I just put 2 more in there; one EVGA and one PNY (yes completely compatible and working in the system.) Now that they are in, the first one is working exactly as it was at 90-92%. The PNY is working also at 90%. The BOINC Manager shows 2 tasks being run now. The third card, which is showing as properly installed, works with a monitor as the main monitor, and also shows up with NVidiaInspector (like the other 2) is not working any tasks and is sitting at 0%. This third one will, on occasion, go up to 9% or somewhere lower than that, as I switch windows, but I cannot get it to do a task with GPUGRID. I don't see any settings for the project or the BOINC Manager itself, so I need so advice or help on how to get this third graphics card working on its own task like the other 2 are. TY | |
ID: 38982 | Rating: 0 | rate: / Reply Quote | |
Hi, possibly modifying - cc_config.xml - to use all GPUs BOINC system: | |
ID: 38985 | Rating: 0 | rate: / Reply Quote | |
OK, when I do a search for that file, I can't find it. I found reference to it in stdoutgpudetect.txt where it keeps repeating the message: | |
ID: 38986 | Rating: 0 | rate: / Reply Quote | |
I went ahead and made that file and started BOINC again. It looks like it has been accepted so I will wait out a full cycle of tasks to see if the other GPU kicks in. | |
ID: 38987 | Rating: 0 | rate: / Reply Quote | |
I went ahead and made that file and started BOINC again. It looks like it has been accepted so I will wait out a full cycle of tasks to see if the other GPU kicks in. Hello: "cc_config.xml" is on - boinc / data - (using OS-Windows) if you are using a version of Boinc> 7.2.42 will be a file with many variables (all 0) just look for the same as I have appointed you modify them and putting - 1 - instead of - 0 - If an older version just paste the file - cc_config.xml - in - boinc / data - and restart in the "Event Log" Boinc Manager will see if you read the configuration file and if it detects all GPUs. Greetings. | |
ID: 38990 | Rating: 0 | rate: / Reply Quote | |
It looks like this thread needs a link on how to setup cc_config.xml. | |
ID: 38999 | Rating: 0 | rate: / Reply Quote | |
Carlesa, thank you again. C:\ProgramData\BOINC\slots\2>C:\ProgramData\BOINC\projects\www.gpugrid.net\acemd.847-65.exe projects/www.gpugrid.net/acemd.847-65.exe --device 2 The output to this is: # ACEMD Molecular Dynamics Version [3212] # CUDA Synchronisation mode: BLOCKING # CUDA Synchronisation mode: BLOCKING # SWAN: Created context 0 on GPU 2 # SWAN Device 2 : # Name : GeForce GTX 780 # ECC : Disabled # Global mem : 3072MB # Capability : 3.5 # PCI ID : 0000:03:00.0 # Device clock : 993MHz # Memory clock : 3004MHz # Memory width : 384bit # SWAN Device 2 : # Name : GeForce GTX 780 # ECC : Disabled # Global mem : 3072MB # Capability : 3.5 # PCI ID : 0000:03:00.0 # Device clock : 993MHz # Memory clock : 3004MHz # Memory width : 384bit # Driver version : r343_00 : 34475 # SWAN: Configuring Peer Access: # - # SWAN NVAPI Version: NVidia Complete Version 1.10 Hopefully I am not corrupting the results, but it then goes and does the one task twice as fast taking the total GPU usage from 48% to 72.5%. Notwithstanding, it only works per task and once it is done (twice as fast) the cmd comes back to a command prompt and BOINC continues with only the 2 tasks running on the first two GPUs. So I still need help. I am just not sure of anything right now when it comes to what is going wrong, but that may be my lack of knowledge about the program. ____________ 1 Corinthians 9:16 "For though I preach the gospel, I have nothing to glory of: for necessity is laid upon me; yea, woe is unto me, if I preach not the gospel!" Ephesians 6:18-20, please ;-) http://tbc-pa.org | |
ID: 39151 | Rating: 0 | rate: / Reply Quote | |
I actually do see a few corrupted (errored out) tasks in my online task logs, so it looks like cheating does have its consequences. I need to find a solution that actually downloads and works 3 tasks as one per GPU the way the program is supposed to and not 'rigged to blow'. | |
ID: 39152 | Rating: 0 | rate: / Reply Quote | |
In addition to any help that can be offered on these forums, is anyone willing to help people on here actually directly check my installation, files, settings, etc via something like TeamViewer? Direct help can save a lot of frustration for me trying to make it work and for those helping by just troubleshooting and doing the fix instead of the back and forth. TYYTYTYTYTYVM in advance for any help or suggestions that are given. | |
ID: 39153 | Rating: 0 | rate: / Reply Quote | |
Your GPU's are too hot. You need to keep them reasonably cool. Use MSI Afterburner or similar to set fan speeds. | |
ID: 39195 | Rating: 0 | rate: / Reply Quote | |
I am using nvidiaInspector's Overclocking options to do nothing but up the fan speed, but what would make you think they are too hot? Does heated GPUs cause the BOINC Manager to only load 2 slots and use 2 devices when 3 are noticed by the OS, nvidiaInspector, and can manually be loaded via command line? I wouldn't think heat is the reason it only loads the top 2 (device 0 and device 1) even if there are 3 slots with units in them, which rarely happens unless I "Suspend" one unit of work and it loads a new third one to run on the one I turned off. But if I do "Resume" any '3rd' unit after 2 are already running, it will sit in "Waiting" mode until one of the other 2 finishes and then it will turn back on and run. <gpu_device_num>2</gpu_device_num> , but as soon as the manager is started again, it changes those values back to 0 or 1 and the same result happens.<gpu_opencl_dev_index>2</gpu_opencl_dev_index> At this point I have to ask... Does anyone run 3 different GPUs in one computer and all 3 GPUs load and run work units continuously? Is the program built to even allow that? | |
ID: 39221 | Rating: 0 | rate: / Reply Quote | |
I mean I honestly bought 2 extra $400 GPUs to run THIS project and it is frustrating that one refuses to be used. I don't even game! | |
ID: 39222 | Rating: 0 | rate: / Reply Quote | |
I mean I honestly bought 2 extra $400 GPUs to run THIS project and it is frustrating that one refuses to be used. I don't even game! Are you leaving any cpu cores free for the gpu's to use? I guessing you DID do the cc_config.xml file to <use_all_gpus> too? Does Boinc itself see all 3 gpu's? Look at the 'event log' on startup and it should list all 3 gpu's, if not you may have to load the drivers again for the 3rd card. Windows sometimes requires that to happen for each gpu in the system, other times it doesn't. After that it may come down to the motherboard, what brand and model do you have? | |
ID: 39223 | Rating: 0 | rate: / Reply Quote | |
The log does see all three GPUs listed one 3 different lines and numbers them 0, 1, and 2. | |
ID: 39230 | Rating: 0 | rate: / Reply Quote | |
The log does see all three GPUs listed one 3 different lines and numbers them 0, 1, and 2. Try suspending your cpu project and see if the 3rd gpu starts crunching, if so then yes it's causing problems. As for "I did change <use_all_gpus> to a value of 1", 1 means yes and zero means no, so yes you should be using all 3. There IS a problem at some projects where Boinc won't use two Nvidia cards no matter what the settings are, I wonder if you have found a new problem with 3 cards? The only thing someone can do at those projects is use the <exclude_gpu> line to make one crunch for a different project. To test that do you happen to have an AMD card laying around? If so can you take out the 3rd Nvidia gpu and put in the AMD one and see if it tries to get work or not? Have you tried using a 'dummy plug' on the cards that do NOT have a monitor plugged into them yet? Windows has a bad habit of disabling things during startup if nothing is plugged into a device, if a gpu is disabled that way it won't be enabled except thru a restart. The only other thing I can think of is have you looked on the Asus message boards to see if there is a problem using 3 cards on that model motherboard? I do not use Team Viewer so would not feel comfortable using it, sorry. | |
ID: 39241 | Rating: 0 | rate: / Reply Quote | |
OK, quick update. 12/21/2014 1:03:20 PM | GPUGRID | Sending scheduler request: Requested by user. This leads me to believe that the issue is in the program itself and not with the hardware. This may be a false lead, but it is not a big leap to get to that conclusion either. If the program sees 3 GPUs12/21/2014 1:03:20 PM | GPUGRID | Not requesting tasks 12/21/2014 1:03:22 PM | GPUGRID | Scheduler request completed 12/21/2014 12:56:04 PM | | CUDA: NVIDIA GPU 0: GeForce GTX 780 (driver version 344.75, CUDA version 6.5, compute capability 3.5, 3072MB, 2779MB available, 4878 GFLOPS peak) but won't get tasks for them all, then a hardware issue seems less likely than something in the code itself or a setting I am just missing. I do have it set on the site to fetch work for 5 days, but I am not sure if that setting is only valid if you have other connection settings set?12/21/2014 12:56:04 PM | | CUDA: NVIDIA GPU 1: GeForce GTX 780 (driver version 344.75, CUDA version 6.5, compute capability 3.5, 3072MB, 2809MB available, 4698 GFLOPS peak) 12/21/2014 12:56:04 PM | | CUDA: NVIDIA GPU 2: GeForce GTX 780 (driver version 344.75, CUDA version 6.5, compute capability 3.5, 3072MB, 2809MB available, 4576 GFLOPS peak) 12/21/2014 12:56:04 PM | | OpenCL: NVIDIA GPU 0: GeForce GTX 780 (driver version 344.75, device version OpenCL 1.1 CUDA, 3072MB, 2779MB available, 4878 GFLOPS peak) 12/21/2014 12:56:04 PM | | OpenCL: NVIDIA GPU 1: GeForce GTX 780 (driver version 344.75, device version OpenCL 1.1 CUDA, 3072MB, 2809MB available, 4698 GFLOPS peak) 12/21/2014 12:56:04 PM | | OpenCL: NVIDIA GPU 2: GeForce GTX 780 (driver version 344.75, device version OpenCL 1.1 CUDA, 3072MB, 2809MB available, 4576 GFLOPS peak) 12/21/2014 12:56:04 PM | | Host name: BeastMode 12/21/2014 12:56:04 PM | | Processor: 12 GenuineIntel Intel(R) Core(TM) i7-4960X CPU @ 3.60GHz [Family 6 Model 62 Stepping 4] 12/21/2014 12:56:04 PM | | Processor features: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss htt tm pni ssse3 cx16 sse4_1 sse4_2 popcnt aes f16c rdrandsyscall nx lm avx vmx tm2 dca pbe fsgsbase smep 12/21/2014 12:56:04 PM | | OS: Microsoft Windows 8.1: Professional x64 Edition, (06.03.9600.00) 12/21/2014 12:56:04 PM | | Memory: 63.94 GB physical, 107.43 GB virtual 12/21/2014 12:56:04 PM | | Disk: 465.42 GB total, 337.59 GB free 12/21/2014 12:56:04 PM | | Local time is UTC -5 hours 12/21/2014 12:56:04 PM | | Config: report completed tasks immediately 12/21/2014 12:56:04 PM | | Config: use all coprocessors 12/21/2014 12:56:04 PM | | Config: fetch minimal work 12/21/2014 12:56:04 PM | | Config: fetch on update 12/21/2014 12:56:04 PM | GPUGRID | URL http://www.gpugrid.net/; Computer ID xxxxxx; resource share 100 12/21/2014 12:56:04 PM | GPUGRID | General prefs: from GPUGRID (last modified 19-Dec-2014 18:57:09) 12/21/2014 12:56:04 PM | GPUGRID | Computer location: home 12/21/2014 12:56:04 PM | GPUGRID | General prefs: no separate prefs for home; using your defaults 12/21/2014 12:56:04 PM | | Preferences: 12/21/2014 12:56:04 PM | | max memory usage when active: 65470.82MB 12/21/2014 12:56:04 PM | | max memory usage when idle: 65470.82MB 12/21/2014 12:56:04 PM | | max disk usage: 232.71GB 12/21/2014 12:56:04 PM | | max CPUs used: 1 12/21/2014 12:56:04 PM | | (to change preferences, visit a project web site or select Preferences in the Manager) 12/21/2014 12:56:04 PM | | Not using a proxy I will check the ASUS website for issues with 3 GPUs. If I did find a new bug with using 3 GPUs, how/to whom would I report such a thing? Is this forum enough for them to see that and respond, test, or fix the issue? | |
ID: 39248 | Rating: 0 | rate: / Reply Quote | |
There is a lot of confusion in this thread. | |
ID: 39250 | Rating: 0 | rate: / Reply Quote | |
{{{Warning, long answer on its way.}}} 12/21/2014 8:35:17 PM | | Starting BOINC client version 7.4.27 for windows_x86_64 12/21/2014 8:35:17 PM | | log flags: file_xfer, sched_ops, task, sched_op_debug, slot_debug, task_debug 12/21/2014 8:35:17 PM | | log flags: work_fetch_debug 12/21/2014 8:35:17 PM | | Libraries: libcurl/7.33.0 OpenSSL/1.0.1h zlib/1.2.8 12/21/2014 8:35:17 PM | | Data directory: C:\ProgramData\BOINC 12/21/2014 8:35:17 PM | | Running under account Mike 12/21/2014 8:35:17 PM | | CUDA: NVIDIA GPU 0: GeForce GTX 780 (driver version 344.75, CUDA version 6.5, compute capability 3.5, 3072MB, 2665MB available, 4878 GFLOPS peak) 12/21/2014 8:35:17 PM | | CUDA: NVIDIA GPU 1: GeForce GTX 780 (driver version 344.75, CUDA version 6.5, compute capability 3.5, 3072MB, 2809MB available, 4698 GFLOPS peak) 12/21/2014 8:35:17 PM | | CUDA: NVIDIA GPU 2: GeForce GTX 780 (driver version 344.75, CUDA version 6.5, compute capability 3.5, 3072MB, 2809MB available, 4576 GFLOPS peak) 12/21/2014 8:35:17 PM | | OpenCL: NVIDIA GPU 0: GeForce GTX 780 (driver version 344.75, device version OpenCL 1.1 CUDA, 3072MB, 2665MB available, 4878 GFLOPS peak) 12/21/2014 8:35:17 PM | | OpenCL: NVIDIA GPU 1: GeForce GTX 780 (driver version 344.75, device version OpenCL 1.1 CUDA, 3072MB, 2809MB available, 4698 GFLOPS peak) 12/21/2014 8:35:17 PM | | OpenCL: NVIDIA GPU 2: GeForce GTX 780 (driver version 344.75, device version OpenCL 1.1 CUDA, 3072MB, 2809MB available, 4576 GFLOPS peak) 12/21/2014 8:35:17 PM | | Host name: BeastMode 12/21/2014 8:35:17 PM | | Processor: 12 GenuineIntel Intel(R) Core(TM) i7-4960X CPU @ 3.60GHz [Family 6 Model 62 Stepping 4] 12/21/2014 8:35:17 PM | | Processor features: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss htt tm pni ssse3 cx16 sse4_1 sse4_2 popcnt aes f16c rdrandsyscall nx lm avx vmx tm2 dca pbe fsgsbase smep 12/21/2014 8:35:17 PM | | OS: Microsoft Windows 8.1: Professional x64 Edition, (06.03.9600.00) 12/21/2014 8:35:17 PM | | Memory: 63.94 GB physical, 107.43 GB virtual 12/21/2014 8:35:17 PM | | Disk: 465.42 GB total, 337.70 GB free 12/21/2014 8:35:17 PM | | Local time is UTC -5 hours 12/21/2014 8:35:17 PM | | Config: report completed tasks immediately 12/21/2014 8:35:17 PM | | Config: use all coprocessors 12/21/2014 8:35:17 PM | | Config: fetch minimal work 12/21/2014 8:35:17 PM | | Config: fetch on update 12/21/2014 8:35:17 PM | GPUGRID | URL http://www.gpugrid.net/; Computer ID 189656; resource share 100 12/21/2014 8:35:17 PM | GPUGRID | General prefs: from GPUGRID (last modified 21-Dec-2014 12:59:37) 12/21/2014 8:35:17 PM | GPUGRID | Computer location: home 12/21/2014 8:35:17 PM | GPUGRID | General prefs: no separate prefs for home; using your defaults 12/21/2014 8:35:17 PM | | Preferences: 12/21/2014 8:35:17 PM | | max memory usage when active: 65470.82MB 12/21/2014 8:35:17 PM | | max memory usage when idle: 65470.82MB 12/21/2014 8:35:17 PM | | max disk usage: 232.71GB 12/21/2014 8:35:17 PM | | (to change preferences, visit a project web site or select Preferences in the Manager) 12/21/2014 8:35:17 PM | | [work_fetch] Request work fetch: Prefs update 12/21/2014 8:35:17 PM | | [work_fetch] Request work fetch: Startup 12/21/2014 8:35:17 PM | | Not using a proxy 12/21/2014 8:35:18 PM | | [work_fetch] ------- start work fetch state ------- 12/21/2014 8:35:18 PM | | [work_fetch] target work buffer: 180.00 + 432000.00 sec 12/21/2014 8:35:18 PM | | [work_fetch] --- project states --- 12/21/2014 8:35:18 PM | GPUGRID | [work_fetch] REC 355175.655 prio -1.000 can request work 12/21/2014 8:35:18 PM | | [work_fetch] --- state for CPU --- 12/21/2014 8:35:18 PM | | [work_fetch] shortfall 5186160.00 nidle 12.00 saturated 0.00 busy 0.00 12/21/2014 8:35:18 PM | GPUGRID | [work_fetch] share 1.000 12/21/2014 8:35:18 PM | | [work_fetch] --- state for NVIDIA GPU --- 12/21/2014 8:35:18 PM | | [work_fetch] shortfall 1296540.00 nidle 3.00 saturated 0.00 busy 0.00 12/21/2014 8:35:18 PM | GPUGRID | [work_fetch] share 1.000 12/21/2014 8:35:18 PM | | [work_fetch] ------- end work fetch state ------- 12/21/2014 8:35:18 PM | GPUGRID | [sched_op] Starting scheduler request 12/21/2014 8:35:18 PM | GPUGRID | [work_fetch] request: CPU (1.00 sec, 12.00 inst) NVIDIA GPU (1.00 sec, 3.00 inst) 12/21/2014 8:35:18 PM | GPUGRID | Sending scheduler request: To fetch work. 12/21/2014 8:35:18 PM | GPUGRID | Requesting new tasks for CPU and NVIDIA GPU 12/21/2014 8:35:18 PM | GPUGRID | [sched_op] CPU work request: 1.00 seconds; 12.00 devices 12/21/2014 8:35:18 PM | GPUGRID | [sched_op] NVIDIA GPU work request: 1.00 seconds; 3.00 devices 12/21/2014 8:35:20 PM | GPUGRID | Scheduler request completed: got 0 new tasks 12/21/2014 8:35:20 PM | GPUGRID | [sched_op] Server version 613 12/21/2014 8:35:20 PM | GPUGRID | No tasks sent 12/21/2014 8:35:20 PM | GPUGRID | No tasks are available for Short runs (2-3 hours on fastest card) 12/21/2014 8:35:20 PM | GPUGRID | No tasks are available for ACEMD beta version 12/21/2014 8:35:20 PM | GPUGRID | No tasks are available for Long runs (8-12 hours on fastest card) 12/21/2014 8:35:20 PM | GPUGRID | No tasks are available for the applications you have selected. 12/21/2014 8:35:20 PM | GPUGRID | Project requested delay of 31 seconds 12/21/2014 8:35:20 PM | GPUGRID | [slot] linked projects/www.gpugrid.net/logogpugrid.png to projects/www.gpugrid.net/stat_icon 12/21/2014 8:35:20 PM | GPUGRID | [slot] linked projects/www.gpugrid.net/project_1.png to projects/www.gpugrid.net/slideshow_ga_00 12/21/2014 8:35:20 PM | GPUGRID | [slot] linked projects/www.gpugrid.net/project_1.png to projects/www.gpugrid.net/slideshow_cellmd_00 12/21/2014 8:35:20 PM | GPUGRID | [slot] linked projects/www.gpugrid.net/project_2.png to projects/www.gpugrid.net/slideshow_ga_01 12/21/2014 8:35:20 PM | GPUGRID | [slot] linked projects/www.gpugrid.net/project_2.png to projects/www.gpugrid.net/slideshow_cellmd_01 12/21/2014 8:35:20 PM | GPUGRID | [slot] linked projects/www.gpugrid.net/project_3.png to projects/www.gpugrid.net/slideshow_ga_02 12/21/2014 8:35:20 PM | GPUGRID | [slot] linked projects/www.gpugrid.net/project_3.png to projects/www.gpugrid.net/slideshow_cellmd_02 12/21/2014 8:35:20 PM | GPUGRID | [work_fetch] backing off CPU 580 sec 12/21/2014 8:35:20 PM | GPUGRID | [work_fetch] backing off NVIDIA GPU 312 sec 12/21/2014 8:35:20 PM | GPUGRID | [sched_op] Deferring communication for 00:00:31 12/21/2014 8:35:20 PM | GPUGRID | [sched_op] Reason: requested by project 12/21/2014 8:35:20 PM | | [work_fetch] Request work fetch: RPC complete 12/21/2014 8:35:52 PM | | [work_fetch] Request work fetch: Backoff ended for GPUGRID As you can tell from the log, and as I had previously not mentioned but should have, is my usage of BOINC. I am not sure I made it clear in all that I have written or maybe I did but it was scattered across several posts: The ONLY project I run with BOINC is GPUGRID. I have, since starting to troubleshoot, turned on all of the resources of my computer to the BOINC clinet, now knowing that the GPUGRID project has very little use for my CPUs, my memory (virtual or physical), my network resources, or my drive space and what little it does need, it has plenty to draw from without denting anything else at all. I do have a CPU intensive distributed project running from distributed.net and it does NOT use the BOINC client at all, as it is a separate install completely. The only other distributed project I ever worked on before distributed.net was the United Devices project that ran under several names such as grid.org, Intel's Crunch for the Cure, and UD.com/uniteddevices.org. That was pretty much the first ever publicly accessible distributed project and it also was not BOINC, but a stand-alone install. So in turn, this being my first and only BOINC project, I was not aware of project priorities (to add to your confusion of what might be causing my issue, but your input still helped and hopefully will continue [and conclude] immensely), how the work fetch even works, or why I was asked about my CPU availability when GPUGRID seems only to use a total of like 1% for each running task anyway. So looking at the logs and the results after both debugs are turned on, it seems that is it asking for work for 3 GPUs and that it sees all 3 GPUs with both debugs and without any. I have not tinkered enough to see how many tasks it has actually stored into memory/hard drive to work on, but as I mention earlier in this thread, I once was able to "Suspend" or pause one taks and another one started. In that one instance, I was not able to pause anymore and get more to start. I assumed that was due to the fact that it knew I only had 3 GPUs, so it would not allow 4 active tasks, even if some are "Suspended", but now it may be because it only collected 3 tasks when fetching, so it only had 3 to work with until they were done. But the question still remains, why will only 2 GPUs work on tasks at one time even when it has 3 tasks to work on and knows I have 3 GPUs to work them on? Additional: I have also, during the course of the past 2 days, uninstalled and reinstalled the BOINC client completely as to undo any tinkering and troubleshooting I had done. I know some information gets passed to the servers which in turn got passed back down to the client, but those things, I think, are more practical use than my experimental troubleshooting related. Also, yeah, I did confirm that manually adding slots and copying files and running from the command line DOES return tasks that are either a complete error or cannot be validated. So yeah, learned my lesson there on that troubleshooting/tinkering escapade. To answer one of your direct questions, and hopefully you have already figured out the answer So, it is finding them. Is the concern that you have downloaded tasks that won't run? Or is the concern that it won't even download 3 tasks?The answer is no. lol My issue is not tasks that download and never run. My issue is not with the client to fail to download 3 tasks. The issue is that it may download 3 tasks, but never runs on more than 2 tasks at one time. It will load 1 task on one GPU and then a second task on a second GPU and then not run a third task while those other 2 are running. So most times it is running (when not on holiday) 2 tasks (one each on 2 different GPUs) and never will it run 3 although occasionally it is only running 1 due to the fact that once both of the first 2 tasks run, the third task will want to complete before the client gets more tasks. So if it downloads 2, it will run those two until those 2 are done. If it downloads 3, it will finish all 3 before getting any more. If it downloads 1, then it will go get a second one, but will then, in turn, not get more until both are done. I hope that is clear on all the iterations I have witnessed. I realize now that it would seem my "min" is 1 and my "max" is 3 (based on the amount of resources available when the work_fetch does its evaluations). I also may not have changed (before the holiday slow down) the report_results_immediately, which may or may not have an effect on the work fetch process or just has to do with the way results are reported for 'scoring' purposes. You say you are running BOINC 7.4.36, but according to http://boinc.berkeley.edu/download_all.php?xml=1 the recommended Windows 64-bit version is 7.4.27 and that is the version I am running. Should I find an update or run the 32bit version, which seems to have a higher version number in order to try to fix this? Or a Beta version that I don't have? I realize that now, as the work units are tough to find out of GPUGRID, may not be the best time to get you in here to troubleshoot, as I am not working any tasks at all. When the holidays are over I will certainly be back all over this and ready to let someone take a personal look. If you think you can figure it out without actual tasks loaded, I am willing to have you take a look. A question sort of off topic, but when I was doing the work for UD/grid.org, they would send out a minimum (with no max) on how many times any one work unit would be sent out. Many of them were probably run hundreds of times. The reason for this is error reduction in getting consistent results from different end users (reducing "jitter"), some end users would take too long or not return results at all, and during times (like this holiday) when they would all simply be away they would let the servers give out copies of the same tasks over and over. Why doesn't GPUGRID do this? The first answer that comes to mind is that BOINC has so many projects running that a great majority of the users could get active tasks from so many other sources that some time off from GPUGRID won't even go noticed. But I would think that as long as somebody somewhere wants to work on your project, keep feeding them work, even if just for validation and jitter reduction reasons. Mike | |
ID: 39252 | Rating: 0 | rate: / Reply Quote | |
Ok, nice answer :) I see you have a technical background, that's good. I'll give you answers that are hopefully at the right "level" of making sense for you. PS: Because of the GPUGrid work shortage, I won't be able to conclusively tell you what the problem is. But I am willing to troubleshoot this as long as it takes to help you solve it. | |
ID: 39253 | Rating: 0 | rate: / Reply Quote | |
I am using nvidiaInspector's Overclocking options to do nothing but up the fan speed, but what would make you think they are too hot? Just answering this question, if you click on a task you ran you can see the logs which include temperature: http://www.gpugrid.net/result.php?resultid=13575551 Name I11R36-SDOERR_BARNA5-62-100-RND9235_0 Workunit 10453012 Created 21 Dec 2014 | 11:26:11 UTC Sent 21 Dec 2014 | 11:26:33 UTC Received 21 Dec 2014 | 17:17:11 UTC Server state Over Outcome Validate error Client state Done Exit status 0 (0x0) Computer ID 189656 Report deadline 26 Dec 2014 | 11:26:33 UTC Run time 20,692.87 CPU time 2,424.98 Validate state Invalid Credit 0.00 Application version Long runs (8-12 hours on fastest card) v8.47 (cuda65) Stderr output <core_client_version>7.4.27</core_client_version> <![CDATA[ <stderr_txt> # GPU [GeForce GTX 780] Platform [Windows] Rev [3212] VERSION [65] # SWAN Device 0 : # Name : GeForce GTX 780 # ECC : Disabled # Global mem : 3072MB # Capability : 3.5 # PCI ID : 0000:01:00.0 # Device clock : 1058MHz # Memory clock : 3104MHz # Memory width : 384bit # Driver version : r343_00 : 34475 # GPU 0 : 43C # GPU 1 : 34C # GPU 2 : 27C # GPU 0 : 47C # GPU 0 : 50C # GPU 0 : 53C # GPU 0 : 55C # GPU 0 : 58C # GPU 0 : 60C # GPU 0 : 63C # GPU 0 : 64C # GPU 0 : 66C # GPU 0 : 67C # GPU 0 : 68C # GPU 0 : 70C # GPU 0 : 71C # GPU 0 : 72C # GPU 0 : 73C # GPU 0 : 75C # GPU 0 : 76C # GPU 0 : 77C # GPU 0 : 78C # GPU 1 : 35C # GPU 2 : 28C # GPU 0 : 79C # GPU 0 : 80C # GPU 1 : 36C # GPU 1 : 37C # GPU 0 : 81C # GPU 2 : 29C # GPU 0 : 82C # GPU 1 : 38C # GPU 0 : 83C # GPU 1 : 44C # GPU 2 : 30C # GPU 1 : 46C # GPU 1 : 47C # GPU 0 : 84C # BOINC suspending at user request (exit) # GPU [GeForce GTX 780] Platform [Windows] Rev [3212] VERSION [65] # SWAN Device 0 : # Name : GeForce GTX 780 # ECC : Disabled # Global mem : 3072MB # Capability : 3.5 # PCI ID : 0000:01:00.0 # Device clock : 1058MHz # Memory clock : 3104MHz # Memory width : 384bit # Driver version : r343_00 : 34475 # GPU 0 : 66C # GPU 1 : 42C # GPU 2 : 30C # GPU 0 : 68C # GPU 0 : 70C # GPU 0 : 71C # GPU 0 : 72C # GPU 0 : 73C # GPU 0 : 74C # GPU 0 : 75C # GPU 0 : 76C # GPU 1 : 43C # GPU 0 : 77C # GPU 0 : 78C # GPU 2 : 31C # GPU 0 : 79C # GPU 0 : 80C # GPU 1 : 44C # GPU 2 : 32C # GPU 1 : 45C # GPU 2 : 33C # GPU 2 : 34C # GPU 1 : 46C # GPU 2 : 38C # GPU 2 : 43C # GPU 2 : 46C # GPU 2 : 49C # GPU 2 : 51C # GPU 2 : 54C # GPU 2 : 56C # GPU 2 : 58C # GPU 2 : 60C # GPU 2 : 61C # GPU 2 : 63C # GPU 1 : 49C # GPU 2 : 64C # GPU 0 : 81C # GPU 1 : 57C # GPU 2 : 66C # GPU 1 : 61C # GPU 2 : 67C # GPU 1 : 64C # GPU 2 : 68C # GPU 1 : 68C # GPU 2 : 69C # GPU 1 : 71C # GPU 2 : 70C # GPU 0 : 82C # GPU 1 : 74C # GPU 2 : 71C # GPU 1 : 76C # GPU 2 : 72C # GPU 1 : 77C # GPU 0 : 83C # GPU 1 : 78C # GPU 1 : 79C # GPU 2 : 73C # GPU 0 : 84C # GPU 1 : 80C # GPU 0 : 85C # GPU 0 : 86C # GPU 0 : 87C # GPU 0 : 88C # GPU 0 : 89C # GPU 0 : 90C # GPU 0 : 91C # GPU 0 : 92C # GPU 0 : 93C # GPU 0 : 94C # GPU 0 : 95C # GPU 1 : 81C # GPU 1 : 82C # GPU 0 : 96C # GPU 1 : 83C # GPU 2 : 74C # GPU 1 : 84C # GPU 2 : 75C # GPU 1 : 85C # GPU 2 : 76C # GPU 1 : 86C # GPU 2 : 77C # GPU 1 : 87C # GPU 1 : 88C # Time per step (avg over 3675000 steps): 5.520 ms # Approximate elapsed time for entire WU: 20700.631 s # PERFORMANCE: 87466 Natoms 5.520 ns/day 0.000 ms/step 0.000 us/step/atom 12:15:44 (6544): called boinc_finish Outcome Validate error While Boinc might not be seeing the GPU's the GPUGrid App clearly sees all 3 GPU's; GPU 0, 1 and 2 are underlined above. Whatever the problem there you are not sufficiently cooling all the GPU's. 95C or 96C is dangerously high IMO and my primary concern would be that you could damage your GPU's or other hardware. I suggest you start by working safely - hard drives don't like being cooked, neither do motherboards, RAM modules... ____________ FAQ's HOW TO: - Opt out of Beta Tests - Ask for Help | |
ID: 39254 | Rating: 0 | rate: / Reply Quote | |
I agree with skgiven -- Your GPUs are getting too hot. For a GTX 780, althought it is rated to support up-to-100*C, that is it's "absolute maximum operating temperature" (TjMax). It actually starts thermal downclocking at 80*C, since GPU Boost v2.0 GPUs use an 80*C threshold. And 80*C is generally about the "maximum comfort zone" for a GPU that you want to take care of. | |
ID: 39256 | Rating: 0 | rate: / Reply Quote | |
Well, I have good news. If the current gold version of the BOINC client does have a bug to not use all 3 GPUs, the .36 version does not. I updated the driver and the BOINC and now it is crunching n 3 GPUs with 3 tasks! | |
ID: 39259 | Rating: 0 | rate: / Reply Quote | |
Both your GPUs 0 & 1 are hitting 95-96C I wouldn't take too long to try to solve the temp problem. | |
ID: 39263 | Rating: 0 | rate: / Reply Quote | |
OK done, using Afterburner. | |
ID: 39265 | Rating: 0 | rate: / Reply Quote | |
Glad to hear you got things working and your priorities sorted out, hopefully. | |
ID: 39267 | Rating: 0 | rate: / Reply Quote | |
This seems like a new topic, but it also seems like something those already responding here can answer for me. Sorry if this is answered all over the forums and for my own lack of investigation. I could not find anything with a few searches of the terms I was looking for. | |
ID: 39398 | Rating: 0 | rate: / Reply Quote | |
Mike, there are four main things to consider here; the system architecture (especially GPU), the WU/app, exactly what is GPU load/utilization/usage and what else you are doing on the system. | |
ID: 39399 | Rating: 0 | rate: / Reply Quote | |
In order for me to get better GPU utilization added a file in C:\ProgramData\BOINC\projects\www.gpugrid.net called app_config.xml with. I have 3 GPU's in the machine. This lets you run two WU's per GPU with each getting half. My GPU utilization runs around 98% on each GPU. | |
ID: 39400 | Rating: 0 | rate: / Reply Quote | |
I want to know how to push the GPUs to use closer to 100% load. I do have it running cooler and can increase the cooling to make it run even cooler yet, if need be. A simple app_config.xml file will let your run multiple gpu units on one cpu at once, thereby utilizing your gpu to it's max. The problem will come in that since you are then pushing your gpu to work harder, the units will EACH take longer and you may not get an 'bonus' credits for finishing the units within the shorter times you are now. A unit is using about 70% of your gpu right now, that means thee is not enough room to load a full 2nd unit so it will have to be sharing some of the gpu's resources to run, slowing down each one. At most projects that isn't a problem as there are no 'bonus' credits for finishing units faster, but here there are. I did not even address the heat issue of pushing your gpu harder! All this comes down to you have a gpu that has more capability than what the programmers designed their software to run on, and it is just cruising thru the units, while the rest of us with our older gpu's are struggling. You are at the tippy top of the spear right now, in a few years, when the rest of us upgrade to something even better, we will pass you on by and you will be the one struggling, enjoy your time out front while you can, it will end! I for one am envious, but shopping! | |
ID: 39401 | Rating: 0 | rate: / Reply Quote | |
Is the GPU load a task specific item? I only ask this because I do see some tasks in the 60s and some in the 80s. Yes. It depends on: - The speed of your GPU. The faster it is, the lower its load (since there are always small pauses where CPU support is needed) - The number of atoms being simulated, i.e. the complexity of the WU. This can be seen in the task output in your profile. - The physical model choose by the scientist. The more work the GPU has to do before it needs CPU support again, the less pauses occur per second. Edit regarding running multiple concurrent WUs: if you're running at 85% GPU load or better, there's little benefit for even a performance loss from doing it. Below 80% load throughput improves. Those numbers are not exact, but the turning point is somewhere between them. MrS ____________ Scanning for our furry friends since Jan 2002 | |
ID: 39402 | Rating: 0 | rate: / Reply Quote | |
I found a way to get over 90% of the GPUs working. Since Dnetc is not a BOINC project, I set the Dnetc GPU priority to 2 and it took up the slack from GPUGrid. Unfortunately there is no happy medium and it is slowing down the GPUGrid project, but my BOINC stats went from 1,500,000 to 850,000 a day round abouts, but my Dnetc stats went from 7,000 a day to over 50,000. I wish I could get some happy place where the DNetc only took the idle that GPUGrid doesn't touch, but any other priority either takes all the GPU or almost none of it. Cancers and other diseases are my moral priority, but getting the GPU as close to 100% is why I spent the money on the cards making it the financial priority for the moment. And the extra stats on the other side where I've spent almost 10 years doesn't hurt either. Maybe if I can think about it one day, I can figure out a way to make BOINC and Dnetc play nicer with each other. | |
ID: 39491 | Rating: 0 | rate: / Reply Quote | |
OK, so I see that the distributed.net GPU client can set its GPU runtime priority. I can't find any information out there on how to set the GPU priority in Windows like you can with the CPU. Does BOINC or the GPUGrid have any settings that can change the GPU runtime priority of it? Since I find that I need to run the DNetc at level 2, running BOINC GPUGrid at level 3 would then allow for BOINC to use as much of the GPU as it can force, then the DNetc will take the remaining percentage. If BOINC/GPUGrid does not have any way of changing or forcing this, does anyone know how to change it per task at the Windows OS level. | |
ID: 39535 | Rating: 0 | rate: / Reply Quote | |
Message boards : Graphics cards (GPUs) : GPUs not being used?