Advanced search

Message boards : Graphics cards (GPUs) : How many Pythons does it need to run???

Author Message
Sandman192
Send message
Joined: 26 Oct 09
Posts: 13
Credit: 32,826,859
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwatwatwatwatwatwat
Message 58888 - Posted: 8 Jun 2022 | 9:27:10 UTC
Last modified: 8 Jun 2022 | 9:41:51 UTC

I have 40 Python programs running at the same time and my GPUs are not doing a thing. A 1080 Ti and a 980 running GPU driver v512.59 with CUDA 11.6.
BOINC is showing it's running 2. One for each GPU.
Both projects show it would take a day to finish. One took an hour and never ran any of my GPUs. Never went past 2%.

I check my logs and have 17 with "Error while computing".

Richard Haselgrove
Send message
Joined: 11 Jul 09
Posts: 1451
Credit: 3,576,874,351
RAC: 264,797
Level
Arg
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 58889 - Posted: 8 Jun 2022 | 10:06:53 UTC - in response to Message 58888.

When you drill down into the failure reports, the salient line seems to be

RuntimeError: [enforce fail at C:\cb\pytorch_1000000000000\work\c10\core\impl\alloc_cpu.cpp:81] data. DefaultCPUAllocator: not enough memory: you tried to allocate 3612672 bytes.

Your cards have too little memory (4095 MB) to run these tasks. Deselect the Python applications in your account settings - ACEMD 3 or 4 tasks should still run OK.

Ian&Steve C.
Avatar
Send message
Joined: 21 Feb 20
Posts: 744
Credit: 4,951,883,494
RAC: 884,245
Level
Arg
Scientific publications
wat
Message 58890 - Posted: 8 Jun 2022 | 12:08:01 UTC - in response to Message 58889.

a 1080Ti has 11GB of VRAM. that is enough.

but the 4GB on the 980 might not be enough.

further, his error is about CPU memory allocation and not related to the GPU memory.
____________

Sandman192
Send message
Joined: 26 Oct 09
Posts: 13
Credit: 32,826,859
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwatwatwatwatwatwat
Message 58891 - Posted: 8 Jun 2022 | 12:35:51 UTC - in response to Message 58890.
Last modified: 8 Jun 2022 | 12:36:57 UTC

If 4GB is not enough then I should have not got it in the first place.
I have a 11GB GPU and it won't run on it as well.
Also, I shouldn't be running 40 Python programs at once. Only 2. One per GPU.
Plus, I have 32GB of RAM.

Keith Myers
Send message
Joined: 13 Dec 17
Posts: 1073
Credit: 1,452,740,714
RAC: 381,531
Level
Met
Scientific publications
watwatwatwatwat
Message 58892 - Posted: 8 Jun 2022 | 17:13:09 UTC - in response to Message 58891.

The Python tasks spawn more than two processes. Usually 32 or more.

This is normal and typical of reinforced learning.

Windows has issues with the Python gpu tasks.

It has no issues with the acemd3 and acemd4 tasks.

Suggest you try those and deselect Python on Gpu tasks.

Sandman192
Send message
Joined: 26 Oct 09
Posts: 13
Credit: 32,826,859
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwatwatwatwatwatwat
Message 58893 - Posted: 8 Jun 2022 | 21:30:24 UTC - in response to Message 58892.

The Python tasks spawn more than two processes. Usually 32 or more.

This is normal and typical of reinforced learning.

Windows has issues with the Python gpu tasks.

It has no issues with the acemd3 and acemd4 tasks.

Suggest you try those and deselect Python on Gpu tasks.

Really!? 32 or more? Hum.

At least the problem is known. Thank you for that.

Python affects Edge to the point where you can't brows from it while Python is running and curser hangs a little at times.

Ian&Steve C.
Avatar
Send message
Joined: 21 Feb 20
Posts: 744
Credit: 4,951,883,494
RAC: 884,245
Level
Arg
Scientific publications
wat
Message 58894 - Posted: 9 Jun 2022 | 13:16:12 UTC - in response to Message 58893.

to clarify, it's not running 32 GPU processes. it runs 1 GPU process with 32 CPU processes, per task. all BOINC GPU projects will use resources from both the GPU and CPU but in most cases you only have a single CPU thread (or less) supporting the GPU process. the GPUGRID Python tasks stand out as they are essentially a multi-threaded (mt) CPU app combined with GPU work. part of the problem is that BOINC isn't coded (currently) to handle this combination. so if you are running any other projects, particularly CPU projects, BOINC thinks you have more free resources than you really do, and generally causing problems.

if you don't want to run python tasks, just go into your project preferences and uncheck the python tasks.
____________

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2335
Credit: 16,178,080,749
RAC: 0
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 58895 - Posted: 9 Jun 2022 | 17:25:53 UTC - in response to Message 58893.
Last modified: 9 Jun 2022 | 17:27:12 UTC

The Python tasks spawn more than two processes. Usually 32 or more.
This is normal and typical of reinforced learning.

Really!? 32 or more? Hum.
At least the problem is known.
This is not a bug. This is a feature.

Sandman192
Send message
Joined: 26 Oct 09
Posts: 13
Credit: 32,826,859
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwatwatwatwatwatwat
Message 58896 - Posted: 9 Jun 2022 | 22:52:42 UTC - in response to Message 58895.
Last modified: 9 Jun 2022 | 23:24:57 UTC

The Python tasks spawn more than two processes. Usually 32 or more.
This is normal and typical of reinforced learning.

Really!? 32 or more? Hum.
At least the problem is known.
This is not a bug. This is a feature.

Not the 32 Pythons running. The 15 errors I'm getting is the problem.

Post to thread

Message boards : Graphics cards (GPUs) : How many Pythons does it need to run???

//