Advanced search

Message boards : News : More tasks: MDAD*

Author Message
Toni
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Send message
Joined: 9 Dec 08
Posts: 1006
Credit: 5,068,599
RAC: 0
Level
Ser
Scientific publications
watwatwatwat
Message 54797 - Posted: 21 May 2020 | 12:47:33 UTC
Last modified: 21 May 2020 | 13:23:42 UTC

I'm filling up the task queue again- - these are called MDAD and suffix.
Happy crunching!

T

Killersocke
Send message
Joined: 18 Oct 13
Posts: 53
Credit: 406,647,419
RAC: 0
Level
Gln
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 54799 - Posted: 21 May 2020 | 13:21:28 UTC - in response to Message 54797.

thx
you make the user happy :-)

Lazydude
Send message
Joined: 25 Sep 08
Posts: 12
Credit: 161,238,437
RAC: 0
Level
Ile
Scientific publications
watwatwatwatwatwatwatwatwat
Message 54801 - Posted: 21 May 2020 | 13:31:47 UTC

Thanks ,
but the certs are not ok



05/21/20 15:26:56 | GPUGRID | [http] HTTP error: SSL peer certificate or SSH remote key was not OK
05/21/20 15:26:56 | GPUGRID | [http] [ID#142] Info: TLSv1.2 (IN), TLS change cipher, Client hello (1):
05/21/20 15:26:56 | GPUGRID | [http] [ID#142] Info: TLSv1.2 (IN), TLS handshake, Finished (20):
05/21/20 15:26:56 | GPUGRID | [http] [ID#142] Info: SSL connection using TLSv1.2 / ECDHE-RSA-AES128-GCM-SHA256
05/21/20 15:26:56 | GPUGRID | [http] [ID#142] Info: ALPN, server did not agree to a protocol
05/21/20 15:26:56 | GPUGRID | [http] [ID#142] Info: Server certificate:
05/21/20 15:26:56 | GPUGRID | [http] [ID#142] Info: subject: CN=www.ps3grid.net
05/21/20 15:26:56 | GPUGRID | [http] [ID#142] Info: start date: May 3 10:33:30 2020 GMT
05/21/20 15:26:56 | GPUGRID | [http] [ID#142] Info: expire date: Aug 1 10:33:30 2020 GMT
05/21/20 15:26:56 | GPUGRID | [http] [ID#142] Info: subjectAltName does not match www.gpugrid.org
05/21/20 15:26:56 | GPUGRID | [http] [ID#142] Info: SSL: no alternative certificate subject name matches target host name 'www.gpugrid.org'
05/21/20 15:26:56 | GPUGRID | [http] [ID#142] Info: Closing connection 133
05/21/20 15:26:56 | GPUGRID | [http] [ID#142] Info: TLSv1.2 (OUT), TLS alert, Client hello (1):
05/21/20 15:26:56 | GPUGRID | [http] HTTP error: SSL peer certificate or SSH remote key was not OK

Erich56
Send message
Joined: 1 Jan 15
Posts: 1132
Credit: 10,205,482,676
RAC: 29,855,510
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwat
Message 54802 - Posted: 21 May 2020 | 13:35:59 UTC

I,too, can't download anything:


21.05.2020 15:26:41 | | Project communication failed: attempting access to reference site
21.05.2020 15:26:42 | | Internet access OK - project servers may be temporarily down.
21.05.2020 15:27:09 | GPUGRID | Started download of 1b35A00_379_2-TONI_MDADpr4sb-0-conf_file_enc
21.05.2020 15:27:10 | GPUGRID | Temporarily failed download of 1b35A00_379_2-TONI_MDADpr4sb-0-conf_file_enc: transient HTTP error
21.05.2020 15:27:10 | GPUGRID | Backing off 00:11:44 on download of 1b35A00_379_2-TONI_MDADpr4sb-0-conf_file_enc
21.05.2020 15:27:11 | | Project communication failed: attempting access to reference site
21.05.2020 15:27:12 | | Internet access OK - project servers may be temporarily down.
21.05.2020 15:28:25 | GPUGRID | Started download of 1b35A00_379_2-TONI_MDADpr4sb-0-xsc_file
21.05.2020 15:28:26 | GPUGRID | Temporarily failed download of 1b35A00_379_2-TONI_MDADpr4sb-0-xsc_file: transient HTTP error
21.05.2020 15:28:26 | GPUGRID | Backing off 00:07:54 on download of 1b35A00_379_2-TONI_MDADpr4sb-0-xsc_file
21.05.2020 15:28:27 | | Project communication failed: attempting access to reference site
21.05.2020 15:28:28 | | Internet access OK - project servers may be temporarily down.
21.05.2020 15:28:28 | GPUGRID | Started download of 1b35A00_379_2-TONI_MDADpr4sb-0-par_file
21.05.2020 15:28:29 | GPUGRID | Temporarily failed download of 1b35A00_379_2-TONI_MDADpr4sb-0-par_file: transient HTTP error
21.05.2020 15:28:29 | GPUGRID | Backing off 00:05:36 on download of 1b35A00_379_2-TONI_MDADpr4sb-0-par_file
21.05.2020 15:28:30 | | Project communication failed: attempting access to reference site
21.05.2020 15:28:31 | | Internet access OK - project servers may be temporarily down.

Toni
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Send message
Joined: 9 Dec 08
Posts: 1006
Credit: 5,068,599
RAC: 0
Level
Ser
Scientific publications
watwatwatwat
Message 54803 - Posted: 21 May 2020 | 13:38:24 UTC - in response to Message 54801.

Try again please

Lazydude
Send message
Joined: 25 Sep 08
Posts: 12
Credit: 161,238,437
RAC: 0
Level
Ile
Scientific publications
watwatwatwatwatwatwatwatwat
Message 54805 - Posted: 21 May 2020 | 13:48:42 UTC

Did a reset of the project.
Now I got 2 Tasks downloaded whitout any problems


Thanks!

[CSF] Thomas H.V. DUPONT
Send message
Joined: 20 Jul 14
Posts: 732
Credit: 126,845,366
RAC: 190,805
Level
Cys
Scientific publications
watwatwatwatwatwatwatwat
Message 54806 - Posted: 21 May 2020 | 14:02:19 UTC

Thanks Toni!
____________
[CSF] Thomas H.V. Dupont
Founder of the team CRUNCHERS SANS FRONTIERES 2.0
www.crunchersansfrontieres

Richard Haselgrove
Send message
Joined: 11 Jul 09
Posts: 1620
Credit: 8,822,866,430
RAC: 19,442,844
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 54809 - Posted: 21 May 2020 | 14:08:51 UTC

See note in HTTPS thread, message 54807

Windows downloading OK, Linux failing. Users - please mention which OS you are using when reporting.

Erich56
Send message
Joined: 1 Jan 15
Posts: 1132
Credit: 10,205,482,676
RAC: 29,855,510
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwat
Message 54812 - Posted: 21 May 2020 | 14:20:21 UTC

after I had deleted the task that could not be downloaded and tried a new one, the following message is now coming everytime I push the "update" button:

21.05.2020 16:17:00 | GPUGRID | update requested by user
21.05.2020 16:17:03 | GPUGRID | Sending scheduler request: Requested by user.
21.05.2020 16:17:03 | GPUGRID | Not requesting tasks: some download is stalled
21.05.2020 16:17:04 | GPUGRID | Scheduler request completed

OS is Windows 10.

thimios
Send message
Joined: 10 Jan 09
Posts: 5
Credit: 181,785,833
RAC: 0
Level
Ile
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwat
Message 54816 - Posted: 21 May 2020 | 15:05:16 UTC - in response to Message 54812.

after I had deleted the task that could not be downloaded and tried a new one, the following message is now coming everytime I push the "update" button:

21.05.2020 16:17:00 | GPUGRID | update requested by user
21.05.2020 16:17:03 | GPUGRID | Sending scheduler request: Requested by user.
21.05.2020 16:17:03 | GPUGRID | Not requesting tasks: some download is stalled
21.05.2020 16:17:04 | GPUGRID | Scheduler request completed

OS is Windows 10.



If you restart BOINC, the problem will resolve itself.

Ian&Steve C.
Avatar
Send message
Joined: 21 Feb 20
Posts: 1069
Credit: 40,231,533,983
RAC: 527
Level
Trp
Scientific publications
wat
Message 54822 - Posted: 21 May 2020 | 15:33:40 UTC - in response to Message 54816.

all three of my Linux hosts have downloaded, processed, and reported work successfully. thanks :)
____________

Erich56
Send message
Joined: 1 Jan 15
Posts: 1132
Credit: 10,205,482,676
RAC: 29,855,510
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwat
Message 54823 - Posted: 21 May 2020 | 15:35:11 UTC - in response to Message 54812.

before, I wrote:

after I had deleted the task that could not be downloaded and tried a new one, the following message is now coming everytime I push the "update" button:

21.05.2020 16:17:00 | GPUGRID | update requested by user
21.05.2020 16:17:03 | GPUGRID | Sending scheduler request: Requested by user.
21.05.2020 16:17:03 | GPUGRID | Not requesting tasks: some download is stalled
21.05.2020 16:17:04 | GPUGRID | Scheduler request completed

OS is Windows 10.


after the problem did not vanish, I reset GPUGRID - and could download new tasks :-)

What I was wondering about though: the GPUGRID masterfile in the newly downloaded "account_www.gpugrid.net.xml" (BOINC folder) still says
"...<master_url>http://www.gpugrid.net/</master_url>..."

I would have expected it be read "https" ...

FrostyDog
Avatar
Send message
Joined: 21 Apr 20
Posts: 4
Credit: 14,057,899
RAC: 2,182
Level
Pro
Scientific publications
wat
Message 54826 - Posted: 21 May 2020 | 15:59:55 UTC

Oh what a lovely surprise. Suddenly realized I had a task running for GPUGrid and was surprised thinking it was one of the very last re-runs of an old task in the list.

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2343
Credit: 16,206,655,749
RAC: 261,147
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 54837 - Posted: 21 May 2020 | 18:54:33 UTC
Last modified: 21 May 2020 | 18:54:53 UTC

There are some bad workuntis in the new batch.

EXCEPTIONAL CONDITION: src\mdio\bincoord.c, line 193: "nelems != 1"
https://www.gpugrid.net/workunit.php?wuid=20162190
https://www.gpugrid.net/workunit.php?wuid=20162534
https://www.gpugrid.net/workunit.php?wuid=20009439
https://www.gpugrid.net/workunit.php?wuid=20009346
https://www.gpugrid.net/workunit.php?wuid=20009664
and
ERROR: src\mdsim\trajectory.cpp line 135: Simulation box has to be rectangular!
https://www.gpugrid.net/workunit.php?wuid=20009564

Ian&Steve C.
Avatar
Send message
Joined: 21 Feb 20
Posts: 1069
Credit: 40,231,533,983
RAC: 527
Level
Trp
Scientific publications
wat
Message 54838 - Posted: 21 May 2020 | 18:57:57 UTC - in response to Message 54837.

can confirm. I'm seeing a high number of bad WUs coming through here too.
____________

Trotador
Send message
Joined: 25 Mar 12
Posts: 103
Credit: 13,920,977,393
RAC: 9,448,390
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 54839 - Posted: 21 May 2020 | 19:08:32 UTC

+1

joukohan
Send message
Joined: 17 Oct 16
Posts: 5
Credit: 17,032,834
RAC: 0
Level
Pro
Scientific publications
wat
Message 54842 - Posted: 21 May 2020 | 19:46:16 UTC
Last modified: 21 May 2020 | 19:51:48 UTC

My Windows-machine has validated one WU ok already, but Debian-machine WUs end up with the same error:

EXCEPTIONAL CONDITION: /home/user/conda/conda-bld/acemd3_1570536635323/work/src/mdio/bincoord.c, line 193: "nelems != 1"


EDIT: Now it seems like crunching better with Debian too; done% actually going up instead of instant error.

Ian&Steve C.
Avatar
Send message
Joined: 21 Feb 20
Posts: 1069
Credit: 40,231,533,983
RAC: 527
Level
Trp
Scientific publications
wat
Message 54844 - Posted: 21 May 2020 | 20:28:42 UTC

not sure if something should be done about the really high number of bad WUs. one of my systems just went through like 100 bad ones.
____________

=Lupus=
Send message
Joined: 10 Nov 07
Posts: 10
Credit: 12,777,491
RAC: 0
Level
Pro
Scientific publications
watwatwatwat
Message 54848 - Posted: 21 May 2020 | 21:07:32 UTC - in response to Message 54844.
Last modified: 21 May 2020 | 21:07:58 UTC

Ah and I thought bad side on my machine... seems WUs are slightly shaky

Richard Haselgrove
Send message
Joined: 11 Jul 09
Posts: 1620
Credit: 8,822,866,430
RAC: 19,442,844
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 54850 - Posted: 21 May 2020 | 21:26:16 UTC

Most of the 0-50 tasks seems to be bad - 0-10 are usually OK.

Ian&Steve C.
Avatar
Send message
Joined: 21 Feb 20
Posts: 1069
Credit: 40,231,533,983
RAC: 527
Level
Trp
Scientific publications
wat
Message 54851 - Posted: 21 May 2020 | 22:19:25 UTC

80-90% of what I'm downloading are all bombing out. Setting NNT until it calms down. all of the errors are kicking me into long backoffs and just wasting time.
____________

Richard Haselgrove
Send message
Joined: 11 Jul 09
Posts: 1620
Credit: 8,822,866,430
RAC: 19,442,844
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 54852 - Posted: 21 May 2020 | 22:32:31 UTC

What's more, I'm starting to get resends of the tasks which couldn't negotiate an SSL server name earlier today, and need manual tweaking to download. Night-time here, and I don't want to stop BOINC to muck about, because they're on machines with mixed GPUs and can't be relied on to restart on the right card.

They'll just have to wait it out overnight and I'll sort them out in the morning.

Erich56
Send message
Joined: 1 Jan 15
Posts: 1132
Credit: 10,205,482,676
RAC: 29,855,510
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwat
Message 54861 - Posted: 22 May 2020 | 4:49:32 UTC - in response to Message 54848.

... seems WUs are slightly shaky

I've had 3 faulty ones last night, all with "195 (0xc3) EXIT_CHILD_FAILED":

http://www.gpugrid.net/result.php?resultid=25128849
http://www.gpugrid.net/result.php?resultid=25125004
http://www.gpugrid.net/result.php?resultid=25082116

Toni
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Send message
Joined: 9 Dec 08
Posts: 1006
Credit: 5,068,599
RAC: 0
Level
Ser
Scientific publications
watwatwatwat
Message 54862 - Posted: 22 May 2020 | 7:22:25 UTC - in response to Message 54861.

Confirmed. About 10% of the tasks were created with a missing file, which makes them crash on startup. I'm figuring out the best course of action.

Richard Haselgrove
Send message
Joined: 11 Jul 09
Posts: 1620
Credit: 8,822,866,430
RAC: 19,442,844
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 54863 - Posted: 22 May 2020 | 7:48:40 UTC - in response to Message 54862.

Confirmed. About 10% of the tasks were created with a missing file, which makes them crash on startup. I'm figuring out the best course of action.

OK, so long as you know - I'll carry on burning them off as quickly as I can ;-)

Your'e going to have a bit of an extra bandwidth bill this month for us downloading the files that were created.

Toni
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Send message
Joined: 9 Dec 08
Posts: 1006
Credit: 5,068,599
RAC: 0
Level
Ser
Scientific publications
watwatwatwat
Message 54864 - Posted: 22 May 2020 | 8:07:10 UTC - in response to Message 54863.

I'm cancelling them.

Richard Haselgrove
Send message
Joined: 11 Jul 09
Posts: 1620
Credit: 8,822,866,430
RAC: 19,442,844
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 54865 - Posted: 22 May 2020 | 8:28:57 UTC - in response to Message 54864.

I'm cancelling them.

And it's working. All GPUs are either running productive work, or have viable tasks waiting to run after backup projects have finished. Thank you.

Profile BladeD
Send message
Joined: 1 May 11
Posts: 9
Credit: 144,358,529
RAC: 0
Level
Cys
Scientific publications
watwatwat
Message 54867 - Posted: 22 May 2020 | 9:37:19 UTC

5/22/2020 4:37:15 AM | GPUGRID | Started download of 1a5cA00_379_0-TONI_MDADex7sa-0-pdb_file
5/22/2020 4:37:16 AM | | Project communication failed: attempting access to reference site
5/22/2020 4:37:16 AM | GPUGRID | Temporarily failed download of 1a5cA00_379_0-TONI_MDADex7sa-0-pdb_file: transient HTTP error
5/22/2020 4:37:16 AM | GPUGRID | Backing off 04:46:52 on download of 1a5cA00_379_0-TONI_MDADex7sa-0-pdb_file
5/22/2020 4:37:17 AM | | Internet access OK - project servers may be temporarily down.


____________

Richard Haselgrove
Send message
Joined: 11 Jul 09
Posts: 1620
Credit: 8,822,866,430
RAC: 19,442,844
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 54868 - Posted: 22 May 2020 | 9:48:39 UTC - in response to Message 54867.

It often happens here - the project server is very busy, and I think constrained for bandwidth. Wait a couple of minutes and try again.

Zirma
Send message
Joined: 21 Apr 20
Posts: 13
Credit: 4,411,884
RAC: 0
Level
Ala
Scientific publications
wat
Message 54869 - Posted: 22 May 2020 | 10:09:31 UTC

23 of 23 dont work

1ezgA00_320_4-TONI_MDADex1se-0-50-RND5032_5 20179781 543598 22 May 2020 | 7:57:25 UTC 22 May 2020 | 8:47:14 UTC Error while computing 5.30 0.00 --- New version of ACEMD v2.10 (cuda101)
1ev0A00_450_3-TONI_MDADex1se-0-50-RND6497_4 20179280 543598 22 May 2020 | 7:22:19 UTC 22 May 2020 | 7:24:03 UTC Error while computing 6.17 0.00 --- New version of ACEMD v2.10 (cuda101)
1eu3A02_348_4-TONI_MDADex1se-0-50-RND0288_5 20179102 543598 22 May 2020 | 7:20:15 UTC 22 May 2020 | 7:22:19 UTC Error while computing 6.17 0.02 --- New version of ACEMD v2.10 (cuda101)
1etb200_348_4-TONI_MDADex1se-0-50-RND4587_5 20178954 543598 22 May 2020 | 7:18:06 UTC 22 May 2020 | 7:20:15 UTC Error while computing 6.16 0.00 --- New version of ACEMD v2.10 (cuda101)
1e8gA04_379_2-TONI_MDADex1se-0-50-RND1569_7 20176164 543598 22 May 2020 | 6:56:28 UTC 22 May 2020 | 7:18:06 UTC Error while computing 6.38 0.00 --- New version of ACEMD v2.10 (cuda101)
1e8gA04_450_3-TONI_MDADex1se-0-50-RND0934_6 20176186 543598 22 May 2020 | 6:06:23 UTC 22 May 2020 | 6:08:27 UTC Error while computing 6.25 0.02 --- New version of ACEMD v2.10 (cuda101)
1ba5A00_413_3-TONI_MDADex1sb-0-50-RND8087_6 20163061 543598 22 May 2020 | 6:04:55 UTC 22 May 2020 | 6:06:23 UTC Error while computing 7.15 0.00 --- New version of ACEMD v2.10 (cuda101)
1eb4A02_413_2-TONI_MDADex1se-0-50-RND9618_7 20176742 543598 22 May 2020 | 6:02:37 UTC 22 May 2020 | 6:04:55 UTC Error while computing 9.20 0.02 --- New version of ACEMD v2.10 (cuda101)
1encA00_320_0-TONI_MDADex1se-0-50-RND5173_2 20178305 543598 22 May 2020 | 6:00:23 UTC 22 May 2020 | 6:02:37 UTC Error while computing 6.12 0.00 --- New version of ACEMD v2.10 (cuda101)
1edqA03_348_4-TONI_MDADex1se-0-50-RND8101_5 20177101 543598 22 May 2020 | 5:58:59 UTC 22 May 2020 | 6:00:23 UTC Error while computing 6.07 0.00 --- New version of ACEMD v2.10 (cuda101)
1e8uA00_450_4-TONI_MDADex1se-0-50-RND7534_2 20176331 543598 22 May 2020 | 5:55:17 UTC 22 May 2020 | 5:56:55 UTC Error while computing 7.13 0.00 --- New version of ACEMD v2.10 (cuda101)
1ej6A01_450_0-TONI_MDADex1se-0-50-RND9222_4 20177970 543598 22 May 2020 | 5:53:54 UTC 22 May 2020 | 5:55:17 UTC Error while computing 6.55 0.00 --- New version of ACEMD v2.10 (cuda101)
1e5wA04_413_4-TONI_MDADex1se-0-50-RND9110_7 20175698 543598 22 May 2020 | 5:52:10 UTC 22 May 2020 | 5:53:54 UTC Error while computing 6.57 0.02 --- New version of ACEMD v2.10 (cuda101)
1efpB00_413_3-TONI_MDADex1se-0-50-RND6574_6 20177356 543598 22 May 2020 | 5:50:31 UTC 22 May 2020 | 5:52:10 UTC Error while computing 5.85 0.00 --- New version of ACEMD v2.10 (cuda101)
1e20A00_320_4-TONI_MDADex1se-0-50-RND7445_6 20175171 543598 22 May 2020 | 5:48:44 UTC 22 May 2020 | 5:50:31 UTC Error while computing 6.53 0.02 --- New version of ACEMD v2.10 (cuda101)
1e8uA00_450_0-TONI_MDADex1se-0-50-RND6268_2 20176319 543598 22 May 2020 | 5:46:13 UTC 22 May 2020 | 5:48:44 UTC Error while computing 6.37 0.00 --- New version of ACEMD v2.10 (cuda101)
1e7lA02_450_4-TONI_MDADex1se-0-50-RND8816_6 20175989 543598 22 May 2020 | 5:43:58 UTC 22 May 2020 | 5:46:13 UTC Error while computing 30.73 0.66 --- New version of ACEMD v2.10 (cuda101)
1e6dM01_379_2-TONI_MDADex1se-0-50-RND3537_5 20175771 543598 22 May 2020 | 5:42:03 UTC 22 May 2020 | 5:43:58 UTC Error while computing 6.23 0.00 --- New version of ACEMD v2.10 (cuda101)
1ej5A00_413_0-TONI_MDADex1se-0-50-RND4767_1 20177877 543598 22 May 2020 | 5:40:10 UTC 22 May 2020 | 5:42:03 UTC Error while computing 6.15 0.00 --- New version of ACEMD v2.10 (cuda101)
1e6vA03_348_1-TONI_MDADex1se-0-50-RND5457_5 20175828 543598 22 May 2020 | 5:38:16 UTC 22 May 2020 | 5:40:10 UTC Error while computing 6.56 0.00 --- New version of ACEMD v2.10 (cuda101)

Richard Haselgrove
Send message
Joined: 11 Jul 09
Posts: 1620
Credit: 8,822,866,430
RAC: 19,442,844
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 54870 - Posted: 22 May 2020 | 10:11:31 UTC - in response to Message 54869.

Read the older posts in this thread. There was a problem, but it's over - they've been cancelled.

Zirma
Send message
Joined: 21 Apr 20
Posts: 13
Credit: 4,411,884
RAC: 0
Level
Ala
Scientific publications
wat
Message 54871 - Posted: 22 May 2020 | 10:14:20 UTC - in response to Message 54870.

canceld ?.. i still get them

1eb6A00_450_3-TONI_MDADex1se-0-50-RND7847_4 20176835 543598 22 May 2020 | 5:20:15 UTC 22 May 2020 | 5:38:16 UTC Error while computing 6.38 0.00 --- New version of ACEMD v2.10 (cuda101)
1eokA00_379_1-TONI_MDADex1se-0-50-RND6186_0 20178430 543598 22 May 2020 | 5:56:55 UTC 22 May 2020 | 5:58:59 UTC Error while computing 6.91 0.00 --- New version of ACEMD v2.10 (cuda101)
1a8oA00_348_3-TONI_MDADpr4sa-9-10-RND7509_0 20009637 543598 11 May 2020 | 3:23:47 UTC 11 May 2020 | 7:04:15 UTC Error while computing 6.10 0.02 --- New version of ACEMD v2.10 (cuda101)

not onely ...ex1... even pr4sa...

tullio
Send message
Joined: 8 May 18
Posts: 190
Credit: 104,426,808
RAC: 0
Level
Cys
Scientific publications
wat
Message 54872 - Posted: 22 May 2020 | 10:39:56 UTC

Tasks 0-50 seem to work right now. I had 36 failures.
Tullio
____________

Toni
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Send message
Joined: 9 Dec 08
Posts: 1006
Credit: 5,068,599
RAC: 0
Level
Ser
Scientific publications
watwatwatwat
Message 54873 - Posted: 22 May 2020 | 10:42:22 UTC - in response to Message 54871.

Cancellation is always flaky. Let them wither.

Richard Haselgrove
Send message
Joined: 11 Jul 09
Posts: 1620
Credit: 8,822,866,430
RAC: 19,442,844
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 54874 - Posted: 22 May 2020 | 11:37:24 UTC - in response to Message 54871.

canceld ?.. i still get them

You got them - past tense. You had already returned them before Toni got into the office this morning and started thinking about what to do.

Zirma
Send message
Joined: 21 Apr 20
Posts: 13
Credit: 4,411,884
RAC: 0
Level
Ala
Scientific publications
wat
Message 54875 - Posted: 22 May 2020 | 11:51:06 UTC - in response to Message 54874.

canceld ?.. i still get them

You got them - past tense. You had already returned them before Toni got into the office this morning and started thinking about what to do.


Yes the first 20 wu but i reed he have stop the bad but i got at least 3 bad after. I diden know there was some latency when he stop them. But now it looks good and working fine so it's no problems. (I put out the work list onely so he can se if ther was a system error on some work he not have notis aboute. I know u all working hard on it.) No hard minds. Ty for all u suport.

Pop Piasa
Avatar
Send message
Joined: 8 Aug 19
Posts: 252
Credit: 458,054,251
RAC: 0
Level
Gln
Scientific publications
watwat
Message 54887 - Posted: 22 May 2020 | 21:50:34 UTC

So far I'm batting .500 on this batch.
54 successes and 54 'exceptional condition' errors.

🤔I'm curious if we are purposely "pushing the envelope" here, Toni. It looks like we're exploring the outer boundaries of the acemd3 program viability from under my rock.

Richard Haselgrove
Send message
Joined: 11 Jul 09
Posts: 1620
Credit: 8,822,866,430
RAC: 19,442,844
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 54888 - Posted: 22 May 2020 | 22:01:58 UTC - in response to Message 54862.

Confirmed. About 10% of the tasks were created with a missing file, which makes them crash on startup. I'm figuring out the best course of action.

Just pulling forward what Toni has has already written in this thread. The only envelope we're pushing is that of one very tired researcher, who - like all of us - makes mistakes from time to time.

Profile robertmiles
Send message
Joined: 16 Apr 09
Posts: 503
Credit: 755,370,933
RAC: 212,472
Level
Glu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 54889 - Posted: 22 May 2020 | 22:11:37 UTC
Last modified: 22 May 2020 | 22:15:42 UTC

I've only had one task since the latest changes.

https://www.gpugrid.net/result.php?resultid=25200010

Its output showed some dump sections, but it appears to have downloaded, run, and uploaded correctly otherwise. Marked as Valid.

Pop Piasa
Avatar
Send message
Joined: 8 Aug 19
Posts: 252
Credit: 458,054,251
RAC: 0
Level
Gln
Scientific publications
watwat
Message 54892 - Posted: 22 May 2020 | 23:17:27 UTC
Last modified: 22 May 2020 | 23:42:26 UTC

Grosso appears to be feeling the strain of so many WUs failing and hosts requesting replacement downloads. Things are pretty slow on my end, only one host at a time getting anything, and that download is intermittent.

It looks to me like the shutting down of SETI@home triggered an unexpected hardware bottleneck for many other projects.

I wonder if a policy of 2 'spares' per GPU might alleviate this some.

Profile [AF] fansyl
Send message
Joined: 26 Sep 13
Posts: 20
Credit: 1,714,356,441
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 54940 - Posted: 24 May 2020 | 17:14:38 UTC

Thanks for the new work !

Alls seems to be OK today :)

vonboedefeldt
Send message
Joined: 24 Mar 20
Posts: 3
Credit: 370,341,423
RAC: 352,919
Level
Asp
Scientific publications
wat
Message 55034 - Posted: 4 Jun 2020 | 5:46:28 UTC

Since june the 2nd I just have failures in calculating (Berechnungsfehler), wu ends after some seconds.
In hope for a solution,

vonboedefeldt

Toni
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Send message
Joined: 9 Dec 08
Posts: 1006
Credit: 5,068,599
RAC: 0
Level
Ser
Scientific publications
watwatwatwat
Message 55035 - Posted: 4 Jun 2020 | 10:37:31 UTC - in response to Message 55034.

Since june the 2nd I just have failures in calculating (Berechnungsfehler), wu ends after some seconds.
In hope for a solution,

vonboedefeldt


Worked in some other PC... try to reboot?

Lazydude
Send message
Joined: 25 Sep 08
Posts: 12
Credit: 161,238,437
RAC: 0
Level
Ile
Scientific publications
watwatwatwatwatwatwatwatwat
Message 55036 - Posted: 4 Jun 2020 | 10:39:34 UTC - in response to Message 55034.

check in your tasklist and check worklist
if there are more than 2 others in worklist then dont worry
its something wrong with thah barch

vonboedefeldt
Send message
Joined: 24 Mar 20
Posts: 3
Credit: 370,341,423
RAC: 352,919
Level
Asp
Scientific publications
wat
Message 55037 - Posted: 4 Jun 2020 | 11:21:32 UTC - in response to Message 55035.

I tried, but the same result as before

vonboedefeldt
Send message
Joined: 24 Mar 20
Posts: 3
Credit: 370,341,423
RAC: 352,919
Level
Asp
Scientific publications
wat
Message 55038 - Posted: 4 Jun 2020 | 11:23:14 UTC - in response to Message 55036.

Did you mean batch, or what is "barch"?

Lazydude
Send message
Joined: 25 Sep 08
Posts: 12
Credit: 161,238,437
RAC: 0
Level
Ile
Scientific publications
watwatwatwatwatwatwatwatwat
Message 55039 - Posted: 4 Jun 2020 | 14:50:26 UTC - in response to Message 55038.

yes batch of work

Aurum
Avatar
Send message
Joined: 12 Jul 17
Posts: 401
Credit: 16,755,010,632
RAC: 220,113
Level
Trp
Scientific publications
watwatwat
Message 55047 - Posted: 11 Jun 2020 | 12:21:31 UTC
Last modified: 11 Jun 2020 | 12:23:05 UTC

Formula BOINC Sprint on project GPUGrid from 06/12/2020 04:00 (UTC) to 06/15/2020 03:59.
http://Formula-BOINC.org

Profile [PUGLIA] kidkidkid3
Avatar
Send message
Joined: 23 Feb 11
Posts: 98
Credit: 1,278,605,544
RAC: 2,058,455
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 55049 - Posted: 11 Jun 2020 | 17:43:42 UTC - in response to Message 55047.
Last modified: 11 Jun 2020 | 17:45:38 UTC

Hi,
some wu abend with this message "finish file present too long".
I'm running with New version of ACEMD v2.10 (cuda101).
Thanks in advance
K.
____________
Dreams do not always come true. But not because they are too big or impossible. Why did we stop believing.
(Martin Luther King)

Richard Haselgrove
Send message
Joined: 11 Jul 09
Posts: 1620
Credit: 8,822,866,430
RAC: 19,442,844
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 55050 - Posted: 11 Jun 2020 | 19:51:30 UTC - in response to Message 55049.

Hi,
some wu abend with this message "finish file present too long".
I'm running with New version of ACEMD v2.10 (cuda101).
Thanks in advance
K.

Your computers are hidden, so I can't tell what version of BOINC you are running, or what platform it's running on.

But this problem has been solved, or at least much alleviated, in the v7.16.xx range of BOINC clients.

Profile [PUGLIA] kidkidkid3
Avatar
Send message
Joined: 23 Feb 11
Posts: 98
Credit: 1,278,605,544
RAC: 2,058,455
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 55051 - Posted: 11 Jun 2020 | 21:05:39 UTC - in response to Message 55050.

Sorry for hidden computer.
My Boinc version was 7.14.2
Now i'll upgrade to 7.16 ... thanks for your help.
K.
____________
Dreams do not always come true. But not because they are too big or impossible. Why did we stop believing.
(Martin Luther King)

Pop Piasa
Avatar
Send message
Joined: 8 Aug 19
Posts: 252
Credit: 458,054,251
RAC: 0
Level
Gln
Scientific publications
watwat
Message 55073 - Posted: 23 Jun 2020 | 20:38:13 UTC
Last modified: 23 Jun 2020 | 20:39:23 UTC

I believe we are about halfway through this run of MDADex2s jobs. I see I am processing tasks marked as 24-50. Looks like around 20 days left. Hoping for another batch to immediately follow or even overlap this one.
😏 Hey, I can dream, can't I? 🧚‍♀️

Erich56
Send message
Joined: 1 Jan 15
Posts: 1132
Credit: 10,205,482,676
RAC: 29,855,510
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwat
Message 55074 - Posted: 24 Jun 2020 | 10:51:21 UTC - in response to Message 55073.

Hoping for another batch to immediately follow or even overlap this one.

maybe one dealing with COVID ?

Jim1348
Send message
Joined: 28 Jul 12
Posts: 819
Credit: 1,591,285,971
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 55075 - Posted: 24 Jun 2020 | 16:42:44 UTC - in response to Message 55074.

maybe one dealing with COVID ?


By the time GPUGrid is out of work, the OPN project at WCG will have a GPU app.
https://www.worldcommunitygrid.org/forums/wcg/viewthread_thread,42561_offset,0

Erich56
Send message
Joined: 1 Jan 15
Posts: 1132
Credit: 10,205,482,676
RAC: 29,855,510
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwat
Message 55076 - Posted: 25 Jun 2020 | 4:46:12 UTC - in response to Message 55075.

maybe one dealing with COVID ?


By the time GPUGrid is out of work, the OPN project at WCG will have a GPU app.
https://www.worldcommunitygrid.org/forums/wcg/viewthread_thread,42561_offset,0

well, they say though: No ETA yet

Pop Piasa
Avatar
Send message
Joined: 8 Aug 19
Posts: 252
Credit: 458,054,251
RAC: 0
Level
Gln
Scientific publications
watwat
Message 55077 - Posted: 25 Jun 2020 | 5:55:56 UTC
Last modified: 25 Jun 2020 | 6:05:47 UTC

From what I have deduced, our present tasks are relevant to the standardization of the virtual protein folding environment.
If I'm correct, you can watch David Baker's explanation of Rosetta, (https://www.cs.washington.edu/events/colloquia/archive?id=449)
to gain some insight regarding the present attempt here to map the modelling envelope more accurately.

❗Toni or moderator, please correct me if I am mistaken‼

Aurum
Avatar
Send message
Joined: 12 Jul 17
Posts: 401
Credit: 16,755,010,632
RAC: 220,113
Level
Trp
Scientific publications
watwatwat
Message 55101 - Posted: 10 Jul 2020 | 0:05:38 UTC
Last modified: 10 Jul 2020 | 0:05:57 UTC

RELEASE OF ACEMD 3.3
https://www.acellera.com/md-simulation-blog-news/

Do we get to run it? Still wondering what we're doing???

Profile trigggl
Send message
Joined: 6 Mar 09
Posts: 25
Credit: 102,324,681
RAC: 0
Level
Cys
Scientific publications
watwatwatwatwatwatwatwat
Message 55122 - Posted: 25 Jul 2020 | 16:43:31 UTC - in response to Message 55076.

maybe one dealing with COVID ?


By the time GPUGrid is out of work, the OPN project at WCG will have a GPU app.
https://www.worldcommunitygrid.org/forums/wcg/viewthread_thread,42561_offset,0

well, they say though: No ETA yet

Is it much farther Papa Smurf?
Not far now.
Is it much farther Papa Smurf?
Not far now.
Is it much farther Papa Smurf?
Not far now.
...

Pop Piasa
Avatar
Send message
Joined: 8 Aug 19
Posts: 252
Credit: 458,054,251
RAC: 0
Level
Gln
Scientific publications
watwat
Message 55144 - Posted: 1 Aug 2020 | 23:14:53 UTC

What's this?
MDADpr4s work units?
0 of 10!
Toni, you rule! Can you give us any info on this series?

Gelirhil
Send message
Joined: 19 Dec 08
Posts: 3
Credit: 22,289,033
RAC: 0
Level
Pro
Scientific publications
wat
Message 55146 - Posted: 2 Aug 2020 | 7:28:01 UTC

e1s297_villin_100ns_5-ADRIA_VillinAdaptive100ns-0-1-RND1114

Is it some new project?

Pop Piasa
Avatar
Send message
Joined: 8 Aug 19
Posts: 252
Credit: 458,054,251
RAC: 0
Level
Gln
Scientific publications
watwat
Message 55148 - Posted: 2 Aug 2020 | 14:46:33 UTC

I just got this.

e1s7_1gen-PABLO_UCB_NMR_KIX_CMYB_8-3-5-RND1204


Looks like maybe we.re crunching msc. odds and ends.

Profile ServicEnginIC
Avatar
Send message
Joined: 24 Sep 10
Posts: 581
Credit: 9,770,362,024
RAC: 21,500,013
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 55149 - Posted: 2 Aug 2020 | 16:00:14 UTC - in response to Message 55148.

Looks like maybe we.re crunching msc. odds and ends.

If it were so, it could be what we call at Spain "La traca final" (the apotheosis at the end of a fireworks display)

Jim1348
Send message
Joined: 28 Jul 12
Posts: 819
Credit: 1,591,285,971
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 55150 - Posted: 2 Aug 2020 | 17:00:12 UTC - in response to Message 55101.

RELEASE OF ACEMD 3.3
https://www.acellera.com/md-simulation-blog-news/

Do we get to run it? Still wondering what we're doing???


Now this is quite interesting:

Our joint collaborative project...
Today (27/5/2020), thanks to a one-year seed grant from the Chan Zuckerberg Initiative (CZI) Acellera is joining the OpenMM development team, together with lead OpenMM developer Peter Eastman, Tom Markland from Stanford University (whose lab focuses on QM/MM and machine learning for quantum chemistry), John Chodera from the Sloan Kettering Institute (whose lab focuses on free energy calculations). This grant aims to support the continued development of OpenMM to better serve its broad biomolecular modeling community, and its extension to integrate machine learning to enable genomic-scale biomolecular modeling, simulation, and prediction. Our collaborative project aims to secure long-term sustainable federal funding for OpenMM from the National Institutes of Health in a proposal submitted earlier this year.

“Hundreds of thousands of scientists each day use open source software to carry out their research,” said CZI Head of Science Cori Bargmann. “Scientists deserve better tools, and we’re helping to meet that need by supporting open source projects that will advance biomedical science and foster greater access to critical software.”

This new series of year-long grants of the CZI’s Essential Open Source Software for Science program, aims to support open source software projects essential to biomedical research, enabling software maintenance, growth, development, and community engagement. View the full list of grantees. Open source software is crucial to modern scientific research, advancing biology and medicine while providing reproducibility and transparency. Yet even the most widely used research software often lacks dedicated funding.

Prof. Gianni De Fabritiis, head of the Computational Science Laboratory (Universitat Pompeu Fabra) and CEO/CSO at Acellera thinks that “this is the way forward. By joining forces we can have one of the largest development teams in molecular simulations and have the strength to tackle the most challenging research projects ahead to make ACEMD and OpenMM incredibly useful for the research community.”


I always knew there was a connection between GPUGrid and Folding behind the scenes. I usually keep cards on both.

Pop Piasa
Avatar
Send message
Joined: 8 Aug 19
Posts: 252
Credit: 458,054,251
RAC: 0
Level
Gln
Scientific publications
watwat
Message 55152 - Posted: 3 Aug 2020 | 20:46:59 UTC - in response to Message 55149.

Looks like maybe we.re crunching msc. odds and ends.

If it were so, it could be what we call at Spain "La traca final" (the apotheosis at the end of a fireworks display)


That's for sure, some of these WUs are so huge My 75 watt cards can't make the 12 hour bonus. My 750ti has been crunching over 23 hrs and still only at 89% doing a WU labeled ADRIA_VillinAdaptive100ns.

But wait, here come the first of the MDADex7s tasks that have 50 runs each. If this is repeat of the ex2s series, we'll be crunching for a good while.

Ian&Steve C.
Avatar
Send message
Joined: 21 Feb 20
Posts: 1069
Credit: 40,231,533,983
RAC: 527
Level
Trp
Scientific publications
wat
Message 55153 - Posted: 3 Aug 2020 | 22:43:10 UTC

The VillinAdaptive WUs also pay a lot less credit reward as compared to the PABLO tasks, per time invested.

They run about the same length of time on my RTX 2070 cards (4-5hrs)
Villin - 82,500 reward
PABLO - 145,500 reward
____________

Pop Piasa
Avatar
Send message
Joined: 8 Aug 19
Posts: 252
Credit: 458,054,251
RAC: 0
Level
Gln
Scientific publications
watwat
Message 55155 - Posted: 4 Aug 2020 | 12:45:23 UTC - in response to Message 55153.

The VillinAdaptive WUs also pay a lot less credit reward as compared to the PABLO tasks, per time invested


I guess that's the "villin" (villain) part. 😜

Remember our motto:
Together we crunch
To test out a hunch
And wish all those points
Could at least buy us lunch.

Pop Piasa
Avatar
Send message
Joined: 8 Aug 19
Posts: 252
Credit: 458,054,251
RAC: 0
Level
Gln
Scientific publications
watwat
Message 55167 - Posted: 10 Aug 2020 | 18:35:44 UTC - in response to Message 55152.
Last modified: 10 Aug 2020 | 18:36:32 UTC

Oops, when I wrote that I saw this task:

But wait, here come the first of the MDADex7s tasks

(1ac5A00_379_0-TONI_MDADex7sa-0-50-RND0255)

...I forgot we already ran TONI_MDADex7s in May.

Keith Myers
Send message
Joined: 13 Dec 17
Posts: 1340
Credit: 7,653,123,724
RAC: 13,404,739
Level
Tyr
Scientific publications
watwatwatwatwat
Message 55168 - Posted: 10 Aug 2020 | 22:28:29 UTC

We will have to keep our gpus busy with work from other projects for a while, other than with the random lucky resend one might get.

Jim1348
Send message
Joined: 28 Jul 12
Posts: 819
Credit: 1,591,285,971
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 55169 - Posted: 11 Aug 2020 | 6:49:23 UTC - in response to Message 55168.
Last modified: 11 Aug 2020 | 6:51:26 UTC

Thanks. It snuck up on me. I am almost done.
Well, Folding is available.

Post to thread

Message boards : News : More tasks: MDAD*

//