New systems in Long queue

Message boards : News : New systems in Long queue

Author	Message
noelia Send message Joined: 5 Jul 12 Posts: 35 Credit: 393,375 RAC: 0 Level Scientific publications	Message 27733 - Posted: 19 Dec 2012 \| 16:41:25 UTC
	Hi all! A good amount of new WUs will be around for the next weeks in the long queue. The systems are called hfXA_long and will provide around 90000 credits each. Thanks and Merry Christmas to all!! Noelia
	ID: 27733 \| Rating: 0 \| rate: / Reply Quote

TheFiend Send message Joined: 26 Aug 11 Posts: 99 Credit: 2,500,112,138 RAC: 0 Level Scientific publications	Message 27737 - Posted: 19 Dec 2012 \| 19:14:34 UTC
	And a Merry Xmas to you and all the team at GPUGRID
	ID: 27737 \| Rating: 0 \| rate: / Reply Quote

Bedrich Hajek Send message Joined: 28 Mar 09 Posts: 485 Credit: 11,108,783,435 RAC: 15,545,660 Level Scientific publications	Message 27742 - Posted: 19 Dec 2012 \| 22:43:06 UTC - in response to Message 27733. Last modified: 19 Dec 2012 \| 22:50:45 UTC
	My computers just downloaded a couple of them. They seem to be very long (looks like about 14 to 16 hours before they finish on my computers.) GPU usage is 90%+ on both windows 7 and xp. I hope the upload files aren't too big, so I don't get an error for that. What exactly are we crunching with these Wu's, in layman's terms please? Merry Christmas!
	ID: 27742 \| Rating: 0 \| rate: / Reply Quote

dskagcommunity Send message Joined: 28 Apr 11 Posts: 456 Credit: 817,865,789 RAC: 0 Level Scientific publications	Message 27743 - Posted: 19 Dec 2012 \| 23:13:03 UTC - in response to Message 27742. Last modified: 19 Dec 2012 \| 23:42:43 UTC
	. I hope the upload files aren't too big, so I don't get an error for that. Merry Christmas! Hope that too, short before christmas i have a speedissue with the 3G connection of my main boinc cluster 2-6 kb/sec shared Upload overall arent that much :/ But yes, 99% GPU Load is great :) Merry chrstmas for you too! ____________ DSKAG Austria Research Team: http://www.research.dskag.at
	ID: 27743 \| Rating: 0 \| rate: / Reply Quote

Bedrich Hajek Send message Joined: 28 Mar 09 Posts: 485 Credit: 11,108,783,435 RAC: 15,545,660 Level Scientific publications	Message 27744 - Posted: 20 Dec 2012 \| 2:03:29 UTC
	I hope I am wrong on this but it looks like the output file for these work units are going to be larger than 128 MB. After one of the units was 25% done, the 1x1_10-NOELIA_hfXA_long-0-2-RND6540_1_4 file was about 35 MB in size, at 33% done it was about 47 MB in size. If this projection holds when the file will be 100% done, the file will be about 140 MB, too large to upload, unless you raise the size limit of the upload files. I hate to see an otherwise successfully completed unit, error out like this. This error occurred before: 5/8/2012 3:26:52 AM \| GPUGRID \| Computation for task 1H46_11_8-PAOLA_RNP-0-5-RND3163_0 finished 5/8/2012 3:26:52 AM \| GPUGRID \| Output file 1H46_11_8-PAOLA_RNP-0-5-RND3163_0_4 for task 1H46_11_8-PAOLA_RNP-0-5-RND3163_0 exceeds size limit. 5/8/2012 3:26:52 AM \| GPUGRID \| File size: 131283476.000000 bytes. Limit: 128000000.000000 byte http://www.gpugrid.net/forum_thread.php?id=2970#24795 But you still have an opportunity to correct this.
	ID: 27744 \| Rating: 0 \| rate: / Reply Quote

flashawk Send message Joined: 18 Jun 12 Posts: 297 Credit: 3,572,627,986 RAC: 0 Level Scientific publications	Message 27745 - Posted: 20 Dec 2012 \| 3:32:51 UTC - in response to Message 27744.
	I'm uploading one right now, it's 109.95MB and it took 8 hours 15 minutes to complete.
	ID: 27745 \| Rating: 0 \| rate: / Reply Quote

werdwerdus Send message Joined: 15 Apr 10 Posts: 123 Credit: 1,004,473,861 RAC: 0 Level Scientific publications	Message 27747 - Posted: 20 Dec 2012 \| 4:46:51 UTC Last modified: 20 Dec 2012 \| 4:48:03 UTC
	yep they get compressed before uploaded right? with gzip or something. I have 2 uploading right now. took 9:22 and 9:09 on two GTX 660 Ti gpus. 109.95 MB each. 97-98% gpu utilization on winxp. ____________ XtremeSystems.org - #1 Team in GPUGrid
	ID: 27747 \| Rating: 0 \| rate: / Reply Quote

TheFiend Send message Joined: 26 Aug 11 Posts: 99 Credit: 2,500,112,138 RAC: 0 Level Scientific publications	Message 27748 - Posted: 20 Dec 2012 \| 8:12:47 UTC
	I've had one error out on one of my GTX670's :( 11 hours.. :( Name 1x22_18-NOELIA_hfXA_long-0-2-RND2038_0 Workunit 3964872 Created 19 Dec 2012 \| 15:06:36 UTC Sent 19 Dec 2012 \| 18:43:06 UTC Received 20 Dec 2012 \| 7:46:04 UTC Server state Over Outcome Computation error Client state Compute error Exit status 0 (0x0) Computer ID 109019 Report deadline 24 Dec 2012 \| 18:43:06 UTC Run time 39,198.89 CPU time 38,715.33 Validate state Invalid Credit 0.00 Application version Long runs (8-12 hours on fastest card) v6.16 (cuda42) Stderr output <core_client_version>7.0.28</core_client_version> <![CDATA[ <stderr_txt> MDIO: cannot open file "restart.coor" # Time per step (avg over 6250000 steps): 6.274 ms # Approximate elapsed time for entire WU: 39213.953 s called boinc_finish </stderr_txt> <message> upload failure: <file_xfer_error> <file_name>1x22_18-NOELIA_hfXA_long-0-2-RND2038_0_4</file_name> <error_code>-131</error_code> </file_xfer_error> </message> ]]> Had another complete in 9 hours.
	ID: 27748 \| Rating: 0 \| rate: / Reply Quote

Gattorantolo [Ticino] Send message Joined: 29 Dec 11 Posts: 44 Credit: 251,211,525 RAC: 0 Level Scientific publications	Message 27749 - Posted: 20 Dec 2012 \| 9:07:23 UTC - in response to Message 27748. Last modified: 20 Dec 2012 \| 9:08:27 UTC
	Me too, after 10 hours...the task was already finished :-((( GTX680 What`s the problem? ____________ Member of Boinc Italy.
	ID: 27749 \| Rating: 0 \| rate: / Reply Quote

Gattorantolo [Ticino] Send message Joined: 29 Dec 11 Posts: 44 Credit: 251,211,525 RAC: 0 Level Scientific publications	Message 27751 - Posted: 20 Dec 2012 \| 10:48:18 UTC - in response to Message 27749. Last modified: 20 Dec 2012 \| 10:49:10 UTC
	Me too, after 10 hours...the task was already finished :-((( GTX680 What`s the problem? The second error after 11 hours...i stop the "long run"...i`am crunching now ACEMD standard!!!!! ____________ Member of Boinc Italy.
	ID: 27751 \| Rating: 0 \| rate: / Reply Quote

GDF Volunteer moderator Project administrator Project developer Project tester Volunteer developer Volunteer tester Project scientist Send message Joined: 14 Mar 07 Posts: 1957 Credit: 629,356 RAC: 0 Level Scientific publications	Message 27752 - Posted: 20 Dec 2012 \| 11:08:57 UTC - in response to Message 27751.
	I have increased the upload size to 200MB now, but this will take place only on new results. I guess it's at the border line between 128MB (the previous limit), so it depends on the compression. It's our mistake as the internal test did not complain, but now it's a bit of a problem to cancel them. gdf
	ID: 27752 \| Rating: 0 \| rate: / Reply Quote

Bedrich Hajek Send message Joined: 28 Mar 09 Posts: 485 Credit: 11,108,783,435 RAC: 15,545,660 Level Scientific publications	Message 27753 - Posted: 20 Dec 2012 \| 11:22:12 UTC - in response to Message 27744.
	I hope I am wrong on this but it looks like the output file for these work units are going to be larger than 128 MB. After one of the units was 25% done, the 1x1_10-NOELIA_hfXA_long-0-2-RND6540_1_4 file was about 35 MB in size, at 33% done it was about 47 MB in size. If this projection holds when the file will be 100% done, the file will be about 140 MB, too large to upload, unless you raise the size limit of the upload files. I hate to see an otherwise successfully completed unit, error out like this. This error occurred before: 5/8/2012 3:26:52 AM \| GPUGRID \| Computation for task 1H46_11_8-PAOLA_RNP-0-5-RND3163_0 finished 5/8/2012 3:26:52 AM \| GPUGRID \| Output file 1H46_11_8-PAOLA_RNP-0-5-RND3163_0_4 for task 1H46_11_8-PAOLA_RNP-0-5-RND3163_0 exceeds size limit. 5/8/2012 3:26:52 AM \| GPUGRID \| File size: 131283476.000000 bytes. Limit: 128000000.000000 byte http://www.gpugrid.net/forum_thread.php?id=2970#24795 But you still have an opportunity to correct this. Here is the reason why they failed, from my event log: 12/20/2012 6:17:55 AM \| GPUGRID \| Computation for task 1x1_10-NOELIA_hfXA_long-0-2-RND6540_1 finished 12/20/2012 6:17:55 AM \| GPUGRID \| Output file 1x1_10-NOELIA_hfXA_long-0-2-RND6540_1_4 for task 1x1_10-NOELIA_hfXA_long-0-2-RND6540_1 exceeds size limit. 12/20/2012 6:17:55 AM \| GPUGRID \| File size: 144107276.000000 bytes. Limit: 128000000.000000 bytes 12/20/2012 6:18:08 AM \| GPUGRID \| Started upload of 1x1_10-NOELIA_hfXA_long-0-2-RND6540_1_0 12/20/2012 6:18:08 AM \| GPUGRID \| Started upload of 1x1_10-NOELIA_hfXA_long-0-2-RND6540_1_1 12/20/2012 6:18:16 AM \| GPUGRID \| Finished upload of 1x1_10-NOELIA_hfXA_long-0-2-RND6540_1_0 12/20/2012 6:18:16 AM \| GPUGRID \| Started upload of 1x1_10-NOELIA_hfXA_long-0-2-RND6540_1_2 12/20/2012 6:18:45 AM \| GPUGRID \| Finished upload of 1x1_10-NOELIA_hfXA_long-0-2-RND6540_1_1 12/20/2012 6:18:45 AM \| GPUGRID \| Started upload of 1x1_10-NOELIA_hfXA_long-0-2-RND6540_1_3 12/20/2012 6:19:03 AM \| GPUGRID \| Finished upload of 1x1_10-NOELIA_hfXA_long-0-2-RND6540_1_2 12/20/2012 6:19:03 AM \| GPUGRID \| Started upload of 1x1_10-NOELIA_hfXA_long-0-2-RND6540_1_7 12/20/2012 6:19:04 AM \| GPUGRID \| Finished upload of 1x1_10-NOELIA_hfXA_long-0-2-RND6540_1_3 12/20/2012 6:19:05 AM \| GPUGRID \| Finished upload of 1x1_10-NOELIA_hfXA_long-0-2-RND6540_1_7 Guys, you're suppose to learn from your mistakes, not repeat them.
	ID: 27753 \| Rating: 0 \| rate: / Reply Quote

GDF Volunteer moderator Project administrator Project developer Project tester Volunteer developer Volunteer tester Project scientist Send message Joined: 14 Mar 07 Posts: 1957 Credit: 629,356 RAC: 0 Level Scientific publications	Message 27754 - Posted: 20 Dec 2012 \| 11:29:36 UTC - in response to Message 27753. Last modified: 20 Dec 2012 \| 11:30:44 UTC
	Ok. We have checked in the DB. There are 87 failures like this and 1800 successes for this batch. As I said the problem is that the submission script did not picked it up. All new results will have a limit of 256MB. gdf Starting from the new application in January we expect to upload much smaller files. 1/3 of the current size.
	ID: 27754 \| Rating: 0 \| rate: / Reply Quote

Bedrich Hajek Send message Joined: 28 Mar 09 Posts: 485 Credit: 11,108,783,435 RAC: 15,545,660 Level Scientific publications	Message 27755 - Posted: 20 Dec 2012 \| 12:18:22 UTC - in response to Message 27754.
	Well, it happened again. 12/20/2012 6:58:25 AM \| GPUGRID \| Output file 1x17_3-NOELIA_hfXA_long-0-2-RND7641_0_4 for task 1x17_3-NOELIA_hfXA_long-0-2-RND7641_0 exceeds size limit. 12/20/2012 6:58:25 AM \| GPUGRID \| File size: 144107276.000000 bytes. Limit: 128000000.000000 bytes 12/20/2012 6:58:36 AM \| GPUGRID \| Starting task 10x12_4-NOELIA_hfXA_long-0-2-RND0606_1 using acemdlong version 616 (cuda42) in slot 2 http://www.gpugrid.net/result.php?resultid=6222298 I have three more of these units crunching right now. I hope this doesn't happen again.
	ID: 27755 \| Rating: 0 \| rate: / Reply Quote

Gattorantolo [Ticino] Send message Joined: 29 Dec 11 Posts: 44 Credit: 251,211,525 RAC: 0 Level Scientific publications	Message 27756 - Posted: 20 Dec 2012 \| 13:00:18 UTC - in response to Message 27755.
	What we have to do? 23 hours GPU work for nothing... :-( ____________ Member of Boinc Italy.
	ID: 27756 \| Rating: 0 \| rate: / Reply Quote

GDF Volunteer moderator Project administrator Project developer Project tester Volunteer developer Volunteer tester Project scientist Send message Joined: 14 Mar 07 Posts: 1957 Credit: 629,356 RAC: 0 Level Scientific publications	Message 27757 - Posted: 20 Dec 2012 \| 13:53:06 UTC - in response to Message 27756.
	Can you manually increase the limit or at least see how much it is? Is your version of the boinc client compressing the files? I don't understand why for some it works and for few it does not. gdf
	ID: 27757 \| Rating: 0 \| rate: / Reply Quote

TheFiend Send message Joined: 26 Aug 11 Posts: 99 Credit: 2,500,112,138 RAC: 0 Level Scientific publications	Message 27761 - Posted: 20 Dec 2012 \| 15:31:28 UTC
	I've just had another one fail at upload!!! :(
	ID: 27761 \| Rating: 0 \| rate: / Reply Quote

[AF>Belgique] bill1170 Send message Joined: 4 Jan 09 Posts: 13 Credit: 1,292,573,895 RAC: 3,498,181 Level Scientific publications	Message 27762 - Posted: 20 Dec 2012 \| 15:34:55 UTC - in response to Message 27757.
	Same problem "20/12/2012 12:40:09 \| GPUGRID \| Output file 1x18_5-NOELIA_hfXA_long-0-2-RND1979_0_4 for task 1x18_5-NOELIA_hfXA_long-0-2-RND1979_0 exceeds size limit. 20/12/2012 12:40:09 \| GPUGRID \| File size: 144107276.000000 bytes. Limit: 128000000.000000 bytes" with this one : http://www.gpugrid.net/workunit.php?wuid=3964747 on GTX660Ti XP32 Boinc 7.0.28
	ID: 27762 \| Rating: 0 \| rate: / Reply Quote

Retvari Zoltan Send message Joined: 20 Jan 09 Posts: 2343 Credit: 16,214,765,968 RAC: 1,002,217 Level Scientific publications	Message 27763 - Posted: 20 Dec 2012 \| 15:36:58 UTC
	This is very very bad. I have only one successful NOELIA_hfXA_long task (it was the first one) and 9 failures 8 of the failures are because the upload limit exceeded. Maybe the BOINC manager (on Windows) has this 128MB upload limit, so it couldn't be fixed on server side, since the increase I've got this error again.
	ID: 27763 \| Rating: 0 \| rate: / Reply Quote

dskagcommunity Send message Joined: 28 Apr 11 Posts: 456 Credit: 817,865,789 RAC: 0 Level Scientific publications	Message 27764 - Posted: 20 Dec 2012 \| 16:29:22 UTC
	Oh man..thats hard to know all fresh WUs will error out too in 14-18 hours and i cant abort them all..."50M until End of the year byebye" -_- ____________ DSKAG Austria Research Team: http://www.research.dskag.at
	ID: 27764 \| Rating: 0 \| rate: / Reply Quote

Rantanplan Send message Joined: 22 Jul 11 Posts: 166 Credit: 138,629,987 RAC: 0 Level Scientific publications	Message 27766 - Posted: 20 Dec 2012 \| 16:44:38 UTC - in response to Message 27763. Last modified: 20 Dec 2012 \| 17:16:22 UTC
	ok, i raised the transfer limit to 1500,00 MB per day !? will that effect anything ? ok i found this , but will taht help, i even dont understand that. http://boinc.berkeley.edu/trac/wiki/JobSubmission
	ID: 27766 \| Rating: 0 \| rate: / Reply Quote

klepel Send message Joined: 23 Dec 09 Posts: 189 Credit: 4,720,236,325 RAC: 1,975,992 Level Scientific publications	Message 27767 - Posted: 20 Dec 2012 \| 17:08:20 UTC
	Two here as well! 20/12/2012 12:30:29 a.m. \| GPUGRID \| Output file 1x13_10-NOELIA_hfXA_long-0-2-RND5009_0_4 for task 1x13_10-NOELIA_hfXA_long-0-2-RND5009_0 exceeds size limit. 20/12/2012 12:30:29 a.m. \| GPUGRID \| File size: 144107276.000000 bytes. Limit: 128000000.000000 bytes 20/12/2012 11:49:34 a.m. \| GPUGRID \| Output file 1x31_12-NOELIA_hfXA_long-0-2-RND4384_1_4 for task 1x31_12-NOELIA_hfXA_long-0-2-RND4384_1 exceeds size limit. 20/12/2012 11:49:34 a.m. \| GPUGRID \| File size: 144107276.000000 bytes. Limit: 128000000.000000 bytes And for sure on my other computer failed one as well. Is there a possibility, to identify the WUs GDF has corrected? The failed ones took me over 11 hours to crunch.
	ID: 27767 \| Rating: 0 \| rate: / Reply Quote

valterc Send message Joined: 21 Jun 10 Posts: 21 Credit: 8,472,939,672 RAC: 26,323,587 Level Scientific publications	Message 27768 - Posted: 20 Dec 2012 \| 18:09:57 UTC Last modified: 20 Dec 2012 \| 18:44:03 UTC
	ERR_FILE_TOO_BIG -131 One of the output files is bigger than the maximum set by the project for upload. BOINC will not try to upload this file. Solution: Go to the project's forums and report this behavior. Nothing can be done client-side. I don't know if any modifications done server-side (I guess that a restart is needed) will actually reflect on already distributed workunits... [EDIT] I have a wu crunching since ~7 hours and I hope I will not lose it.... I found inside "client_state.xml" some lines about the "name_of_the_wu_4" which is the larger of the output files made while crunching: <max_nbytes>128000000.000000</max_nbytes> will change the value to a much larger one, restart boinc and see what happens...
	ID: 27768 \| Rating: 0 \| rate: / Reply Quote

TheFiend Send message Joined: 26 Aug 11 Posts: 99 Credit: 2,500,112,138 RAC: 0 Level Scientific publications	Message 27769 - Posted: 20 Dec 2012 \| 18:23:26 UTC - in response to Message 27761.
	I've just had another one fail at upload!!! :( And a 3rd one :(
	ID: 27769 \| Rating: 0 \| rate: / Reply Quote

Rantanplan Send message Joined: 22 Jul 11 Posts: 166 Credit: 138,629,987 RAC: 0 Level Scientific publications	Message 27770 - Posted: 20 Dec 2012 \| 18:28:36 UTC - in response to Message 27769. Last modified: 20 Dec 2012 \| 18:29:35 UTC
	i wend wrong
	ID: 27770 \| Rating: 0 \| rate: / Reply Quote

GDF Volunteer moderator Project administrator Project developer Project tester Volunteer developer Volunteer tester Project scientist Send message Joined: 14 Mar 07 Posts: 1957 Credit: 629,356 RAC: 0 Level Scientific publications	Message 27771 - Posted: 20 Dec 2012 \| 18:32:39 UTC - in response to Message 27770. Last modified: 20 Dec 2012 \| 18:35:31 UTC
	So the situation is this one. All the new WUs SENT after we have made the change this morning are ok. The ones that were already queued in your client before this morning in Barcelona have a 50% probability of failing the upload, so cancel them. gdf Check this file in client_state.xml <max_nbytes>128000000.000000</max_nbytes>
	ID: 27771 \| Rating: 0 \| rate: / Reply Quote

valterc Send message Joined: 21 Jun 10 Posts: 21 Credit: 8,472,939,672 RAC: 26,323,587 Level Scientific publications	Message 27772 - Posted: 20 Dec 2012 \| 18:34:55 UTC - in response to Message 27771.
	So the situation is this one. All the new WUs SENT after we have made the change this morning are ok. The ones that were already queued in your client before this morning in Barcelona have a 50% probability of failing the upload, so cancel them. gdf or try what I suggested a few posts ago....
	ID: 27772 \| Rating: 0 \| rate: / Reply Quote

GDF Volunteer moderator Project administrator Project developer Project tester Volunteer developer Volunteer tester Project scientist Send message Joined: 14 Mar 07 Posts: 1957 Credit: 629,356 RAC: 0 Level Scientific publications	Message 27773 - Posted: 20 Dec 2012 \| 18:35:56 UTC - in response to Message 27772.
	Yes do that. gdf
	ID: 27773 \| Rating: 0 \| rate: / Reply Quote

Retvari Zoltan Send message Joined: 20 Jan 09 Posts: 2343 Credit: 16,214,765,968 RAC: 1,002,217 Level Scientific publications	Message 27774 - Posted: 20 Dec 2012 \| 19:03:05 UTC - in response to Message 27768. Last modified: 20 Dec 2012 \| 19:12:16 UTC
	I have a wu crunching since ~7 hours and I hope I will not lose it.... I found inside "client_state.xml" some lines about the "name_of_the_wu_4" which is the larger of the output files made while crunching: <max_nbytes>128000000.000000</max_nbytes> will change the value to a much larger one, restart boinc and see what happens... I did it, and it's working! 1. Exit BOINC manager with stopping scientific applications 2. Locate the client.xml file and open it with a text editor -- On Windows XP: notepad.exe "C:\Documents and Settings\All Users\Application Data\BOINC\client_state.xml" -- On Windows 7: notepad.exe "C:\Program Data\BOINC\client_state.xml" 3. Search and replace the <max_nbytes>128000000.000000</max_nbytes> value to <max_nbytes>198000000.000000</max_nbytes> 4. Save and Exit 5. Restat BOINC manager.
	ID: 27774 \| Rating: 0 \| rate: / Reply Quote

Rantanplan Send message Joined: 22 Jul 11 Posts: 166 Credit: 138,629,987 RAC: 0 Level Scientific publications	Message 27775 - Posted: 20 Dec 2012 \| 19:05:12 UTC - in response to Message 27773. Last modified: 20 Dec 2012 \| 19:10:59 UTC
	now it must be right. <file> <name>1x29_7-NOELIA_hfXA_long-0-2-RND2127_2_4</name> <nbytes>0.000000</nbytes> <max_nbytes>256000000.000000</max_nbytes> <status>0</status> <upload_url>http://www.gpugrid.org/PS3GRID_cgi/file_upload_handler</upload_url> </file> Crunshing this task, for sure it will work again.
	ID: 27775 \| Rating: 0 \| rate: / Reply Quote

skgiven Volunteer moderator Volunteer tester Send message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level Scientific publications	Message 27779 - Posted: 20 Dec 2012 \| 22:57:28 UTC - in response to Message 27775.
	Why are you not using <max_nbytes>0.000000</max_nbytes> ? ____________ FAQ's HOW TO: - Opt out of Beta Tests - Ask for Help
	ID: 27779 \| Rating: 0 \| rate: / Reply Quote

GDF Volunteer moderator Project administrator Project developer Project tester Volunteer developer Volunteer tester Project scientist Send message Joined: 14 Mar 07 Posts: 1957 Credit: 629,356 RAC: 0 Level Scientific publications	Message 27780 - Posted: 20 Dec 2012 \| 23:07:42 UTC - in response to Message 27779.
	I did not even know about it. gdf
	ID: 27780 \| Rating: 0 \| rate: / Reply Quote

GPUGRID Send message Joined: 12 Dec 11 Posts: 91 Credit: 2,730,095,033 RAC: 0 Level Scientific publications	Message 27781 - Posted: 20 Dec 2012 \| 23:36:44 UTC
	Ok... i´m new on this beautifull project, and i´m not secure about editing things... But I understand that I can go back to long runs because the ones that are beeing splited now will work without any change? I had a lot of 10/11 hr processing wasted and dont want to loose it all again..
	ID: 27781 \| Rating: 0 \| rate: / Reply Quote

Bedrich Hajek Send message Joined: 28 Mar 09 Posts: 485 Credit: 11,108,783,435 RAC: 15,545,660 Level Scientific publications	Message 27783 - Posted: 21 Dec 2012 \| 2:21:41 UTC
	I finally got one to upload successfully, after 4 failures, and with 3 more crunching!!! http://www.gpugrid.net/result.php?resultid=6241005
	ID: 27783 \| Rating: 0 \| rate: / Reply Quote

dskagcommunity Send message Joined: 28 Apr 11 Posts: 456 Credit: 817,865,789 RAC: 0 Level Scientific publications	Message 27785 - Posted: 21 Dec 2012 \| 9:17:08 UTC
	Yes they are back running, you can crunch on. ____________ DSKAG Austria Research Team: http://www.research.dskag.at
	ID: 27785 \| Rating: 0 \| rate: / Reply Quote

GPUGRID Send message Joined: 12 Dec 11 Posts: 91 Credit: 2,730,095,033 RAC: 0 Level Scientific publications	Message 27786 - Posted: 21 Dec 2012 \| 10:42:22 UTC - in response to Message 27785.
	Thank you mate, they are reporting fine now!
	ID: 27786 \| Rating: 0 \| rate: / Reply Quote

ritterm Send message Joined: 31 Jul 09 Posts: 88 Credit: 244,413,897 RAC: 0 Level Scientific publications	Message 27787 - Posted: 21 Dec 2012 \| 12:13:09 UTC - in response to Message 27771. Last modified: 21 Dec 2012 \| 12:31:25 UTC
	On 20 Dec 12, GDF wrote... All the new WUs SENT after we have made the change this morning are ok. So, just to make sure, I should be okay having downloaded WU 6224269 on 21 Dec, even though it was created on 19 Dec? Thanks, MarkR ____________
	ID: 27787 \| Rating: 0 \| rate: / Reply Quote

Toni Volunteer moderator Project administrator Project developer Project tester Project scientist Send message Joined: 9 Dec 08 Posts: 1006 Credit: 5,068,599 RAC: 0 Level Scientific publications	Message 27788 - Posted: 21 Dec 2012 \| 13:24:16 UTC - in response to Message 27787.
	@ritterm - It should be ok. It's the "sent time" that counts for this issue. You can anyway check the <max_nbytes> parameter, as above, to be sure.
	ID: 27788 \| Rating: 0 \| rate: / Reply Quote

ritterm Send message Joined: 31 Jul 09 Posts: 88 Credit: 244,413,897 RAC: 0 Level Scientific publications	Message 27789 - Posted: 21 Dec 2012 \| 14:29:15 UTC - in response to Message 27788.
	@ritterm - It should be ok. It's the "sent time" that counts for this issue. You can anyway check the <max_nbytes> parameter, as above, to be sure. Thanks, Toni. I checked the client_state file and there appear to be numerous files with different <max_nbytes> settings for the WU. Again, just so I'm sure I've got this right, the output file is the one ending in "_4"? If so, I think I'm okay: <file> <name>10x6_4-NOELIA_hfXA_long-0-2-RND8944_0_4</name> <nbytes>0.000000</nbytes> <max_nbytes>256000000.000000</max_nbytes> <status>0</status> <upload_url>http://www.gpugrid.org/PS3GRID_cgi/file_upload_handler</upload_url> </file> ____________
	ID: 27789 \| Rating: 0 \| rate: / Reply Quote

pvh Send message Joined: 17 Mar 10 Posts: 23 Credit: 1,173,824,416 RAC: 0 Level Scientific publications	Message 27791 - Posted: 21 Dec 2012 \| 17:44:20 UTC
	I had one of these failures too today: <core_client_version>7.0.28</core_client_version> <![CDATA[ <stderr_txt> MDIO: cannot open file "restart.coor" # Time per step (avg over 6250000 steps): 15.773 ms # Approximate elapsed time for entire WU: 98580.515 s 03:24:41 (23694): called boinc_finish </stderr_txt> <message> upload failure: <file_xfer_error> <file_name>1x3_1-NOELIA_hfXA_long-0-2-RND0218_1_4</file_name> <error_code>-131</error_code> </file_xfer_error> </message> ]]> 100,000 seconds of work down the toilet... :( Please fix this!
	ID: 27791 \| Rating: 0 \| rate: / Reply Quote

Vinnidikt Send message Joined: 6 Mar 12 Posts: 1 Credit: 61,673,501 RAC: 0 Level Scientific publications	Message 27792 - Posted: 21 Dec 2012 \| 22:38:49 UTC
	http://www.gpugrid.net/results.php?userid=86706 37 hours wasted...
	ID: 27792 \| Rating: 0 \| rate: / Reply Quote

Beyond Send message Joined: 23 Nov 08 Posts: 1112 Credit: 6,162,416,256 RAC: 0 Level Scientific publications	Message 27793 - Posted: 22 Dec 2012 \| 2:50:04 UTC
	All 4 of my machines failed these after between 24 and 25 hours each, so about 99 hours wasted. Since no TONI WUs seem to be available, had to switch projects for a while :-(
	ID: 27793 \| Rating: 0 \| rate: / Reply Quote

ritterm Send message Joined: 31 Jul 09 Posts: 88 Credit: 244,413,897 RAC: 0 Level Scientific publications	Message 27794 - Posted: 22 Dec 2012 \| 3:51:39 UTC
	The task I referred to in my previous message finished okay. ____________
	ID: 27794 \| Rating: 0 \| rate: / Reply Quote

Gattorantolo [Ticino] Send message Joined: 29 Dec 11 Posts: 44 Credit: 251,211,525 RAC: 0 Level Scientific publications	Message 27795 - Posted: 22 Dec 2012 \| 6:07:22 UTC - in response to Message 27794. Last modified: 22 Dec 2012 \| 6:07:52 UTC
	Now is working :-) 3xNoelia WU, 135.000 credits each :-) ____________ Member of Boinc Italy.
	ID: 27795 \| Rating: 0 \| rate: / Reply Quote

Jari Pyyluoma Send message Joined: 2 Aug 08 Posts: 12 Credit: 1,165,835,704 RAC: 0 Level Scientific publications	Message 27801 - Posted: 22 Dec 2012 \| 11:51:18 UTC - in response to Message 27795.
	Why do you allow 7 errors? That seems wasteful. name 1x40_14-NOELIA_hfXA_long-0-2-RND7079 application Long runs (8-12 hours on fastest card) created 19 Dec 2012 \| 15:18:38 UTC minimum quorum 1 initial replication 1 max # of error/total/success tasks 7, 10, 6 Task click for details Computer Sent Time reported or deadline explain Status Run time (sec) CPU time (sec) Credit Application 6222894 107753 19 Dec 2012 \| 23:14:47 UTC 20 Dec 2012 \| 23:43:07 UTC Error while computing 47,292.92 47,269.47 --- Long runs (8-12 hours on fastest card) v6.16 (cuda42) 6245526 141259 21 Dec 2012 \| 1:58:49 UTC 21 Dec 2012 \| 4:16:55 UTC Error while computing 2,685.14 100.26 --- Long runs (8-12 hours on fastest card) v6.16 (cuda42) 6246153 141700 21 Dec 2012 \| 7:20:11 UTC 21 Dec 2012 \| 9:59:23 UTC Error while computing 4.46 2.36 --- Long runs (8-12 hours on fastest card) v6.16 (cuda42) 6246927 140345 21 Dec 2012 \| 17:45:40 UTC 26 Dec 2012 \| 17:45:40 UTC In progress --- --- --- Long runs (8-12 hours on fastest card) v6.16 (cuda42)
	ID: 27801 \| Rating: 0 \| rate: / Reply Quote

Rantanplan Send message Joined: 22 Jul 11 Posts: 166 Credit: 138,629,987 RAC: 0 Level Scientific publications	Message 27803 - Posted: 22 Dec 2012 \| 13:00:08 UTC - in response to Message 27801.
	http://www.gpugrid.net/result.php?resultid=6224928 absolutely clueless , what happend ? 38.000 seconds down the drain.
	ID: 27803 \| Rating: 0 \| rate: / Reply Quote

cciechad Send message Joined: 28 Dec 10 Posts: 13 Credit: 37,543,525 RAC: 0 Level Scientific publications	Message 27809 - Posted: 22 Dec 2012 \| 21:11:58 UTC - in response to Message 27803.
	Thats a different error. ERROR: file deven.cpp line 1106: # Energies have become nan Means the simulation went into a state that was not physically possible and was aborted.
	ID: 27809 \| Rating: 0 \| rate: / Reply Quote

tomba Send message Joined: 21 Feb 09 Posts: 497 Credit: 700,690,702 RAC: 0 Level Scientific publications	Message 27822 - Posted: 24 Dec 2012 \| 12:53:55 UTC - in response to Message 27795.
	Now is working :-) 3xNoelia WU, 135.000 credits each :-) Just had "Error while computing" 16 hours in... ____________
	ID: 27822 \| Rating: 0 \| rate: / Reply Quote

Chilean Send message Joined: 8 Oct 12 Posts: 98 Credit: 385,652,461 RAC: 0 Level Scientific publications	Message 27823 - Posted: 24 Dec 2012 \| 16:53:19 UTC
	I haven't had a problem with these new WU, but good lord these WU a HUGE. It take my 660M about 30 hours to finish one (and it is heavily overclocked).
	ID: 27823 \| Rating: 0 \| rate: / Reply Quote

skgiven Volunteer moderator Volunteer tester Send message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level Scientific publications	Message 27824 - Posted: 24 Dec 2012 \| 19:25:03 UTC - in response to Message 27823.
	Careful with the OC's; these tasks might consume slightly more power. ____________ FAQ's HOW TO: - Opt out of Beta Tests - Ask for Help
	ID: 27824 \| Rating: 0 \| rate: / Reply Quote

Bedrich Hajek Send message Joined: 28 Mar 09 Posts: 485 Credit: 11,108,783,435 RAC: 15,545,660 Level Scientific publications	Message 27825 - Posted: 24 Dec 2012 \| 20:56:24 UTC
	Now that we have these units uploading successfully, I noticed that the server status has 22,000 + work units listed as unsent. I have never seen that number that high, and since in the last few days, I have been getting nothing but these Noelia units, putting 2 and 2 together, I ask that question again, what's the big thing that we are crunching here?
	ID: 27825 \| Rating: 0 \| rate: / Reply Quote

TheFiend Send message Joined: 26 Aug 11 Posts: 99 Credit: 2,500,112,138 RAC: 0 Level Scientific publications	Message 27827 - Posted: 24 Dec 2012 \| 23:05:40 UTC - in response to Message 27825.
	Now that we have these units uploading successfully, I noticed that the server status has 22,000 + work units listed as unsent. I have never seen that number that high, and since in the last few days, I have been getting nothing but these Noelia units, putting 2 and 2 together, I ask that question again, what's the big thing that we are crunching here? Maybe they're just keeping stocked up for Xmas!
	ID: 27827 \| Rating: 0 \| rate: / Reply Quote

dskagcommunity Send message Joined: 28 Apr 11 Posts: 456 Credit: 817,865,789 RAC: 0 Level Scientific publications	Message 27829 - Posted: 25 Dec 2012 \| 10:20:25 UTC
	I think that too. Scientist want to have xmas too ;) ____________ DSKAG Austria Research Team: http://www.research.dskag.at
	ID: 27829 \| Rating: 0 \| rate: / Reply Quote

dskagcommunity Send message Joined: 28 Apr 11 Posts: 456 Credit: 817,865,789 RAC: 0 Level Scientific publications	Message 27830 - Posted: 25 Dec 2012 \| 10:49:37 UTC Last modified: 25 Dec 2012 \| 10:50:59 UTC
	I hope more they activade soon cuda42 only with the smaller upload files. One of my cards need only 13 hours to compute, but the 3g connection has uploadspeedproblems and i need over 14 hours to upload. So it blocks new wus to download and i miss the 24h bonus too :( so the card witch needs over 18 hours get the full bonus due high speed internet, what a Shame ^^ ____________ DSKAG Austria Research Team: http://www.research.dskag.at
	ID: 27830 \| Rating: 0 \| rate: / Reply Quote

cciechad Send message Joined: 28 Dec 10 Posts: 13 Credit: 37,543,525 RAC: 0 Level Scientific publications	Message 27831 - Posted: 25 Dec 2012 \| 16:30:45 UTC
	The Cuda31 versions of these tasks are very slow 24h+ on slightly factory overclocked 560ti all of the 4.2 tasks are working much better at most they take 16 hours(Always at-least 25% faster sometimes upto 40% faster). Unfortunately the server is passing out about 25% Cuda31 tasks to 42 capable machines and I always catch them to late to make aborting worthwile.
	ID: 27831 \| Rating: 0 \| rate: / Reply Quote

Retvari Zoltan Send message Joined: 20 Jan 09 Posts: 2343 Credit: 16,214,765,968 RAC: 1,002,217 Level Scientific publications	Message 27839 - Posted: 26 Dec 2012 \| 0:45:01 UTC - in response to Message 27824.
	Careful with the OC's; these tasks might consume slightly more power. That's too bad that the credit/time ratio of these workuntis doesn't reflect this.
	ID: 27839 \| Rating: 0 \| rate: / Reply Quote

Chilean Send message Joined: 8 Oct 12 Posts: 98 Credit: 385,652,461 RAC: 0 Level Scientific publications	Message 27840 - Posted: 26 Dec 2012 \| 5:26:19 UTC - in response to Message 27839.
	The 30+ hours per WU is ruining my RAC :P I'm crunching longer and getting less credit haha, oh well. At least these WUs don't lag my graphics.
	ID: 27840 \| Rating: 0 \| rate: / Reply Quote

Lazarus-uk Send message Joined: 16 Nov 08 Posts: 29 Credit: 122,821,515 RAC: 0 Level Scientific publications	Message 27841 - Posted: 26 Dec 2012 \| 8:59:29 UTC
	Maybe they would consider extending the 24hr bonus time to 36hrs or even 48hrs for these longer WUs. A lot of people seem to be struggling to finish them in 24hrs.
	ID: 27841 \| Rating: 0 \| rate: / Reply Quote

skgiven Volunteer moderator Volunteer tester Send message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level Scientific publications	Message 27843 - Posted: 26 Dec 2012 \| 10:59:49 UTC - in response to Message 27841.
	Sounds like a reasonable request given that the tasks are so long (~17h on a GTX470) and the Work Units are only two steps deep. However, all 20,000WU's should be completed by early January, so it's a short experiment, and it would be a lot of work to redo the credit system for a single batch of tasks - something the researchers would not want to be doing for every new batch! Anyway, this 2task-deep research model is a test, and might not be adopted. See Gianni's recent New Year experiment post. ____________ FAQ's HOW TO: - Opt out of Beta Tests - Ask for Help
	ID: 27843 \| Rating: 0 \| rate: / Reply Quote

Chilean Send message Joined: 8 Oct 12 Posts: 98 Credit: 385,652,461 RAC: 0 Level Scientific publications	Message 27846 - Posted: 26 Dec 2012 \| 23:38:11 UTC - in response to Message 27843.
	What do they mean by "2-step deep" ?
	ID: 27846 \| Rating: 0 \| rate: / Reply Quote

Retvari Zoltan Send message Joined: 20 Jan 09 Posts: 2343 Credit: 16,214,765,968 RAC: 1,002,217 Level Scientific publications	Message 27847 - Posted: 27 Dec 2012 \| 1:42:01 UTC - in response to Message 27846. Last modified: 27 Dec 2012 \| 1:51:38 UTC
	What do they mean by "2-step deep" ? In this project every workunit we process is a piece of a "longer" Molecular Dynamics simulation. So when a workunit is finished, uploaded and validated, then its result will be sent to another host to process the MD simulation from where the previous one has finished, until the whole given timeframe of the MD simulation is completed. For example: the whole timeframe for the MD simulation is 1µsec, it will be divided into 100 pieces of 10ns workunits (aka "steps"). However in the case of NOELIA_hfXA_long, there are only 2 such pieces. On the server status page there are 22.000 unsent long workunits right now, in this case every one of them will be processed twice, while in other cases there are only 2.200 unsent long workunits, but those usually processed for 100 times.
	ID: 27847 \| Rating: 0 \| rate: / Reply Quote

tomba Send message Joined: 21 Feb 09 Posts: 497 Credit: 700,690,702 RAC: 0 Level Scientific publications	Message 27849 - Posted: 27 Dec 2012 \| 16:15:22 UTC - in response to Message 27841.
	... A lot of people seem to be struggling to finish them in 24hrs. How about this for a down-to-the-wire full bonus WU: And this after a 2hr:40min upload! It does seem a bit daft that upload time should be added to processing time. Doesn't a done WU package include processing-end time?? ____________
	ID: 27849 \| Rating: 0 \| rate: / Reply Quote

skgiven Volunteer moderator Volunteer tester Send message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level Scientific publications	Message 27850 - Posted: 27 Dec 2012 \| 17:24:32 UTC - in response to Message 27849.
	Credit is applied when a task is reported, it's not based on runtime. While you obviously have a point, report time is the time it tasks to download, run, upload and report a task. Tasks aren't of any research use until they are uploaded. Reporting is different; it's a separate tiny upload, mainly for credits, and goes to a different database. Good to see you managed to get that task reported in time. I suggest people lower their cache (where possible) if you are close to missing out on a time bonus. ____________ FAQ's HOW TO: - Opt out of Beta Tests - Ask for Help
	ID: 27850 \| Rating: 0 \| rate: / Reply Quote

Beyond Send message Joined: 23 Nov 08 Posts: 1112 Credit: 6,162,416,256 RAC: 0 Level Scientific publications	Message 27851 - Posted: 27 Dec 2012 \| 17:44:29 UTC - in response to Message 27825. Last modified: 27 Dec 2012 \| 17:48:41 UTC
	Now that we have these units uploading successfully, I noticed that the server status has 22,000 + work units listed as unsent. I have never seen that number that high, and since in the last few days, I have been getting nothing but these Noelia units, putting 2 and 2 together, I ask that question again, what's the big thing that we are crunching here? I've been able to get nothing but the Noelia WUs either. They take too long to run to meet the 24hr deadline so have moved the NVidias to other projects. The Toni WUs are the only ones that work OK on my 4 cards as the Nathans take too much memory to run efficiently. Will check back from time to time as I would really like to be running GPUGrid...
	ID: 27851 \| Rating: 0 \| rate: / Reply Quote

Chilean Send message Joined: 8 Oct 12 Posts: 98 Credit: 385,652,461 RAC: 0 Level Scientific publications	Message 27852 - Posted: 27 Dec 2012 \| 18:13:28 UTC - in response to Message 27847.
	What do they mean by "2-step deep" ? In this project every workunit we process is a piece of a "longer" Molecular Dynamics simulation. So when a workunit is finished, uploaded and validated, then its result will be sent to another host to process the MD simulation from where the previous one has finished, until the whole given timeframe of the MD simulation is completed. For example: the whole timeframe for the MD simulation is 1µsec, it will be divided into 100 pieces of 10ns workunits (aka "steps"). However in the case of NOELIA_hfXA_long, there are only 2 such pieces. On the server status page there are 22.000 unsent long workunits right now, in this case every one of them will be processed twice, while in other cases there are only 2.200 unsent long workunits, but those usually processed for 100 times. Would this explain the size of these WU? (2 step deep means longer timeframe of simulation per WU, right?)
	ID: 27852 \| Rating: 0 \| rate: / Reply Quote

werdwerdus Send message Joined: 15 Apr 10 Posts: 123 Credit: 1,004,473,861 RAC: 0 Level Scientific publications	Message 27853 - Posted: 27 Dec 2012 \| 18:22:40 UTC - in response to Message 27851. Last modified: 27 Dec 2012 \| 18:23:10 UTC
	Now that we have these units uploading successfully, I noticed that the server status has 22,000 + work units listed as unsent. I have never seen that number that high, and since in the last few days, I have been getting nothing but these Noelia units, putting 2 and 2 together, I ask that question again, what's the big thing that we are crunching here? I've been able to get nothing but the Noelia WUs either. They take too long to run to meet the 24hr deadline so have moved the NVidias to other projects. The Toni WUs are the only ones that work OK on my 4 cards as the Nathans take too much memory to run efficiently. Will check back from time to time as I would really like to be running GPUGrid... Why don't you switch to the short task queue? AFAIK they still get a 24 hour bonus and are at least 1/2 the length or less of the long tasks. ____________ XtremeSystems.org - #1 Team in GPUGrid
	ID: 27853 \| Rating: 0 \| rate: / Reply Quote

tomba Send message Joined: 21 Feb 09 Posts: 497 Credit: 700,690,702 RAC: 0 Level Scientific publications	Message 27854 - Posted: 27 Dec 2012 \| 18:33:05 UTC - in response to Message 27853.
	Why don't you switch to the short task queue? Pretty please, tell me how to do that. Thanks! ____________
	ID: 27854 \| Rating: 0 \| rate: / Reply Quote

ritterm Send message Joined: 31 Jul 09 Posts: 88 Credit: 244,413,897 RAC: 0 Level Scientific publications	Message 27855 - Posted: 27 Dec 2012 \| 20:06:52 UTC - in response to Message 27854.
	Why don't you switch to the short task queue? Pretty please, tell me how to do that. Thanks! 1) Go to your account area 2) Under "Preferences", select "GPUGRID Preferences" 3) Select "Edit GPUGRID preferences" for setting appropriate for your host (default, home, school, work) 4) Under "Run only the selected applications", check "ACEMD standard" and be sure the others are not checked. Hope that helps! MarkR ____________
	ID: 27855 \| Rating: 0 \| rate: / Reply Quote

tomba Send message Joined: 21 Feb 09 Posts: 497 Credit: 700,690,702 RAC: 0 Level Scientific publications	Message 27856 - Posted: 27 Dec 2012 \| 20:34:49 UTC - in response to Message 27855.
	1) Go to your account area Hi Mark! Thanks for answering. In BOINC I clicked "Your Account". Came here: Me no see "Preferences"!! ____________
	ID: 27856 \| Rating: 0 \| rate: / Reply Quote

Dylan Send message Joined: 16 Jul 12 Posts: 98 Credit: 386,043,752 RAC: 0 Level Scientific publications	Message 27857 - Posted: 27 Dec 2012 \| 20:40:12 UTC Last modified: 27 Dec 2012 \| 20:43:57 UTC
	Tomba, you are viewing your account information that everyone can see. To view your preferences however, click on your name on the top right of the GPUGRID web site. It should be next to server status, and is above the Volunteers and the Science links. This will take you to the correct account area where you can change your preferences. Look at the top of the picture you posted, and click on tomba.
	ID: 27857 \| Rating: 0 \| rate: / Reply Quote

tomba Send message Joined: 21 Feb 09 Posts: 497 Credit: 700,690,702 RAC: 0 Level Scientific publications	Message 27858 - Posted: 27 Dec 2012 \| 20:51:03 UTC - in response to Message 27857.
	Tomba, you are viewing your account information that everyone can see. To view your preferences however, click on your name on the top right of the GPUGRID web site. It should be next to server status, and is above the Volunteers and the Science links. This will take you to the correct account area where you can change your preferences. Look at the top of the picture you posted, and click on tomba. Got it! Thank you!! ____________
	ID: 27858 \| Rating: 0 \| rate: / Reply Quote

Beyond Send message Joined: 23 Nov 08 Posts: 1112 Credit: 6,162,416,256 RAC: 0 Level Scientific publications	Message 27859 - Posted: 28 Dec 2012 \| 1:38:34 UTC - in response to Message 27854.
	Why don't you switch to the short task queue? Pretty please, tell me how to do that. Thanks! It's in your project preferences. Be advised that the credit earned is much lower though.
	ID: 27859 \| Rating: 0 \| rate: / Reply Quote

tomba Send message Joined: 21 Feb 09 Posts: 497 Credit: 700,690,702 RAC: 0 Level Scientific publications	Message 27869 - Posted: 28 Dec 2012 \| 19:08:15 UTC
	Just had another Noelia "Error while computing", after eight hours of crunching. That's the sixth out of a total of 15. Can I please have more Nathan benHPs...? ____________
	ID: 27869 \| Rating: 0 \| rate: / Reply Quote

Dylan Send message Joined: 16 Jul 12 Posts: 98 Credit: 386,043,752 RAC: 0 Level Scientific publications	Message 27870 - Posted: 28 Dec 2012 \| 19:33:27 UTC
	Are you overclocking your card? Is your computer crashing? Things like that can cause these errors.
	ID: 27870 \| Rating: 0 \| rate: / Reply Quote

Retvari Zoltan Send message Joined: 20 Jan 09 Posts: 2343 Credit: 16,214,765,968 RAC: 1,002,217 Level Scientific publications	Message 27872 - Posted: 28 Dec 2012 \| 21:07:59 UTC - in response to Message 27869.
	Just had another Noelia "Error while computing", after eight hours of crunching. That's the sixth out of a total of 15. Can I please have more Nathan benHPs...? Obviously you can't have more of them. Your error is an "Energies have become nan", so see the appropriate thread.
	ID: 27872 \| Rating: 0 \| rate: / Reply Quote

cciechad Send message Joined: 28 Dec 10 Posts: 13 Credit: 37,543,525 RAC: 0 Level Scientific publications	Message 27873 - Posted: 29 Dec 2012 \| 0:37:19 UTC
	Is there any way to stop the old 31 wus from hitting my machine other than checking and aborting manually? These are insanely slow on my old 560ti. The server reports 1000s of wus ready to send and when I abort a 31 wu it almost always send a 42 task so I'm thinking the server isn't out of 42 wus. I saw this rarely in the past with the long queue but it seems much worse with these tasks.
	ID: 27873 \| Rating: 0 \| rate: / Reply Quote

Retvari Zoltan Send message Joined: 20 Jan 09 Posts: 2343 Credit: 16,214,765,968 RAC: 1,002,217 Level Scientific publications	Message 27874 - Posted: 29 Dec 2012 \| 11:01:57 UTC - in response to Message 27873.
	You can try my workaround on Linux: 1. turn off file size checking in cc_config.xml 2. overwrite the cuda31 binary with the cuda42 binary
	ID: 27874 \| Rating: 0 \| rate: / Reply Quote

Stoneageman Send message Joined: 25 May 09 Posts: 224 Credit: 34,057,374,498 RAC: 186 Level Scientific publications	Message 27875 - Posted: 29 Dec 2012 \| 17:49:28 UTC
	We can remove already 3.1 apps on the long queue, but we will not upgrade the application until new year. gdf GDF says yes, server says no!
	ID: 27875 \| Rating: 0 \| rate: / Reply Quote

tomba Send message Joined: 21 Feb 09 Posts: 497 Credit: 700,690,702 RAC: 0 Level Scientific publications	Message 27876 - Posted: 29 Dec 2012 \| 18:44:49 UTC - in response to Message 27872.
	Your error is an "Energies have become nan", so see the appropriate thread. Thank you for responding. I went there. All a bit confusing BUT I think I got the message that I need to under-clock my GPU. In just over two months I've had 16 "errors" out of 94 "good" WUs (yes - I log all my WUs...) I have an ASUS GTX 460 1 gig. Stock settings are Core: 675, Shader: 1350, Mem: 1800. For the 18 months I've had the 460, I've been GPUGRID-ing at: Core: 850, Shader: 1700, Mem 2000. It's only recently the errors have reared their ugly heads... I have 'EVGA Precision', that lets me do changes, though I have nothing that lets me change the voltage... Can I please ask you to recommend which of the three parameters I should change to give me the best chance of avoiding errors. Many thanks, Tom ____________
	ID: 27876 \| Rating: 0 \| rate: / Reply Quote

Retvari Zoltan Send message Joined: 20 Jan 09 Posts: 2343 Credit: 16,214,765,968 RAC: 1,002,217 Level Scientific publications	Message 27877 - Posted: 29 Dec 2012 \| 20:55:15 UTC - in response to Message 27876.
	I have an ASUS GTX 460 1 gig. Stock settings are Core: 675, Shader: 1350, Mem: 1800. ASUS have their own GPU overclocking tool, but I don't recommend it. For the 18 months I've had the 460, I've been GPUGRID-ing at: Core: 850, Shader: 1700, Mem 2000. It's only recently the errors have reared their ugly heads... As the GPU is getting older, it can tolerate less overclocking. Also, the CUDA4.2 tasks tolerate less overclocking than the old CUDA3.1 tasks. I have 'EVGA Precision', that lets me do changes, though I have nothing that lets me change the voltage... I use and recommend MSI Afterburner for that purpose. Can I please ask you to recommend which of the three parameters I should change to give me the best chance of avoiding errors. 1. Cleaning the cooling fins of the GPU and the CPU with high pressure air duster (using a vacuum-cleaner at the same time is recommended) - (also the PSU, as much as you can do it through the rear grille, or the fan) 2. If there is any overclock applied to the PCIe bus, cancel it, and set the PCIe bus frequency to its default of 100MHz 3. GPU memory clock is irrelevant, raising it is just asking for trouble. (set it to factory default) 4. You can raise the GPU voltage by 0.025V increments as long as your GPU temp is below 80°C (and gets stable). 5. Lower your GPU core clock by 10MHz decrements, until it gets stable. (note that not every frequency can be set, so the actual GPU frequency can be ~3MHz lower than the value beside the slider)
	ID: 27877 \| Rating: 0 \| rate: / Reply Quote

skgiven Volunteer moderator Volunteer tester Send message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level Scientific publications	Message 27878 - Posted: 29 Dec 2012 \| 21:01:10 UTC - in response to Message 27876.
	Firstly make sure the GPU Fan speed is keeping the GPU below 70°C. Sometimes adding a case fan or opening the side panel helps. Start downclocking by reducing the GDDR5 frequency by 10% If that fails try 20%. If need be reduce the GPU clock by 5 or 10% If these measures fail get software to increase the Voltage, but only by the least amount (typically ~0.025V) Note that Rosetta tasks were causing issues a while back, and might still be; I think some tasks used up too much RAM, which causes memory caching to disk and numerous related errors. ____________ FAQ's HOW TO: - Opt out of Beta Tests - Ask for Help
	ID: 27878 \| Rating: 0 \| rate: / Reply Quote

tomba Send message Joined: 21 Feb 09 Posts: 497 Credit: 700,690,702 RAC: 0 Level Scientific publications	Message 27880 - Posted: 30 Dec 2012 \| 16:45:34 UTC - in response to Message 27877.
	Again, thank you for responding! I use and recommend MSI Afterburner for that purpose. I installed that... 1. Cleaning the cooling fins of the GPU and the CPU with high pressure air duster (using a vacuum-cleaner at the same time is recommended) - (also the PSU, as much as you can do it through the rear grille, or the fan) I do that religiously every three months. Just did it again. Fans are running quieter. 2. If there is any overclock applied to the PCIe bus, cancel it, and set the PCIe bus frequency to its default of 100MHz I have no idea how to do that so I assume it's at default. 3. GPU memory clock is irrelevant, raising it is just asking for trouble. (set it to factory default) Did that using Afterburner. 4. You can raise the GPU voltage by 0.025V increments as long as your GPU temp is below 80°C (and gets stable). 5. Lower your GPU core clock by 10MHz decrements, until it gets stable. (note that not every frequency can be set, so the actual GPU frequency can be ~3MHz lower than the value beside the slider) 4. and 5. I shall save for later. I now have a dust-free PC and and a default GPU memory clock. One step at a time... Thanks again! ____________
	ID: 27880 \| Rating: 0 \| rate: / Reply Quote

tomba Send message Joined: 21 Feb 09 Posts: 497 Credit: 700,690,702 RAC: 0 Level Scientific publications	Message 27881 - Posted: 30 Dec 2012 \| 17:27:56 UTC - in response to Message 27878.
	Firstly make sure the GPU Fan speed is keeping the GPU below 70°C. Sometimes adding a case fan or opening the side panel helps. My GPU temp is a solid 65°C Start downclocking by reducing the GDDR5 frequency by 10% I get REALLY confused by the variations in terminology. What is GDDR5 frequency??? Is that core, shader, memory??? ____________
	ID: 27881 \| Rating: 0 \| rate: / Reply Quote

Dylan Send message Joined: 16 Jul 12 Posts: 98 Credit: 386,043,752 RAC: 0 Level Scientific publications	Message 27882 - Posted: 30 Dec 2012 \| 18:15:29 UTC
	Tomba, GDDR5 is the memory used in the GPU. See this Wikipedia article: http://en.wikipedia.org/wiki/GDDR5
	ID: 27882 \| Rating: 0 \| rate: / Reply Quote

tomba Send message Joined: 21 Feb 09 Posts: 497 Credit: 700,690,702 RAC: 0 Level Scientific publications	Message 27941 - Posted: 5 Jan 2013 \| 18:00:23 UTC
	All well with Noelias this past week. Just getting in under the 24-hour bar. The latest one forecasts 38 hours to complete!!! ____________
	ID: 27941 \| Rating: 0 \| rate: / Reply Quote

tomba Send message Joined: 21 Feb 09 Posts: 497 Credit: 700,690,702 RAC: 0 Level Scientific publications	Message 27945 - Posted: 6 Jan 2013 \| 8:56:21 UTC - in response to Message 27941.
	The latest one forecasts 38 hours to complete!!! Call off the dogs! After 15 house running, completion is just 5h:40m away. ____________
	ID: 27945 \| Rating: 0 \| rate: / Reply Quote

Retvari Zoltan Send message Joined: 20 Jan 09 Posts: 2343 Credit: 16,214,765,968 RAC: 1,002,217 Level Scientific publications	Message 28028 - Posted: 13 Jan 2013 \| 23:07:15 UTC - in response to Message 27852.
	Would this explain the size of these WU? (2 step deep means longer timeframe of simulation per WU, right?) I guess so, but I'm not the person who has comprehensive knowledge about these workunits.
	ID: 28028 \| Rating: 0 \| rate: / Reply Quote

skgiven Volunteer moderator Volunteer tester Send message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level Scientific publications	Message 28033 - Posted: 14 Jan 2013 \| 12:03:56 UTC - in response to Message 28028.
	I think it means task generations; so 2 steps would mean completing a batch of tasks and then a second batch, auto-generated from the first batches results. ____________ FAQ's HOW TO: - Opt out of Beta Tests - Ask for Help
	ID: 28033 \| Rating: 0 \| rate: / Reply Quote

Post to thread

Message boards : News : New systems in Long queue

	About	Science	Volunteers	Performance	Forum	Join us	Donate