Message boards : Number crunching : new ADRIA_KIXcMyb_HIP_bandit workunits
Author | Message |
---|---|
I saw a bunch of these went out today. | |
ID: 56951 | Rating: 0 | rate: / Reply Quote | |
Huston, we have a problem. <error_code>-131 (file size too big)</error_code> I expect everyone's will fail the same. This is a project-side issue. That's a lot of wasted time for no results and no credit. Edit* I've notified Toni, but it probably wont get fixed before hundreds of tasks have failed and been resent :(. ____________ | |
ID: 56952 | Rating: 0 | rate: / Reply Quote | |
Well that sucks. I got a bunch of them also. I have about an hour to go on one of them to see if it too bombs out with too big an upload. | |
ID: 56953 | Rating: 0 | rate: / Reply Quote | |
Yes, the same problem of file upload too big error. | |
ID: 56954 | Rating: 0 | rate: / Reply Quote | |
My first round of tasks ran to completion, but all hit a computation error right after hitting 100% Thank you very much for sharing this issue. In a flash: I have several of these tasks currently running at my hosts. The first in finishing is running at a GTX 1660 Ti GPU under Linux OS. https://www.gpugrid.net/result.php?resultid=32622893 - I've stopped BOINC activity at BOINC Manager. - I've killed all BOINC processes. - I've edited as administrator the file \var\lib\boinc-client\client_state.xml - I've edited all the instances (21) for the mentioned task of <max_nbytes> parameter, adding a leading 10 to the existing value. - I've restarted boinc-client service, and activity at BOINC Manager. This task is estimating to finish in about 10 hours. Then I'll report if this bypass has worked. | |
ID: 56955 | Rating: 0 | rate: / Reply Quote | |
That process sounds right. I've been trying to catch a resend myself, but with no success so far (most of the fun seems to have happened overnight, UK time, when I had work fetch suspended). | |
ID: 56956 | Rating: 0 | rate: / Reply Quote | |
Am I late to the party? Had my computer running with the gpugrid project attached but didn't get anything. Checked the logs as well, it just keeps saying project has not tasks available. | |
ID: 56957 | Rating: 0 | rate: / Reply Quote | |
My first round of tasks ran to completion, but all hit a computation error right after hitting 100% I wasnt able to save a lot of my tasks, but I applied this change to my existing tasks and it looks like it worked. the trouble file seems to be the one labelled "_9". that file is like 475MB and takes several minutes to upload. not sure if any others are over their limits, since each upload file has it's own size limit and they are not all the same. the is quite labor intensive though. you'll have to make this change for each new task that shows up. it needs to be fixed on the project side. ____________ | |
ID: 56959 | Rating: 0 | rate: / Reply Quote | |
I was able to edit one task on the host with homogeneous cards. But won't bother with the tasks running on non-homogeneous hosts as they will likely error out from restarting on a different card anyway. | |
ID: 56961 | Rating: 0 | rate: / Reply Quote | |
the trouble file seems to be the one labelled "_9". that file is like 475MB and takes several minutes to upload. not sure if any others are over their limits, since each upload file has it's own size limit and they are not all the same. I've just picked up a couple of these tasks. The _9 file seems to have a <max_bytes> of 512,000,000. That should be just enough, even though the binary equivalent is 488.28 MB. I'll bump it for safety, and try and catch the exact size when it uploads. The tasks I've got are freshly created, at 15:42 UTC this afternoon - not resends. They may have made a fresh batch with the problem fixed. | |
ID: 56965 | Rating: 0 | rate: / Reply Quote | |
the trouble file seems to be the one labelled "_9". that file is like 475MB and takes several minutes to upload. not sure if any others are over their limits, since each upload file has it's own size limit and they are not all the same. yup. i saw that too. they are labelled with "new" and "Adaptive" in the filename. so Toni obviously saw my message to change these units. all of the existing ones should be cancelled and re-sent IMO. if left to their own devices, the tasks will land in the hands of someone who doesnt know to manually fix them. it'll take a long time for these to naturally hit 8 errors since they are so long running. weeks. ____________ | |
ID: 56966 | Rating: 0 | rate: / Reply Quote | |
My _9 file currently uploading is 478MB. | |
ID: 56967 | Rating: 0 | rate: / Reply Quote | |
My _9 file currently uploading is 478MB. default is 256,000,000 bytes. they have since changed it on the new files to 512,000,000 bytes ____________ | |
ID: 56968 | Rating: 0 | rate: / Reply Quote | |
I suspended BOINC network activity a while before my previously mentioned task finished. | |
ID: 56969 | Rating: 0 | rate: / Reply Quote | |
WHAT a WASTE! | |
ID: 56970 | Rating: 0 | rate: / Reply Quote | |
New tasks have already been reconfigured with a larger file size limit and are being distributed. | |
ID: 56972 | Rating: 0 | rate: / Reply Quote | |
a 1660ti is not powerful enough to complete in under 24hrs. if you want the 24hr bonus, you'll need a faster GPU for these tasks. but what project gives 10x the credit? Collatz? LOL. do you care about credits, or do you care about doing real work? Collatz is doing useless research IMO. and at least one volunteer has pointed out that the project isn't even providing valid results. https://boinc.berkeley.edu/forum_thread.php?id=14159 ____________ | |
ID: 56974 | Rating: 0 | rate: / Reply Quote | |
New tasks have already been reconfigured with a larger file size limit and are being distributed. Fixed. Thanks for everybody reporting. | |
ID: 56976 | Rating: 0 | rate: / Reply Quote | |
new tasks failing again, even with the higher limt: https://www.gpugrid.net/result.php?resultid=32625435 | |
ID: 56987 | Rating: 0 | rate: / Reply Quote | |
My last *_0_9 file size was 509240934 bytes. | |
ID: 56988 | Rating: 0 | rate: / Reply Quote | |
I guess that something is going on for a definitive solution to lately problems. | |
ID: 56991 | Rating: 0 | rate: / Reply Quote | |
I applied the fix to all my personal systems also, and you're right, the project couldnt know who did or didn't apply the fix manually. I'm sure there were many more people that didnt fix and would have errored out. While it's unfortunate that some processing time was wasted, I think they made the right call to just cancel them all. | |
ID: 56992 | Rating: 0 | rate: / Reply Quote | |
Cancelled tasks here also. I had edited them to survive. Moot now but at least some of them hadn't started yet so no crunching time lost. | |
ID: 56993 | Rating: 0 | rate: / Reply Quote | |
New tasks have just started to flow. ... | |
ID: 56997 | Rating: 0 | rate: / Reply Quote | |
New tasks have just started to flow. can confirm. hopefully no more issues with these ones :) thanks admins! ____________ | |
ID: 56998 | Rating: 0 | rate: / Reply Quote | |
seeing nearly all of these tasks instant failing the past 2 days. not sure what's up. there is no verbose error message associated with them. just...fail. <message> not sure what's happening with these, but I've checked several other computers and it looks like everyone is having the same problem. ____________ | |
ID: 57015 | Rating: 0 | rate: / Reply Quote | |
seeing nearly all of these tasks instant failing the past 2 days... Same behavior for mine ones. And their re-sends to other hosts. | |
ID: 57016 | Rating: 0 | rate: / Reply Quote | |
I thought it was me as I got one and the resend hadn't error'd out yet. Good to know. | |
ID: 57017 | Rating: 0 | rate: / Reply Quote | |
They're trying again. I got e3s197_e1s419p0f951-ADRIA_New_KIXcMyb_HIP_AdaptiveBandit-1-2-RND9332_0 - newly created this morning - but sadly it failed after 3 seconds. | |
ID: 57021 | Rating: 0 | rate: / Reply Quote | |
i usually see a few _0's roll through every day. all instant fail for the past several days. I sent a message to Toni, but did see any resolution yet. | |
ID: 57024 | Rating: 0 | rate: / Reply Quote | |
I couldn't get anything out of the captured work spec - the wrapper rather gets in the way of decoding it. Next time, I'll try capturing the files as well, and running them in terminal - that worked for WCG. | |
ID: 57026 | Rating: 0 | rate: / Reply Quote | |
Every of 7 tasks received at my Linux hosts today, all of them starting with "e4s...", have failed after a few seconds past. | |
ID: 57027 | Rating: 0 | rate: / Reply Quote | |
I don’t think it’s a licensing issue. The apps were recreated with an updated license last October, about 9 months ago. Both Windows and Linux were created at the same time, presumably with the same licensing period. | |
ID: 57028 | Rating: 0 | rate: / Reply Quote | |
but you're right that it does appear to be a problem with Linux computers. maybe some parameter isnt set right for the Linux app? i see some successful runs on Windows machines. | |
ID: 57029 | Rating: 0 | rate: / Reply Quote | |
May be there is some kind of issue regarding Linux application / license (?) To verify my assumption, I've entered the Windows 10 partition at this Linux / Windows dual boot system. Every my last ADRIA_New_KIXcMyb_HIP_AdaptiveBandit tasks received at the Linux host since June 22nd had failed after three to four seconds of execution. I've received this e3s203_e1s419p0f906-ADRIA_New_KIXcMyb_HIP_AdaptiveBandit-1-2-RND8149_1 task at the Windows host, and it has been running for more than one hour by now. The computer is the same, the only difference is in Operating System entered. I think I'll let this task to run to its completion, to check whether it succeeds. About 44 more hours left, estimated for this GTX 1650 SUPER GPU... | |
ID: 57030 | Rating: 0 | rate: / Reply Quote | |
I've received this e3s203_e1s419p0f906-ADRIA_New_KIXcMyb_HIP_AdaptiveBandit-1-2-RND8149_1 task at the Windows host, and it has been running for more than one hour by now. Finally, the mentioned task finished today successfully at my Windows 10 host, after 157.184,28 seconds of total processing time. This task eventually survived an unwanted system reboot, due to delayed Windows updates after a long time of working on its Linux side. It even got mid bonus for result returned in less than 48 hours. I continue to think that some action should be taken on Server side to correct a problem affecting tasks generated for Linux environment. | |
ID: 57039 | Rating: 0 | rate: / Reply Quote | |
I sent another message to Toni about the issue of Linux tasks. Hopefully a quick resolution. | |
ID: 57040 | Rating: 0 | rate: / Reply Quote | |
I sent another message to Toni about the issue of Linux tasks. Hopefully a quick resolution. I did too. | |
ID: 57044 | Rating: 0 | rate: / Reply Quote | |
Message boards : Number crunching : new ADRIA_KIXcMyb_HIP_bandit workunits