Between the 10th Dec and 17th Dec by my GTS250 had 5 Errors and 9 Successes.
19h 15min were lost (69206s), or 11.5% of the time. Slightly better than the previous week (12.5) and still much better that the week before (25%).
Since the 13th there has only been one failure, although it did fail after 9h30min!
I suspect that failure was as a result of the task being run when I was using the system. So I made sure it does not run GPUGrid when I am using it (which is not too often)!
All Error messages have the following line,
MDIO ERROR: cannot open file "restart.coor"
List of tasks undertaken:
1633773 1024720 15 Dec 2009 23:57:30 UTC 16 Dec 2009 13:41:29 UTC Completed and validated 47,063.52 4,500.40 3,977.21 5,369.23 Full-atom molecular dynamics v6.71 (cuda23)
1632027 1023349 15 Dec 2009 6:17:34 UTC 15 Dec 2009 23:57:30 UTC Error while computing 34,324.59 1,311.34 4,428.01 --- Full-atom molecular dynamics v6.71 (cuda23)
1629991 1022109 14 Dec 2009 15:42:01 UTC 15 Dec 2009 11:17:44 UTC Completed and validated 52,633.66 3,102.44 4,503.74 6,080.05 Full-atom molecular dynamics v6.71 (cuda23)
1627586 1020487 14 Dec 2009 0:18:19 UTC 14 Dec 2009 20:42:11 UTC Completed and validated 55,474.79 3,033.25 4,531.91 6,118.08 Full-atom molecular dynamics v6.71 (cuda23)
1625604 1007426 13 Dec 2009 11:55:40 UTC 14 Dec 2009 6:21:57 UTC Completed and validated 52,461.03 2,915.99 4,503.74 6,080.05 Full-atom molecular dynamics v6.71 (cuda23)
1624544 1018750 12 Dec 2009 11:21:41 UTC 13 Dec 2009 11:53:58 UTC Completed and validated 55,578.08 3,299.06 4,531.91 6,118.08 Full-atom molecular dynamics v6.71 (cuda23)
1624517 1018739 12 Dec 2009 10:56:14 UTC 12 Dec 2009 11:14:49 UTC Error while computing 1,015.11 54.30 4,022.81 --- Full-atom molecular dynamics v6.71 (cuda23)
1624470 1018708 12 Dec 2009 10:28:30 UTC 12 Dec 2009 10:34:45 UTC Error while computing 265.39 18.05 4,428.01 --- Full-atom molecular dynamics v6.71 (cuda23)
1622530 1013606 11 Dec 2009 20:43:48 UTC 12 Dec 2009 10:28:30 UTC Error while computing 32,402.90 1,903.20 4,531.91 --- Full-atom molecular dynamics v6.71 (cuda23)
1620740 1016195 11 Dec 2009 7:20:21 UTC 12 Dec 2009 5:01:01 UTC Completed and validated 50,106.54 2,207.99 4,022.81 5,430.80 Full-atom molecular dynamics v6.71 (cuda23)
1620010 1015882 12 Dec 2009 10:34:45 UTC 12 Dec 2009 10:56:14 UTC Error while computing 1,199.04 117.81 3,977.21 --- Full-atom molecular dynamics v6.71 (cuda23)
1617985 1014436 10 Dec 2009 10:43:53 UTC 11 Dec 2009 16:01:44 UTC Completed and validated 45,358.59 5,275.05 3,539.96 4,778.94 Full-atom molecular dynamics v6.71 (cuda23)
1616001 1013057 9 Dec 2009 23:10:00 UTC 10 Dec 2009 18:54:39 UTC Completed and validated 56,684.09 3,287.19 4,531.91 6,118.08 Full-atom molecular dynamics v6.71 (cuda23)
Failure 1:
________________________________________
Name p270000-IBUCH_2_pYEEI_2011-5-20-RND2486_0
Workunit 1015882
Created 11 Dec 2009 1:24:50 UTC
Sent 12 Dec 2009 10:34:45 UTC
Received 12 Dec 2009 10:56:14 UTC
Server state Over
Outcome Client error
Client state Compute error
Exit status 1 (0x1)
Computer ID 51279
Report deadline 17 Dec 2009 10:34:45 UTC
Run time 1199.038244
CPU time 117.812
stderr out <core_client_version>6.10.18</core_client_version>
<![CDATA[
<message>
Incorrect function. (0x1) - exit code 1 (0x1)
</message>
<stderr_txt>
# Using CUDA device 0
# There is 1 device supporting CUDA
# Device 0: "GeForce GTS 250"
# Clock rate: 1.85 GHz
# Total amount of global memory: 1073741824 bytes
# Number of multiprocessors: 16
# Number of cores: 128
MDIO ERROR: cannot open file "restart.coor"
Cuda error: Kernel [pme_fill_charges_overflow] failed in file 'fillcharges.cu' in line 97 : unknown error.
</stderr_txt>
]]>
Validate state Invalid
Claimed credit 3977.21064814815
Granted credit 0
application version Full-atom molecular dynamics v6.71 (cuda23)
Failure 2:
________________________________________
Name 471-GIANNI_BIND_166_119-30-100-RND4009_1
Workunit 1013606
Created 11 Dec 2009 20:08:53 UTC
Sent 11 Dec 2009 20:43:48 UTC
Received 12 Dec 2009 10:28:30 UTC
Server state Over
Outcome Client error
Client state Compute error
Exit status 1 (0x1)
Computer ID 51279
Report deadline 16 Dec 2009 20:43:48 UTC
Run time 32402.901728
CPU time 1903.197
stderr out <core_client_version>6.10.18</core_client_version>
<![CDATA[
<message>
Incorrect function. (0x1) - exit code 1 (0x1)
</message>
<stderr_txt>
# Using CUDA device 0
# There is 1 device supporting CUDA
# Device 0: "GeForce GTS 250"
# Clock rate: 1.85 GHz
# Total amount of global memory: 1073741824 bytes
# Number of multiprocessors: 16
# Number of cores: 128
MDIO ERROR: cannot open file "restart.coor"
Cuda error: Kernel [PmeRealSpace_compute_forces] failed in file 'PmeRealSpace.cu' in line 172 : unknown error.
</stderr_txt>
]]>
Validate state Invalid
Claimed credit 4531.90972222222
Granted credit 0
application version Full-atom molecular dynamics v6.71 (cuda23)
Failure 3:
________________________________________
Name 88-KASHIF_HIVPR_n1_for_1hhp_open_ba4-78-100-RND1283_0
Workunit 1018708
Created 12 Dec 2009 9:52:07 UTC
Sent 12 Dec 2009 10:28:30 UTC
Received 12 Dec 2009 10:34:45 UTC
Server state Over
Outcome Client error
Client state Compute error
Exit status 1 (0x1)
Computer ID 51279
Report deadline 17 Dec 2009 10:28:30 UTC
Run time 265.390701
CPU time 18.04932
stderr out <core_client_version>6.10.18</core_client_version>
<![CDATA[
<message>
Incorrect function. (0x1) - exit code 1 (0x1)
</message>
<stderr_txt>
# Using CUDA device 0
# There is 1 device supporting CUDA
# Device 0: "GeForce GTS 250"
# Clock rate: 1.85 GHz
# Total amount of global memory: 1073741824 bytes
# Number of multiprocessors: 16
# Number of cores: 128
MDIO ERROR: cannot open file "restart.coor"
Cuda error: Kernel [PmeRealSpace_compute_forces] failed in file 'PmeRealSpace.cu' in line 172 : unknown error.
</stderr_txt>
]]>
Validate state Invalid
Claimed credit 4428.01157407407
Granted credit 0
application version Full-atom molecular dynamics v6.71 (cuda23)
Failure 4:
Name34-KASHIF_HIVPR_sub_so_ba1-72-100-RND1262_0 Workunit1018739 Created12 Dec 2009 10:17:10 UTC Sent12 Dec 2009 10:56:14 UTC Received12 Dec 2009 11:14:49 UTC Server stateOver OutcomeClient error Client stateCompute error Exit status1 (0x1) Computer ID51279 Report deadline17 Dec 2009 10:56:14 UTC Run time1015.111446 CPU time54.30395 stderr out
<core_client_version>6.10.18</core_client_version>
<![CDATA[
<message>
Incorrect function. (0x1) - exit code 1 (0x1)
</message>
<stderr_txt>
# Using CUDA device 0
# There is 1 device supporting CUDA
# Device 0: "GeForce GTS 250"
# Clock rate: 1.85 GHz
# Total amount of global memory: 1073741824 bytes
# Number of multiprocessors: 16
# Number of cores: 128
MDIO ERROR: cannot open file "restart.coor"
Cuda error: Kernel [PmeRealSpace_compute_forces] failed in file 'PmeRealSpace.cu' in line 172 : unknown error.
</stderr_txt>
]]>
Validate stateInvalid Claimed credit4022.81481481481 Granted credit0 application versionFull-atom molecular dynamics v6.71 (cuda23)
Failure 5:
Name89-KASHIF_HIVPR_n1_for_1hhp_open_ba4-78-100-RND7252_1 Workunit1023349 Created15 Dec 2009 5:43:05 UTC Sent15 Dec 2009 6:17:34 UTC Received15 Dec 2009 23:57:30 UTC Server stateOver OutcomeClient error Client stateCompute error Exit status1 (0x1) Computer ID51279 Report deadline20 Dec 2009 6:17:34 UTC Run time34324.593376 CPU time1311.344 stderr out
<core_client_version>6.10.18</core_client_version>
<![CDATA[
<message>
Incorrect function. (0x1) - exit code 1 (0x1)
</message>
<stderr_txt>
# Using CUDA device 0
# There is 1 device supporting CUDA
# Device 0: "GeForce GTS 250"
# Clock rate: 1.85 GHz
# Total amount of global memory: 1073741824 bytes
# Number of multiprocessors: 16
# Number of cores: 128
MDIO ERROR: cannot open file "restart.coor"
# Using CUDA device 0
# There is 1 device supporting CUDA
# Device 0: "GeForce GTS 250"
# Clock rate: 1.85 GHz
# Total amount of global memory: 1073741824 bytes
# Number of multiprocessors: 16
# Number of cores: 128
Cuda error: Kernel [PmeRealSpace_compute_forces] failed in file 'PmeRealSpace.cu' in line 172 : unknown error.
</stderr_txt>
]]>
Validate stateInvalid Claimed credit4428.01157407407 Granted credit0 application versionFull-atom molecular dynamics v6.71 (cuda23) |