Message boards : Number crunching : Problems and Technical Issues with Rosetta@home
Previous · 1 . . . 29 · 30 · 31 · 32 · 33 · 34 · 35 . . . 300 · Next
Author | Message |
---|---|
bormolino Send message Joined: 16 May 13 Posts: 4 Credit: 160,977 RAC: 0 |
PS- Just after posting, I now see that bormolino might be reporting the same issue just above my post. Yes :D Same on my machines. |
EHM-1 Send message Joined: 21 Mar 20 Posts: 23 Credit: 183,782 RAC: 0 |
Follow-up to my earlier post: At the most recent screensaver invocation, the normal behavior resumed. Note: Though subscribed to this thread, I received no notification of bormolino's post. Eric |
rlpm Send message Joined: 23 Mar 20 Posts: 13 Credit: 84 RAC: 0 |
Note: Though subscribed to this thread, I received no notification of bormolino's post. Check your community prefs from your main account page. |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2115 Credit: 41,110,625 RAC: 19,736 |
Not sure what this means atm 30/03/2020 3:17:00 | Rosetta@home | Scheduler request completed: got 0 new tasks Also, entering this thread I initially got a message saying the site was down. Came back on a refresh |
amgthis Send message Joined: 25 Mar 06 Posts: 81 Credit: 203,879,282 RAC: 0 |
getting an 'temporarily failed upload of (w/u name here xxx ) transient http error' message on upload failure and time out. I'm guessing it's just some new message I've never seen and the project is just getting updated, etc. |
HPE Belgium Send message Joined: 27 Mar 20 Posts: 16 Credit: 367,648,439 RAC: 0 |
Hello, I have some servers that I want to use for R@H. Most of the servers use full CPU and all cores/logical CPU's, however I have 2 servers that only use half of the available logical processor. Both servers are ProLiant Gen9 servers. One server is a BL660c Gen9 with 32 logical CPU's but only half of them are working while I still have tasks "ready to start". Other server is DL380 Gen9 which takes 67% CPU load instad of 100% My other servers are Gen8 servers which take full load. Is there something I can do to fix this? Somebody that can help me troubleshoot? All my preferences are set to 100% load in my global preferences and this setting works fine on most of my servers. |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1670 Credit: 17,523,845 RAC: 23,480 |
Is there something I can do to fix this? Somebody that can help me troubleshoot? All my preferences are set to 100% load in my global preferences and this setting works fine on most of my servers.Are they "Ready to start" or "Waiting on memory?"- they've got enough RAM to support all of those cores & threads? You haven't changed any settings in the BOINC Manager on those systems (local settings override web based ones)? Grant Darwin NT |
HPE Belgium Send message Joined: 27 Mar 20 Posts: 16 Credit: 367,648,439 RAC: 0 |
Thank you for replying. More then enough free memory, and they are really "Ready to Start". What I do see however that I have 32 jobs running with a total of 32 logical CPU's in my server, but it is only using half of the Logical CPU's. See here https://imgur.com/a/3iBM4DO I have this on all Gen9, while I have Gen8 with 64 logical CPU's which are all fully used. I am now deploying another Gen9 and will see what that gives. |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1670 Credit: 17,523,845 RAC: 23,480 |
What I do see however that I have 32 jobs running with a total of 32 logical CPU's in my server, but it is only using half of the Logical CPU's.Hmm. I had something similar with my GPUs on Seti where the driver install went very weird & it showed double the number of actual GPUs in the BOINC log. I would check the Event log and make sure there is only 1 CPU entry in there (although being a muti-socket system it should probably be 2, making sure there aren't 4 in there). eg- 30/03/2020 15:09:34 | | CUDA: NVIDIA GPU 0: GeForce RTX 2060 (driver version 442.59, CUDA version 10.2, compute capability 7.5, 4096MB, 3556MB available, 14054 GFLOPS peak) 30/03/2020 15:09:34 | | CUDA: NVIDIA GPU 1: GeForce GTX 1070 (driver version 442.59, CUDA version 10.2, compute capability 6.1, 4096MB, 3556MB available, 6852 GFLOPS peak) 30/03/2020 15:09:34 | | OpenCL: NVIDIA GPU 0: GeForce RTX 2060 (driver version 442.59, device version OpenCL 1.2 CUDA, 6144MB, 3556MB available, 14054 GFLOPS peak) 30/03/2020 15:09:34 | | OpenCL: NVIDIA GPU 1: GeForce GTX 1070 (driver version 442.59, device version OpenCL 1.2 CUDA, 8192MB, 3556MB available, 6852 GFLOPS peak) 30/03/2020 15:09:34 | | Processor: 12 GenuineIntel Intel(R) Core(TM) i7-8700K CPU @ 3.70GHz [Family 6 Model 158 Stepping 10] 30/03/2020 15:09:34 | | Processor features: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss htt tm pni ssse3 fma cx16 sse4_1 sse4_2 movebe popcnt aes f16c rdrandsyscall nx lm avx avx2 vmx smx tm2 pbe fsgsbase bmi1 hle smep bmi2 30/03/2020 15:09:34 | | OS: Microsoft Windows 10: Professional x64 Edition, (10.00.18363.00) 30/03/2020 15:09:34 | | Memory: 31.95 GB physical, 36.70 GB virtual 30/03/2020 15:09:34 | | Disk: 930.50 GB total, 823.00 GB free 30/03/2020 15:09:34 | | Local time is UTC +9 hours 30/03/2020 15:09:34 | SETI@home | Found app_config.xml 30/03/2020 15:09:34 | SETI@home Beta Test | Found app_config.xml When my driver issue occurred, the CUDA & OpenCL entries for each video card were doubled up- resulting in 2 Tasks running on only the 1 GPU. Grant Darwin NT |
MarkJ Send message Joined: 28 Mar 20 Posts: 72 Credit: 25,238,680 RAC: 0 |
More then enough free memory, and they are really "Ready to Start". What I do see however that I have 32 jobs running with a total of 32 logical CPU's in my server, but it is only using half of the Logical CPU's.Do you mean 16 cores/32 threads? What version of BOINC are you running, is it up to date (7.14 or later)? Do you have local prefs limiting it to 50% of CPU, if so change it to 100% What percentage of memory is it allowed to use (it has one setting for in use and another for idle). If you are logged in then it’s in use as far as BOINC is concerned. Do you have an app_config in the Rosetta project folder limiting the number of tasks? |
HPE Belgium Send message Joined: 27 Mar 20 Posts: 16 Credit: 367,648,439 RAC: 0 |
I checked the event log. I don't see anything special in there.... 0-Mar-2020 10:14:03 [---] Starting BOINC client version 7.14.2 for windows_x86_64 30-Mar-2020 10:14:03 [---] log flags: file_xfer, sched_ops, task 30-Mar-2020 10:14:03 [---] Libraries: libcurl/7.47.1 OpenSSL/1.0.2g zlib/1.2.8 30-Mar-2020 10:14:03 [---] Running as a daemon (GPU computing disabled) 30-Mar-2020 10:14:03 [---] Data directory: C:ProgramDataBOINC 30-Mar-2020 10:14:03 [---] Running under account boinc_master 30-Mar-2020 10:14:03 [---] No usable GPUs found 30-Mar-2020 10:14:03 [---] Creating new client state file 30-Mar-2020 10:14:03 [---] Processor: 32 GenuineIntel Intel(R) Xeon(R) CPU E5-2630 v3 @ 2.40GHz [Family 6 Model 63 Stepping 2] 30-Mar-2020 10:14:03 [---] Processor features: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss htt tm pni ssse3 fma cx16 sse4_1 sse4_2 movebe popcnt aes f16c rdrandsyscall nx lm avx avx2 vmx smx tm2 dca pbe fsgsbase bmi1 smep bmi2 30-Mar-2020 10:14:03 [---] OS: Microsoft Windows Server 2016: Standard x64 Edition, (10.00.14393.00) 30-Mar-2020 10:14:03 [---] Memory: 383.87 GB physical, 423.87 GB virtual 30-Mar-2020 10:14:03 [---] Disk: 1023.45 GB total, 966.99 GB free 30-Mar-2020 10:14:03 [---] Local time is UTC +2 hours 30-Mar-2020 10:14:03 [---] No WSL found. 30-Mar-2020 10:14:03 [---] Last benchmark was 18351 days 08:14:03 ago 30-Mar-2020 10:14:08 [---] No general preferences found - using defaults 30-Mar-2020 10:14:08 [---] Preferences: 30-Mar-2020 10:14:08 [---] max memory usage when active: 196543.06 MB 30-Mar-2020 10:14:08 [---] max memory usage when idle: 353777.50 MB 30-Mar-2020 10:14:08 [---] max disk usage: 921.10 GB 30-Mar-2020 10:14:08 [---] don't use GPU while active 30-Mar-2020 10:14:08 [---] suspend work if non-BOINC CPU load exceeds 25% 30-Mar-2020 10:14:08 [---] (to change preferences, visit a project web site or select Preferences in the Manager) 30-Mar-2020 10:14:08 [---] Setting up project and slot directories 30-Mar-2020 10:14:08 [---] Checking active tasks 30-Mar-2020 10:14:08 [---] Setting up GUI RPC socket 30-Mar-2020 10:14:08 [---] Checking presence of 0 project files 30-Mar-2020 10:14:08 [---] This computer is not attached to any projects 30-Mar-2020 10:43:45 [---] Using proxy info from GUI 30-Mar-2020 10:44:21 [---] Fetching configuration file from https://boinc.bakerlab.org/rosetta/get_project_config.php 30-Mar-2020 10:44:39 [---] Running CPU benchmarks 30-Mar-2020 10:44:39 [---] Suspending computation - CPU benchmarks in progress 30-Mar-2020 10:45:10 [---] Benchmark results: 30-Mar-2020 10:45:10 [---] Number of CPUs: 32 30-Mar-2020 10:45:10 [---] 2933 floating point MIPS (Whetstone) per CPU 30-Mar-2020 10:45:10 [---] 11378 integer MIPS (Dhrystone) per CPU |
HPE Belgium Send message Joined: 27 Mar 20 Posts: 16 Credit: 367,648,439 RAC: 0 |
Latest BOINC. Fresh install from today. Global settings in boinc profile is: use at most 100% of the cpus use at most 100% cpu time For memory, use at most 90%, but as you can see in the screenshot I attached, there is more then enough free. I have no app_config in the ProgramDataBOINC folder. |
MarkJ Send message Joined: 28 Mar 20 Posts: 72 Credit: 25,238,680 RAC: 0 |
Latest BOINC. Fresh install from today. What about that suspend when non-BOINC load > 25%? Can you try setting it to zero. Are computing options set to “run always”? |
HPE Belgium Send message Joined: 27 Mar 20 Posts: 16 Credit: 367,648,439 RAC: 0 |
"Suspend when non-BOINC load ..." is off in the "Computing Preferences" in BAM "Activity" in BAM is all set to "Always"... I really don't know what's wrong. I use the exact same setting on all my servers. As said, Gen8 servers, even with 64 cores are fully loaded. Gen9 servers only take half... |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1670 Credit: 17,523,845 RAC: 23,480 |
Latest BOINC. Fresh install from today.Those were my next couple of questions, because the startup messages there look good, and some settings in app_config.xml will result in more cores than physically exist. But the number of Tasks you have matches the number of threads available, yet they are doubled up on physical cores. Are all of the Tasks running on just the 1 CPU? Wild speculation- configuration setting on the OS (boot config/environment variables etc?) is blocking the use of 1 CPU, but since the OS is reporting all Cores & Threads, that's how many Tasks are running even though half of them aren't actually available for use??? Got me scratching my head, hopefully someone else will have come across it before. Anyway- Good luck, it's past my bed time. Grant Darwin NT |
MarkJ Send message Joined: 28 Mar 20 Posts: 72 Credit: 25,238,680 RAC: 0 |
"Suspend when non-BOINC load ..." is off in the "Computing Preferences" in BAM Have you shut it down/rebooted after BOINC install? |
HPE Belgium Send message Joined: 27 Mar 20 Posts: 16 Credit: 367,648,439 RAC: 0 |
I installed as a service, so it needs a reboot after install... |
Bryn Mawr Send message Joined: 26 Dec 18 Posts: 389 Credit: 12,070,320 RAC: 12,300 |
"Suspend when non-BOINC load ..." is off in the "Computing Preferences" in BAM Silly question, could hyperthreading be turned off in the bios? |
HPE Belgium Send message Joined: 27 Mar 20 Posts: 16 Credit: 367,648,439 RAC: 0 |
"Suspend when non-BOINC load ..." is off in the "Computing Preferences" in BAM There are no silly questions. But HT is enabled. Here are the other BIOS options about performance (last word is the current setting): Intel(R) Turbo Boost Technology Default - Enabled Enabled ACPI SLIT Default - Enabled Enabled Node Interleaving Default - Disabled Disabled Intel NIC DMA Channels (IOAT) Default - Enabled Enabled HW Prefetcher Default - Enabled Enabled Adjacent Sector Prefetch Default - Enabled Enabled DCU Stream Prefetcher Default - Enabled Enabled DCU IP Prefetcher Default - Enabled Enabled QPI Snoop Configuration Default - Home Snoop Home Snoop QPI Home Snoop Optimization Default - Directory + OSB Enabled QPI Bandwidth Optimization (RTID) Default - Balanced Balanced Memory Proximity Reporting for I/O Default - Enabled Enabled I/O Non-posted Prefetching Default - Enabled Enabled NUMA Group Size Optimization Default - Clustered Clustered Intel Performance Monitoring Support Default - Disabled Disabled |
Falconet Send message Joined: 9 Mar 09 Posts: 353 Credit: 1,222,776 RAC: 4,804 |
I installed as a service, so it needs a reboot after install... Can you create a cc_config.xml file, save it in the BOINC Data Directory (Usually C:/ProgramData/BOINC, you can check the event log for the correct path) with this, changing "N" to the numbers of Threads you want to run: <cc_config> <options> <ncpus>N</ncpus> </options> </cc_config> I remember someone at the WCG forums with a 32C/64T AMD CPU that was running only 32 tasks. Once you save the file, go to BOINC-Options-Read Config Files or something like that. |
Message boards :
Number crunching :
Problems and Technical Issues with Rosetta@home
©2024 University of Washington
https://www.bakerlab.org