Message boards : Number crunching : Problems and Technical Issues with Rosetta@home
Previous · 1 . . . 34 · 35 · 36 · 37 · 38 · 39 · 40 . . . 300 · Next
Author | Message |
---|---|
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1670 Credit: 17,523,845 RAC: 23,480 |
I'm saying it doesn't look productive because the decoys are taking approximately 4 to 6 times longer to process.Whereas the half dozen or so Tasks i've processed so far with the new application actually got more work done in 8 hours than the previous applications did in the same time. Early days yet, with not much work to actually see how things are going. Grant Darwin NT |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1670 Credit: 17,523,845 RAC: 23,480 |
Definitely an issue with Rosetta Mini target run times since the rollout of the new applications. Finally managed to pick up some new work- a whole 2 Tasks. 1 Rosetta Mini and 1 Rosetta 4.12. Rosetta 4.12 is on target for the target CPU time (8hrs). Rosetta Mini is on target for double the target CPU time (16hrs). Has only started since the new applications were released (as far a i can tell; did they also do a fix for the Rosetta Mini Tasks that were paying next to nothing at the same time?) Edit- Finally finished a few of these longer running Rosetta Minis and i've decided this isn't really a problem at all. While the Tasks take twice as long to process, they pay out 4 times more Credit than they usually do. I can live with that. Grant Darwin NT |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2115 Credit: 41,112,600 RAC: 19,835 |
The tasks in progress is incorrect, I reset the project twice this week due to multiple downloads failing so they arent really there as discussed in a different thread. I remember now - apologies. That's around 2400 'ghost' tasks showing that aren't there for you to run. They'll be part of the 900k+ looking to be in progress tasks that aren't running down in spite of nolittle work for days. I'm saying it doesn't look productive because the decoys are taking approximately 4 to 6 times longer to process. If you watch the graphics, it gets to a certain number of steps and then almost stops, taking 30-60 minutes for each additional step. Something else I missed. I don't look at the graphics to see how things are running. I'm not sure it reflects anything too much about the task - just a show. Models, yes, but not steps. Maybe I'm wrong. Ignore me. |
strongboes Send message Joined: 3 Mar 20 Posts: 27 Credit: 5,394,270 RAC: 0 |
the first of the new tasks has just finished, took 4 hours to run the 1 decoy for me, these were definitely running under an hour previously. If you have your runtime to 4 hours you wont really notice the difference in time, but i'm more concerned with the actual work being done by the program. If points are an accurate indication then with 4.07 I was running at an average of 300pts per hour per core, this just finished task has returned 300 points in 4 hours, which ties in with my thinking they are not running efficiently. Is there a mod reading who can make a comment? edit, there are 60 of these now finishing so plenty to look athttps://boinc.bakerlab.org/rosetta/result.php?resultid=1138591491 |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2115 Credit: 41,112,600 RAC: 19,835 |
I can give a little further info also, my cpu is currently 99% utilised. boinc is running 60 cores, 2 are running gpus for folding, 2 spare for overhead. normally when boinc is running with all cores running the clock speed is approx 3.2ghz, and it will pull as many watts as i let it (doubling the power with an overclock only get me to 3.55ghz) , at the moment it's pulling 15% less power, and the clock speed is up at 4.2ghz for all cores. If each core was being run hard it would be impossible for it to run this speed. This is the speed it normally runs with say 3 or 4 cores loaded. That's very interesting. I also overclock on a FX8370 Piledriver, but prevent any throttling so my base 4.3GHz is running at 4.768GHz which is surprisingly stable. But I know 8 cores is no comparison to 60-64 cores. I pay attention to people with Ryzens because that's where I'm likely to go next once I've run this one into the ground |
strongboes Send message Joined: 3 Mar 20 Posts: 27 Credit: 5,394,270 RAC: 0 |
This is the 3990x so 64 cores, 128 threads, I've turned off smt so only running the 64 so as to give more l3 cache per core which allows the tasks to progress very rapidly. It's a fantastic chip. |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2115 Credit: 41,112,600 RAC: 19,835 |
This is the 3990x so 64 cores, 128 threads, I've turned off smt so only running the 64 so as to give more l3 cache per core which allows the tasks to progress very rapidly. It's a fantastic chip. Noted, ta. But also aware you've gone for top spec. I'm more likely to look at 3700/3800 for the price/performance tbh. Cost is an issue. But I'll reassess at the point I'm ready to buy. Not even sure I could afford your RAM, let alone your MBCPU! |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2115 Credit: 41,112,600 RAC: 19,835 |
the first of the new tasks has just finished, took 4 hours to run the 1 decoy for me, these were definitely running under an hour previously. If you have your runtime to 4 hours you wont really notice the difference in time, but i'm more concerned with the actual work being done by the program. If points are an accurate indication then with 4.07 I was running at an average of 300pts per hour per core, this just finished task has returned 300 points in 4 hours, which ties in with my thinking they are not running efficiently. Are you sure? Looks more like 75/core/hr in the past to me. Sometimes 50 Also, new versions take a little while to get their scoring sorted out iirc. Looks like it started at 150/4hrs and risen to nearer 300 now. But this isn't my strong suit. Anyway, I only chimed in because I'd be happy with 8 or 16 WUs atm. 11 now here on my 8-core but still nothing for my 2 4-core machines. 60 would be a dream |
Grant (SSSF) Send message Joined: 28 Mar 20 Posts: 1670 Credit: 17,523,845 RAC: 23,480 |
Edit-Well, it was nice while it lasted. Gone from 4 times as much down to 2 times as much- so back on par with Tasks that run for normal Target times. Grant Darwin NT |
nastasache Send message Joined: 24 Feb 07 Posts: 16 Credit: 171,383 RAC: 0 |
Thanks a lot, Robert I changed all to use 99% of RAM (was 90% as default and 50% for other). And 1% of swap. It looks no out of memory errors for now but memory usage stay as before. For 12 tasks, the total memory usage is about 6GB. It looks R@H using less memory per task than max available for 32bit app. Here is a task with max mem usage: Application Rosetta 4.12 Name 4dy3ga3h_jhr_design1_COVID-19_SAVE_ALL_OUT_903392_1 State Running Received 2020-04-01 21:33:01 Report deadline 2020-04-09 21:33:00 Estimated computation size 80,000 GFLOPs CPU time 08:11:40 CPU time since checkpoint 00:04:37 Elapsed time 15:34:17 Estimated time remaining 2d 05:56:33 Fraction done 22.400% Virtual memory size 1.12 GB Working set size 1.14 GB Directory slots/2 Process ID 14460 Progress rate 2.520% per hour Executable rosetta_4.12_windows_intelx86.exe Btw a task take about 2-3 days to finish, from an initial 4 hours estimation; it's that normal? Iulian |
strongboes Send message Joined: 3 Mar 20 Posts: 27 Credit: 5,394,270 RAC: 0 |
see below, there are no 4.07 tasks left showing, there was 9000 yesterday only 400 today, the mini was taking around an hour but gives an idea. the 4.07 were averaging a 40 min runtime, with a rate of 1 credit for 11.5 secs of runtime on average. 3600/11.5 = 313 The last 4.12 is running at 1 credit for 59.95 seconds of runtime. 4.7* slower https://boinc.bakerlab.org/rosetta/results.php?hostid=3800945&offset=340&show_names=0&state=4&appid= |
JoshuaScholar Send message Joined: 26 Mar 20 Posts: 18 Credit: 232,183 RAC: 0 |
I know this affects so few people that it won't matter much but: I have an older 2 socket Xeon system (Sandy Bridge era e5-2690s). Let me tell you what DOESN'T work properly with the Windows client on my Windows 10 pro setup: 1) NUMA. Having two sockets, the most common way to run Windows is with each processor accessing the memory that's attached to it directly preferentially. This is called NUMA, and it's slightly faster. But with NUMA enabled, the client picks the proper number of threads as if it's going to use both sockets, but then it runs all of the threads on only ONE of the sockets. 2) Hyperthreading with NUMA off. [NUMA off is called "uniform memory access", by the way.] With NUMA off and Hyperthreading enabled, the client creates the right number of threads for using both sockets BUT it allocates both threads to the SAME hyperthread in each core. So each core has one empty hyperthread and one hyperthread shared by two threads. So on this old 2 socket Xeon system running Windows 10 pro, the only efficient way to run the BOINC client is to turn off NUMA and also turn off hyperthreading. Then it works properly. On a machine this old, on a highly parallel workload, turning off hyperthreading is about a 20% throughput hit. On a newer processor it would be a greater hit. I'm not sure if there's any real hit to turning off NUMA, but it isn't a big one. Josh Scholar |
nastasache Send message Joined: 24 Feb 07 Posts: 16 Credit: 171,383 RAC: 0 |
Hi especially @Grant (SSSF) Where I am wrong? I need 2x more time to finish the tasks and 50% GFLOPS on similar i7-8700K CPU Compare: - https://boinc.bakerlab.org/rosetta/host_app_versions.php?hostid=3933928 - https://boinc.bakerlab.org/rosetta/host_app_versions.php?hostid=3914491 Thanks in advance. |
robertmiles Send message Joined: 16 Jun 08 Posts: 1232 Credit: 14,269,631 RAC: 3,846 |
strongboes, [snip] I'm saying it doesn't look productive because the decoys are taking approximately 4 to 6 times longer to process. If you watch the graphics, it gets to a certain number of steps and then almost stops, taking 30-60 minutes for each additional step. You are assuming that each decoy does an equal amount of work, and that each step does an equal amount of work. I don't expect that to be true. Generally, the first decoy is only for checking that your computer works correctly and is the same every time, The second decoy starts the useful work. |
robertmiles Send message Joined: 16 Jun 08 Posts: 1232 Credit: 14,269,631 RAC: 3,846 |
One thing to watch for when using CPUs with especially high numbers of cores - the bandwidth from the CPU to the memory may not be adequate to run all of the cores very well. This could leave each core in use waiting for access to memory most of the time, If so, it can be useful to reduce the number of cores BOINC is allowed to use and see if that speeds up the work enough to more than compensate for fewer cores in use. |
JoshuaScholar Send message Joined: 26 Mar 20 Posts: 18 Credit: 232,183 RAC: 0 |
That might be because of the bugs I noticed. Make sure that every thread is really allocated in its own hyperhthread, because BOINC doesn't leave it up to the OS. |
strongboes Send message Joined: 3 Mar 20 Posts: 27 Credit: 5,394,270 RAC: 0 |
One thing to watch for when using CPUs with especially high numbers of cores - the bandwidth from the CPU to the memory may not be adequate to run all of the cores very well. This could leave each core in use waiting for access to memory most of the time, If you read previous posts you will see that i'm not hyper threading and have large l3 cache and ram, I tried running just 10 cores also. It isn't that, they run roughly 4 times slower than 4.07 if they start with rb, It will be obvious soon enough. |
JoshuaScholar Send message Joined: 26 Mar 20 Posts: 18 Credit: 232,183 RAC: 0 |
Oh you're right. I just looked at my task list. Time per WU has jumped from 8 hours to 16 hours! The cores are running cooler than the last version too, suggests a bottleneck. Note 2, I just noticed that the most recent few are fast again. Maybe there was just a run of WU for a harder problem. |
robertmiles Send message Joined: 16 Jun 08 Posts: 1232 Credit: 14,269,631 RAC: 3,846 |
A typical cause here for harder problems is larger proteins. |
Sid Celery Send message Joined: 11 Feb 08 Posts: 2115 Credit: 41,112,600 RAC: 19,835 |
see below, there are no 4.07 tasks left showing, there was 9000 yesterday only 400 today, the mini was taking around an hour but gives an idea. the 4.07 were averaging a 40 min runtime, with a rate of 1 credit for 11.5 secs of runtime on average. 3600/11.5 = 313 I didn't look back that far earlier. What I notice now is that starting today, 2-Apr, the scoring for mini-Rosetta has plunged to 75/hr, down from 300/hr and 4.12 are 300/4hr - 75/hr too It looks like something has happened to <all> scoring from today - a step change down - but consistent between the two on validation. Very odd. |
Message boards :
Number crunching :
Problems and Technical Issues with Rosetta@home
©2024 University of Washington
https://www.bakerlab.org