WLP becomes unresponsible when running several optimizations with MS123 Scorecard
Author: kazuna
Creation Date: 3/3/2019 5:44 PM
profile picture

kazuna

#1
Does MS123 Scorecard have scalability limitation?
Whenever I begin optimize using MS123 Scorecard, as soon as I begin optimization for 3 instances, WLP becomes unresponsible.
I see the optimization progress bar moving but WLP doesn't response for any UI.

This issue doesn't happen with the other score cards and I have no problem optimizing 8 instances at once.
My machine has 8 cores, CPU usage is at around 15% and there are tons of memory available (64GB).
This is not a hardware limitation.

Aren't you using global semaphore or spinlock on UI thread?

Here is the optimization parameters.

Scale: 1 Minute
Data Range: 1 Year
Position Size: SetShareSize
Optimization Method: Exhaustive
Runs Required: 20000
profile picture

Eugene

#2
MS123 Scorecard is parallelized with PLINQ and parallel tasks. There's no artificial limitations but it's not necessarily a good idea to run multiple optimizations. I made a couple of quick tests (using Daily data) and WLD continued to stay responsive when running 3 optimizations. As there are no plans to make any adjustments to the otherwise working scorecard, try changing your settings or reducing the number of parallel optimizations until UI is responsive.
profile picture

kazuna

#3
How many processor count do you specify to WithDegreeOfParallelism for PLINQ?
PLINQ doesn't seem a good idea for running multiple instance of optimization.

But you should not block the UI thread at least.
Are you using async method with PLINQ?
profile picture

Eugene

#4
QUOTE:
How many processor count do you specify to WithDegreeOfParallelism for PLINQ?

I let .NET handle it at default settings.

QUOTE:
But you should not block the UI thread at least.

Like I said I couldn't duplicate but whatever happens it probably not the Scorecard but WL itself.

QUOTE:
PLINQ doesn't seem a good idea for running multiple instance of optimization.

Yes but it's good for speeding up a single optimization. Running multiple optimizations with MS123 Scorecard has never been on the table.

P.S. Speed-wise I recommend running the "No Closed Equity" version.
profile picture

kazuna

#5
QUOTE:
Like I said I couldn't duplicate but whatever happens it probably not the Scorecard but WL itself.
But it doesn't occur with other scorecard.

QUOTE:
Yes but it's good for speeding up a single optimization. Running multiple optimizations with MS123 Scorecard has never been on the table.
From what I tested, PLINQ isn't speeding up the optimization at all. Instead, it's actually slower.

When I optimize 20000 runs for 1 year intra-day data with MS123 Scorecard, PLINQ seems using 43 worker threads (slicing the runs?). However, only single core is used. CPU usage is actually hovering at around 85-90%, never reaching to 100%.

With Extended Scorecard, it uses single thread but CPU usage is 100% all the time and it finished the optimization 10% faster than MS123 Scorecard.

There looks like something (most likely data set access?) serializing the execution and PLINQ's parallelism is completely meaningless. Even worse, PLINQ's overhead actually makes the optimization slower than non-PLINQ version.
profile picture

kazuna

#6
As for the unresponsiveness issue, you can easily duplicate the problem using this code.
Just open 4~8 strategy windows and begin optimization one by one.
After you begin 3 optimizations, you cannot start the 4th one and UI is completely frozen.
You will realize the issue is much worse than you imagined.

Scale: 1 Minute
Data Range: 1 Year
Position Size: SetShareSize
DataSet: 1 minute scale SPY

CODE:
Please log in to see this code.
profile picture

Eugene

#7
QUOTE:
From what I tested, PLINQ isn't speeding up the optimization at all. Instead, it's actually slower.

The class (PerformanceEngine) that generates performance metrics is shared between MS123 Scorecards and MS123 Visualizers. So it's not just about optimization, the focus is on backtest performance. In my testing there was positive impact on backtest performance. For me, parallel optimizations is a borderline case.

QUOTE:
With Extended Scorecard, it uses single thread but CPU usage is 100% all the time and it finished the optimization 10% faster than MS123 Scorecard.

In Portfolio Simulation mode, MS123 Scorecard (full) has to calculate 37 metrics with its parallel tasks while Extended Scorecard calculates only 14. Clearly 10% looks like a good tradeoff!
profile picture

kazuna

#8
Like "No Closed Equity" version, I wonder if you would consider for "No PLINQ" version.
profile picture

kazuna

#9
QUOTE:
In Portfolio Simulation mode, MS123 Scorecard (full) has to calculate 37 metrics with its parallel tasks while Extended Scorecard calculates only 14. Clearly 10% looks like a good tradeoff!
If it is parallel task, then why it only uses single core? CPU utilization is almost comparable between MS123 Scorecard and Extended Scorecard. That means parallelism isn't giving you any merit. On my 8 cores 16 hyperthread CPU, single optimization uses at around 12~14% CPU on both scorecards.
profile picture

Eugene

#10
QUOTE:
That means parallelism isn't giving you any merit.

Like I said there's 3x more calculations in a dozen Tasks under the hood of MS123 scorecards so they are more computationally intensive.

QUOTE:
Like "No Closed Equity" version, I wonder if you would consider for "No PLINQ" version.

"-PLINQ what?"
"-No parallel? It's like Unparalleled right?"
"-What the heck is multi-threading?"

Adding such choice would sound very geeky and confusing. Not a good user experience. The thing is that the shared code has been working fine since its redesign in 2012. Investing time and effort into turning off PLINQ in Scorecards just because there are some potential performance concerns in such a borderline case as multiple optimization is, looks like a questionable effort to me.

QUOTE:
If it is parallel task, then why it only uses single core?

That's a question to the Wealth-Lab's developer, not to yours truly (the MS123 Scorecard developer). As a 3rd party I never had any source code access to WLP. Keep in mind that there are thread safety issues with the backtesting engine which held us up from parallelizing Monte Carlo Lab (for example).

QUOTE:
After you begin 3 optimizations, you cannot start the 4th one and UI is completely frozen.
You will realize the issue is much worse than you imagined.

My suggestion would be to spawn as few parallel optimizations as possible so as the UI isn't frozen. If two works for you then great, if one this is fine with me too.
profile picture

kazuna

#11
QUOTE:
My suggestion would be to spawn as few parallel optimizations as possible so as the UI isn't frozen. If two works for you then great, if one this is fine with me too.
In order to maximize the CPU utilization on 8 core 16 hyperthreads systems, I need to spawn at least 16 optimizations.

But again, the problem is not the number of optimizations. The problem is the UI thread is blocked. I think it's a simple implementation issue. You have to offload all CPU intense work from UI thread. Asynchrons method would be a solution.
profile picture

Eugene

#12
QUOTE:
In order to maximize the CPU utilization

Even though I may understand it from a geek standpoint but maximizing your CPU utilization is not a use case. Simply put it's not advertised that WLP would do it. ;)

QUOTE:
The problem is the UI thread is blocked. I think it's a simple implementation issue.

Here's what takes place:
CODE:
Please log in to see this code.

Can you see potential for releasing the UI thread here?
profile picture

kazuna

#13
I'm not a modern C# expert, so please consider this as pseudo-code.
I don't know exactly why UI thread is blocked but I would try something like this to see if it mitigates the problem.

CODE:
Please log in to see this code.

profile picture

Eugene

#14
Thank you for your suggestion. Async/await had been added to C# 5 and became mainstream after MS123 Perf.Visualizers was redesigned in 2012. Because await requires the calling method to be marked async, this is currently not an option as this would require modification of PopulateScorecard() (i.e. WLP source code).

But I may consider to experiment with our implementation instead. Here's what happens inside it (in pseudocode)...
CODE:
Please log in to see this code.


So do you think that waiting asynchronously with await Task.WhenAll (as opposed to the synchronous wait which is what Task.WaitAll does) should keep the WL UI thread from being blocked? Please confirm if I understood you correctly.
profile picture

kazuna

#15
QUOTE:
So do you think that waiting asynchronously with await Task.WhenAll (as opposed to the synchronous wait which is what Task.WaitAll does) should keep the WL UI thread from being blocked? Please confirm if I understood you correctly.
Yes, that's what I'm suspecting. I think it's worth to give it a shot at least.

By the way, did you duplicate the problem? It is pretty bad, isn't it?
profile picture

Eugene

#16
QUOTE:
By the way, did you duplicate the problem? It is pretty bad, isn't it?

No I didn't try it. Perhaps I'm not being clear but multiple optimizations with MS123 Scorecard is an uncommon scenario to be considered an issue. Nothing is broken for the normal use case.

QUOTE:
Yes, that's what I'm suspecting. I think it's worth to give it a shot at least.

OK I'll enter a new task in our product backlog to evaluate await Task.WhenAll to look at it eventually.
profile picture

superticker

#17
QUOTE:
Does MS123 Scorecard have scalability limitation?
Whenever I begin optimize using MS123 Scorecard, as soon as I begin optimization for 3 instances, WLP becomes unresponsive.
Unresponsive? This sounds like a memory page faulting problem to me. Your problem has gotten so virtual-memory large that the OS is paging it out to disk. Is your disk drive thrashing?

Go into Process Explorer (by System Internals, sysinternals) and post what the "10-second delta" page fault rate is on your system. See the attachment. (Use the View » Update Speed menu for setting the update rate to 10-seconds intervals in Process Explorer.)

If you're paging, you need to reduce your problem size dramatically. This should speed you up significantly.

The extended ScoreCard may be using significantly more virtual memory than the standard ScoreCard to compute all its metrics. In that case, reduce the number of optimizations you're doing in parallel to dramatically speed up your system when using the extended ScoreCard.
profile picture

kazuna

#18
QUOTE:
No I didn't try it.
Please take a look at it in person. You realize it is pretty bad, I promise.

QUOTE:
Unresponsive? This sounds like a memory page faulting problem to me.
No. As I mentioned at the beginning, I have 64GB memory and WLP uses just 1GB. The disk drive is NVMe which is one of the fastest you can get now. CPU usage peaks out at 15%. If you cannot believe it, you can duplicate the problem as I mentioned above. It's obvious that it is software problem. In fact, you only need 4 optimizations to duplicate the problem. At the same time, the other scorecard (e.g. Extended Scorecard) can easily go beyond 20 optimizations.
profile picture

superticker

#19
QUOTE:
I have 64GB memory and WLP uses just 1GB.
If you look at my attachment above, my WL "working set" is using 0.7GBytes when I have 16 Gbytes physical memory installed.

But if you now look at the virtual size of my WL process, it's showing 3.3GBytes. And my WL process is only streaming, not optimizing.

If you're only getting 1GByte of virtual memory for WL when I'm getting 3.3GBytes virtual memory doing nothing important, then you have OS problems. But my "guess" is that the 1GBtye is your "working set" (i.e. physical memory allocation for WL) and not your virtual memory allocation. I think your virtual memory allocation is much bigger for your WL process. But if it isn't, then you need to have your Windows OS troubleshooted because Windows isn't working right. I should not be getting three times the virtual memory for WL that you're getting when you have a much bigger system (64Gbytes) than I do (16GBytes). Look at my attachment (Post# 17) again.

When you do a System Info, does Windows acknowledge seeing all 64GBytes of memory? Some older northbridge chipsets may not be able to physically address that much memory. My northbridge chipset can only address 32GBytes.

Post a Process Explorer screenshot of your system when you're having the problems. We can troubleshoot it from there. Perhaps you do have an OS problem, but we aren't there yet. What I'm most interested in is the amount of virtual memory and paging rate your WL process has during the unresponsive problem.

---
On a slightly different topic, if you're asking how to increase the "working set" for a Windows application, that might be a StackOverflow question since that's more about Windows. But I agree, if you're only getting a working set of 1GBtye (physical memory) on a 64Gbyte machine with the OS defaults, that behavior seems odd to me. The StackOverflow guys may know more. Search "increasing the working set on Windows" on StackOverflow.

QUOTE:
... it's not necessarily a good idea to run multiple optimizations.
And I agree. But if the optimization code did increase the working set, it might be possible to run multiple optimizations. That may be the point here. For non-optimization activity, however, the default working set is fine.
profile picture

kazuna

#20
There is not much difference observed on Process Explorer when optimizing between MS123 Scorecard and Extended Scorecard.
In fact, MS123 Scorecard uses less memory and have less paging rate.
This makes sense becuase MS123 Scorecard is doing less compared to Extended Scorecard.

Please note that Extended Scorecard doesn't exhibit the problem even when running 10 optimizations.

There is no such thing like northbridge anymore on the modern system. I have Core i9-9900K and it has its own memory controller integrated and it supports up to 64GB memory.

This issue is typical UI thread problem often discussed.
https://www.google.com/search?q=C%23+UI+thread+unresponsive&oq=C%23+UI+thread+unresponsive

If you are insterested in the problem, please duplicate the problem and take a look. It's easy.
profile picture

superticker

#21
QUOTE:
Please note that Extended Scorecard doesn't exhibit the problem even when running 10 optimizations.
Sorry, I misread this thread. I thought the larger Extended ScoreCard was having the problems.

So why would the smaller MS123 ScoreCard be having parallel execution problems? Does it employ a .NET datatype that's not thread safe? I'm asking. I haven't used this ScoreCard before. If it's using datatypes that aren't thread safe, then I would do just one optimization at a time.
profile picture

kazuna

#22
QUOTE:
Does it employ a .NET datatype that's not thread safe?
It's a thread safe and it's not a problem.

https://docs.microsoft.com/en-us/dotnet/standard/parallel-programming/parallel-linq-plinq

MS123 Scorecard uses PLINQ which is supposed to distribute the tasks amoung the available cores and run them in parallel. Despite the parallelism, I cannot observe any core distributions when optimizing but only one core is used at a time. Even worse it seems causing the UI thread unresponsive and that's a deal breaker for me.

The parallelism issue would be due to data set access or something serializing the tasks and killing the parallelism.

UI thread unresponsive issue is likely due to the scorecard call (PopulateScorecard) running on the UI thread or blocking some resources shared with the UI thread.

If PLINQ had worked successfully, single optimization would have distributed onto all available cores. But I see almost no core distribution. One core is used for 85-90% and other cores are slightly used for less than 5%. Extended Scorecard, the single core implementation actually runs faster as far as the optimization is concerned.
profile picture

Eugene

#23
To sum things up. Running multiple optimizations has never been a use case for WLP/D. For the PerformanceEngine class that MS123 Scorecards are powered by along with MS123 Visualizers, PLINQ and tasks is a life saver for speeding up its calculations (3x more numbers than Extended Scorecard returns). In multiple optimizations, there's overhead from spawning many parallel tasks and may be due to the use of PLINQ in them. Scorecard does have to block some shared resources (the ListViewItem) and runs on the UI thread by design.

@kazuna

It may be a while when I get a chance to experiment with async/await and I'm still being skeptical about that as the likely reason is that we're bound to the Scorecard interface. If what you require for your optimization from MS123 Scorecards is just the Max Drawdown % (as you asked lately), the solution may be to build your own Scorecard with said performance metric. And you'll have the freedom to make it single-threaded as Extended Scorecard is if that suits your CPU usage better. The open source code can be downloaded from our Wiki: Home - MS123 Visualizers (log in there and click on Attachments). It's outdated (from 2011 I guess) but should be enough to illustrate the idea. One request if you consider this: please use a closely matching (or new) forum thread for scorecard development.
profile picture

kazuna

#24
QUOTE:
To sum things up. Running multiple optimizations has never been a use case.
Yes, I completely understand that.

QUOTE:
PLINQ and tasks is a life saver for speeding up its calculations (3x more numbers the Extended Scorecard returns).
Probably the number of metrics makes the difference for daily scale data? For intra-day scale, the strategy execution itself is intense, so the PLINQ's improvement on metric calculations are negligible?

QUOTE:
the solution may be to build your own Scorecard with said performance metric.
That sounds like one solution. Let me take a look. Thank you for your suggestion.
profile picture

Eugene

#25
Hope it works out for you. If it won't that would disprove the PLINQ idea, proving that the lock on shared resource is the bottleneck.
profile picture

Panache

#26
QUOTE:
the solution may be to build your own Scorecard
Another option is to build the optimizer into your strategy and bypass the scorecards entirely. To do that, your strategy has to pick it's own trades and do it's own position sizing.

Here's an early version of the code I used for the optimization section, just to give you a starting point (it is designed for the last parameter to be whether or not to optimized):
CODE:
Please log in to see this code.
profile picture

Eugene

#27
Nice!
profile picture

Panache

#28
I just realized there is a bug in my code. To demonstrate:

CODE:
Please log in to see this code.

The List<T>.Clear method sets the count to 0, but the buy (and sell) still shows up in trades. I'm confused as to why it is different than the Wealth-Lab method.

I don't think
CODE:
Please log in to see this code.
is causing a problem, but as a result of the behavior of
CODE:
Please log in to see this code.
and the fact that
CODE:
Please log in to see this code.
is not null, I'm concerned. Is there a method available to clear ActivePositions or can you share what else is happening to Positions in the Wealth-Lab method?
profile picture

Eugene

#29
Neither ActivePositions.Clear() nor Positions.Clear() are supported in WealthScript Strategies. It's expected that calling these methods will not work as intended.

On the other hand, ClearPositions() may work out in single symbol mode. Please see the QuickRef for the documented method's description and a note regarding its effect on multi-symbol mode backtest.


QUOTE:
or can you share what else is happening to Positions in the Wealth-Lab method?

Sorry but we don't have access to the WLP code base and even if we did we wouldn't disclose the details of Fidelity's intellectual property.
This website uses cookies to improve your experience. We'll assume you're ok with that, but you can opt-out if you wish (Read more).