Neuro-Lab compiles the strategy excessively in MSB / Optimization mode

LenMoz

#1

8/16/2016 6:08 PM

If I run Task Manager (Windows 10) while running an optimization, the "Background Process" category includes "Visual C# Command Line Compiler". This line appears for 3 seconds, then disappears for 3 seconds, then appears, etc. This oscillation continues for the duration of the optimization run.

So, two questions...
1. What is the purpose of this? I'm not modifying the strategy while the optimization runs.
2. If it is not necessary, can the strategy process recognize that it is running under optimization and gain some speed by bypassing the compile step (if that's what it is).

Edit: The 3-second timing is while running a 10-year backtest on 145 symbols with a 500-line strategy. It may be hard to see the phenomenon on a smaller backtest.

Eugene

#2

8/16/2016 6:27 PM

If it ain't broke, don't fix it?

LenMoz

#3

8/16/2016 6:31 PM

Eugene!!!!!! And you wonder why I've stopped posting on these boards.

I'm making a suggestion to save time on an optimization run. This is a topic of interest to several users. Now do your job and answer the questions.

Eugene

#4

8/16/2016 6:53 PM

QUOTE:
Eugene!!!!!! And you wonder why I've stopped posting on these boards.

I'm making a suggestion to save time on an optimization run. This is a topic of interest to several users. Now do your job and answer the questions.

You're barking up the wrong tree Len. I'm not responsible for developing the main Wealth-Lab application and have never had access to its source code. But I trust the Fidelity VP Product Development who designed Wealth-Lab's optimizer so I tried to informally suggest that there must be a solid reason behind something that doesn't appear broken. When it becomes a business critical issue rather than some "topic of interest" I'll stand out and certainly do my job.

LenMoz

#5

8/16/2016 7:02 PM

Can you refer me to the documentation (SLA) between you and Fidelity so I know what is MS123's and what is Fidelity's?

Eugene

#6

8/16/2016 7:08 PM

Our third party company MS123 LLC runs the wealth-lab.com website, is licensed by Fidelity to support and resell Wealth-Lab products to international customers. We are not involved in preparing the Fidelity Data Providers as well as are not responsible for developing the main Wealth-Lab application. We also do not determine what goes into the product. Besides other activities, MS123 takes care of the website, support, Extension development, and acts as an analyst and facilitator to submit problems to Fidelity.

LenMoz

#7

8/16/2016 7:22 PM

You italicized "problems". My problem is that optimization runs too slowly. Others in these forums have seen this as a problem as well. I included a suggestion that may improve performance. So, act as an analyst and facilitate.

Eugene

#8

8/16/2016 7:35 PM

Optimization may perform not as expected for a different reason (which could be determined by the developer with debugger). But "slowly" is akin to the infamous "it doesn't work" (just as useful and descriptive) - yet even worse being subjective.

What is "too slowly" exactly? Compared to a competitor's product? To a smaller backtest? Or it gets progressively slower as it runs? Or maybe it's some particular piece of code that slows down whereas plain vanilla "Moving Average Crossover" is fine in this aspect? etc.

In other words, please help me help you by describing your problem clearly. Give me some facts to reproduce and submit a bug report. Thanks.

vk

#9

8/16/2016 7:49 PM

Hi LenMoz,
Thank you for pointing out this interesting observation. I will bring this to the attention of the right people and hopefully they either have an explanation and good reason for it or they will put it onto their list for future improvements. Right now I would not know why this is happening. Ones again thank you for pointing it out and I will get back to you as soon as I have an answer.

VK

LenMoz

#10

8/16/2016 7:56 PM

I'm simply asking someone (at Fidelity) to look at the optimizer host code to see whether needless compiles are being done, as evidenced by Task Manager. If they are, change the code to compile only once at the start of the optimization. It has nothing to do with competitors, or backtest size, or any particular piece of code. You could simply refer Fidelity to this thread.

Eugene

#11

8/16/2016 8:36 PM

Len, before pointing the finger at csc.exe keep in mind that there always may be other reasons like GC or a bug like QC 55091.

Volker, in addition to knowing "why" wouldn't it be necessary to know "what" exactly is happening? ;) Running Windows 10 (like OP) I've been unable to reproduce any csc.exe popping up on the Background Processes / Resource Monitor while doing an Exhaustive optimization to start with.

LenMoz

#12

8/16/2016 9:04 PM

Another reason that occurs to me for a compile is that the strategy includes a neural network. Perhaps it's compiling the neural network script?

EDIT: It may not be related to optimization, but rather Neuro-Lab. The same phenomenon occurs simply doing a Multiple-Symbol Backtest.

Eugene

#13

8/16/2016 9:46 PM

Good catch Len. NL may be compiling its scripts/indicators during optimizations. We'll have to look into it and determine if it's doing its compilations excessively or this is required.

Eugene

#14

8/17/2016 3:52 PM

Seems like NL must be compiling the various scripts (input, output, indicator) as part of its script execution workload.

Cone

#15

8/18/2016 8:23 AM

In summary, does the the "Visual C# Command Line Compiler" observation occur only for optimizations of strategies that employ a NN?

LenMoz

#16

8/19/2016 1:17 AM

Cone,

That seems to be the case. I see it a lot because I have very few strategies that don't invoke NNIndicator.Series / use a neural network.

Edit: I ran a non-NN strategy and did not see the compile.

(This thread may be mistitled)

Eugene

#17

8/19/2016 6:07 AM

QUOTE:
(This thread may be mistitled)

Added a mention of Neuro-Lab to reflect your findings.

LenMoz

#18

8/19/2016 3:40 PM

QUOTE:
Added a mention of Neuro-Lab to reflect your findings

"Optimization" in the title may not be needed. I think we'll find that it has no role in this. Multi-symbol backtest is sufficient to invoke the compiler multiple times. I thought "optimization" before the later tests. (Edit)Possibly "MSB seems to compile the strategy a lot when Neuro-Lab is used"?

Eugene

#19

8/19/2016 4:31 PM

QUOTE:
(Edit)Possibly "MSB seems to compile the strategy a lot when Neuro-Lab is used"?

But that doesn't exclude optimization as a likely scenario. Therefore the "Optimization / MSB..." in the new title.

QUOTE:
Multi-symbol backtest is sufficient to invoke the compiler multiple times.

Right. Wealth-Lab executes the strategy (including Neuro-Lab's scripts) on each symbol sequentially, and then applies the position sizing overlay.

LenMoz

#20

8/25/2016 7:53 PM

Hi,

There is a *****20-to-1 speed improvement ***** to be had here. I think a redesign and implementation effort to have the NNIndicator parse the Xml and compile only once is worth pursuing, to benefit all users of Neuro-Lab.

Using NNIndicator, a 10-year backtest on 145 symbols takes 64 seconds. The NN uses 11 input DataSeries and a single hidden layer having 4 nodes. The very same backtest, using my own designed procedure that doesn't compile at all, takes 3 seconds, a 21 to 1 improvement. It parses the NeuroLab Xml only once, Signals produced are identical.

Here's the design I used. I created a class having a data structure for the NN topology and weights and two major procedures. The first, ParseNetworkXml, builds the data structure. The second NeuroCalc, calculates the NNIndicator DataSeries. The messy part of my solution is that it requires copying the NeuroLab Input Script into the strategy to build the NN's input DataSeries. Calls to neuroLab.Input are replaced by inputs.Add, where "inputs" is a List of DataSeries. Edit: I forgot to mention. An MSB only parses the Xml once, on the first symbol, and stores results as a Global.

Sidebar: Did you know that the weights are in the Xml twice, having identical values?

Len

LenMoz

#21

9/2/2016 4:21 PM

I called Fidelity today to try to raise some awareness of this performance issue. The person I spoke to didn't give me much hope, other than indicating that this forum is followed by their developers. So, Fidelity developers, any reaction to this? The time in the strategy (i.e. pre-Visualizers) running a multi-symbol backtest truly shows a 20 to 1 improvement when the XML parsing and input script compiling is done only once.

Eugene

#22

9/2/2016 5:15 PM

While the speed improvement you attained is really impressive, "copying the NeuroLab Input Script into the strategy to build the NN's input DataSeries" sounds like an added modification that has to be performed on a by strategy basis. Is this true? If so then from both usability and compatibility standpoint for the commercial product it's a tough call. Disclaimer: I'm not the NL developer.

LenMoz

#23

9/2/2016 5:43 PM

My solution, designed as a proof-of-concept, does indeed require strategy by strategy hand tailoring. It would not be the desired solution. The desired solution would require no change to strategy code. Rather, NNIndicator.Series would have a mechanism to detect whether it had built NN data structures and compiled the input script in this run, so as to build only once and re-use rather than the current rebuild at each symbol. I don't have the code so I can't design the final solution. Does this make sense? Fidelity, are you in there? Hello???

Edit: It's not that is so terribly difficult. The two methods that parse the network and calculate the DataSeries are together only 350 lines of code. For my purposes, I built a free-standing .dll so the hand-tailoring in each strategy is rather simple.

LenMoz

#24

9/16/2016 1:22 PM

Any progress on this? Any response at all?

IMHO, the current design is so unnecessarily slow as to make NeuroLab unfeasible for any meaningful purpose. I'm doing optimizations using my solution that I could not even consider before.

Len

vk

#25

9/18/2016 8:03 PM

Hi Len,
Neuro-Lab was a product developed by MS123, the Wealth-Lab support team that you probably mostly communicate with. The person who developed Neuro-Lab is not working for us anymore. In fact he just stopped working a few weeks before you discovered the "bug". Hence getting it fixed would be a tremendous financial effort. As far as we know you were the only one "discovering" it and/or reporting it. I am not in the position to talk about the Fidelity plans to release WL7, however if it materializes there should be a new NL, which then should definitely consider it.
Finally, I reached out to the developer to get an estimate on the fix, if it is within the scope I will get it done. Does that sound ok?

LenMoz

#26

9/18/2016 9:08 PM

QUOTE:
As far as we know you were the only one "discovering" it and/or reporting it.

That's because the underlying design isn't published. Who would guess at a compile for every MSB symbol? I've used NeuroLab since 2013 and always thought the slowness was because of the time required to build the input DataSeries. I found the real (compile) reason by accident. Unfortunately, NeuroLab doesn't seem to have a very big user community; no one has lent support to my request.

Carova

#27

9/19/2016 1:17 AM

QUOTE:
Unfortunately, NeuroLab doesn't seem to have a very big user community; no one has lent support to my request.

Maybe because it is so slow?? I tried and and decided it was way too slow for my needs.

Vince

LenMoz

#28

10/23/2016 8:57 PM

Any progress on this?

Cone

#29

10/24/2016 10:20 PM

Try version 1.0.3.0 available in extension updates now.

We were able to eliminate the unnecessary compiles, but didn't achieve the order of speed improvement that you did with your solution. Nonetheless, Neuro-Lab operates more than 200% faster now, which is definitely a big improvement!

LenMoz

#30

10/24/2016 11:10 PM

I ran one of my strategies using my solution and yours. Prior to 1.0.3.0, this run would have taken about 65 seconds including visualizers. Using 1.0.3.0, pre-visualizers, the run took 16 seconds. Using my solution, the run took 3 seconds (also pre-visualizers). I wish I had captured the pre-visualizer time before installing 1.0.3.0. Further, I did not see compiler executions in 1.0.3.0.

Bottom line, thanks for the update. There is still room for improvement.

How many times is the XML parsed? I parse only once, on the first symbol, and store the result in global storage. That could be a difference.

Eugene

#31

10/25/2016 6:09 AM

According to the developer, caching requests to the .XML file resulted only in a minimum speed improvement of ~10%. Since this could be a breaking change he considered it's not worth the trouble.

LenMoz

#32

10/25/2016 1:18 PM

I have an object, NeuralModel, that contains data structures and methods to replicate NNIndicator.Series functionality. It is instantiated on the first symbol. The constructor parses the XML into arrays. The object is stored as a Global. Method NeuralCalc constructs a DataSeries equivalent to NNIndicator.Series.

So, the top of my strategy looks like this...

CODE:
Please log in to see this code.

The constructor builds these data structures. My solution handles networks with up to two hidden layers only.

CODE:
Please log in to see this code.

IMHO the calls to parse XML are your (major) bottleneck.

Size:

Color:

I have an object, NeuralModel, that contains data structures and methods to replicate NNIndicator.Series functionality. It is instantiated on the first symbol. The constructor parses the XML into arrays. The object is stored as a Global. Method NeuralCalc constructs a DataSeries equivalent to NNIndicator.Series.

So, the top of my strategy looks like this...
[code]		protected override void Execute()
		{
			NeuralModel m;
			string NnName = "NeuroLabNetworkName"; // Neural network name
			if (DataSetSymbols[0] == Bars.Symbol) {
				m = new NeuralModel(NnName, this); // Constructor parses the XML
				SetGlobal(NnName, m);
			}
			else {
				m = (NeuralModel)GetGlobal(NnName);
				if (m == null) { // not found in Globals (for SSB)
					m = new NeuralModel(NnName, this);  // Constructor parses the XML
					SetGlobal(NnName, m);
				}
			}
			// For every symbol...
			List<DataSeries> NnInputs;
			NnInputs = NNDataPrep(); // NNDataPrep is a local method in the strategy to run the NeuroLab Input Script
			DataSeries dsNN = m.NeuralCalc(NnInputs, this); // replaces NNIndicator.Series(Bars, "NeuroLabNetworkName");
[/code]

The constructor builds these data structures. My solution handles networks with up to two hidden layers only.
[code]		int mintHiddenLayers;
		int mintInputs;
		int mintHiddens;
		int mintHiddens2;
		int mintOutputs;
		double[] mdblFromMin = new double[MAXINPUTS + 1]; // # of inputs plus the dependent variable
		double[] mdblFromMax = new double[MAXINPUTS + 1]; // # of inputs plus the dependent variable
		double[] mdblToMin = new double[MAXINPUTS + 1]; // # of inputs plus the dependent variable
		double[] mdblToMax = new double[MAXINPUTS + 1]; // # of inputs plus the dependent variable
		double[,] mdblHiddenWts = new double[MAXHIDDENS, MAXINPUTS + 1];
		double[,] mdblHiddenWts2 = new double[MAXHIDDENS2, MAXHIDDENS + 1];
		double[] mdblOutputWts = new double[MAXHIDDENS + 1];
[/code]
IMHO the calls to parse XML are your (major) bottleneck.

superticker

#33

10/26/2016 3:22 PM

QUOTE:
How many times is the XML parsed? I parse only once, on the first symbol, and store the result in global storage.

I'm not sure if the "parsing" time is what's slowing it down. My guess is the cause of the slowness maybe in the creation and destruction of all the data structures that follow the parsing. In other words, Garbage Collection (GC).

Try examining the Wealth-Lab process with Process Explorer (from SysInternals, now owned by Microsoft). Take a look at the .NET framework process tasks for the Generation 0,1, and 2 heap activity. If GC activity is over 5%, you have GC problems. From a GC prospective, you're always better off allocating the data structures and reusing them if possible; taking them down, GCing, then recreating them again is really slow.

QUOTE:
According to the developer, caching requests to the .XML file resulted only in a minimum speed improvement of ~10%.

And that maybe true if you're only modeling 2 or 3 parameters and have a really large processor cache. But if you're modeling 14 parameters such that the GC problem will no longer fit in the on-chip processor cache, that will make a factor of 5 difference.

When we (computer engineers) design caching systems, we allow a factor of 5 speed difference between each tier (L1, L2, L3) of the caching architecture. It's part of the "parameter funneling model" of the system design to maximize gain while minimizing chip real estate. So if the GC gets a cache miss on the L2 cache and that memory access is deferred to the L3 cache (or off-chip RAM memory) instead, that's a speed hit of 5x1 (L3) or 5x5 (off-chip RAM) times respectively. Bottom line, as your memory footprint gets bigger, the cache misses really slow you down big time.

QUOTE:
Unfortunately, NeuroLab doesn't seem to have a very big user community;...

QUOTE:
Maybe because it is so slow? I tried and and decided it was way too slow ... Vince

I agree. If it's too slow, no one will use it.

The other problem is the lack of experience users have had with neural networks. Neural nets are ideal for fitting model parameters for nonlinear, discontinuous, fuzzy systems like we have for stock trading. But how many WL users have had a graduate level course in neural networks (either in EE or computer science) to know that? That's your biggest problem.

What Fidelity should do is host a bi-annual symposium for Wealth-Lab users. The sessions at such a symposium can then cover some of these advanced topics. I would host a Wealth-Lab symposium in parallel with an established symposium about stock investing/trading so you get enough critical mass (i.e. attendance) to make it successful.... And it would be nice to meet some of the developers.

Size:

Color:

[quote]How many times is the XML parsed? I parse only once, on the first symbol, and store the result in global storage.[/quote]I'm not sure if the "parsing" time is what's slowing it down.  My [u]guess[/u] is the cause of the slowness maybe in the creation and destruction of all the data structures that follow the parsing.  In other words, Garbage Collection (GC).

Try examining the Wealth-Lab process with Process Explorer (from SysInternals, now owned by Microsoft). Take a look at the .NET framework process tasks for the Generation 0,1, and 2 heap activity. If GC activity is over 5%, you have GC problems.  From a GC prospective, you're always better off allocating the data structures and [u]reusing them[/u] if possible; taking them down, GCing, then recreating them again is really slow.

[quote]According to the developer, caching requests to the .XML file resulted only in a minimum speed improvement of ~10%.[/quote]And that maybe true if you're only modeling 2 or 3 parameters and have a really large processor cache.  But if you're modeling 14 parameters such that the GC problem will no longer fit in the on-chip processor cache, that will make a factor of 5 difference.

When we (computer engineers) design caching systems, we allow a factor of 5 speed difference between each tier (L1, L2, L3) of the caching architecture.  It's part of the "parameter funneling model" of the system design to maximize gain while minimizing chip real estate.  So if the GC gets a cache miss on the L2 cache and that memory access is deferred to the L3 cache (or off-chip RAM memory) instead, that's a speed hit of 5x1 (L3) or 5x5 (off-chip RAM) times respectively.  Bottom line, as your memory footprint gets bigger, the cache misses really slow you down big time.

[quote]Unfortunately, NeuroLab doesn't seem to have a very big user community;...[/quote][quote]Maybe because it is so slow? I tried and and decided it was way too slow ... Vince[/quote]I agree.  If it's too slow, no one will use it.

The other problem is the lack of experience users have had with neural networks.  Neural nets are ideal for fitting model parameters for nonlinear, [u]discontinuous[/u], fuzzy systems like we have for stock trading.  But how many WL users have had a graduate level course in neural networks (either in EE or computer science) to know that?  That's your biggest problem.

What Fidelity should do is host a bi-annual symposium for Wealth-Lab users.  The sessions at such a symposium can then cover some of these advanced topics. I would host a Wealth-Lab symposium in parallel with an established symposium about stock investing/trading so you get enough critical mass (i.e. attendance) to make it successful....  And it would be nice to meet some of the developers.

LenMoz

#34

10/26/2016 3:51 PM

Thanks for your insightful post, superticker. Since the compiles have been removed, I think there is high probability that Xml processing is the biggest remaining culprit. Through a ticket, I've provided MS123/Fidelity with the C# project that builds my object plus a strategy script that compares timing of NIndicator.Series to my solution.

QUOTE:
What Fidelity should do is host a bi-annual symposium for Wealth-Lab users.

QUOTE:
And it would be nice to meet some of the developers.

I couldn't agree more! I wouldn't miss it.

Cone

#35

10/26/2016 4:52 PM

At the Tri-annual Traders' Expo (next one in Las Vegas in 3 weeks) you can meet the developer(s). We've given "Deep Dive" classes at the expo for years, but it turns out to be difficult to attract advanced users to attend and always ends up being closer to a Wealth-Lab 101 class.

LenMoz

#36

10/26/2016 5:45 PM

I've started a "WealthLab at Trade Shows" thread so these off-topic posts might be found.

Wealth-Lab® Links

WealthSignals Links

Disclaimer Policy

Follow Us

A FREE account will allow you access to our knowledge-base resources, customer support, WealthSignals services, and a trial version of our software Wealth Lab

Almost done.