Hello,
I am currently running a backtest on a list of about 500 symbols for 10 years. It now takes already more than 48 hours and has only reached symbols with Dxxx. So, I need to optimize the performance of the script, but somehow dont know where to start.
The script deals with two things that are responsible for its slowness:
- evolving Watchlist
- it ranks the symbols into different bins accoarding to recent volume for the purpose of position sizing
I guess that both procedures are run over and over again for every symbol and this could be done more efficiently.
For the purpose of simplification and better clarity I will only provide a pseudo code:
CODE:
Please log in to see this code.
After this part there is the signal generation and standard logic which does not take too much time.
I think the Volumebin creation could be handled only once for every bar for the whole dataset!? How can I achieve this?
Is there also something I can do about the onWatch procedure?
regards,
dansmo
Size:
Color:
1. Use a profiler to find the real code bottleneck:
Is there a way to profile Wealth-Lab 5 Strategies i.e. measure application/script performance? (scroll to the end).
As a simple alternative, use Trace.WriteLine, or just a logger like the built-in
log4net.
2. If applicable, modify your strategy to call
SetContext only prior to entering a Position - otherwise use GetExternalSymbol.
3. The whole volumeBin() + myList part looks unclear and spaghetti code to me, try optimizing the logic.
Size:
Color:
Hi Eugene,
re 1: since I can compare the scripts performance with and without those two procedures I am sure that these are the bottlenecks.
re 2: In this script SetContext is used to have a symbol that contains all the bars in the time period. It hast nothing to do with the creation of positions.
re 3: I think the logic itselft wont help much. More important is: is there a way to have this only run once for every bar in the datasource instead of 500x for every bar. And what do I have to change to get the same resulst?
Size:
Color:
3. If you made it iterating 500x per bar, it's likely there's a way to optimize it.
Size:
Color:
I just dont know how...
WL5 cycles through the datasource and executes the script once for every symbol. How can I set it up
that a piece of code is only executed for the very first symbol? And how can the information in the lists
be saved so that the remaining 499 runs have access to this information?
Size:
Color:
Size:
Color:
That sounds easy. Will try it that way.
Size:
Color:
Okay, running the code parts with the evolving watchlist and volumeBin stuff only once for the first symbol:
How would you suggest storing that information?
I am thinking about setting up a dictionary that holds a info-class for every date in the backtest.
This info-class then stores a dictionary with symbol keys and the info if the symbol was on the watchlist
on that date and to which volume bin it belongs.
Do you have better ideas?
Size:
Color:
If it works for you, it's fine with me.
Size:
Color:
I just thought that maybe you would have an idea of a better/more efficient solution before I start programming it :-)
Size:
Color:
I am no programmer. However, if you your code has charting, consider commenting them out when you are ready to run against larger watchlists. This should also help with those nasty out of memory dump outs.
Size:
Color:
I´ve made a big step towards better optimization. However I am running in a problem:
This is the beginning of the strategy:
CODE:
Please log in to see this code.
So, I initialize a dictionary at the beginning. Running this script 2x without closing the window will use
the same dictionary 2 times??? Is this possible? How can I clean this up, so that users dont have to close
the window?
Size:
Color:
This is a class level variable. When you first execute a Strategy by opening its window or recompiling ("Run the strategy"), avgValue is initialized as a new Dictionary. Subsequently clicking F5 or "Go" will make the code work with the existing Dictionary i.e. it won't cause initialization.
Size:
Color:
QUOTE:
I just thought that maybe you would have an idea of a better/more efficient solution before I start programming it :-)
I thought it's
your work and
your solution ;)
Size:
Color:
QUOTE:
" will make the code work with the existing Dictionary i.e. it won't cause initialization.
Is there any way to clear this Dictionary without recompiling?
Size:
Color:
Size:
Color:
With "at the end of the script" you mean inside the Strategy Class?
I tried it this way:
CODE:
Please log in to see this code.
error: avgValue is a field but it is used as a type (translation from German)
Size:
Color:
Inside the Execute() method, or a method called by Execute().
Size:
Color:
Running this in a MulitSymbolBacktest: this will clear the Dictionary after executing every symbol.
But i need the information in the dictionary for the whole MSB. Thats why I am filling the dictionary
on the first run with the first symbol.
Size:
Color:
See my reply dated 7/22/2010 4:56 AM for a pointer. Hint: read it to the end.
Size:
Color:
Do you mean this part?
CODE:
Please log in to see this code.
Size:
Color:
I am running in another problem here:
If I set SetConext, Bars.Count does not refer to the symbol set in SetContext. How can I get that information?
Size:
Color:
CODE:
Please log in to see this code.
Run on a symbol with different Bars.Count then the specified symbol in SetContext. Check the Debug Window with Bars.Count.
I expected that teh Print would show Bars.Count for the symbol set in SetContext.
But Bars.Count is shown for the selected symbol.
Size:
Color:
This is synchronization in action (
DataSeries > Accessing Secondary Symbols > Secondary Series Synchronization in the WealthScript Programming Guide). To disable:
CODE:
Please log in to see this code.
Size:
Color:
Hi Eugene,
with this Code I am trying to fill a class SymInfo which contains symbol specific information.
The plan was to run it on the symbol that was in the DataSet for the whole time during the backtest, so that
I only need to run it once on this symbol to store the evolving watchlist information.
CODE:
Please log in to see this code.
Changeing Synch to fals will however result in Bars.Count being 524276 instead of the real 2542 bars for ALV GY in the datasource.
Do you have any idea why Bars.Count is that high?
Size:
Color:
It must be finding an intraday version of the symbol.
1. Is "ALV GY " in the same Daily DataSet on which you're running the script?
2. Should there really be a whitespace after the Y in "ALV GY "? It's pretty strange to pad symbol names with invisible characters.
Size:
Color:
That was something I thought, too. But there is no intraday version in WLD5.
1. Yes, it is in the same DataSet.
2. Whitespace is wrong. Thanks for spotting it.
Re-running the script: Bars.Count is correct now, so it was the whitespace that caused the problem. However, the only
symbol in my datasets that has intraday bars is FDAX.
Size:
Color:
Actually, SetContext is expected to produce a "Could not load data for symbol" message in a case like this. Question is, what Bars object did it find and load then?
Size:
Color:
The symbol with the most bars is @FDAX=102XN. But that has "only" about 320.000 bars.
Size:
Color:
Sorry, still having errors with this one. This time it is an ArgumentOutofRangeExpection.
CODE:
Please log in to see this code.
This line throws the exception.
QUOTE:
symbolList sl = ih.WatchList(bar);
What could be the cause? There is a file for this date in the folder. I just dont know what is wrong here or where I have to search.
Size:
Color:
I don't know what could be the cause, but the easiest way to troubleshoot such errors is to load Visual Studio/SharpDevelop and use the debugger (set a breakpoint, use the F11 key to step into etc.)
Size:
Color:
All these issues are about Synch. Above error was because I initialized ih on the Bars Object of the first symbol, which accidentally had
only 100 bars or so.
Size:
Color:
For anyone interested, here is the solution. This works great for large datasets:
CODE:
Please log in to see this code.
The symbol specified in SetContext should be the symbol that has the most bars in the datasource.
Edit:
Justed wanted to be a bit more specific on performance. Script is now running an hour instead of over a week!!
Size:
Color: