Nice to see your interest.
If i understand logic SMA computing may be done fast enough without any cpu context switching in sync mode. I dont have enough knowledge about it. My tests with sma parallelizing didnt give anything too.
May be it will be needed to compare data array to CPU cash size. And if bigger parallelizing in .Net will give something. How big it should be? Good question. It will be interesting co compare to CUDA results. (About CUDA later).
I am not a pro programmer and had about 10 years out of quantative finances. 3 years ago returned to my previous interests as hobby when was at hospital with broken leg)).
So pls dont critisize me to much. All i know now - all i can remember from turbo pascal, c and 2 years of C# learning.
Here is my code wich gave me best improvement 9% (usually 4-5%) on 3.800.000 ticks. No parallelizing - just precize coding).
CODE:
Please log in to see this code.
I have strong opinion that C# may give super results and c++ is not needed. The main idea not to let GarbageCollector work during computation and follow recommendations like dont forget make working class as sealed, use structs, and internal block variables wich can be stored in AX, BX, CX, DX ( how many modern CPU has and how are they named?) arithmetic registres of CPU.
Some recommendations work better on x64. There are many tests when x64 studio compilation in same conditions and even similar code give 3-5% better results than unsafe C#, C++ or C.
So i have read carefully these tests and implemented them.
For example what surpised me much:
const int i is better than just int i;
in for (int j =0; j<i;j++) difference on C# can give about 3% -5% of speed difference between C++ and C#!
Dont know why it doesnt give anything in c++.
Following same logic ds.Count is bad style. But in my example some improvement was realised due to avoiding some computation branches and may be avoiding excessive CPU context switching. 2 years more and i ll know it).
I think 2-4% of time will be reduced by replacing ds.Count by: int num = ds.Count; and num will be placed in body of for operator.
In one topic concerning c# code optimization i have seen an interesting idea about loop coding optimization but havenot understand what it was mean and couldnt reproduce. But idea was as following
QUOTE:
for (int i = bigperiod; i > 0; )
{
...a*b*c * matrix[i--];
}
Author called it c# loop optimization. May be i have missed something. In my case i ve received freezed interface.
As you can see in my SMA example - little bit more coding but it works little bit better.
As for StdDev and others - im not kidding. I tried to compare results to amibroker - it runs fast enough - my results are similar. Not sure how they are measuring it - all strategies or final result is on time computing of each strategy. I mean each strategie there can show time computing and rendering time. Each result is good but as you know we cant be 100% trust if we dont see how fair it is measured and computed.
To be honest i dont like AmiBroker - just used my old 5.4 version to check.
Another test i ve tried 2 years ago was with optimization of 2 strategies with many stddev and normalization of price/stddev computing. Ami interface was freezing for 10-15 minutes and sometimes i couldnt understand what was going on. But one simple computing seems running there fast enough. And in general their optimization interface and internal language is not as comfortable as WealthLab and Visual Studio.
When i was optimizing same ideas under WealthLab - WLB was looking better with 8-20 strategies with GA.
Algos of series computing in ami seems fast enough, but as for multithreading - i think they dont work good with Windows model.
They tell that new Ami version can use up to 32 CPU cores, but why it was freezing in the same optimization conditions on 8 cores?
WealthLab looks much more friendly with Windows - naturally born).
What i understand now - Amibroker nor WealhLab dont follow WPF model!. Let's say WealthLab is much closer.- COM - WPF interoperation - black holes in windows can be explained.
I ve realised some ideas following WPF description model and feel very happy.
What i mean - graphics as fast it can be done under WPF. I ve tested 5 ms update in 40 windows with Visual Class - would say perfect.
No freezing no big cpu load. Sure some additional tricks - but under main thread. BackgroundWorker or parallizing - and rendering on computed event works great. No freezing at all. WealthLab freezes sometimes. I ve done it because i need to trade with high frequency intraday data, order book and trying to use machine learning. Seems everything is possible. Even if i switch off graphics and install program on server near exchange C# code will work great.
And last additinal trick is CUDA - CUDAFy - free C# library.
Result is impressive but it is needed to find balance because PLINQ and parallelizing under Net is fast enough. Difference begins after 100.000 values are reached. There are some articles on codeproject with computational examples. In one article of CUDAFy developer Nvidia quadra 540 (800 usd acer) gave result 300! better than PLINQ on Geo tasks. But with arrrays below 50.000 elements cuda programming wont give anything.
I studied it to understand how better plan future work with my program. New Nvidia adapter even on notebooks can give result much better than 10 Xeons or any other well marketed products. Quadra or Tesla and just one I7 - and nothing more.
Last version of NVIDIA hardware and software (unfortunately only release candidate) support shared memory adressing.
If i understand right unified operators and no big difficulties of transfering data arrays between different memory types. May be programming interface will be even same for Computer RAM and Nvidia Card memory. There are c++ examples but my card doesnt support this mode. Some video chips with this possibility are already available. CUDAFy wich provides C# interfaces doesnt support it yet but i think in half year everything will be tested.
As for data base if i understand right file database is much faster than any SQL etc.
So just simple well organized .CSV files are much faster. Have not tested SQL programming personally but i have such experience with moscow data provider - they were giving static provider with data stored in sql. I dont like speed so even dont want any experiment with it.
So seems following WPF programming model must be comfortable enough. Even it is not needed to run every window in personal thread. UI thread will be enough in combination with parallel For, LINQ if needed. And CUDA in addition if needed huge arrayes testing.
Now im finishing with data providing. and hope will finish with beta assembly in about 2 weeks.
If you are interested in testing - i can send you copy.