Can anyone recommend some compact code to auto edit data spikes and clean a data set?
Size:
Color:
Sounds like a promising idea for inclusion in our
Data Tool...
Size:
Color:
How do you define Data Spikes?
Size:
Color:
I guess that is part of the problem.
A simple minded way is to take some rms measure of say H-L and if a bar is n std dev away it is reset to the previous bar. There are lots of ideas that can be borrowed. My background is in geophysics and we wrestle with noisy seismic data. I'm sure many of the same ideas can be adapted.
I thought maybe somebody had already done the homework but if not then I'll go to the geophysical literature and gin something up.
Size:
Color:
Thanks.
Std Dev is a good idea, meanwhile what do you think of a WL Bad Tick filter analogue?
Size:
Color:
Do not want to put a lot of work into this. Just want something to get rid of the worst offenders.
It seems that Close is reliable even when High and Low are not, so this scheme assumes the Close is still a good number. It is based on the median which many spike editors seem to use. The variables ef and mw can be set with a little work reviewing the data set and trying a few numbers. I chose ef and mw based on symbol EEM 5m. Who knows what you should replace a bad H-L with, but being partial to fibonacci, I used 1.382 times the median High-Low value.
I'm attaching the usual disclaimer, know your data set.
CODE:
Please log in to see this code.
Size:
Color:
Due to subjectivity and other concerns I decided to avoid automatic spike corrections. Nonetheless, upcoming release of our Data Tool will include a data check feature with spike detection.
Size:
Color: