"x-y" Linear Regression?

Author: Carova

Creation Date: 5/17/2019 10:24 AM

Hi Eugene!

The current linear regression indicator displays the results against time. How do I go about performing an "x-y" regression, i.e. a regression of a series against a series? Thanks!

Vince

The current linear regression indicator displays the results against time. How do I go about performing an "x-y" regression, i.e. a regression of a series against a series? Thanks!

Vince

Hi Eugene!

I know it is built into .Net but I am at a loss as how to access it to make a WL indicator from it. :( (profound C# ignorance)

Vince

I know it is built into .Net but I am at a loss as how to access it to make a WL indicator from it. :( (profound C# ignorance)

Vince

Are you sure you're not confusing .NET itself with an external library like ALGLIB or Math.NET? Speaking of built in things, to my knowedge Microsoft Chart control (MSChart) supports various regressions (linear, log, polynomial etc) of a time series. MS123 Visualizers uses it, for instance. If you have a specific example of what you're trying to accomplish don't hesitate to point me to it.

I use Math.NET all the time, and they do have some nice regression fitting routines. Check out their examples: https://numerics.mathdotnet.com/Regression.html

Now nothing in Math.NET does plotting, only curve fitting. If you need to plot (Do you?), and you're writing a Performance Visualizer (Are you?), then I would check out: https://code.msdn.microsoft.com/mschart

If you need a plot and you're not writing a Performance Visualizer, then I would use either Excel or R. Both interface with any .NET application like Weath-Lab. Excel might be easier to get started with, but if you already have R installed, then I would use that instead. And R does regression fitting too, which is a good option if you are__not__ wanting to do fitting as part of your strategy code. (Personal note: The regression fitting I do __is__ part of my strategy execution, so I use Math.NET exclusively for that.)

I can post my WL library routine for fitting y = beta00*1/x + beta0 + beta1*x with Math.NET if that helps, which covers raising x to powers of -1, 0, and 1 as you can see from the equation. I skip including the x² term because that's too sensitive to outlier behavior. You "might" include a (ln x) or exp(x) term in there, but I wouldn't put too many degrees of freedom in your regression model because that would over fit it for a fuzzy system problem.

---

I just had a wild and crazy idea. If it's possible to define a 2D double array:

__constructor__ of the strategy that wants to make the xyScatterPlot. But then that strategy instance will have to exist when the Performance Visualizer tries to plot it. Is that a problem?

Now nothing in Math.NET does plotting, only curve fitting. If you need to plot (Do you?), and you're writing a Performance Visualizer (Are you?), then I would check out: https://code.msdn.microsoft.com/mschart

If you need a plot and you're not writing a Performance Visualizer, then I would use either Excel or R. Both interface with any .NET application like Weath-Lab. Excel might be easier to get started with, but if you already have R installed, then I would use that instead. And R does regression fitting too, which is a good option if you are

I can post my WL library routine for fitting y = beta00*1/x + beta0 + beta1*x with Math.NET if that helps, which covers raising x to powers of -1, 0, and 1 as you can see from the equation. I skip including the x² term because that's too sensitive to outlier behavior. You "might" include a (ln x) or exp(x) term in there, but I wouldn't put too many degrees of freedom in your regression model because that would over fit it for a fuzzy system problem.

---

I just had a wild and crazy idea. If it's possible to define a 2D double array:

CODE:and store it in the WL global cache, then one may be able to write a general purpose Performance Visualizer to plot some arbitrary 2D thing easily. The Performance Visualizer should probably purge the xyScatterPlot array after plotting so it's not stuck in the cache. There might some conflicts if two separate strategy instances try writing into the same global array simultaneously. Hmm, that's a down side of employing the WL global cache. Well, you could define your own protected strategy 2D array in the

Please log in to see this code.

Thanks Eugene! Yes, I did mean Math.NET. My error!

Hi superticker!

Yes, I had seen that but my limited C# skills prevented me from even beginning to understand an approach to using it. I was attempting to use it in a script, not a Visualizer, which made the task potentially easier, but still beyond my abilities.

I did locate C# code for linear regression (https://gist.github.com/NikolayIT/d86118a3a0cb3f5ed63d674a350d75f2) and attempted to use it to construct a WL Indicator

but I am at a loss for a number of items which are creating errors.

Help Eugene!

Vince

Hi superticker!

Yes, I had seen that but my limited C# skills prevented me from even beginning to understand an approach to using it. I was attempting to use it in a script, not a Visualizer, which made the task potentially easier, but still beyond my abilities.

I did locate C# code for linear regression (https://gist.github.com/NikolayIT/d86118a3a0cb3f5ed63d674a350d75f2) and attempted to use it to construct a WL Indicator

CODE:

Please log in to see this code.

but I am at a loss for a number of items which are creating errors.

Help Eugene!

Vince

The original code doesn't convince me that you're not trying to reinvent the wheel i.e. standard LinReg indicator:

https://gist.github.com/NikolayIT/d86118a3a0cb3f5ed63d674a350d75f2

At any rate, Wealth-Lab works with DataSeries*which by definition is the basic data structure that represents a historical series (i.e. a List<double> with an associated List<DateTime>*).

Vince, please count me out for this topic.

https://gist.github.com/NikolayIT/d86118a3a0cb3f5ed63d674a350d75f2

CODE:

Please log in to see this code.

At any rate, Wealth-Lab works with DataSeries

Vince, please count me out for this topic.

Well, the code you have will probably work okay. Did you need an LR decomposition (your above code)? Are you fitting many x-arrays against a single y-array? If so, then an LR decomposition would be faster. I kind of thought you just wanted to make a single plot, not an array of plots for many different x-arrays. Math.NET supports LR decomposition setups too.

If all you want to do is fit a "single" line, then the*Simple Linear Regression* example at https://numerics.mathdotnet.com/Regression.html will do that; that example is included below. When you run this code, the Execute() statement will loop for every stock in your dataset. So just run it with a __single__ stock, not the entire dataset.

You'll need to copy the MathNet.Numerics.dll library into Wealth-Lab's install directory before you can run this. Hmm; there's a MathNet.Numerics.xml file in there too. I'm not sure if you need that one, but you can include it just to be safe.

There is a Community.Components function to convert a DataSeries to an array if you're interested, but I haven't used it: https://www.wealth-lab.com/Forum/Posts/Convert-DataSeries-to-C-Array-38617

None of this will plot, of course. That's another problem. For off-line work, I just plot with Excel. The WL discussions talk about multiple methods for getting WL data into Excel. Excel can fit regression models too. Go to the Data >> Data Analysis menu and select "regression". You may have to install the Excel Data Analysis pack if you haven't already done so.

If all you want to do is fit a "single" line, then the

CODE:

Please log in to see this code.

You'll need to copy the MathNet.Numerics.dll library into Wealth-Lab's install directory before you can run this. Hmm; there's a MathNet.Numerics.xml file in there too. I'm not sure if you need that one, but you can include it just to be safe.

There is a Community.Components function to convert a DataSeries to an array if you're interested, but I haven't used it: https://www.wealth-lab.com/Forum/Posts/Convert-DataSeries-to-C-Array-38617

None of this will plot, of course. That's another problem. For off-line work, I just plot with Excel. The WL discussions talk about multiple methods for getting WL data into Excel. Excel can fit regression models too. Go to the Data >> Data Analysis menu and select "regression". You may have to install the Excel Data Analysis pack if you haven't already done so.

Hi Eugene! The WL LinReg indicator does create an x-y indicator where "x"=time. I want the "x" to be some other data series.

Thanks superticker!

How do I convert the code you provided so that it creates an "indicator" of length "l", i.e. it does the fit over a period of "l" bars?

Vince

Thanks superticker!

How do I convert the code you provided so that it creates an "indicator" of length "l", i.e. it does the fit over a period of "l" bars?

Vince

QUOTE:Is the x-array going to be time (in bars)? If so, why don't you just use the WL LinearRegSlope.Value function http://www2.wealth-lab.com/WL5Wiki/LinearRegSlope.ashx to get a slope for a given time-period window? You don't need a general xy-plotting routine for this.

How do I convert the code you provided so that it creates an "indicator" of length "i", i.e. it does the fit over a period of "i" bars?

I'm not sure where you are going with this. Would you be wanting to cross correlate one time series (say an index) with another (say a stock)? See cross correlation https://en.wikipedia.org/wiki/Cross-correlation Some indexes would correlate better with different stocks. If I remember right, I "think" WL does have a correlation visualizer for doing that already, so you

Now if you're looking for

So please describe the goals of your xy-plot? What's on the x-axis, and what's on the y-axis? And what's the overall purpose?

Hi superticker!

I am attempting to get the slope of the regression of two series (an x and a y). For the specific case where x=time we have LinearRegSlope, but for the general case where x != time there is no equivalent. That is what I am trying to create with this indicator. Is it clearer now?

Vince

I am attempting to get the slope of the regression of two series (an x and a y). For the specific case where x=time we have LinearRegSlope, but for the general case where x != time there is no equivalent. That is what I am trying to create with this indicator. Is it clearer now?

Vince

Well then, for x != time, my posted solution (Post# 8) should work for you. You just need to convert your x and y arrays to double[] before calling Fit.Line(). And as mentioned in Post# 8, there's a Community.Components *To.Array* call if you need to convert from DataSeries to double[]. Alternatively, you could just use a for loop (which the compiler can probably optimize better):

I can't get more specific than that unless you can offer an example where x != time, because the implementation and regression equation is likely to vary for different cases. Also, if x != time, then you probably want to create a Performance Visualizer,__not__ an indicator (which is base on a __time dependent__ DataSeries).

One implementation comment. Within a WL strategy, I try to use only WL compatible data types. So I'm reluctant to place double[] data types within a WL strategy. What you can do is create a personal code library with Visual Studio and place your double[] types in there instead. Then when you call your personal routines from your strategy, the only arrays present in your strategy will be of type DataSeries. But I would get your regression code debugged in the WL editor first, then move it into a personal *.DLL library.

My strategies are about 450 lines, but my personal libraries are 6 times that size in total. I never call external packages (like Math.NET) from within a strategy because they have WL incompatible data types.

CODE:

Please log in to see this code.

I can't get more specific than that unless you can offer an example where x != time, because the implementation and regression equation is likely to vary for different cases. Also, if x != time, then you probably want to create a Performance Visualizer,

One implementation comment. Within a WL strategy, I try to use only WL compatible data types. So I'm reluctant to place double[] data types within a WL strategy. What you can do is create a personal code library with Visual Studio and place your double[] types in there instead. Then when you call your personal routines from your strategy, the only arrays present in your strategy will be of type DataSeries. But I would get your regression code debugged in the WL editor first, then move it into a personal *.DLL library.

My strategies are about 450 lines, but my personal libraries are 6 times that size in total. I never call external packages (like Math.NET) from within a strategy because they have WL incompatible data types.

Hi superticker!

I am looking to use the xyLR indicator to track two closely related items (Unleaded Gasoline and Crude Oil Futures) for short to intermediate term hedge trades. I examined price ratios, but they are not too good for this purpose, so I am interested in exploring the slopes of a variety of time periods to see if that works better. I believe that this approach might work well for pairs-trading a number of highly correlated trading vehicles where price ratios are not suitable.

Vince

I am looking to use the xyLR indicator to track two closely related items (Unleaded Gasoline and Crude Oil Futures) for short to intermediate term hedge trades. I examined price ratios, but they are not too good for this purpose, so I am interested in exploring the slopes of a variety of time periods to see if that works better. I believe that this approach might work well for pairs-trading a number of highly correlated trading vehicles where price ratios are not suitable.

Vince

@Vince,

What's wrong with using Correlation of the ROC of each one of your closely related items, for example?

FYI:

The XML file is not required (I don't think it'd help Wealth-Lab Editor's Autocomplete much anyway) but make sure to uncheck "Downloaded from the internet" in file's properties before copying or it will not work (Wealth-Lab will act as if it's not there)!

Here's how: How to unblock files downloaded from Internet in Windows 10

What's wrong with using Correlation of the ROC of each one of your closely related items, for example?

FYI:

QUOTE:

You'll need to copy the MathNet.Numerics.dll library into Wealth-Lab's install directory before you can run this. Hmm; there's a MathNet.Numerics.xml file in there too. I'm not sure if you need that one, but you can include it just to be safe.

The XML file is not required (I don't think it'd help Wealth-Lab Editor's Autocomplete much anyway) but make sure to uncheck "Downloaded from the internet" in file's properties before copying or it will not work (Wealth-Lab will act as if it's not there)!

Here's how: How to unblock files downloaded from Internet in Windows 10

Hi Eugene!

Looked at that approach, but since the two instruments are so highly correlated (>0.98) that was not useful.

Thanks! I am still trying to figure out how to get this into indicator form. :(

Vince

Looked at that approach, but since the two instruments are so highly correlated (>0.98) that was not useful.

QUOTE:

The XML file is not required (I don't think it'd help Wealth-Lab Editor's Autocomplete much anyway) but make sure to uncheck "Downloaded from the internet" in file's properties before copying or it will not work (Wealth-Lab will act as if it's not there)!

Here's how: How to unblock files downloaded from Internet in Windows 10

Thanks! I am still trying to figure out how to get this into indicator form. :(

Vince

QUOTE:Okay. So why not decorrelate one with the other? Or am I missing something?

I am looking ... to track two closely related items (Unleaded Gasoline and Crude Oil Futures) for short to intermediate term hedge trades.

CODE:

Please log in to see this code.

So if the decorrelatedLine has a

There are an unlimited number of ways to decorrelate something. I don't mean is suggest ROC is the only approach to decorrelation. For example, if you redesign VWAP so it can operate over Daily bars (The current WL version can't do that.), you can decorrelate with that instead.

An echo cancellation filter is an example of decorrelation with a "lag time". Engineers use this type of filter in analog landline phones calls so it's possible to carry your two-way call with two wires rather than three; otherwise, one direction would confound the other with an echo.

---

QUOTE:Just remember an indicator is for a

I am still trying to figure out how to get this into indicator form. :(

Hi superticker!

I have tried a number of different formulations attempting to tease out an effective way to separate out the "indicator" that might work. This has included your approach of decorrelation. In all of the cases the trading noise leads to many whipsaws. This was why I am exploring alternatives.

What my indicator would be is the slope as a f(time).

Vince

I have tried a number of different formulations attempting to tease out an effective way to separate out the "indicator" that might work. This has included your approach of decorrelation. In all of the cases the trading noise leads to many whipsaws. This was why I am exploring alternatives.

QUOTE:

I am still trying to figure out how to get this into indicator form. :(

QUOTE:

Just remember an indicator is for a time varying transform. If the transform isn't a function of time (not x == time), then don't make it an indicator. A DataSeries (which indicators create) is always a function of time.

What my indicator would be is the slope as a f(time).

Vince

QUOTE:You can take the decorrelatedLine output and run it through some kind of EMA. I would pick one that's adaptive. WL has several. Exactly, what are the ticker symbols you are trying to decorrelate? I would like to look at them myself. You may be looking for an inverse correlation between these pairs that's not really there in the first place.

In all of the cases the trading noise leads to many whipsaws.

QUOTE:So you should be able to write an indicator that produces a DataSeries for what you want to do. Rather than using the ROC(Close,1,"...") in the decorrelation calculation, you could substitute the LinearRegSlope(Close,5,"...") instead. I don't especially like than solution because you should be doing the decorrelation with as

What my indicator would be is the slope as a f(time).

As any rate, I think employing LinearRegSlope(Close,,"...") in your indicator somehow is the way to go for finding a slope. You only need to use the Math.NET Fit.Line() call if x != time, which is

QUOTE:

You can take the decorrelatedLine output and run it through some kind of EMA. I would pick one that's adaptive. WL has several. Exactly, what are the ticker symbols you are trying to decorrelate? I would like to look at them myself.

What is your data provider? The reason I ask is that different providers use slightly different "nomenclature" for the Futures symbols than what the NYMEX uses. Here are charts for the current contract of Unleaded Gasoline (https://www.barchart.com/futures/quotes/RB*0/technical-chart) and WTI Crude Oil (https://www.barchart.com/futures/quotes/CLN19/technical-chart). Perhaps your data provider uses these symbols in some fashion to construct a back-adjusted continuous contract.

QUOTE:

What my indicator would be is the slope as a f(time).

I probably should have said "x-y slope as a function of time".

Vince

QUOTE:So think about inserting LinearRegSlope() in your code so it can return that x-y slope behavior over time--which is what it's design to do. Now you're left with forming the behavior (or equation) of the x-y instrument pairs over time so you can pass that relationship (equation) into LinearRegSlope().

I probably should have said "x-y slope as a function of time".

Fidelity is my only data provider. If Fidelity doesn't list it, I don't buy it.

QUOTE:

So think about inserting LinearRegSlope() in your code so it can return that x-y slope behavior over time--which is what it's design to do. Now you're left with forming the behavior (or equation) of the x-y instrument pairs over time so you can pass that relationship (equation) into LinearRegSlope().

That was one of my early attempts at addressing the issue. Way too much lag, which resulted in very poor performance.

Vince

PS. That is because you need to normalize the slope by dividing with a smoothed price series

QUOTE:If there is a time lag between the two instruments, then you need the cross correlation function (see Post# 10) to determine exactly what that lag is so you can time shift one series relative to the other first

That was one of my early attempts at addressing the issue. Way too much lag, which resulted in very poor performance.

You can simply line the two time series's up manually for now and add the cross correlation

Fidelity can't resolve ticker CLN19 or RBN19, so I can't run them on WL. Are there Fidelity symbol equivalents for these two instruments?

I'm just casually looking at the 6-month plots of these two instruments now (Daily bars), and I don't see that they are inversely correlated for arbitrage-pair trading. If anything, they look highly correlated to me. Am I on the wrong time scale?

Or perhaps I'm not understanding the goals here. The idea is to sell one to buy the other--right--because one goes down when the other goes up on a periodic bases? Is this period expected to cycle over days or minutes?

@superticker

These are Futures contracts (energies).

QUOTE:

Fidelity can't resolve ticker CLN19 or RBN19, so I can't run them on WL. Are there Fidelity symbol equivalents for these two instruments?

These are Futures contracts (energies).

superticker,

There is no time lag between the instruments.

It is a fully-hedged pairs trade - buy the stronger, sell the weaker. Think of it as a mean-reversion trading strategy, or a swing trade for equities. The period can be as short as a few days or as long as a couple or three weeks. But you need to get in at the right time or the profit is lost.

Vince

QUOTE:

If there is a time lag between the two instruments,

There is no time lag between the instruments.

QUOTE:

I'm just casually looking at the 6-month plots of these two instruments now (Daily bars), and I don't see that they are inversely correlated for arbitrage-pair trading. If anything, they look highly correlated to me. Am I on the wrong time scale?

Or perhaps I'm not understanding the goals here. The idea is to sell one to buy the other--right--because one goes down when the other goes up on a periodic bases? Is this period expected to cycle over days or minutes?

It is a fully-hedged pairs trade - buy the stronger, sell the weaker. Think of it as a mean-reversion trading strategy, or a swing trade for equities. The period can be as short as a few days or as long as a couple or three weeks. But you need to get in at the right time or the profit is lost.

Vince

I don't know how to trade these two instruments. And I never studied futures. I can't help you. Perhaps someone who understands futures knows how to do this. I'm out of my area here.

From Post# 70 of https://www.wealth-lab.com/Forum/Posts/The-future-of-WLD-40455/Page/1#213037

What you're asking about is X-Y regression, which is an entirely different thing. See my Math.NET solution in Post# 8 for that. It also includes a

Moreover, with Math.NET's ...

Plotting this is separate problem. You need to check out https://docs.microsoft.com/en-us/dotnet/api/system.windows.forms.datavisualization.charting for that.

QUOTE:Wealth-Lab's RSquared indicator is fitting a simple (1st-degree) linear regression vs time plot to a DataSeries, which works as expected.

2) The WL Standard Indicator 'RSquared': Definition - "R-squared explains to what extent the variance of one variable explains the variance of the second variable." As currently implemented it only plots using one series whereas it ought to plot one series (dependent variable) against a second one (independent variable).

What you're asking about is X-Y regression, which is an entirely different thing. See my Math.NET solution in Post# 8 for that. It also includes a

CODE:line to compute the R-squared of the fit as well.

Please log in to see this code.

Moreover, with Math.NET's ...

CODE:method (not shown), one can fit a regression model of any number of terms and any number of variables. And it works great.

Please log in to see this code.

Plotting this is separate problem. You need to check out https://docs.microsoft.com/en-us/dotnet/api/system.windows.forms.datavisualization.charting for that.