Hi everyone,
I previously posted this question but my message was not well written and did not contain any code so I will try to do a better job this time.
The goal is to perform a non-linear regression on complex-valued data.
I will first give a short description of the data and then describe the complex-valued non-linear function.
***The Data
Obtained from mechanical tests performed at 5 loading frequencies 0.1, 0.25, 1, 0.4 and 12 Hz, and at 5 temeperatures.
The independent variable used in the regression is:
f_data <- rep(c(0.1,0.25,1,4,12),5)
The measured values of the response variable are:
E <- c(289.7411+ 225.0708i , 386.4417+ 303.5021i , 671.5132+ 521.1253i , 1210.8638+ 846.6253i , 1860.9623+1155.4139i ,
862.8984+ 636.2637i , 1159.0436+ 814.5609i , 1919.0369+1186.5679i , 3060.7207+1573.6088i , 4318.1781+1868.4761i ,
2760.7782+1418.5450i , 3306.3013+1612.2712i , 4746.6958+1923.8662i , 6468.5769+2148.9502i , 8072.2642+2198.5344i ,
6757.7680+2061.3110i , 7591.9787+2123.9168i , 9522.9471+2261.8489i , 11255.0952+2166.6411i , 12601.3970+2120.7178i ,
11913.6543+2016.0828i , 12906.8294+2030.0610i , 14343.7693+1893.4877i , 15942.7703+1788.0910i , 16943.2261+1665.9847i)
To visualize the data:
plot(f_data,Re(E),log="xy")
plot(f_data,Im(E),log="xy")
plot(E)
***Non-linear regression function:
Obtained from an analytical model
E_2S2P1D <- function(f,logaT,Eg,Ee,k,h,delta,logbeta,logtau)
Ee+(Eg-Ee)*(
1+delta*(2i*pi*10^(logtau)*f*10^logaT)^-k +
(2i*pi*10^(logtau)*f*10^logaT)^-h +
(2i*pi*10^(logtau)*f*10^logaT*10^(logbeta))^-1
)^-1
E_2S2P1D is a complex-valued function (note the imaginary unit "i") where:
"f" is the real-valued independent variable (i.e. the frequency),
"logaT","Eg","Ee","k","h","delta","logbeta","logtau" are real-valued parameters.
For example:
E_2S2P1D(1,0,27000,200,0.16,0.47,1.97,2.2,0.02)
[1] 9544.759+2204.974i
E_2S2P1D(1,-1,27000,200,0.16,0.47,1.97,2.2,0.02)
[1] 6283.748+2088.473i
In order to find the parameters of the regression function I use ?nls".
Executing
fit <- nls(E ~ E_2S2P1D(f_data,c(rep(loga40,5),rep(loga30,5),rep(loga20,5),rep(0,5),rep(loga0,5)),
Eg,Ee,k,h,delta,logbeta,logtau),
start=list(loga40=-3.76,loga30=-2.63,loga20=-1.39,loga0=1.68,
Eg=27000,Ee=200,k=0.16,h=0.47,delta=1.97,logbeta=2.2,logtau=-0.02) )
results in a lot of warnings (my translation into english):
"In numericDeriv(form[[3L]], names(ind), env) :
imaginary parts removed during the conversion?
I?ve the same error trying:
y <- E_2S2P1D(f_data,c(rep(-3.76,5),rep(-2.63,5),rep(-1.39,5),rep(0,5),rep(1.68,5)),27000,200,0.16,0.47,1.97,2.2,0.02)
plot(E,y)
If I got it right R is not able to handle regression problems on complex-valued functions, correct?
If yes, is there some workaround?
For example using MSExcel I write the Real and Imaginary parts separately, and than I minimize
error = sum [ ( Re(E) - Re(E_E_2S2P1D) ) / Re(E) )^2 ] + sum [ ( Im(E) - Im(E_E_2S2P1D) ) / Im(E) )^2 ]
using the MSExcel solver function.
thank you for your help
Andrea
PS
Thanks to David Winsemius for his suggestions
I have not the mental energy to go through your somewhat complicated
example, but I suspect that your problem is simply the following: The
function nls() is trying to minimize a sum of squares, and that does not
make sense in the context of complex observations. That is, nls() is
trying to minimize a sum of terms of the form
(f(x_i,theta) - z_i)^2
where f(x_i,theta) and z_i are complex quantities (and theta is the
vector of parameters with respect to which minimization is "trying to be
done". But as I said, this makes no sense. Not that
(f(x_i,theta) - z_i)^2 is a complex quantity and *not* a positive real
quantity.
You need to minimize a sum of terms of the form
|f(x_i,theta) - z_i|^2
where |w|^2 equal to w times the complex conjugate of w. You get this
in R via the Mod() function, i.e. |w|^2 is (Mod(w))^2.
Thus it seems to me that nls() cannot handle what you want handled. You
need to code up the function to be minimized yourself, using Mod() and
then use optim() or nlm() to minimize it that function with respect to
"theta".
HTH
cheers,
Rolf Turner
On 10/02/14 11:45, Andrea Graziani wrote:
Hi everyone,
I previously posted this question but my message was not well written and did not contain any code so I will try to do a better job this time.
The goal is to perform a non-linear regression on complex-valued data.
I will first give a short description of the data and then describe the complex-valued non-linear function.
***The Data
Obtained from mechanical tests performed at 5 loading frequencies 0.1, 0.25, 1, 0.4 and 12 Hz, and at 5 temeperatures.
The independent variable used in the regression is:
f_data <- rep(c(0.1,0.25,1,4,12),5)
The measured values of the response variable are:
E <- c(289.7411+ 225.0708i , 386.4417+ 303.5021i , 671.5132+ 521.1253i , 1210.8638+ 846.6253i , 1860.9623+1155.4139i ,
862.8984+ 636.2637i , 1159.0436+ 814.5609i , 1919.0369+1186.5679i , 3060.7207+1573.6088i , 4318.1781+1868.4761i ,
2760.7782+1418.5450i , 3306.3013+1612.2712i , 4746.6958+1923.8662i , 6468.5769+2148.9502i , 8072.2642+2198.5344i ,
6757.7680+2061.3110i , 7591.9787+2123.9168i , 9522.9471+2261.8489i , 11255.0952+2166.6411i , 12601.3970+2120.7178i ,
11913.6543+2016.0828i , 12906.8294+2030.0610i , 14343.7693+1893.4877i , 15942.7703+1788.0910i , 16943.2261+1665.9847i)
To visualize the data:
plot(f_data,Re(E),log="xy")
plot(f_data,Im(E),log="xy")
plot(E)
***Non-linear regression function:
Obtained from an analytical model
E_2S2P1D <- function(f,logaT,Eg,Ee,k,h,delta,logbeta,logtau)
Ee+(Eg-Ee)*(
1+delta*(2i*pi*10^(logtau)*f*10^logaT)^-k +
(2i*pi*10^(logtau)*f*10^logaT)^-h +
(2i*pi*10^(logtau)*f*10^logaT*10^(logbeta))^-1
)^-1
E_2S2P1D is a complex-valued function (note the imaginary unit "i") where:
"f" is the real-valued independent variable (i.e. the frequency),
"logaT","Eg","Ee","k","h","delta","logbeta","logtau" are real-valued parameters.
For example:
E_2S2P1D(1,0,27000,200,0.16,0.47,1.97,2.2,0.02)
[1] 9544.759+2204.974i
E_2S2P1D(1,-1,27000,200,0.16,0.47,1.97,2.2,0.02)
[1] 6283.748+2088.473i
In order to find the parameters of the regression function I use ?nls".
Executing
fit <- nls(E ~ E_2S2P1D(f_data,c(rep(loga40,5),rep(loga30,5),rep(loga20,5),rep(0,5),rep(loga0,5)),
Eg,Ee,k,h,delta,logbeta,logtau),
start=list(loga40=-3.76,loga30=-2.63,loga20=-1.39,loga0=1.68,
Eg=27000,Ee=200,k=0.16,h=0.47,delta=1.97,logbeta=2.2,logtau=-0.02) )
results in a lot of warnings (my translation into english):
"In numericDeriv(form[[3L]], names(ind), env) :
imaginary parts removed during the conversion?
I?ve the same error trying:
y <- E_2S2P1D(f_data,c(rep(-3.76,5),rep(-2.63,5),rep(-1.39,5),rep(0,5),rep(1.68,5)),27000,200,0.16,0.47,1.97,2.2,0.02)
plot(E,y)
If I got it right R is not able to handle regression problems on complex-valued functions, correct?
If yes, is there some workaround?
For example using MSExcel I write the Real and Imaginary parts separately, and than I minimize
error = sum [ ( Re(E) - Re(E_E_2S2P1D) ) / Re(E) )^2 ] + sum [ ( Im(E) - Im(E_E_2S2P1D) ) / Im(E) )^2 ]
using the MSExcel solver function.
thank you for your help
Andrea
PS
Thanks to David Winsemius for his suggestions
Hi Andrea
Although there are theory on linear models for complex valued normal distributions (see e.g. http://www.amazon.com/Linear-Graphical-Models-Multivariate-Distribution/dp/0387945210) to my knowledge there is no function for doing complex-valued regression in R.
However as my reference probably explain (long time since I saw the textbook) you can consider the Im and Re parts separately. But first some questions. I don't know anything about the physics behind your formula. It seems that you want to do non-linear regression for five curves (one for each value of components of the vector logaT). Why are loga40, loga30, loga20, 0 (zero value), and loga0 not independent variables that you control? If so you don't want to estimate those.
If the logaT parameters are not under your control do you still really want to estimate the same set of values of the parameters of the model for each set of your data? And not separate estimates for each curve?
However (putting my head under my shoulder) you can do something like this (Having just noticed Rolf Turner's answer you should absolutely explore that solutions. That would probably the best way to go).
###### R script:
## added some NAs do make separat plot of curves
## then you need to set na.action argument of nls()
## your data
f_data <- rep(c(0.1,0.25,1,4,12, NA),5)
E <- c(289.7411+ 225.0708i , 386.4417+ 303.5021i , 671.5132+ 521.1253i , 1210.8638+ 846.6253i , 1860.9623+1155.4139i,NA,
862.8984+ 636.2637i , 1159.0436+ 814.5609i , 1919.0369+1186.5679i , 3060.7207+1573.6088i , 4318.1781+1868.4761i,NA,
2760.7782+1418.5450i , 3306.3013+1612.2712i , 4746.6958+1923.8662i , 6468.5769+2148.9502i , 8072.2642+2198.5344i,NA,
6757.7680+2061.3110i , 7591.9787+2123.9168i , 9522.9471+2261.8489i , 11255.0952+2166.6411i , 12601.3970+2120.7178i,NA,
11913.6543+2016.0828i , 12906.8294+2030.0610i ,14343.7693+1893.4877i, 15942.7703+1788.0910i, 16943.2261+1665.9847i,NA)
## function
E_2S2P1D <- function(f,logaT,Eg,Ee,k,h,delta,logbeta,logtau) {
Ee+(Eg-Ee)*(
1+delta*(2i*pi*10^(logtau)*f*10^logaT)^-k +
(2i*pi*10^(logtau)*f*10^logaT)^-h +
(2i*pi*10^(logtau)*f*10^logaT*10^(logbeta))^-1
)^-1 }
mycol <- rep(c("red", "green", "blue", "magenta", "black"), each = 6)
png("plot-raw-data.png", height = 3*480)
par(mfrow = c(3,1))
plot(f_data, Re(E), type = "p", col = mycol)# log="xy")
plot(f_data,Im(E), type = "p", col = mycol)# log="xy")
plot(f_data,abs(E), type = "p", col = mycol)# log="xy")
dev.off()
myE <- data.frame(f = f_data, E = E)
## this seems to run
fit.re <- nls(Re(E) ~ Re(E_2S2P1D(f,c(rep(loga40,5),rep(loga30,5),rep(loga20,5),rep(0,5),rep(loga0,5)),
Eg,Ee,k,h,delta,logbeta,logtau)),
data = myE, na.action = "na.exclude",
start=list(loga40=-3.76,loga30=-2.63,loga20=-1.39,loga0=1.68,
Eg=27000,Ee=200,k=0.16,h=0.47,delta=1.97,logbeta=2.2,logtau=-0.02) )
fit.re
## plot fit overlayed observations
png("fitted-Re.png")
plot(f_data, Re(E), type = "p", col = mycol)# log="xy")
lines(f_data, fitted(fit.re), col = "grey80") #, log = "xy"
dev.off()
## this gives an error
## probably due to the 5 curves do not share parameters. See 3rd panel in attached plot-raw-data.png
fit.im <- nls(Im(E) ~ Im(E_2S2P1D(f,c(rep(loga40,5),rep(loga30,5),rep(loga20,5),rep(0,5),rep(loga0,5)),
Eg,Ee,k,h,delta,logbeta,logtau)),
data = myE, na.action = "na.exclude",
start=list(loga40=-3.76,loga30=-2.63,loga20=-1.39,loga0=1.68,
Eg=27000,Ee=200,k=0.16,h=0.47,delta=1.97,logbeta=2.2,logtau=-0.02) )
## Now try abs()
## which works and gives allmost same estimates as for fit.re
fit.abs <- nls(abs(E) ~ abs(E_2S2P1D(f,c(rep(loga40,5),rep(loga30,5),rep(loga20,5),rep(0,5),rep(loga0,5)),
Eg,Ee,k,h,delta,logbeta,logtau)),
data = myE, na.action = "na.exclude",
start=list(loga40=-3.76,loga30=-2.63,loga20=-1.39,loga0=1.68,
Eg=27000,Ee=200,k=0.16,h=0.47,delta=1.97,logbeta=2.2,logtau=-0.02))
fit.abs
png("fitted-abs.png")
plot(f_data, abs(E), type = "p", col = mycol)# log="xy")
lines(f_data, predict(fit.abs), col = "grey80") #, log = "xy"
dev.off()
Yours sincerely / Med venlig hilsen
Frede Aakmann T?gersen
Specialist, M.Sc., Ph.D.
Plant Performance & Modeling
Technology & Service Solutions
T +45 9730 5135
M +45 2547 6050
frtog at vestas.com
http://www.vestas.com
Company reg. name: Vestas Wind Systems A/S
This e-mail is subject to our e-mail disclaimer statement.
Please refer to www.vestas.com/legal/notice
If you have received this e-mail in error please contact the sender.
-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org]
On Behalf Of Andrea Graziani
Sent: 9. februar 2014 23:46
To: r-help at r-project.org
Subject: [R] Regression of complex-valued functions
Hi everyone,
I previously posted this question but my message was not well written and
did not contain any code so I will try to do a better job this time.
The goal is to perform a non-linear regression on complex-valued data.
I will first give a short description of the data and then describe the complex-
valued non-linear function.
***The Data
Obtained from mechanical tests performed at 5 loading frequencies 0.1, 0.25,
1, 0.4 and 12 Hz, and at 5 temeperatures.
The independent variable used in the regression is:
f_data <- rep(c(0.1,0.25,1,4,12),5)
The measured values of the response variable are:
E <- c(289.7411+ 225.0708i , 386.4417+ 303.5021i , 671.5132+ 521.1253i ,
1210.8638+ 846.6253i , 1860.9623+1155.4139i ,
862.8984+ 636.2637i , 1159.0436+ 814.5609i , 1919.0369+1186.5679i ,
3060.7207+1573.6088i , 4318.1781+1868.4761i ,
2760.7782+1418.5450i , 3306.3013+1612.2712i , 4746.6958+1923.8662i ,
6468.5769+2148.9502i , 8072.2642+2198.5344i ,
6757.7680+2061.3110i , 7591.9787+2123.9168i , 9522.9471+2261.8489i ,
11255.0952+2166.6411i , 12601.3970+2120.7178i ,
11913.6543+2016.0828i , 12906.8294+2030.0610i , 14343.7693+1893.4877i ,
15942.7703+1788.0910i , 16943.2261+1665.9847i)
To visualize the data:
plot(f_data,Re(E),log="xy")
plot(f_data,Im(E),log="xy")
plot(E)
***Non-linear regression function:
Obtained from an analytical model
E_2S2P1D <- function(f,logaT,Eg,Ee,k,h,delta,logbeta,logtau)
Ee+(Eg-Ee)*(
1+delta*(2i*pi*10^(logtau)*f*10^logaT)^-k +
(2i*pi*10^(logtau)*f*10^logaT)^-h +
(2i*pi*10^(logtau)*f*10^logaT*10^(logbeta))^-1
)^-1
E_2S2P1D is a complex-valued function (note the imaginary unit "i") where:
"f" is the real-valued independent variable (i.e. the frequency),
"logaT","Eg","Ee","k","h","delta","logbeta","logtau" are real-valued
parameters.
For example:
E_2S2P1D(1,0,27000,200,0.16,0.47,1.97,2.2,0.02)
[1] 9544.759+2204.974i
E_2S2P1D(1,-1,27000,200,0.16,0.47,1.97,2.2,0.02)
[1] 6283.748+2088.473i
In order to find the parameters of the regression function I use "nls".
Executing
fit <- nls(E ~
E_2S2P1D(f_data,c(rep(loga40,5),rep(loga30,5),rep(loga20,5),rep(0,5),rep(lo
ga0,5)),
Eg,Ee,k,h,delta,logbeta,logtau),
start=list(loga40=-3.76,loga30=-2.63,loga20=-1.39,loga0=1.68,
Eg=27000,Ee=200,k=0.16,h=0.47,delta=1.97,logbeta=2.2,logtau=-
0.02) )
results in a lot of warnings (my translation into english):
"In numericDeriv(form[[3L]], names(ind), env) :
imaginary parts removed during the conversion"
I've the same error trying:
y <- E_2S2P1D(f_data,c(rep(-3.76,5),rep(-2.63,5),rep(-
1.39,5),rep(0,5),rep(1.68,5)),27000,200,0.16,0.47,1.97,2.2,0.02)
plot(E,y)
If I got it right R is not able to handle regression problems on complex-valued
functions, correct?
If yes, is there some workaround?
For example using MSExcel I write the Real and Imaginary parts separately,
and than I minimize
error = sum [ ( Re(E) - Re(E_E_2S2P1D) ) / Re(E) )^2 ] + sum [ ( Im(E) -
Im(E_E_2S2P1D) ) / Im(E) )^2 ]
using the MSExcel solver function.
thank you for your help
Andrea
PS
Thanks to David Winsemius for his suggestions
On Feb 9, 2014, at 2:45 PM, Andrea Graziani wrote:
Hi everyone,
I previously posted this question but my message was not well written and did not contain any code so I will try to do a better job this time.
The goal is to perform a non-linear regression on complex-valued data.
I will first give a short description of the data and then describe the complex-valued non-linear function.
***The Data
Obtained from mechanical tests performed at 5 loading frequencies 0.1, 0.25, 1, 0.4 and 12 Hz, and at 5 temeperatures.
The independent variable used in the regression is:
f_data <- rep(c(0.1,0.25,1,4,12),5)
The measured values of the response variable are:
E <- c(289.7411+ 225.0708i , 386.4417+ 303.5021i , 671.5132+ 521.1253i , 1210.8638+ 846.6253i , 1860.9623+1155.4139i ,
862.8984+ 636.2637i , 1159.0436+ 814.5609i , 1919.0369+1186.5679i , 3060.7207+1573.6088i , 4318.1781+1868.4761i ,
2760.7782+1418.5450i , 3306.3013+1612.2712i , 4746.6958+1923.8662i , 6468.5769+2148.9502i , 8072.2642+2198.5344i ,
6757.7680+2061.3110i , 7591.9787+2123.9168i , 9522.9471+2261.8489i , 11255.0952+2166.6411i , 12601.3970+2120.7178i ,
11913.6543+2016.0828i , 12906.8294+2030.0610i , 14343.7693+1893.4877i , 15942.7703+1788.0910i , 16943.2261+1665.9847i)
To visualize the data:
plot(f_data,Re(E),log="xy")
plot(f_data,Im(E),log="xy")
plot(E)
***Non-linear regression function:
Obtained from an analytical model
E_2S2P1D <- function(f,logaT,Eg,Ee,k,h,delta,logbeta,logtau)
Ee+(Eg-Ee)*(
1+delta*(2i*pi*10^(logtau)*f*10^logaT)^-k +
(2i*pi*10^(logtau)*f*10^logaT)^-h +
(2i*pi*10^(logtau)*f*10^logaT*10^(logbeta))^-1
)^-1
Others have pointed out the theoretical difficulties regarding curve fitting on the complex plane. My concern looking at this is that you have apparently lapse into mathematical notation rather than functional programming notation above:
This
2i*pi*10^(logtau)*f*10^logaT
Would need to be this:
2*i*pi*10^(logtau)*f*10^logaT
You would start getting parser errors since '2i' is not a valid token name. I would also worry about negative powers. They can sometimes be numerically unstable.
And you have not offered the values for temperatures which I suspect are the items in your start list: loga40=-3.76,loga30=-2.63,loga20=-1.39,loga0=1.68. I would think these are fixed rather than something to be optimized. You can leave them in the global environment or build them into the function but I don't think they should be in your start list. (Frede Aakmann T?gersen noted that issue but did not build a solution that incorporated addressed that concern.)
To examine that relationship you would:
loga40=-3.76;loga30=-2.63;loga20=-1.39;loga0=1.68
logaT <- c(rep(loga40,5),rep(loga30,5),rep(loga20,5),rep(0,5),rep(loga0,5))
# To visualize the data:
plot(logaT,Re(E), log="y")
plot(logaT,Im(E), log="y")
David.
>
> E_2S2P1D is a complex-valued function (note the imaginary unit "i") where:
> "f" is the real-valued independent variable (i.e. the frequency),
> "logaT","Eg","Ee","k","h","delta","logbeta","logtau" are real-valued parameters.
>
> For example:
>
>> E_2S2P1D(1,0,27000,200,0.16,0.47,1.97,2.2,0.02)
> [1] 9544.759+2204.974i
>
>> E_2S2P1D(1,-1,27000,200,0.16,0.47,1.97,2.2,0.02)
> [1] 6283.748+2088.473i
>
> In order to find the parameters of the regression function I use ?nls".
>
> Executing
>
> fit <- nls(E ~ E_2S2P1D(f_data,c(rep(loga40,5),rep(loga30,5),rep(loga20,5),rep(0,5),rep(loga0,5)),
> Eg,Ee,k,h,delta,logbeta,logtau),
> start=list(loga40=-3.76,loga30=-2.63,loga20=-1.39,loga0=1.68,
> Eg=27000,Ee=200,k=0.16,h=0.47,delta=1.97,logbeta=2.2,logtau=-0.02) )
>
> results in a lot of warnings (my translation into english):
> "In numericDeriv(form[[3L]], names(ind), env) :
> imaginary parts removed during the conversion?
>
> I?ve the same error trying:
> y <- E_2S2P1D(f_data,c(rep(-3.76,5),rep(-2.63,5),rep(-1.39,5),rep(0,5),rep(1.68,5)),27000,200,0.16,0.47,1.97,2.2,0.02)
> plot(E,y)
>
> If I got it right R is not able to handle regression problems on complex-valued functions, correct?
> If yes, is there some workaround?
> For example using MSExcel I write the Real and Imaginary parts separately, and than I minimize
>
> error = sum [ ( Re(E) - Re(E_E_2S2P1D) ) / Re(E) )^2 ] + sum [ ( Im(E) - Im(E_E_2S2P1D) ) / Im(E) )^2 ]
>
> using the MSExcel solver function.
>
> thank you for your help
> Andrea
>
> PS
> Thanks to David Winsemius for his suggestions
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
David Winsemius
Alameda, CA, USA
On Feb 9, 2014, at 2:45 PM, Andrea Graziani wrote:
Hi everyone,
I previously posted this question but my message was not well written and did not contain any code so I will try to do a better job this time.
The goal is to perform a non-linear regression on complex-valued data.
I will first give a short description of the data and then describe the complex-valued non-linear function.
***The Data
Obtained from mechanical tests performed at 5 loading frequencies 0.1, 0.25, 1, 0.4 and 12 Hz, and at 5 temeperatures.
The independent variable used in the regression is:
f_data <- rep(c(0.1,0.25,1,4,12),5)
The measured values of the response variable are:
E <- c(289.7411+ 225.0708i , 386.4417+ 303.5021i , 671.5132+ 521.1253i , 1210.8638+ 846.6253i , 1860.9623+1155.4139i ,
862.8984+ 636.2637i , 1159.0436+ 814.5609i , 1919.0369+1186.5679i , 3060.7207+1573.6088i , 4318.1781+1868.4761i ,
2760.7782+1418.5450i , 3306.3013+1612.2712i , 4746.6958+1923.8662i , 6468.5769+2148.9502i , 8072.2642+2198.5344i ,
6757.7680+2061.3110i , 7591.9787+2123.9168i , 9522.9471+2261.8489i , 11255.0952+2166.6411i , 12601.3970+2120.7178i ,
11913.6543+2016.0828i , 12906.8294+2030.0610i , 14343.7693+1893.4877i , 15942.7703+1788.0910i , 16943.2261+1665.9847i)
To visualize the data:
plot(f_data,Re(E),log="xy")
plot(f_data,Im(E),log="xy")
plot(E)
***Non-linear regression function:
Obtained from an analytical model
E_2S2P1D <- function(f,logaT,Eg,Ee,k,h,delta,logbeta,logtau)
Ee+(Eg-Ee)*(
1+delta*(2i*pi*10^(logtau)*f*10^logaT)^-k +
(2i*pi*10^(logtau)*f*10^logaT)^-h +
(2i*pi*10^(logtau)*f*10^logaT*10^(logbeta))^-1
)^-1
Others have pointed out the theoretical difficulties regarding curve fitting on the complex plane. My concern looking at this is that you have apparently lapse into mathematical notation rather than functional programming notation above:
This
2i*pi*10^(logtau)*f*10^logaT
Would need to be this:
2*i*pi*10^(logtau)*f*10^logaT
You would start getting parser errors since '2i' is not a valid token name. I would also worry about negative powers. They can sometimes be numerically unstable.
2i is fine: it is the same as 0+2i, which is how it normally prints.
Duncan Murdoch
And you have not offered the values for temperatures which I suspect are the items in your start list: loga40=-3.76,loga30=-2.63,loga20=-1.39,loga0=1.68. I would think these are fixed rather than something to be optimized. You can leave them in the global environment or build them into the function but I don't think they should be in your start list. (Frede Aakmann T?gersen noted that issue but did not build a solution that incorporated addressed that concern.)
To examine that relationship you would:
loga40=-3.76;loga30=-2.63;loga20=-1.39;loga0=1.68
logaT <- c(rep(loga40,5),rep(loga30,5),rep(loga20,5),rep(0,5),rep(loga0,5))
# To visualize the data:
plot(logaT,Re(E), log="y")
plot(logaT,Im(E), log="y")
Dear Rolf,
Thank you for your suggestion.
Based on your remarks I solved my problem using nlm().
Actually there are two quite straightforward ways to split the complex-valued problem into two ?linked? real-valued problems.
### 1. Real part and Imaginary part
# Experimental data
E1_data <- Re(E)
E2_data <- Im(E)
# Model function (same of my previous post only different notations)
E_2S2P1D <- function(f,logaT,logEg,logEe,k,h,delta,logbeta,logtau)
10^logEe+(10^logEg-10^logEe)*(
1+delta*(2i*pi*10^(logtau)*f*10^logaT)^-k +
(2i*pi*10^(logtau)*f*10^logaT)^-h +
(2i*pi*10^(logtau)*f*10^logaT*10^(logbeta))^-1
)^-1
# Distance function
dist <- function(p)
sum( ( (E1_data - Re(E_2S2P1D(f_data,c(rep(p[1],5),rep(p[2],5),rep(p[3],5),rep(0,5),rep(p[4],5)), p[5],p[6],p[7],p[8],p[9],p[10],p[11])) ) / E1_data )^2 ) +
sum( ( (E2_data - Im(E_2S2P1D(f_data,c(rep(p[1],5),rep(p[2],5),rep(p[3],5),rep(0,5),rep(p[4],5)), p[5],p[6],p[7],p[8],p[9],p[10],p[11])) ) / E2_data )^2 )
# fitting
fit <- nlm(dist, p = c(-3.76,-2.63,-1.39,1.68,log10(27000),log10(200),0.16,0.47,1.97,2.2,0.02))
### 2. Modulus and argument
# Experimental data
Emod_data <- Mod(E)
Earg_data <- Arg(E)
# Model function: same as above
# Distance function
dist <- function(p)
sum(((Emod_data - log10(Mod(E_2S2P1D(f_data,c(rep(p[1],5),rep(p[2],5),rep(p[3],5),rep(0,5),rep(p[4],5)), p[5],p[6],p[7],p[8],p[9],p[10],p[11])) ) / Emod_data )^2 ) +
sum( ( (Earg_data - Arg(E_2S2P1D(f_data,c(rep(p[1],5),rep(p[2],5),rep(p[3],5),rep(0,5),rep(p[4],5)), p[5],p[6],p[7],p[8],p[9],p[10],p[11])) ) / Earg_data )^2 )
# fitting: same as above
Using the same starting values, the two approaches bring to slightly different solutions:
### 1. Real part and Imaginary part
[1] -3.8674680 -2.7250640 -1.4574350 1.6447868 4.4355748 1.2400092 0.1574215 0.4731295 2.0624878 3.7526197 0.1161640
Do you believe this is due only to ?numerical" reasons linked to how the nls() function works?
Anyway...Thanks again!
Andrea
Il giorno 11/feb/2014, alle ore 09:42, Rolf Turner <r.turner at auckland.ac.nz> ha scritto:
I have not the mental energy to go through your somewhat complicated example, but I suspect that your problem is simply the following: The function nls() is trying to minimize a sum of squares, and that does not make sense in the context of complex observations. That is, nls() is trying to minimize a sum of terms of the form
(f(x_i,theta) - z_i)^2
where f(x_i,theta) and z_i are complex quantities (and theta is the vector of parameters with respect to which minimization is "trying to be done". But as I said, this makes no sense. Not that
(f(x_i,theta) - z_i)^2 is a complex quantity and *not* a positive real quantity.
You need to minimize a sum of terms of the form
|f(x_i,theta) - z_i|^2
where |w|^2 equal to w times the complex conjugate of w. You get this in R via the Mod() function, i.e. |w|^2 is (Mod(w))^2.
Thus it seems to me that nls() cannot handle what you want handled. You need to code up the function to be minimized yourself, using Mod() and then use optim() or nlm() to minimize it that function with respect to "theta".
HTH
cheers,
Rolf Turner
On 10/02/14 11:45, Andrea Graziani wrote:
Hi everyone,
I previously posted this question but my message was not well written and did not contain any code so I will try to do a better job this time.
The goal is to perform a non-linear regression on complex-valued data.
I will first give a short description of the data and then describe the complex-valued non-linear function.
***The Data
Obtained from mechanical tests performed at 5 loading frequencies 0.1, 0.25, 1, 0.4 and 12 Hz, and at 5 temeperatures.
The independent variable used in the regression is:
f_data <- rep(c(0.1,0.25,1,4,12),5)
The measured values of the response variable are:
E <- c(289.7411+ 225.0708i , 386.4417+ 303.5021i , 671.5132+ 521.1253i , 1210.8638+ 846.6253i , 1860.9623+1155.4139i ,
862.8984+ 636.2637i , 1159.0436+ 814.5609i , 1919.0369+1186.5679i , 3060.7207+1573.6088i , 4318.1781+1868.4761i ,
2760.7782+1418.5450i , 3306.3013+1612.2712i , 4746.6958+1923.8662i , 6468.5769+2148.9502i , 8072.2642+2198.5344i ,
6757.7680+2061.3110i , 7591.9787+2123.9168i , 9522.9471+2261.8489i , 11255.0952+2166.6411i , 12601.3970+2120.7178i ,
11913.6543+2016.0828i , 12906.8294+2030.0610i , 14343.7693+1893.4877i , 15942.7703+1788.0910i , 16943.2261+1665.9847i)
To visualize the data:
plot(f_data,Re(E),log="xy")
plot(f_data,Im(E),log="xy")
plot(E)
***Non-linear regression function:
Obtained from an analytical model
E_2S2P1D <- function(f,logaT,Eg,Ee,k,h,delta,logbeta,logtau)
Ee+(Eg-Ee)*(
1+delta*(2i*pi*10^(logtau)*f*10^logaT)^-k +
(2i*pi*10^(logtau)*f*10^logaT)^-h +
(2i*pi*10^(logtau)*f*10^logaT*10^(logbeta))^-1
)^-1
E_2S2P1D is a complex-valued function (note the imaginary unit "i") where:
"f" is the real-valued independent variable (i.e. the frequency),
"logaT","Eg","Ee","k","h","delta","logbeta","logtau" are real-valued parameters.
For example:
E_2S2P1D(1,0,27000,200,0.16,0.47,1.97,2.2,0.02)
[1] 9544.759+2204.974i
E_2S2P1D(1,-1,27000,200,0.16,0.47,1.97,2.2,0.02)
[1] 6283.748+2088.473i
In order to find the parameters of the regression function I use ?nls".
Executing
fit <- nls(E ~ E_2S2P1D(f_data,c(rep(loga40,5),rep(loga30,5),rep(loga20,5),rep(0,5),rep(loga0,5)),
Eg,Ee,k,h,delta,logbeta,logtau),
start=list(loga40=-3.76,loga30=-2.63,loga20=-1.39,loga0=1.68,
Eg=27000,Ee=200,k=0.16,h=0.47,delta=1.97,logbeta=2.2,logtau=-0.02) )
results in a lot of warnings (my translation into english):
"In numericDeriv(form[[3L]], names(ind), env) :
imaginary parts removed during the conversion?
I?ve the same error trying:
y <- E_2S2P1D(f_data,c(rep(-3.76,5),rep(-2.63,5),rep(-1.39,5),rep(0,5),rep(1.68,5)),27000,200,0.16,0.47,1.97,2.2,0.02)
plot(E,y)
If I got it right R is not able to handle regression problems on complex-valued functions, correct?
If yes, is there some workaround?
For example using MSExcel I write the Real and Imaginary parts separately, and than I minimize
error = sum [ ( Re(E) - Re(E_E_2S2P1D) ) / Re(E) )^2 ] + sum [ ( Im(E) - Im(E_E_2S2P1D) ) / Im(E) )^2 ]
using the MSExcel solver function.
thank you for your help
Andrea
PS
Thanks to David Winsemius for his suggestions
Hi Frede,
Thank you for your accurate answer!
If I understand well, your way to use nls() solves the problem using too many physical parameters.
I solved the problem following the other way that you and Rolf Turner suggested (i.e. splitting the complex-valued problem into two real-valued problems and using nlm() ).
If you are interested in more details, please see my answer to Rolf?s mail.
Why are loga40, loga30, loga20, 0 (zero value), and loga0 not independent variables that you control? If so you don't want to estimate those.
If the logaT parameters are not under your control do you still really want to estimate the same set of values of the parameters of the model for each set of your data? And not separate estimates for each curve?
You noticed a key point of the mechanical problem!
However I?m looking for a ?single curve?, so-called ?master curve?, to fit all experimental data!
However (putting my head under my shoulder) you can do something like this (Having just noticed Rolf Turner's answer you should absolutely explore that solutions. That would probably the best way to go).
;)
Thanks again!
Andrea
## this seems to run
fit.re <- nls(Re(E) ~ Re(E_2S2P1D(f,c(rep(loga40,5),rep(loga30,5),rep(loga20,5),rep(0,5),rep(loga0,5)),
Eg,Ee,k,h,delta,logbeta,logtau)),
data = myE, na.action = "na.exclude",
start=list(loga40=-3.76,loga30=-2.63,loga20=-1.39,loga0=1.68,
Eg=27000,Ee=200,k=0.16,h=0.47,delta=1.97,logbeta=2.2,logtau=-0.02) )
fit.re
## plot fit overlayed observations
png("fitted-Re.png")
plot(f_data, Re(E), type = "p", col = mycol)# log="xy")
lines(f_data, fitted(fit.re), col = "grey80") #, log = "xy"
dev.off()
## this gives an error
## probably due to the 5 curves do not share parameters. See 3rd panel in attached plot-raw-data.png
fit.im <- nls(Im(E) ~ Im(E_2S2P1D(f,c(rep(loga40,5),rep(loga30,5),rep(loga20,5),rep(0,5),rep(loga0,5)),
Eg,Ee,k,h,delta,logbeta,logtau)),
data = myE, na.action = "na.exclude",
start=list(loga40=-3.76,loga30=-2.63,loga20=-1.39,loga0=1.68,
Eg=27000,Ee=200,k=0.16,h=0.47,delta=1.97,logbeta=2.2,logtau=-0.02) )
## Now try abs()
## which works and gives allmost same estimates as for fit.re
fit.abs <- nls(abs(E) ~ abs(E_2S2P1D(f,c(rep(loga40,5),rep(loga30,5),rep(loga20,5),rep(0,5),rep(loga0,5)),
Eg,Ee,k,h,delta,logbeta,logtau)),
data = myE, na.action = "na.exclude",
start=list(loga40=-3.76,loga30=-2.63,loga20=-1.39,loga0=1.68,
Eg=27000,Ee=200,k=0.16,h=0.47,delta=1.97,logbeta=2.2,logtau=-0.02))
fit.abs
png("fitted-abs.png")
plot(f_data, abs(E), type = "p", col = mycol)# log="xy")
lines(f_data, predict(fit.abs), col = "grey80") #, log = "xy"
dev.off()
Yours sincerely / Med venlig hilsen
Frede Aakmann T?gersen
Specialist, M.Sc., Ph.D.
Plant Performance & Modeling
Technology & Service Solutions
T +45 9730 5135
M +45 2547 6050
frtog at vestas.com
http://www.vestas.com
Company reg. name: Vestas Wind Systems A/S
This e-mail is subject to our e-mail disclaimer statement.
Please refer to www.vestas.com/legal/notice
If you have received this e-mail in error please contact the sender.
-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org]
On Behalf Of Andrea Graziani
Sent: 9. februar 2014 23:46
To: r-help at r-project.org
Subject: [R] Regression of complex-valued functions
Hi everyone,
I previously posted this question but my message was not well written and
did not contain any code so I will try to do a better job this time.
The goal is to perform a non-linear regression on complex-valued data.
I will first give a short description of the data and then describe the complex-
valued non-linear function.
***The Data
Obtained from mechanical tests performed at 5 loading frequencies 0.1, 0.25,
1, 0.4 and 12 Hz, and at 5 temeperatures.
The independent variable used in the regression is:
f_data <- rep(c(0.1,0.25,1,4,12),5)
The measured values of the response variable are:
E <- c(289.7411+ 225.0708i , 386.4417+ 303.5021i , 671.5132+ 521.1253i ,
1210.8638+ 846.6253i , 1860.9623+1155.4139i ,
862.8984+ 636.2637i , 1159.0436+ 814.5609i , 1919.0369+1186.5679i ,
3060.7207+1573.6088i , 4318.1781+1868.4761i ,
2760.7782+1418.5450i , 3306.3013+1612.2712i , 4746.6958+1923.8662i ,
6468.5769+2148.9502i , 8072.2642+2198.5344i ,
6757.7680+2061.3110i , 7591.9787+2123.9168i , 9522.9471+2261.8489i ,
11255.0952+2166.6411i , 12601.3970+2120.7178i ,
11913.6543+2016.0828i , 12906.8294+2030.0610i , 14343.7693+1893.4877i ,
15942.7703+1788.0910i , 16943.2261+1665.9847i)
To visualize the data:
plot(f_data,Re(E),log="xy")
plot(f_data,Im(E),log="xy")
plot(E)
***Non-linear regression function:
Obtained from an analytical model
E_2S2P1D <- function(f,logaT,Eg,Ee,k,h,delta,logbeta,logtau)
Ee+(Eg-Ee)*(
1+delta*(2i*pi*10^(logtau)*f*10^logaT)^-k +
(2i*pi*10^(logtau)*f*10^logaT)^-h +
(2i*pi*10^(logtau)*f*10^logaT*10^(logbeta))^-1
)^-1
E_2S2P1D is a complex-valued function (note the imaginary unit "i") where:
"f" is the real-valued independent variable (i.e. the frequency),
"logaT","Eg","Ee","k","h","delta","logbeta","logtau" are real-valued
parameters.
For example:
E_2S2P1D(1,0,27000,200,0.16,0.47,1.97,2.2,0.02)
[1] 9544.759+2204.974i
E_2S2P1D(1,-1,27000,200,0.16,0.47,1.97,2.2,0.02)
[1] 6283.748+2088.473i
In order to find the parameters of the regression function I use "nls".
Executing
fit <- nls(E ~
E_2S2P1D(f_data,c(rep(loga40,5),rep(loga30,5),rep(loga20,5),rep(0,5),rep(lo
ga0,5)),
Eg,Ee,k,h,delta,logbeta,logtau),
start=list(loga40=-3.76,loga30=-2.63,loga20=-1.39,loga0=1.68,
Eg=27000,Ee=200,k=0.16,h=0.47,delta=1.97,logbeta=2.2,logtau=-
0.02) )
results in a lot of warnings (my translation into english):
"In numericDeriv(form[[3L]], names(ind), env) :
imaginary parts removed during the conversion"
I've the same error trying:
y <- E_2S2P1D(f_data,c(rep(-3.76,5),rep(-2.63,5),rep(-
1.39,5),rep(0,5),rep(1.68,5)),27000,200,0.16,0.47,1.97,2.2,0.02)
plot(E,y)
If I got it right R is not able to handle regression problems on complex-valued
functions, correct?
If yes, is there some workaround?
For example using MSExcel I write the Real and Imaginary parts separately,
and than I minimize
error = sum [ ( Re(E) - Re(E_E_2S2P1D) ) / Re(E) )^2 ] + sum [ ( Im(E) -
Im(E_E_2S2P1D) ) / Im(E) )^2 ]
using the MSExcel solver function.
thank you for your help
Andrea
PS
Thanks to David Winsemius for his suggestions
[1] -3.8674680 -2.7250640 -1.4574350 1.6447868 4.4355748 1.2400092 0.1574215 0.4731295 2.0624878 3.7526197 0.1161640
Do you believe this is due only to ?numerical" reasons linked to how the nls() function works?
I find it a bit surprising that there are differences in the third
decimal place here. I would have thought that the results would be
exactly the same, or ***very*** close to it.
If you have done it right (I have not checked your code at all) the "two
approaches" should amount to the same approach. I.e. Mod(z)^2 is
equal to Re(z)^2 + Im(z)^2.
Check your code carefully.
Also it would be a good idea to try a "toy example", something much less
complicated than your real problem, and see what happens there.
cheers,
Rolf Turner