The fit of the GLM is something like this:
for(i in <iterations>)
{
while(<data to read>)
{
<read data and create model matrix>
<update the current solution>
*
<test if the tolerance is reached>
}
<print traces>
}
<return the results>
* I add here code to compute the con/discordant pair.
It is so annoying I can't send you my whole code...
2008/11/4 Jeffrey Horner <jeff.horner at vanderbilt.edu
<mailto:jeff.horner at vanderbilt.edu>>
christophe dutang wrote on 11/04/2008 07:07 AM:
Hi all,
Here are some details about my code:
- I use a mysql server where my whole database is stored,
- I'm doing a logistic regression with the package biglm
- my 'data function' just does a SQL request for the selected
variables and then I read part of it with the function fetch:
MySQLdatafun<-function(reset=FALSE)
{
if(reset)
{
beginRead <<- 0
endRead <<- 0
dbClearResult(resSQL)
resSQL <<- dbSendQuery(con, SQLSelectExplVar)
return(NULL)
}
if(endRead >= lengthData)
return(NULL)
beginRead<<-endRead+1
endRead<<-endRead+min(chunksize, lengthData-endRead)
mydata <<- data.frame(fetch(resSQL, n=endRead-beginRead+1),
stringsAsFactors=TRUE)
# BTW the last argument 'stringAsFactor' does not seem to work.
for(i in 1:NCOL(mydata))
if(class(mydata[1,i]) == "character")
mydata[,i] <<- factor(mydata[,i])
mydata
}
- what's really strange is the problem occurs when I add some
code to the biglm package to compute the con/discordant percentage.
What do you mean when you "add some code to the biglm package"?
Also, can you provide a table schema?
Jeff
I can not yet upgrade to 2.8.0 since there is no version of the
(d)com server for this version of R.
Thanks in advance
Christophe Dutang
2008/11/3 Jeffrey Horner <jeff.horner at vanderbilt.edu
<mailto:jeff.horner at vanderbilt.edu>
<mailto:jeff.horner at vanderbilt.edu
<mailto:jeff.horner at vanderbilt.edu>>>
christophe dutang wrote on 10/31/2008 03:28 PM:
Hi,
I'm currently experiencing problem with the combination
of mysql
/ Rmysql /
R when reading the result of a 'big' query. If I select
only 4
variables of
my mysql table, the result dimension has 56972 rows, I
read by
pack of
50000, namely the first of 50000 and the second of 6972.
In this
cas I do
not get any DBI warning telling an error of mysql server.
But If
I read the
21 variables of my table, the result dimension is then
56972 x
21. In R, the
first read of 50000 rows is fine but second stops after
reading
2182 rows...
and a DBI warning is raised
RS-DBI driver warning: (error while fetching rows)
This problem was raised in 2003, cf.
https://stat.ethz.ch/pipermail/r-help/2003-April/032708.html
But I found here
http://www.mysqlperformanceblog.com/2007/07/06/, that "If
you do not check for error it can look as you've done with
result set while
you only processed a portion of it, which can cause
rather hard
to catch
errors."
Does anyone experience this problem? and know how to
solve it?
Try upgrading R to 2.8.0. <http://2.8.0.> <http://2.8.0.>
Can you provide your code to the list... or pseudo code so
that we
can troubleshoot? Specifically are you calling fetch() or
dbApply()?
Jeff
Thanks in advance
Christophe Dutang
PS : I use R 2.7.2 on windows XP pro with RMySQL_0.6-1 and a
MySQL community
server 5.0
[[alternative HTML version deleted]]