Hello list members! I'm trying to enter some data in an R session using source() function with an URL as argument. The data source is a PHP script located in an apache web server and the data is a long list generated on-the-fly, these are the initial lines: groups<-list() groups[['ENSMUST00000000001']]=c(52611,483683,147952,132170,297514,469248,291525,364037,469915,55472,280220,314688,415650,486875,440898,6781,497785) groups[['ENSMUST00000000003']]=c(416911,327120,425495,72272,297529,101933,371418,139034,318872,367204,237702) groups[['ENSMUST00000000028']]=c(199311,325400,184761,241988,376845,75052,67724,404240,439543,391057,393816) groups[['ENSMUST00000000031']]=c(402587,352900,139030,186068,463553,328881,74942,277085,301431,256149,410846) groups[['ENSMUST00000000033']]=c(12700,23908,11140,122358,389908,390084,383903,354007,457965,106395,131876) groups[['ENSMUST00000000049']]=c(59336,203239,101077,382882,327374,281549,212042,275594,361523,490934,240275) groups[['ENSMUST00000000056']]=c(409571,304584,394332,379699,13785,4260,288889,42538,304075,47734,485512,52501,328509,504846,334607,82566,250088,150240,16422,446551,314484,91878,124752,341638,379512,379890,319764,8019,59221,156508,362524,74001,149400) groups[['ENSMUST00000000058']]=c(26511,455190,466368,358528,268486,315461,149260,422804,137641,163718,352555) The problem: When I execute the command it apparently finish ok, without printed errors but when I test the consistency of the data entered using the command length() I always obtain different figures. More facts: When I source the data from a static file instead an url, the data is fully entered and the length is always the same (20346 list elements). It delays 30 secs to load. When I source the data from the dynamic way, from an url, it delays 2 min. and always data is truncated. Tried and miserably failed: - Changed .Options$timeout from 60 to 300 - Using R --verbose is of no help, the data is silently truncated. - Changed the expression in which data is entered: groups<-list( 'ENSMUST00000000001'=c(52611,483683,147952,132170,297514,469248,291525,364037,469915,55472,280220,314688,415650,486875,440898,6781,497785), 'ENSMUST00000000003'=c(416911,327120,425495,72272,297529,101933,371418,139034,318872,367204,237702) ... ) Kind list members, is there some timeout I am missing? Some way to debug the process? Some suggestion? Sincerely, thank you! Alberto de Luis www.cicancer.org
Problems with source() function
5 messages · Duncan Temple Lang, Al, Seth Falcon
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Does source(textConnection(readLines(url(http://...))) give the correct answer. If not, what is being dropped when you just use readLines() and look at the contents of the download. And how long is the longest line? The RCurl package (http://www.omegahat.org/RCurl) gives you a lot of control in perform and processing HTTP requests, allowing you to control the request, and read the body and the header of the response. It may be worth a try if things are getting frustrating. D.
Al wrote:
Hello list members! I'm trying to enter some data in an R session using source() function with an URL as argument. The data source is a PHP script located in an apache web server and the data is a long list generated on-the-fly, these are the initial lines: groups<-list() groups[['ENSMUST00000000001']]=c(52611,483683,147952,132170,297514,469248,291525,364037,469915,55472,280220,314688,415650,486875,440898,6781,497785) groups[['ENSMUST00000000003']]=c(416911,327120,425495,72272,297529,101933,371418,139034,318872,367204,237702) groups[['ENSMUST00000000028']]=c(199311,325400,184761,241988,376845,75052,67724,404240,439543,391057,393816) groups[['ENSMUST00000000031']]=c(402587,352900,139030,186068,463553,328881,74942,277085,301431,256149,410846) groups[['ENSMUST00000000033']]=c(12700,23908,11140,122358,389908,390084,383903,354007,457965,106395,131876) groups[['ENSMUST00000000049']]=c(59336,203239,101077,382882,327374,281549,212042,275594,361523,490934,240275) groups[['ENSMUST00000000056']]=c(409571,304584,394332,379699,13785,4260,288889,42538,304075,47734,485512,52501,328509,504846,334607,82566,250088,150240,16422,446551,314484,91878,124752,341638,379512,379890,319764,8019,59221,156508,362524,74001,149400) groups[['ENSMUST00000000058']]=c(26511,4
5!
5190,466368,358528,268486,315461,149260,422804,137641,163718,352555) The problem: When I execute the command it apparently finish ok, without printed errors but when I test the consistency of the data entered using the command length() I always obtain different figures. More facts: When I source the data from a static file instead an url, the data is fully entered and the length is always the same (20346 list elements). It delays 30 secs to load. When I source the data from the dynamic way, from an url, it delays 2 min. and always data is truncated. Tried and miserably failed: - Changed .Options$timeout from 60 to 300 - Using R --verbose is of no help, the data is silently truncated. - Changed the expression in which data is entered: groups<-list( 'ENSMUST00000000001'=c(52611,483683,147952,132170,297514,469248,291525,364037,469915,55472,280220,314688,415650,486875,440898,6781,497785), 'ENSMUST00000000003'=c(416911,327120,425495,72272,297529,101933,371418,139034,318872,367204,237702) ... ) Kind list members, is there some timeout I am missing? Some way to debug the process? Some suggestion? Sincerely, thank you! Alberto de Luis www.cicancer.org
______________________________________________ R-help at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
- -- Duncan Temple Lang duncan at wald.ucdavis.edu Department of Statistics work: (530) 752-4782 371 Kerr Hall fax: (530) 752-7099 One Shields Ave. University of California at Davis Davis, CA 95616, USA -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.2 (Darwin) Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org iD8DBQFDYSvk9p/Jzwa2QP4RAsqfAJ98RNScQ7ea1/MAnt72R0VGZoXaEQCfZvyl WNNN/HT1hx/Kix3KSp15XwM= =VsDG -----END PGP SIGNATURE-----
Thank you for your answer :) I've tested your suggestion but without success. The remote load process is truncated silently using source(textConnection(readLines(url(http://...))) when look at the contents there's not a fixed point of break, is different each time I execute the command. Therefore the dropped lines are different every time. It seems the only constant is the time of the interruption (1 min 55 secs in my system). Loading the file in a browser (it loads always complete) and examining the text, there's no apparent malformation in the rupture points. The longest line is 669 chars and is perfectly loaded in remote and local mode: > lineas <- readLines("transcripts_moe430a.R") > length(lineas) [1] 20347 > max(nchar(lineas)) [1] 669 > which(nchar(lineas)==669) [1] 3241 > lineas <- readLines(url ("http://10.10.10.3:83/probefinder/scripts/probegrouper.php?chip=moe430a&mode=transcript")) > length(lineas) [1] 7471 > max(nchar(lineas)) [1] 669 > which(nchar(lineas)==669) [1] 3242 Apparently there's a timeout in the url() or some subordinated function. I will try to use the RCurl package but, for educational purposes, I prefer that the load process were managed in a simply way... with an source() for example, in order to not overload alumni with tricky methods... Thank you again. ..................... Alberto de Luis Bioinformatics and Functional Genomics Lab Cancer Research Center Salamanca (Spain) .....................
On Thu, 2005-10-27 at 12:35 -0700, Duncan Temple Lang wrote:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Does source(textConnection(readLines(url(http://...))) give the correct answer. If not, what is being dropped when you just use readLines() and look at the contents of the download. And how long is the longest line? The RCurl package (http://www.omegahat.org/RCurl) gives you a lot of control in perform and processing HTTP requests, allowing you to control the request, and read the body and the header of the response. It may be worth a try if things are getting frustrating. D. Al wrote:
Hello list members! I'm trying to enter some data in an R session using source() function with an URL as argument. The data source is a PHP script located in an apache web server and the data is a long list generated on-the-fly, these are the initial lines: groups<-list() groups[['ENSMUST00000000001']]=c(52611,483683,147952,132170,297514,469248,291525,364037,469915,55472,280220,314688,415650,486875,440898,6781,497785) groups[['ENSMUST00000000003']]=c(416911,327120,425495,72272,297529,101933,371418,139034,318872,367204,237702) groups[['ENSMUST00000000028']]=c(199311,325400,184761,241988,376845,75052,67724,404240,439543,391057,393816) groups[['ENSMUST00000000031']]=c(402587,352900,139030,186068,463553,328881,74942,277085,301431,256149,410846) groups[['ENSMUST00000000033']]=c(12700,23908,11140,122358,389908,390084,383903,354007,457965,106395,131876) groups[['ENSMUST00000000049']]=c(59336,203239,101077,382882,327374,281549,212042,275594,361523,490934,240275) groups[['ENSMUST00000000056']]=c(409571,304584,394332,379699,13785,4260,288889,42538,304075,47734,485512,52501,328509,504846,334607,82566,250088,150240,16422,446551,314484,91878,124752,341638,379512,379890,319764,8019,59221,156508,362524,74001,149400) groups[['ENSMUST00000000058']]=c(26511
,4
5!
5190,466368,358528,268486,315461,149260,422804,137641,163718,352555) The problem: When I execute the command it apparently finish ok, without printed errors but when I test the consistency of the data entered using the command length() I always obtain different figures. More facts: When I source the data from a static file instead an url, the data is fully entered and the length is always the same (20346 list elements). It delays 30 secs to load. When I source the data from the dynamic way, from an url, it delays 2 min. and always data is truncated. Tried and miserably failed: - Changed .Options$timeout from 60 to 300 - Using R --verbose is of no help, the data is silently truncated. - Changed the expression in which data is entered: groups<-list( 'ENSMUST00000000001'=c(52611,483683,147952,132170,297514,469248,291525,364037,469915,55472,280220,314688,415650,486875,440898,6781,497785), 'ENSMUST00000000003'=c(416911,327120,425495,72272,297529,101933,371418,139034,318872,367204,237702) ... ) Kind list members, is there some timeout I am missing? Some way to debug the process? Some suggestion? Sincerely, thank you! Alberto de Luis www.cicancer.org
______________________________________________ R-help at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
- -- Duncan Temple Lang duncan at wald.ucdavis.edu Department of Statistics work: (530) 752-4782 371 Kerr Hall fax: (530) 752-7099 One Shields Ave. University of California at Davis Davis, CA 95616, USA -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.2 (Darwin) Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org iD8DBQFDYSvk9p/Jzwa2QP4RAsqfAJ98RNScQ7ea1/MAnt72R0VGZoXaEQCfZvyl WNNN/HT1hx/Kix3KSp15XwM= =VsDG -----END PGP SIGNATURE-----
On 28 Oct 2005, aldeluis at usal.es wrote:
Thank you for your answer :) I've tested your suggestion but without success. The remote load process is truncated silently using source(textConnection(readLines(url(http://...))) when look at the contents there's not a fixed point of break, is different each time I execute the command. Therefore the dropped lines are different every time. It seems the only constant is the time of the interruption (1 min 55 secs in my system).
I'm not sure exactly what you are trying to accomplish, but I wonder if either of the following two ideas would help you: 1. Instead of source(), consider loading the R code on the server side and then using save(..., compres=TRUE) to create a binary image. You can then feed that to the clients and have them use load(). 2. What happens if you gzip the code before sending and gunzip on the client side. It may be less convenient, although supposedly there is a way to do the equivalent of gzfile(url(...)). HTH, + seth
I'm trying to feed data generated on-the-fly by a PHP script using R
function source(), passing the arguments in the URL, using GET method
("http://someserver.com/script.php?a=343&b=873"). If not on-the-fly, the
user has to wait more and get the data in more than one step.
I'm trying a one-step simple method but for some reason the source()
function truncates silently the data. I will try your suggestion of a
binary file if I can generate the gzip stream on-the-fly...
Thank you!
.....................
Alberto de Luis
Bioinformatics and Functional Genomics Lab
Cancer Research Center
Salamanca (Spain)
.....................
On Fri, 2005-10-28 at 06:57 -0700, Seth Falcon wrote:
On 28 Oct 2005, aldeluis at usal.es wrote:
Thank you for your answer :) I've tested your suggestion but without success. The remote load process is truncated silently using source(textConnection(readLines(url(http://...))) when look at the contents there's not a fixed point of break, is different each time I execute the command. Therefore the dropped lines are different every time. It seems the only constant is the time of the interruption (1 min 55 secs in my system).
I'm not sure exactly what you are trying to accomplish, but I wonder if either of the following two ideas would help you: 1. Instead of source(), consider loading the R code on the server side and then using save(..., compres=TRUE) to create a binary image. You can then feed that to the clients and have them use load(). 2. What happens if you gzip the code before sending and gunzip on the client side. It may be less convenient, although supposedly there is a way to do the equivalent of gzfile(url(...)). HTH, + seth
______________________________________________ R-help at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html