Skip to content

error message from read.csv in loop

9 messages · Kai Yang, Eric Berger, Migdonio González +3 more

#
Hello List,
I use for loop to read csv difference file into data frame rr.? The data frame rr will be deleted after a comparison and go to the next csv file.? Below is my code:
for (j in 1:nrow(ora))
{
? mycol? <- ora[j,"fname"]
? mycsv? <- paste0(mycol,".csv'")
? rdcsv? <- noquote(paste0("'w:/project/_Joe.B/Oracle/data/", mycsv))
? rr? ? ?<- read.csv(rdcsv)
}
but when I run this code, I got error message below:
Error in file(file, "rt") : cannot open the connection
In addition: Warning message:
In file(file, "rt") :
? cannot open file ''w:/project/_Joe.B/Oracle/data/ASSAY_DEFINITIONS.csv'': No such file or directory

so, I checked the rdcsv and print it out, see below:
[1] 'w:/project/_Joe.B/Oracle/data/ASSAY_DEFINITIONS.csv'
[1] 'w:/project/_Joe.B/Oracle/data/ASSAY_DISCRETE_VALUES.csv'
[1] 'w:/project/_Joe.B/Oracle/data/ASSAY_QUESTIONS.csv'
[1] 'w:/project/_Joe.B/Oracle/data/ASSAY_RUNS.csv'
[1] 'w:/project/_Joe.B/Oracle/data/DATA_ENTRY_PAGES.csv'
[1] 'w:/project/_Joe.B/Oracle/data/DISCRETE_VALUES.csv'
[1] 'w:/project/_Joe.B/Oracle/data/ENTRY_GROUPS.csv'
[1] 'w:/project/_Joe.B/Oracle/data/GEMD_CODELIST_GROUPS.csv'
[1] 'w:/project/_Joe.B/Oracle/data/GEMD_CODELIST_VALUES.csv'
[1] 'w:/project/_Joe.B/Oracle/data/GEMD_LOT_DEFINITIONS.csv'
[1] 'w:/project/_Joe.B/Oracle/data/GEMD_SAMPLES.csv'
[1] 'w:/project/_Joe.B/Oracle/data/MOLECULAR_WAREHOUSE.csv'
[1] 'w:/project/_Joe.B/Oracle/data/QUESTION_DEFINITIONS.csv'
[1] 'w:/project/_Joe.B/Oracle/data/QUESTION_GROUPS.csv'
[1] 'w:/project/_Joe.B/Oracle/data/RESPONDENTS.csv'
[1] 'w:/project/_Joe.B/Oracle/data/RESPONSES.csv'
[1] 'w:/project/_Joe.B/Oracle/data/SAMPLE_LIST.csv'
[1] 'w:/project/_Joe.B/Oracle/data/SAMPLE_LIST_NAMES.csv'
[1] 'w:/project/_Joe.B/Oracle/data/SAMPLE_PLATE_ADDRESSES.csv'
[1] 'w:/project/_Joe.B/Oracle/data/STORAGE_UNITS.csv'
it seems correct. I copy and paste it into a code :
?rr? ? ?<- read.csv( 'w:/project/_Joe.B/Oracle/data/RESPONDENTS.csv')
and it works fine.
Can someone help me debug where is the problem in my for loop code?
Thanks,
Kai
#
it complained about  ASSAY_DEFINITIONS not about  RESPONDENTS.
Can you try with the ASSAY_DEFINITIONS  file?


On Fri, Jul 9, 2021 at 9:10 PM Kai Yang via R-help <r-help at r-project.org>
wrote:

  
  
#
It seems that your problem is that you are using single quotes inside of
the double quotes. This is not necessary. Here is the corrected for-loop:

for (j in 1:nrow(ora))
{
        mycol  <- ora[j,"fname"]
        mycsv  <- paste0(mycol,".csv")
        rdcsv  <- noquote(paste0("w:/project/_Joe.B/Oracle/data/", mycsv))
        rr     <- read.csv(rdcsv)
}

Also note that the rr variable will only store the last CSV, not all CSV.
You will need to initialize the rr variable as a list to store all CSVs if
that is what you require. Something like this:

# Initialize the rr variable as a list.
rr <- as.list(rep(NA, nrow(ora)))

# Run the for-loop to store all the CSVs in rr.
for (j in 1:nrow(ora))
{
        mycol  <- ora[j,"fname"]
        mycsv  <- paste0(mycol,".csv")
        rdcsv  <- noquote(paste0("w:/project/_Joe.B/Oracle/data/", mycsv))
        rr[[j]]     <- read.csv(rdcsv)
}

Regards
Migdonio G.

On Fri, Jul 9, 2021 at 1:10 PM Kai Yang via R-help <r-help at r-project.org>
wrote:

  
  
#
Hi Migdonio,
I did try your code:
# Initialize the rr variable as a list.

rr <- as.list(rep(NA, nrow(ora)))


# Run the for-loop to store all the CSVs in rr.

for (j in 1:nrow(ora))

{

? ? ? ? mycol ?<- ora[j,"fname"]

? ? ? ? mycsv ?<- paste0(mycol,".csv")

? ? ? ? rdcsv ?<- noquote(paste0("w:/project/_Joe.B/Oracle/data/", mycsv))

? ? ? ? rr[[j]] ? ? <- read.csv(rdcsv)

}

this code is working, but rr is not a data frame, R said: Large list ( 20 elements .....). how can I use it as a data frame one by one?
Thank you for your help
Kai
On Friday, July 9, 2021, 11:39:59 AM PDT, Migdonio Gonz?lez <migdonio.gonzalez02 at gmail.com> wrote:
It seems that your problem is that you are using single quotes inside of the double quotes. This is not necessary. Here is the corrected for-loop:
for (j in 1:nrow(ora))
{
? ? ? ? mycol ?<- ora[j,"fname"]
? ? ? ? mycsv ?<- paste0(mycol,".csv")
? ? ? ? rdcsv ?<- noquote(paste0("w:/project/_Joe.B/Oracle/data/", mycsv))
? ? ? ? rr ? ? <- read.csv(rdcsv)
}
Also note that the rr variable will only store the last CSV, not all CSV. You will need to initialize the rr variable as a list to store all CSVs if that is what you require. Something like this:
# Initialize the rr variable as a list.
rr <- as.list(rep(NA, nrow(ora)))
# Run the for-loop to store all the CSVs in rr.
for (j in 1:nrow(ora))
{
? ? ? ? mycol ?<- ora[j,"fname"]
? ? ? ? mycsv ?<- paste0(mycol,".csv")
? ? ? ? rdcsv ?<- noquote(paste0("w:/project/_Joe.B/Oracle/data/", mycsv))
? ? ? ? rr[[j]] ? ? <- read.csv(rdcsv)
} 

RegardsMigdonio G.
On Fri, Jul 9, 2021 at 1:10 PM Kai Yang via R-help <r-help at r-project.org> wrote:
Hello List,
I use for loop to read csv difference file into data frame rr.? The data frame rr will be deleted after a comparison and go to the next csv file.? Below is my code:
for (j in 1:nrow(ora))
{
? mycol? <- ora[j,"fname"]
? mycsv? <- paste0(mycol,".csv'")
? rdcsv? <- noquote(paste0("'w:/project/_Joe.B/Oracle/data/", mycsv))
? rr? ? ?<- read.csv(rdcsv)
}
but when I run this code, I got error message below:
Error in file(file, "rt") : cannot open the connection
In addition: Warning message:
In file(file, "rt") :
? cannot open file ''w:/project/_Joe.B/Oracle/data/ASSAY_DEFINITIONS.csv'': No such file or directory

so, I checked the rdcsv and print it out, see below:
[1] 'w:/project/_Joe.B/Oracle/data/ASSAY_DEFINITIONS.csv'
[1] 'w:/project/_Joe.B/Oracle/data/ASSAY_DISCRETE_VALUES.csv'
[1] 'w:/project/_Joe.B/Oracle/data/ASSAY_QUESTIONS.csv'
[1] 'w:/project/_Joe.B/Oracle/data/ASSAY_RUNS.csv'
[1] 'w:/project/_Joe.B/Oracle/data/DATA_ENTRY_PAGES.csv'
[1] 'w:/project/_Joe.B/Oracle/data/DISCRETE_VALUES.csv'
[1] 'w:/project/_Joe.B/Oracle/data/ENTRY_GROUPS.csv'
[1] 'w:/project/_Joe.B/Oracle/data/GEMD_CODELIST_GROUPS.csv'
[1] 'w:/project/_Joe.B/Oracle/data/GEMD_CODELIST_VALUES.csv'
[1] 'w:/project/_Joe.B/Oracle/data/GEMD_LOT_DEFINITIONS.csv'
[1] 'w:/project/_Joe.B/Oracle/data/GEMD_SAMPLES.csv'
[1] 'w:/project/_Joe.B/Oracle/data/MOLECULAR_WAREHOUSE.csv'
[1] 'w:/project/_Joe.B/Oracle/data/QUESTION_DEFINITIONS.csv'
[1] 'w:/project/_Joe.B/Oracle/data/QUESTION_GROUPS.csv'
[1] 'w:/project/_Joe.B/Oracle/data/RESPONDENTS.csv'
[1] 'w:/project/_Joe.B/Oracle/data/RESPONSES.csv'
[1] 'w:/project/_Joe.B/Oracle/data/SAMPLE_LIST.csv'
[1] 'w:/project/_Joe.B/Oracle/data/SAMPLE_LIST_NAMES.csv'
[1] 'w:/project/_Joe.B/Oracle/data/SAMPLE_PLATE_ADDRESSES.csv'
[1] 'w:/project/_Joe.B/Oracle/data/STORAGE_UNITS.csv'
it seems correct. I copy and paste it into a code :
?rr? ? ?<- read.csv( 'w:/project/_Joe.B/Oracle/data/RESPONDENTS.csv')
and it works fine.
Can someone help me debug where is the problem in my for loop code?
Thanks,
Kai





? ? ? ? [[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
#
Hello Kai,

Just as you did to store the data inside of rr. Try class(rr[[1]]) or
class(rr[[2]]) and so on to explore a bit more. The variable rr is a list
that contains dataframes within it. To access the dataframes you must use
the syntax rr[[i]] where i is the index of the element of the list (or the
number of the dataframe in your case). For example:

df1 <- rr[[1]]
class(df1) # Check if this is class "data.frame".

df2 <- rr[[2]]
class(df2)

You can also try other ways to store the dataframes more efficiently. This
is just a quick-and-dirty solution to the code you provided. I recommend
reading more about lists in R to understand how they work and how they
differ from other data structures like vectors.

Hope this helps,
Warm regards.
Migdonio G.
On Fri, Jul 9, 2021 at 2:24 PM Kai Yang <yangkai9999 at yahoo.com> wrote:

            

  
  
#
Hello,

1. When there are systematic errors, use ?try or, better yet, ?tryCatch.
Something like the code below will create a list of errors and read in 
the data if none occurred.
The code starts by creating an empty list for tryCatch results. It uses 
?file.path instead of noquote/paste0 to assemble the file name and wraps 
tryCatch around read.csv. Then, after the for loop, it gets the wrong 
reads and displays the error messages.
3. I'm assuming that after processing the data file by file you discard 
the data.frame if it's read without problems and move on to the next 
one. It would also be possible to store them all in a list, together 
with the errors.


ok <- vector("list", nrow(ora))
for (j in 1:nrow(ora))
{
 ? mycol? <- ora[j,"fname"]
 ? mycsv? <- paste0(mycol, ".csv'")
 ? rdcsv? <- file.path("w:/project/_Joe.B/Oracle/data", mycsv)
 ? rr <- tryCatch(read.csv(rdcsv), error = function(e) e)
 ? if(inherits(rr, "error"))
 ??? ok[[i]] <- rr
 ? else ok[[i]] <- TRUE
}

i_err <- sapply(ok, inherits, "error")
for(e in ok[i_err]) message(e$message)


2. I'm assuming that you want to process file by file and? if the data 
are read without problems you discard the data.frame after processing it 
and move on to the next file. It is also possible to store them all in a 
list, together with the errors.


fun <- function(j, data = ora, mypath = "w:/project/_Joe.B/Oracle/data")
{
 ? mycol? <- data[j, "fname"]
 ? mycsv? <- paste0(mycol, ".csv")
 ? rdcsv? <- file.path(mypath, mycsv)
 ? tryCatch(read.csv(rdcsv), error = function(e) e)
}

df_list <- lapply(seq_len(nrow(ora)), fun)
i_err <- sapply(df_list, inherits, "error")
df_list[!i_err]? # these are ok

# hypothetical processing strategy
processing_results <- lapply(df_list[!i_err], function(rr) {
 ? # code goes here
 ? # ...etc...
})


Hope this helps,

Rui Barradas



?s 19:01 de 09/07/2021, Kai Yang via R-help escreveu:

  
    
#
"It seems that your problem is that you are using single quotes inside of
the double quotes."

That is FALSE. From ?Quotes:
"Single and double quotes delimit character constants. They can be
used interchangeably but double quotes are preferred (and character
constants are printed using double quotes), so single quotes are
normally only used to delimit character constants containing double
quotes."

Of course, pairs of each type of quote must properly match, must not
get confused with quotes in the delineated string, etc. , but they are
otherwise interchangeable. The whole of ?Quotes, especially the
examples, is informative and worth the read (imo).

Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )

On Sat, Jul 10, 2021 at 8:20 AM Migdonio Gonz?lez
<migdonio.gonzalez02 at gmail.com> wrote:
#
On 10/07/2021 12:30 p.m., Bert Gunter wrote:
I think Migdonio is right.  From the error message the problem is that 
the filename was being specified as 
"'w:/project/_Joe.B/Oracle/data/ASSAY_DEFINITIONS.csv'"

That is not a legal filename:  the single quotes probably tell Windows 
to interpret it as a single filename entry, not drive, path, filename. 
Or maybe the drive is being interpreted as "'w", which isn't a legal 
drive id.

In any case, if you set f to an existing full path, and file.exists(f) 
returns TRUE, you'll find that file.exists(paste0("'", f, "'")) returns 
FALSE.

Duncan Murdoch
#
Thank you very much for the clarification. I will try to use a more precise
language next time.

Warm regards
Migdonio G.
On Sat, Jul 10, 2021 at 11:30 AM Bert Gunter <bgunter.4567 at gmail.com> wrote: