Hello, I am working on a project. The new data files is coming as the data collectors get data, then the data collectors put these new data files in a folder. I need to read these new data files when they are in folder. so far, I did this job manually, that is to say, each time I go to that folder and find new data files, then use my R program to read these new data files. I am wondering if anyone know how to perform this job automatically in R. thanks, jlm
Read files in a folder when new data files come
5 messages · jim holtman, Carlos J. Gil Bellosta, Barry Rowlingson +1 more
You can read the status of every file in a directory and make the decision to process it. One technique is to create a file in the directory the last time that you processed information from the directory. You could schedule an R script to first read in your 'flag' file and determine the date it was created and then get all the files in the directory that are later than that date to process them. You would then rewrite your flag file to update its modification date for the next round. Does this do what you want?
On Sun, Jan 24, 2010 at 3:05 PM, jlfmssm <jlfmssm at gmail.com> wrote:
Hello, I am working on a project. The new data files is coming as the data collectors get data, then the data collectors put these new data files in a folder. I need to read these new data files when they are in folder. so far, I did this job manually, that is to say, each time I go to that folder and find new data files, then use my R program to read these new data files. I am wondering if anyone know how to perform this job automatically in R. thanks, jlm
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve?
Hello, Could you tell us something more about your infrastructure? Windows? Linux? On Unix/Linux you could use cron to have a R process to read all the files in the given directory, process them one by one and archive them in another place. On Windows, no idea. Alternatively, you could perhaps ask your users to use some kind of web interface to upload the data. This interface could then trigger an R process. Best regards, Carlos J. Gil Bellosta http://www.datanalytics.com
jlfmssm wrote:
Hello, I am working on a project. The new data files is coming as the data collectors get data, then the data collectors put these new data files in a folder. I need to read these new data files when they are in folder. so far, I did this job manually, that is to say, each time I go to that folder and find new data files, then use my R program to read these new data files. I am wondering if anyone know how to perform this job automatically in R. thanks, jlm
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
On Sun, Jan 24, 2010 at 8:05 PM, jlfmssm <jlfmssm at gmail.com> wrote:
Hello, I am working on a project. The new data files is coming as the data collectors get data, then the data collectors put these new data files in a folder. I need to read these new data files when they are in folder. so far, I did this job manually, that is to say, each time I go to that folder and find new data files, then use my R program to read these new data files. I am wondering if anyone know how to perform this job automatically in R.
Without needing some operating-system specific hackery, the easiest
way would be to use 'list.files()' and look for new files every so
many minutes or seconds (depending on how urgent it is). Or to check
file.info() on your directory and test the modification time. You'd
then write that into a .R file and run that in the background using
your operating system's background job functionality (as a 'service'
in Windows, or as a background process in Unix). Use
Sys.sleep(seconds) to wait in your loop. Something like (totally
untested):
lastChange = file.info(dumpLocation)$mtime
while(TRUE){
currentM = file.info(dumpLocation)$mtime
if(currentM != lastChange){
lastChange = currentM
doSomethingWithStuffIn(dumpLocation)
}
# try again in 10 minutes
Sys.sleep(600)
}
There are ways for programs to get directory content change events
when files appear in directories, but they will probably be very
operating system specific. There's also the problem of your code
firing up when a file is only half-uploaded - what do you do then?
Does your data format have an 'end of data' marker?
Barry
Thank you for your reply, Yes, this is what I want to do. I am working on Windows. The data files is located in a folder on a data server. Each time the data collectors put the new data on the data server once they get new data. What I want to do is my R program will process those new data files once my program finds there are new data coming into that folder in this data server. Thanks, jlm
On Sun, Jan 24, 2010 at 2:12 PM, jim holtman <jholtman at gmail.com> wrote:
You can read the status of every file in a directory and make the decision to process it. ?One technique is to create a file in the directory the last time that you processed information from the directory. ?You could schedule an R script to first read in your 'flag' file and determine the date it was created and then get all the files in the directory that are later than that date to process them. You would then rewrite your flag file to update its modification date for the next round. Does this do what you want? On Sun, Jan 24, 2010 at 3:05 PM, jlfmssm <jlfmssm at gmail.com> wrote:
Hello, I am working on a project. The new data files is coming as the data collectors get data, then the data collectors put these new data files in a folder. I need to read these new data files when they are in folder. so far, I did this job manually, that is to say, each time I go to that folder and find new data files, then use my R program to read these new data files. I am wondering if anyone know how to perform this job automatically in R. thanks, jlm
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
-- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve?