Dear All,
I am pretty new to R and thus my question may sound silly.
Is there a way to automatically generate a series of separate vectors
(so not arranged in a matrix), without typing and changing every time
the values, and store them as separate *xlsx file, where the "*" is
replaced by the name of the vector itself?
What i would like to create is a total of 12 vectors, corresponding to
the 12 months (January to December), say for the year 2006; thus the
name of a resulting single vector should be something like
"January2006", and the final file that will be stored in my WD should
have the same name ("January2009.xlsx").
The number of the elements of each vector must correspond to the length
in days of the single months (considering a non-leap-year, 356 days)
multiplied by 2 (e.g. "January2006" will have 31*2=62 elements,
"February2006" will have 28*2=56 elements, and so on).
Finally, the elements of the vectors should be named as:
"010106_aaa","010106_bbb","020106_aaa","020106_bbb", ... ,
"310106_aaa","310106_bbb".
To sum up, at the end of the process i would like to obtain 12 vectors
as it follows:
Jauary2006("010106_aaa","010106_bbb","020106_aaa","020106_bbb", ... ,
"310106_aaa","310106_bbb")
.
.
.
.
.
December2006("010106_aaa","010106_bbb","020106_aaa","020106_bbb", ... ,
"310106_aaa","310106_bbb")
Any help would be particularly welcome and appreciated.
Cheers,
NP
* Italiano - rilevata
* Inglese
* Italiano
* Francese
* Spagnolo
* Tedesco
* Inglese
* Italiano
* Francese
* Spagnolo
* Tedesco
<javascript:void(0);>
creating series of vectors
7 messages · Petr Savicky, ilai, MacQueen, Don +1 more
On Thu, Feb 16, 2012 at 05:32:15PM +0100, Nino Pierantonio wrote:
Dear All,
I am pretty new to R and thus my question may sound silly.
Is there a way to automatically generate a series of separate vectors
(so not arranged in a matrix), without typing and changing every time
the values, and store them as separate *xlsx file, where the "*" is
replaced by the name of the vector itself?
What i would like to create is a total of 12 vectors, corresponding to
the 12 months (January to December), say for the year 2006; thus the
name of a resulting single vector should be something like
"January2006", and the final file that will be stored in my WD should
have the same name ("January2009.xlsx").
The number of the elements of each vector must correspond to the length
in days of the single months (considering a non-leap-year, 356 days)
multiplied by 2 (e.g. "January2006" will have 31*2=62 elements,
"February2006" will have 28*2=56 elements, and so on).
Finally, the elements of the vectors should be named as:
"010106_aaa","010106_bbb","020106_aaa","020106_bbb", ... ,
"310106_aaa","310106_bbb".
To sum up, at the end of the process i would like to obtain 12 vectors
as it follows:
Jauary2006("010106_aaa","010106_bbb","020106_aaa","020106_bbb", ... ,
"310106_aaa","310106_bbb")
.
.
.
.
.
December2006("010106_aaa","010106_bbb","020106_aaa","020106_bbb", ... ,
"310106_aaa","310106_bbb")
Hi.
Try the following function, which creates a list of vectors.
seqDays <- function(year)
{
n <- 365 + (year %% 4 == 0)
x <- as.Date(paste(year, "-01-01", sep="")) + 0:(n-1)
months <- unique(months(x))
x <- do.call(rbind, strsplit(as.character(x), "-"))
x1 <- sprintf("%02d", year %% 100)
y <- paste(x[, 3], x[, 2], x1, sep="")
y <- c(rbind(paste(y, "_aaa", sep=""), paste(y, "_bbb", sep="")))
x2 <- rep(x[, 2], each=2)
out <- split(y, x2)
names(out) <- paste(months, year, sep="")
out
}
out <- seqDays(2006)
out
$January2006
[1] "010106_aaa" "010106_bbb" "020106_aaa" "020106_bbb" "030106_aaa"
[6] "030106_bbb" "040106_aaa" "040106_bbb" "050106_aaa" "050106_bbb"
[11] "060106_aaa" "060106_bbb" "070106_aaa" "070106_bbb" "080106_aaa"
[16] "080106_bbb" "090106_aaa" "090106_bbb" "100106_aaa" "100106_bbb"
[21] "110106_aaa" "110106_bbb" "120106_aaa" "120106_bbb" "130106_aaa"
[26] "130106_bbb" "140106_aaa" "140106_bbb" "150106_aaa" "150106_bbb"
[31] "160106_aaa" "160106_bbb" "170106_aaa" "170106_bbb" "180106_aaa"
[36] "180106_bbb" "190106_aaa" "190106_bbb" "200106_aaa" "200106_bbb"
[41] "210106_aaa" "210106_bbb" "220106_aaa" "220106_bbb" "230106_aaa"
[46] "230106_bbb" "240106_aaa" "240106_bbb" "250106_aaa" "250106_bbb"
[51] "260106_aaa" "260106_bbb" "270106_aaa" "270106_bbb" "280106_aaa"
[56] "280106_bbb" "290106_aaa" "290106_bbb" "300106_aaa" "300106_bbb"
[61] "310106_aaa" "310106_bbb"
$February2006
[1] "010206_aaa" "010206_bbb" "020206_aaa" "020206_bbb" "030206_aaa"
[6] "030206_bbb" "040206_aaa" "040206_bbb" "050206_aaa" "050206_bbb"
[11] "060206_aaa" "060206_bbb" "070206_aaa" "070206_bbb" "080206_aaa"
[16] "080206_bbb" "090206_aaa" "090206_bbb" "100206_aaa" "100206_bbb"
[21] "110206_aaa" "110206_bbb" "120206_aaa" "120206_bbb" "130206_aaa"
[26] "130206_bbb" "140206_aaa" "140206_bbb" "150206_aaa" "150206_bbb"
[31] "160206_aaa" "160206_bbb" "170206_aaa" "170206_bbb" "180206_aaa"
[36] "180206_bbb" "190206_aaa" "190206_bbb" "200206_aaa" "200206_bbb"
[41] "210206_aaa" "210206_bbb" "220206_aaa" "220206_bbb" "230206_aaa"
[46] "230206_bbb" "240206_aaa" "240206_bbb" "250206_aaa" "250206_bbb"
[51] "260206_aaa" "260206_bbb" "270206_aaa" "270206_bbb" "280206_aaa"
[56] "280206_bbb"
$March2006
[1] "010306_aaa" "010306_bbb" "020306_aaa" "020306_bbb" "030306_aaa"
...
Individual vectors may be accessed as out[[i]], their names
as names(out).
Storing to text files may be done as follows.
for (i in 1:12) {
writeLines(out[[i]], con=paste(names(out)[i], ".txt", sep=""))
}
Hope this helps.
Petr Savicky.
# All days in years 2006 to 2009 by month in 48 (12x4) files.
days <- seq(as.Date("2006/1/1"), as.Date("2009/12/31"),by="day") # one
long vector
out <- paste(rep(format(days,'%d%m%y'),each=2),c('aaa','bbb'),sep='_')
# reformat to style
month <- factor(rep(format(days,'%B%y'),each=2)) # group by month.year
for(i in levels(month))
cat(out[month==i],'\n',file=paste(i,'txt',sep='.')) # write external
files
Cheers
On Thu, Feb 16, 2012 at 9:32 AM, Nino Pierantonio <nino.p.80 at gmail.com> wrote:
Dear All,
I am pretty new to R and thus my question may sound silly.
Is there a way to automatically generate a series of separate vectors (so
not arranged in a matrix), without typing and changing every time the
values, and store them as separate *xlsx file, where the "*" is replaced by
the name of the vector itself?
What i would like to create is a total of 12 vectors, corresponding to the
12 months (January to December), say for the year 2006; thus the name of a
resulting single vector should be something like "January2006", and the
final file that will be stored in my WD should have the same name
("January2009.xlsx").
The number of the elements of each vector must correspond to the length in
days of the single months (considering a non-leap-year, 356 days) multiplied
by 2 (e.g. "January2006" will have 31*2=62 elements, "February2006" will
have 28*2=56 elements, and so on).
Finally, the elements of the vectors should be named as:
"010106_aaa","010106_bbb","020106_aaa","020106_bbb", ... ,
"310106_aaa","310106_bbb".
To sum up, at the end of the process i would like to obtain 12 vectors as it
follows:
Jauary2006("010106_aaa","010106_bbb","020106_aaa","020106_bbb", ... ,
"310106_aaa","310106_bbb")
.
.
.
.
.
December2006("010106_aaa","010106_bbb","020106_aaa","020106_bbb", ... ,
"310106_aaa","310106_bbb")
Any help would be particularly welcome and appreciated.
Cheers,
NP
?* Italiano - rilevata
?* Inglese
?* Italiano
?* Francese
?* Spagnolo
?* Tedesco
?* Inglese
?* Italiano
?* Francese
?* Spagnolo
?* Tedesco
?<javascript:void(0);>
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
4 days later
Thanks Ilai this helped. Cheers, Nino Il 16/02/2012 23:15, ilai ha scritto:
# All days in years 2006 to 2009 by month in 48 (12x4) files.
days<- seq(as.Date("2006/1/1"), as.Date("2009/12/31"),by="day") # one
long vector
out<- paste(rep(format(days,'%d%m%y'),each=2),c('aaa','bbb'),sep='_')
# reformat to style
month<- factor(rep(format(days,'%B%y'),each=2)) # group by month.year
for(i in levels(month))
cat(out[month==i],'\n',file=paste(i,'txt',sep='.')) # write external
files
Cheers
On Thu, Feb 16, 2012 at 9:32 AM, Nino Pierantonio<nino.p.80 at gmail.com> wrote:
Dear All,
I am pretty new to R and thus my question may sound silly.
Is there a way to automatically generate a series of separate vectors (so
not arranged in a matrix), without typing and changing every time the
values, and store them as separate *xlsx file, where the "*" is replaced by
the name of the vector itself?
What i would like to create is a total of 12 vectors, corresponding to the
12 months (January to December), say for the year 2006; thus the name of a
resulting single vector should be something like "January2006", and the
final file that will be stored in my WD should have the same name
("January2009.xlsx").
The number of the elements of each vector must correspond to the length in
days of the single months (considering a non-leap-year, 356 days) multiplied
by 2 (e.g. "January2006" will have 31*2=62 elements, "February2006" will
have 28*2=56 elements, and so on).
Finally, the elements of the vectors should be named as:
"010106_aaa","010106_bbb","020106_aaa","020106_bbb", ... ,
"310106_aaa","310106_bbb".
To sum up, at the end of the process i would like to obtain 12 vectors as it
follows:
Jauary2006("010106_aaa","010106_bbb","020106_aaa","020106_bbb", ... ,
"310106_aaa","310106_bbb")
.
.
.
.
.
December2006("010106_aaa","010106_bbb","020106_aaa","020106_bbb", ... ,
"310106_aaa","310106_bbb")
Any help would be particularly welcome and appreciated.
Cheers,
NP
* Italiano - rilevata
* Inglese
* Italiano
* Francese
* Spagnolo
* Tedesco
* Inglese
* Italiano
* Francese
* Spagnolo
* Tedesco
<javascript:void(0);>
______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Nino Pierantonio Mobile: +39 349.532.9370 Skype: pierantonio_nino
Dear all,
I am using R to work on huge numbers of telemetry data divided by day.
Each file (an xlsx file) contains 2 rows, the first one for sst readings
and the second one for chl readings, and 72360 columns, each
corresponding to the centre of a cell in my study area. The columns have
no headings. Lots of cells have fake readings (-999.0000000). What I
want to do is merging the files together, by month and season, replace
null values with "NA" and then calculate for both sst and chl average
row values. I have stored the files in the directory C:/TEMP. This
directory contains 12 subfolders, January to December and each subfolder
contains a certain number of files, corresponding to the number of days
for each month (e.g. January 31 files, February 30 files, and so on).
I already have commands that work properly but would really know if it
is possible to reduce their number and, maybe to do some of them
automatically. What I do is working "month-by-month" as it follows (I am
aware this is not the most elegant way to do it, i'm new to R and for
the moment "elegance&stile" is not my main goal):
>setwd("C:/Temp/January09") # to set my working directory
>library(xlsx) # to load the "xlsx" library necessary to handle the
original *.xlsx files
>list.jan09<-list.files("C:/Temp/January09", full=TRUE)
>read.all.jan09<-lapply(list.jan09, read.xlsx, 1, header=FALSE)
>daily.all.jan09<-do.call("cbind",read.all.jan09) # to create a data
frame containig all my data
>daily.sst.jan09<-daily.all.jan09[,seq(from=1,to=61,by=2)] # to create
a second data frame containing only sst readings (sst readings
correspond to the first column of each daily file). The resulting file
will have 31 columns and 72360 lines
>daily.chl.jan09<-daily.all.jan09[,seq(from=2,to=62,by=2)] # to create
a third data frame containing only chl readings (chl readings correspond
to the second column of each daily file). The resulting file will have
31 columns and 72360 lines
>daily.sst.jan09<-replace(daily.sst.jan09,daily.sst.jan09==-999.0000000,NA) # used to replace -999.0000000 values with "NA"
>jan09_avgsst<-rowMeans(daily.sst.jan09) # to create a vector
containing the mean sst value of all the rows
>write.xlsx(jan09_avgsst,
"C:/Users/AAA/Desktop/Data/january09_avgsst.xlsx") # to store the sst
vector
>daily.chl.jan09<-replace(daily.chl.jan09,daily.chl.jan09==-999.0000000,NA) # used to replace -999.0000000 values with "NA"
>jan09_avgchl<-rowMeans(daily.chl.jan09) # to create a vector
containing the mean value of all the rows
>write.xlsx(jan09_avgchl,
"C:/Users/AAA/Desktop/Data/january09_avgchl.xlsx") # to store the chl
vector
I repeat these same commands for all the months and for the seasons
(January-March; April-June; July-September; October-December), so the
all thing is a bit redundant.
How can I speed up the process, reduce the commands and maybe make them
automatically? Many thanks for your help.
Cheers,
Nino
Nino Pierantonio Mobile: +39 349.532.9370 Skype: pierantonio_nino * Italiano - rilevata * Inglese * Italiano * Francese * Spagnolo * Tedesco * Inglese * Italiano * Francese * Spagnolo * Tedesco <javascript:void(0);>
1 day later
Are you absolutely certain that the data must be stored in Excel?
In the long run I believe you will find it easier if the data is stored in
an external database, or some other data repository that does not require
you to read so many separate files.
Probably the best you can hope for as it is now is to put these commands
inside a loop, or nested loops, with the input and output file names
constructed from the loop indexes [see help('paste') for constructing file
names].
-Don
Don MacQueen
Lawrence Livermore National Laboratory
7000 East Ave., L-627
Livermore, CA 94550
925-423-1062
On 2/21/12 9:45 AM, "Nino Pierantonio" <nino.p.80 at gmail.com> wrote:
>Dear all,
>
>I am using R to work on huge numbers of telemetry data divided by day.
>Each file (an xlsx file) contains 2 rows, the first one for sst readings
>and the second one for chl readings, and 72360 columns, each
>corresponding to the centre of a cell in my study area. The columns have
>no headings. Lots of cells have fake readings (-999.0000000). What I
>want to do is merging the files together, by month and season, replace
>null values with "NA" and then calculate for both sst and chl average
>row values. I have stored the files in the directory C:/TEMP. This
>directory contains 12 subfolders, January to December and each subfolder
>contains a certain number of files, corresponding to the number of days
>for each month (e.g. January 31 files, February 30 files, and so on).
>
>I already have commands that work properly but would really know if it
>is possible to reduce their number and, maybe to do some of them
>automatically. What I do is working "month-by-month" as it follows (I am
>aware this is not the most elegant way to do it, i'm new to R and for
>the moment "elegance&stile" is not my main goal):
>
> >setwd("C:/Temp/January09") # to set my working directory
> >library(xlsx) # to load the "xlsx" library necessary to handle the
>original *.xlsx files
> >list.jan09<-list.files("C:/Temp/January09", full=TRUE)
> >read.all.jan09<-lapply(list.jan09, read.xlsx, 1, header=FALSE)
> >daily.all.jan09<-do.call("cbind",read.all.jan09) # to create a data
>frame containig all my data
> >daily.sst.jan09<-daily.all.jan09[,seq(from=1,to=61,by=2)] # to create
>a second data frame containing only sst readings (sst readings
>correspond to the first column of each daily file). The resulting file
>will have 31 columns and 72360 lines
> >daily.chl.jan09<-daily.all.jan09[,seq(from=2,to=62,by=2)] # to create
>a third data frame containing only chl readings (chl readings correspond
>to the second column of each daily file). The resulting file will have
>31 columns and 72360 lines
>
>>daily.sst.jan09<-replace(daily.sst.jan09,daily.sst.jan09==-999.0000000,NA
>>) # used to replace -999.0000000 values with "NA"
> >jan09_avgsst<-rowMeans(daily.sst.jan09) # to create a vector
>containing the mean sst value of all the rows
> >write.xlsx(jan09_avgsst,
>"C:/Users/AAA/Desktop/Data/january09_avgsst.xlsx") # to store the sst
>vector
>
>>daily.chl.jan09<-replace(daily.chl.jan09,daily.chl.jan09==-999.0000000,NA
>>) # used to replace -999.0000000 values with "NA"
> >jan09_avgchl<-rowMeans(daily.chl.jan09) # to create a vector
>containing the mean value of all the rows
> >write.xlsx(jan09_avgchl,
>"C:/Users/AAA/Desktop/Data/january09_avgchl.xlsx") # to store the chl
>vector
>
>I repeat these same commands for all the months and for the seasons
>(January-March; April-June; July-September; October-December), so the
>all thing is a bit redundant.
>
>How can I speed up the process, reduce the commands and maybe make them
>automatically? Many thanks for your help.
>
>Cheers,
>Nino
>
>--
>Nino Pierantonio
>
>Mobile: +39 349.532.9370
>Skype: pierantonio_nino
>
> * Italiano - rilevata
> * Inglese
> * Italiano
> * Francese
> * Spagnolo
> * Tedesco
>
> * Inglese
> * Italiano
> * Francese
> * Spagnolo
> * Tedesco
>
> <javascript:void(0);>
>
>______________________________________________
>R-help at r-project.org mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.
Thanks Don for your suggestion. I have received the original data in Excel xlsx format so I must work with them unless I want to change file format to thousands of files... I am also saving my output R files in Excel format to make them compatible with the original ones. Everything will be then stored in a proper database for further analysis after some basic data management. Nino I 23/02/2012 00:49, MacQueen, Don ha scritto:
Are you absolutely certain that the data must be stored in Excel?
In the long run I believe you will find it easier if the data is stored in
an external database, or some other data repository that does not require
you to read so many separate files.
Probably the best you can hope for as it is now is to put these commands
inside a loop, or nested loops, with the input and output file names
constructed from the loop indexes [see help('paste') for constructing file
names].
-Don
* Italiano - rilevata * Inglese * Italiano * Francese * Spagnolo * Tedesco * Inglese * Italiano * Francese * Spagnolo * Tedesco <javascript:void(0);> Impossibile tradurre il testo selezionato