Skip to content

Multiple plots and postscripts using split function

5 messages · fd, David L Carlson, Don McKenzie +2 more

fd
#
Hi,

I'm relatively new to R and I would like to do the following:

I have a .csv file with four columns (NAME, ID, YEAR, VALUE) and would like
to do several xy plots with the year on the x-axis and the data values
(measurements) on the y-axis and after that export the different plots to
postcript. 

My .csv file looks something like this (only an example):

NAME				ID		YEAR	VALUE
ADAMS				885		1988		-2
ADAMS				885		1989		0
BAHIA DEL DIABLO		2665		1999		4
BAHIA DEL DIABLO		2665		2000		8
BAHIA DEL DIABLO		2665		2001		19
BAHIA DEL DIABLO		2665		2002		13
BAHIA DEL DIABLO		2665		2003		13
BARTLEY				893		1983		0
BARTLEY				893		1984		-1
BARTLEY				893		1985		0
BARTLEY				893		1988		2
BARTLEY				893		1989		-1
CANADA				877		1972		-1

I have split the different items into groups and I'd like the plots to have
the title of NAME but the filename of the postscript to be exported should
have the ID as filename.

My code so far:

#Set Working Directory:
setwd("/Users/Desktop/FV")
# Read CSV
dat <- read.csv("FV.csv", sep=";", header=TRUE)
# Split Data
ind <- split(x = dat,f = dat[,'ID'])
nam <- names(ind)

sapply(nam, function(x) {
	postscript(x)
	par(mar=c(6,8,6,5), cex=0.8) ????
???	plot(ind[[x]][,c('YEAR','VALUE')], 
	type='b', 
	main = x, 
	xlab="Time [Years]", 
	ylab="Front variation") ?????
	axis(1, at = seq(1800,2100,5), cex.axis=1, labels=FALSE, tcl=-0.3)
???	axis(2, at = seq(-100000,100000,500), cex.axis=1, labels=FALSE,
tcl=-0.3) 
???
	dev.off() 
})

This results in plots with the title and filename of the resulting
postscript being the same. Is there a way to get the plot title out of the
NAME column and the filename out of the ID?

Additionally I'd only like to plot graphs for items with more than 3 data
values. Is this possible to incorporate in the split command?

Another point is that some items have gaps in the time series where no
measurements were taken (in my example: BARTLEY from 1983 to 1985 and 1988
to 1989). I would like to plot using type= 'b' so that the points are
connected with lines, but when doing that, the values between 1985 and 1988
are automatically connected which I don't want. I'd like the plot to start
again at the value where the gap ends (in my example from 1988 onwards). Is
there a solution for this?

Any help is kindly appreciated! Thanks for your help.

Kind regards,
fd



--
View this message in context: http://r.789695.n4.nabble.com/Multiple-plots-and-postscripts-using-split-function-tp4694850.html
Sent from the R help mailing list archive at Nabble.com.
#
This is one of those times when you would do better to just use a loop. It will be easier to debug and to see what is going on. Replace the sapply() call with

for (i in 1:length(ind)) {
	postscript(names(ind[i]))
	par(mar=c(6,8,6,5), cex=0.8)     
   	plot(ind[[i]][,c('YEAR','VALUE')],
		type='b', 
		main = ind[[i]][1, "NAME"],
		. . . other commands . . . )
	dev.off()
}

-------------------------------------
David L Carlson
Department of Anthropology
Texas A&M University
College Station, TX 77840-4352



-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of fd
Sent: Thursday, July 31, 2014 9:37 AM
To: r-help at r-project.org
Subject: [R] Multiple plots and postscripts using split function

Hi,

I'm relatively new to R and I would like to do the following:

I have a .csv file with four columns (NAME, ID, YEAR, VALUE) and would like
to do several xy plots with the year on the x-axis and the data values
(measurements) on the y-axis and after that export the different plots to
postcript. 

My .csv file looks something like this (only an example):

NAME				ID		YEAR	VALUE
ADAMS				885		1988		-2
ADAMS				885		1989		0
BAHIA DEL DIABLO		2665		1999		4
BAHIA DEL DIABLO		2665		2000		8
BAHIA DEL DIABLO		2665		2001		19
BAHIA DEL DIABLO		2665		2002		13
BAHIA DEL DIABLO		2665		2003		13
BARTLEY				893		1983		0
BARTLEY				893		1984		-1
BARTLEY				893		1985		0
BARTLEY				893		1988		2
BARTLEY				893		1989		-1
CANADA				877		1972		-1

I have split the different items into groups and I'd like the plots to have
the title of NAME but the filename of the postscript to be exported should
have the ID as filename.

My code so far:

#Set Working Directory:
setwd("/Users/Desktop/FV")
# Read CSV
dat <- read.csv("FV.csv", sep=";", header=TRUE)
# Split Data
ind <- split(x = dat,f = dat[,'ID'])
nam <- names(ind)

sapply(nam, function(x) {
	postscript(x)
	par(mar=c(6,8,6,5), cex=0.8) ????
???	plot(ind[[x]][,c('YEAR','VALUE')], 
	type='b', 
	main = x, 
	xlab="Time [Years]", 
	ylab="Front variation") ?????
	axis(1, at = seq(1800,2100,5), cex.axis=1, labels=FALSE, tcl=-0.3)
???	axis(2, at = seq(-100000,100000,500), cex.axis=1, labels=FALSE,
tcl=-0.3) 
???
	dev.off() 
})

This results in plots with the title and filename of the resulting
postscript being the same. Is there a way to get the plot title out of the
NAME column and the filename out of the ID?

Additionally I'd only like to plot graphs for items with more than 3 data
values. Is this possible to incorporate in the split command?

Another point is that some items have gaps in the time series where no
measurements were taken (in my example: BARTLEY from 1983 to 1985 and 1988
to 1989). I would like to plot using type= 'b' so that the points are
connected with lines, but when doing that, the values between 1985 and 1988
are automatically connected which I don't want. I'd like the plot to start
again at the value where the gap ends (in my example from 1988 onwards). Is
there a solution for this?

Any help is kindly appreciated! Thanks for your help.

Kind regards,
fd



--
View this message in context: http://r.789695.n4.nabble.com/Multiple-plots-and-postscripts-using-split-function-tp4694850.html
Sent from the R help mailing list archive at Nabble.com.

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
#
The range vector is evaluated at the start of the loop, so it is only evaluated once. ind.length would be an unnecessary extra variable.
---------------------------------------------------------------------------
Jeff Newmiller                        The     .....       .....  Go Live...
DCN:<jdnewmil at dcn.davis.ca.us>        Basics: ##.#.       ##.#.  Live Go...
                                      Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/Batteries            O.O#.       #.O#.  with
/Software/Embedded Controllers)               .OO#.       .OO#.  rocks...1k
--------------------------------------------------------------------------- 
Sent from my phone. Please excuse my brevity.
On July 31, 2014 11:09:59 AM PDT, Don McKenzie <dmck at u.washington.edu> wrote:
#
Even better is to replace
    for(i in 1:length(something)) {}
with
    for(i in seq_along(something)) {}
The former gives you 2 iterations, the 2nd probably causing an error,
when length(something) is 0.  The latter always gives one iteration
per element of 'something'.


Bill Dunlap
TIBCO Software
wdunlap tibco.com


On Thu, Jul 31, 2014 at 11:56 AM, Jeff Newmiller
<jdnewmil at dcn.davis.ca.us> wrote: