Skip to content

Scatterplot of two groups side-by-side?

5 messages · nonunah at yahoo.de, Baptiste Auguie, Stefan Grosse +2 more

#
Hi,

You could do this very easily using ggplot2,
See more examples on Hadley's website: http://had.co.nz/ggplot2/

Hope this helps,

baptiste
On 26 Apr 2009, at 10:29, nonunah at yahoo.de wrote:

            
_____________________________

Baptiste Augui?

School of Physics
University of Exeter
Stocker Road,
Exeter, Devon,
EX4 4QL, UK

Phone: +44 1392 264187

http://newton.ex.ac.uk/research/emag
#
On Sun, 26 Apr 2009 09:29:39 +0000 (GMT) "nonunah at yahoo.de"
<nonunah at yahoo.de> wrote:
ND> I'm realy new to R, so I hope you can help me, as I didn't find any
ND> solution in the common books.  

Have a look at the online resources at:
http://cran.r-project.org/other-docs.html
There is also stuff on graphics.

Furthermore the lattice package and book are highly recommended as well.

ND> By the way: It's realy necessary to plott the data as scatters and
ND> not as boxplots. With the command "plot", I can not plot the data
ND> by groups (I tried it with the commands "subset" and "groups", but
ND> obviously, there is no way to do so).  

There is always a way. I just don't understand why it is necessary to
plot this as a scatterplot. 

Look your "problem" is that your data have integer values. So it is
very clear that they will be overplotted and that the reader has no
idea at which point are many observations even when you split the data
on the x axis into groups. Or even if you make a per group plot as
Baptiste suggested and as would be possible with lattice as well.

I could offer an easy solution. You can split into groups manually by
changing your x values slightly groupwise. But still you dont see how
many data are on each point. You could add some noise with the jitter
function (see ?jitter ), so that one sees that there are many
observation at one point. However it introduces the appearence that you
dont deal with categorical data, which might not be intended...

daten<-data.frame(y=sample(c(1,2,3),24,replace=T),
x=rep(c(1,2),each=12),group=rep(c(1,2)))

daten


# plot with overplotting, no information gain
plot(daten$x,daten$y)

# plot with jitter

# prepare data
daten$x2<-ifelse(daten$group==1,daten$x-0.02,daten$x+0.02)

plot(c(0,2),c(0,4),type="n") # empty plot you could use real data

# plot points, see ?jitter for options
points(jitter(y)~x2,data=subset(daten,group==1),col="blue",pch=1)
points(jitter(y)~x2,data=subset(daten,group==2),col="red",pch=2)

# regression lines added:
abline(lm(y~x,data=subset(daten,group==1)),col="blue")
abline(lm(y~x,data=subset(daten,group==2)),col="red")

legend("topleft",c("group 1","group 2",
"regression group 1","regression group 2") ,lty=c(0,0,1,1),
pch=c(1,2,NA,NA), col=rep(c("blue","red"),2),bty="n")

But I believe there are better solutions. You should think about a
different plot like a ballon plot or so. 

Then I doubt whether a linear regression is really good here since we
deal with categorical data...

ND> I'm greatful for every (simple) solution  

Sorry if it is not simple. You see R has the advantage that it is
highly configurable. But you still need to know the message...

hth
Stefan
#
Dear Karin,

If I understand correctly what you want, the scatterplot function in the car
package isn't designed to produce it, but there are many ways to draw
side-by-side scatterplots. Here is one, using basic R graphics:

par(mfrow=c(1,2))
by(Data, Data$group, 
    function(x) {
        plot(Pulls ~ Resistance, data=x, main=paste("group =", group[1]))
        abline(lm(Pulls ~ Resistance, data=x))
        }
    )

This assumes that your data are in a data frame named Data, with variables
group, Pulls, and Resistance.

I hope this helps,
 John

------------------------------
John Fox, Professor
Department of Sociology
McMaster University
Hamilton, Ontario, Canada
web: socserv.mcmaster.ca/jfox
On
solution
lines.
then
different
got
this
defining
tried
to
http://www.R-project.org/posting-guide.html
#
nonunah at yahoo.de wrote:
Hi Karin,
I would use cluster.overplot in the plotrix package for the scattergram. 
This will adjust the positions of the points for the two groups so that 
they form little clusters instead of overplotting. You can color the two 
sets of points differently as well as use different symbols. Then by 
adding the two regression lines in the same two colors, you should have 
an easier to interpret plot.

Jim