Skip to content

Trouble plotting with factor

4 messages · Paul Johnson, Marc Schwartz, Martin Maechler

#
With R 1.8.1 running in Fedora Core 1 Linux, I am having some trouble 
recoding and ploting some factor variables. 

First, can I give you some example data?
Here is a column of names for age groups:

agegroups <- c(   "15-19", "20-24", "25-29","30-34", "35-39", 
"40-44","45-49","50-54","55-59","60-64",  "65-69", "70-74", "75-79",  
"80-84", "OVER")

Here is an index for driver's license ownership in each age group:

  fracld            
0.4914204
0.9752746
1.0864465
1.0555984
1.0631969
1.0738725
1.0971969
1.0657212
1.0217373
0.9761226
0.9043233
0.9045744
0.8573243
0.7889182
0.5217992

I want to take several similar columns of numbers and put them into a 
single line plot, one line for each column. 

I can get a graph with the inside part that looks roughly like I want if 
I just do:

plot(fracld,type="l")

The horizontal axis, of course, is just the sequence from 1:15, not the 
age labels. That's no good.

But, If I try

plot(as.factor(agegroup), fracld, type="l")

the plot does not have the line I want, but rather only flat "steps" 
showing the values.  It does  have a nice looking horizontal axis, 
though, showing the age groups.

So I think to myself "I'll outsmart them by adding the lines after 
creating the plot", but if I do this

plot(agegroup,fracld,type="n")

The step markers still appear. 

So if I want the tick marks and value lables on the horzontal axis, 
there is apparently  no way to plot lines?

What to do?
#
On Fri, 2004-01-30 at 15:04, Paul Johnson wrote:
Paul,

I did not see any responses come through yet on this, so I don't know if
you got anything offlist.

I do not know how your data is structured, but one approach, if your
data is in a matrix, with the numbers being the columns and the
agegroups being the rownames, is the following:

# Create the agegroups
agegroups <- c("15-19", "20-24", "25-29","30-34", "35-39",
               "40-44","45-49","50-54","55-59","60-64",
               "65-69", "70-74", "75-79", "80-84", "OVER")

# Create the first column
fracld <- c(0.4914204, 0.9752746, 1.0864465, 1.0555984, 1.0631969,
            1.0738725, 1.0971969, 1.0657212, 1.0217373, 0.9761226,
            0.9043233, 0.9045744, 0.8573243, 0.7889182, 0.5217992)

# Now create two additional columns for the example
# 'fracld2' won't make sense "real world" since many vals
# will be > 1.0
fracld1 <- fracld - 0.25
fracld2 <- fracld + 0.25

# Create the matrix
df <- cbind(fracld, fracld1, fracld2)

# Set the rownames
rownames(df) <- agegroups
fracld   fracld1   fracld2
15-19 0.4914204 0.2414204 0.7414204
20-24 0.9752746 0.7252746 1.2252746
25-29 1.0864465 0.8364465 1.3364465
30-34 1.0555984 0.8055984 1.3055984
35-39 1.0631969 0.8131969 1.3131969
40-44 1.0738725 0.8238725 1.3238725
45-49 1.0971969 0.8471969 1.3471969
50-54 1.0657212 0.8157212 1.3157212
55-59 1.0217373 0.7717373 1.2717373
60-64 0.9761226 0.7261226 1.2261226
65-69 0.9043233 0.6543233 1.1543233
70-74 0.9045744 0.6545744 1.1545744
75-79 0.8573243 0.6073243 1.1073243
80-84 0.7889182 0.5389182 1.0389182
OVER  0.5217992 0.2717992 0.7717992


# Now use matplot() to plot each column
# Do not plot the axes
matplot(df, type = "l", axes = FALSE,
        xlab = "Age Group", ylab = "Proportion DL Ownership")

# Create the X axis, specifying 15 tick marks 
# and using rownames(df) as the labels
axis(1, at = 1:nrow(df), labels = rownames(df))

# Now draw the Y axis with defaults
axis(2)

# Put a box around the whole thing
box()


BTW, this is also using FC1 and R 1.8.1 Patched.

HTH,

Marc Schwartz
#
<<excellent nicely told help text for Paul Johnson>>


I want to comment on the following because it's 
"not quite optimal", and still is recommended on-and-on ....

    Marc> # Now use matplot() to plot each column

    Marc> # Do not plot the axes
    Marc> matplot(df, type = "l", axes = FALSE,
    Marc>         xlab = "Age Group", ylab = "Proportion DL Ownership")

    Marc> # Create the X axis, specifying 15 tick marks 
    Marc> # and using rownames(df) as the labels
    Marc> axis(1, at = 1:nrow(df), labels = rownames(df))

    Marc> # Now draw the Y axis with defaults
    Marc> axis(2)

    Marc> # Put a box around the whole thing
    Marc> box()

More elegant is not to set axes = FALSE (and having to add
axis(2) and box() later) but to use  xaxt = 'n'  {only
suppressing x-axis}, i.e., instead of the above 4 statements,
only two :

 matplot(df, type = "l", xaxt = "n", # do not plot the 'x' axis
         xlab = "Age Group", ylab = "Proportion DL Ownership")

 # Create the X axis, specifying 15 tick marks 
 # and using rownames(df) as the labels
 axis(1, at = 1:nrow(df), labels = rownames(df))


Martin
#
On Sat, 2004-01-31 at 07:34, Martin Maechler wrote:
Thanks Martin.
Right. I had actually started to use that approach (which I have used in
the past for plot(), etc.). I must have had a transient loss of blood
flow to the brain, as I for some reason (uncommented in my code), I
decided to use 'axes = FALSE'.  Either that, or I was asleep at the
keyboard and my fingers were on autopilot....  :-)

Thanks for pointing that out Martin.

Best regards,

Marc