Skip to content

Unequal column lengths

4 messages · Tom Mosca, David Winsemius, Jim Lemon +1 more

#
Hello,

I?ve tried several times to learn R, but have never gotten past a particular gate.  My data are organized by column in Excel, with column headers in the first row.  The columns are of unequal lengths.  I export them as CSV, then import the CSV file into R.  I wish to summarize the data by column.  R inserts NA for missing values, then refuses to operate on columns with NA.  R is importing my data into a data frame, and I realize that is inappropriate for what I want to do.

How can I import my data so that I can work on columns of unequal length?  The first thing I would like to do is generate a table containing mean, median, mode, standard deviation, min, max and count, all per column.

Thank you, Tom

Example data
  Dat1 Dat2 Dat3
1    1    5    4
2    7    7    9
3    3    3    5
4    2   NA  5
5    9   NA NA
#
Most of the summary statistic functions have an na.rm options that you should set to TRUE.
Looks like you have an R dataframe already, so I would try(

colMeans(data, na.rm=TRUE)
And do learn to configure your email client to post to r-help in plain text.
David Winsemius
Alameda, CA, USA
#
Hi Tom,
What you want is a list rather than a data frame. So:

df<-read.table(text="  Dat1 Dat2 Dat3
 1    1    5    4
 2    7    7    9
 3    3    3    5
 4    2   NA  5
 5    9   NA NA",
 header=TRUE)
dflist<-as.list(df)
na.remove<-function(x) return(x[!is.na(x)])
sapply(dflist,na.remove)

Jim
On Fri, Apr 15, 2016 at 7:33 AM, Tom Mosca <tom at vims.edu> wrote:
#
Many basic summary stats in R will not work (i.e. usually return an NA) if there are NAs in the data unless you explicitylauthorize it to do so.

With your data set df
with(df, mean(Dat2, na.rm = TRUE))
[1] 5

This by the way is functionally the same as 
mean(df$Dat2, na.rm = TRUE) 
It's just easier to type the first one 


In other cases R will do not object to the NA's

summary(df)
     Dat1          Dat2        Dat3     
 Min.   :1.0   Min.   :3   Min.   :4.00  
 1st Qu.:2.0   1st Qu.:4   1st Qu.:4.75  
 Median :3.0   Median :5   Median :5.00  
 Mean   :4.4   Mean   :5   Mean   :5.75  
 3rd Qu.:7.0   3rd Qu.:6   3rd Qu.:6.00  
 Max.   :9.0   Max.   :7   Max.   :9.00  
               NA's   :2   NA's   :1     


John Kane
Kingston ON Canada
____________________________________________________________
FREE 3D EARTH SCREENSAVER - Watch the Earth right on your desktop!