Suggestions ?!?!
ivo welch <ivo.welch at yale.edu> writes:
* the first is for the summary() method for plain data frames. it would seem to me that the number of "NA" observations should be printed as an integer, not necessarily in scientific notation. I have also yet to determine when summary() likes to give means and when it does not. (maybe it was an older version that sometimes did not give means). summary does not seem to have optional parameters to specify what statistics I would like. this could be useful, too.
The form of the output from summary depends on the mode or class of the column. A numeric column is summarized by a 'five-number' summary (min, first quartile, median, third quartile, maximum) and the mean. If there are NA's in the column the number of NA's is reported. The reason that it is sometimes reported to several decimal places is because all the values in that part of the summary are being printed in the same format. If the mean requires four decimal places to get the desired number of significant digits then the number of NA's will also be given to four decimal places. A column that is a factor or an ordered factor will be summarized by a (possibly truncated) frequency table. Means, medians, etc. are not meaningful for factors.
* another small enhancement: there are four elementary data frame
operations that bedevil novices, so they really should have named
function wrappers:
delrow( dataframe d, index=45);
insrow( dataframe d, (row)vector v);
delcol( dataframe d, "name");
inscol( dataframe d, (col)vector v);
Three of the "secrets of the S masters" are: - indexing is particularly flexible and powerful in S - the "%in%" function is versatile and often overlooked - you can add a column to a data frame by assigning to that name so three of these operations can be written as d[ -45, ] # delrow( dataframe d, index=45) d[ , !(names(d) %in% "name")] # delcol( dataframe d, "name") d[ , -col] # alternative form is you know the column number d$newcol = v # inscol( dataframe d, (col)vector v)
Even a simple alias would do (maybe named row.delete, column.delete). I looked at my R "bible" (venables&ripley), too, but here too it is not as clear as it needs to be. yes, these operations are programmable, but it ain't as obvious as it should be for beginners. these are elementary.
P.S. How many other people think that the next edition of MASS should be renamed "Secrets of The S Masters"? :-)