Skip to content

how to view/edit large matrix/array in R?

17 messages · Bert Gunter, Michael, David Winsemius +5 more

#
... and do you really think perusing thousands of numbers by eye is
any way to edit/check data?!

Personal viewpoint: I would say that this is a large area of
statistics and data analysis that the discipline fails to address in
any systematic way ... perhaps because there is no way to address it
systematically? Contrary views and corrections -- especially
references! -- would be very welcome.

 My advice would be: graphics! -- but I can't provide anything more
useful without the specifics of the problem.

Some packages may provide the interactivity you seek -- check the CRAN
GUI task view, R Commander, etc.

-- Bert
On Mon, Dec 5, 2011 at 5:01 PM, Michael <comtech.usa at gmail.com> wrote:

  
    
#
Have you tried

?View
?edit
On Mon, Dec 5, 2011 at 8:12 PM, Bert Gunter <gunter.berton at gene.com> wrote:

  
    
#
On Dec 5, 2011, at 8:37 PM, jim holtman wrote:

            
Or:

?pairs
help(splom, package=lattice)

(My preference is plot(density()) but the 2d density plots are slow so  
also use:

help(hexbin, package=hexbin)
#
'edit' does allow you to change it.  If all else fails, export to
Excel, split the screen and then synchronize the two displays.
On Mon, Dec 5, 2011 at 9:52 PM, Michael <comtech.usa at gmail.com> wrote:

  
    
I think what most everyone is getting at is that the visual identification of numeric outliers is an exceedingly difficult task and one we humans are not well evolved for. Rather they are all suggesting you use visual techniques to spot and fix outliers individually. This practice has a long and reputable history in statistics and has been shown to be far more efficient than simply scanning pages of numbers for a single misplaced decimal. In conjunction, I'd also recommend use of the identify() function, which serves just this purpose. 

If you have so much data that the csv export is unbearably slow it seems unlikely you can check it all by hand. 

Another, more general, methodology if you are worried about data corruption is to use robust statistics when applicable. 

Michael
On Dec 5, 2011, at 10:27 PM, Michael <comtech.usa at gmail.com> wrote:

            
#
On Tue, Dec 6, 2011 at 4:42 AM, Michael <comtech.usa at gmail.com> wrote:
You may want to use RExcel then.

Liviu

  
    
  
#
Le lundi 05 d?cembre 2011 ? 19:01 -0600, Michael a ?crit :
RKWard has a good data editor, and you can open several objects at the
same time in tabs. But it will not probably work if your data set is
really huge (here it works very well with a few thousand rows, though).

If you need to see a selection of variables in parallel, ordering the
variables so that they're next to each other is probably a good
solution.


Cheers
#
Michael <comtech.usa <at> gmail.com> writes:
You might look into the RGtk2Extras package and find the dfedit. You could
also embed that in some custom GUI to show variables, as you want. The 
package requires RGtk2, and hence the Gtk libraries to be installed. 
The data frame editor there can gracefully handle large data sets.