Good day,
Would it be useful to provide the same operations which can be done to a data.frame for a DataFrame in a future release of S4Vectors? For example,
dataTable <- data.frame(aFeature = 1:5, anotherFeature = 5:1)
colMeans(dataTable)
# aFeature anotherFeature
# 3 3
dataTableS4 <- DataFrame(aFeature = 1:5, anotherFeature = 5:1)
colMeans(dataTableS4)
Error in colMeans(dataTableS4) :
'x' must be an array of at least two dimensions
--------------------------------------
Dario Strbenac
University of Sydney
Camperdown NSW 2050
Australia
[Bioc-devel] Numeric Operation on DataFrame
3 messages · Dario Strbenac, Michael Lawrence, Hervé Pagès
Please be more specific about the desired operations, or, better, submt a pull request with them. colMeans() in particular was intentionally omitted because it depends on having homogeneous data, which is better suited for a matrix, not a data frame. On Mon, Jan 15, 2018 at 10:00 PM, Dario Strbenac <dstr7320 at uni.sydney.edu.au
wrote:
Good day,
Would it be useful to provide the same operations which can be done to a
data.frame for a DataFrame in a future release of S4Vectors? For example,
dataTable <- data.frame(aFeature = 1:5, anotherFeature = 5:1)
colMeans(dataTable)
# aFeature anotherFeature
# 3 3
dataTableS4 <- DataFrame(aFeature = 1:5, anotherFeature = 5:1)
colMeans(dataTableS4)
Error in colMeans(dataTableS4) :
'x' must be an array of at least two dimensions
--------------------------------------
Dario Strbenac
University of Sydney
Camperdown NSW 2050
Australia
_______________________________________________ Bioc-devel at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel
Hi, I think I remember it was once suggested on this list that DataFrame objects with numeric columns could support math/summarization operations, like data.frame objects do (can't find the thread to provide the link, sorry). I'll mention that wrapping a DataFrame object (or any matrix-like or array-like object) in a DelayedArray object is one way to enable this: library(DelayedArray) M <- DelayedArray(dataTableS4) colMeans(M) # aFeature anotherFeature # 3 3 This should not copy the DataFrame so should be more memory efficient than doing as.data.frame() on it. In addition it will transparently use the internal DelayedArray machinery i.e. will delay some operations (e.g. subsetting and log() in colMeans(log(M[-1, ])) are delayed) and use block-processing for non-delayed operations (e.g. colMeans). Note that wrapping a DataFrame with Rle columns in a DelayedArray object also works. Pete's DelayedMatrixStats package will extend DelayedArray capabilities by giving you access to all the summarization functions defined in the matrixStats package. That being said, it would be nice if math/summarization operations worked directly on DataFrame objects like they do on ordinary data frames. This could naturally be extended to DataFrame objects with numeric Rle columns. H.
On 01/16/2018 06:29 AM, Michael Lawrence wrote:
Please be more specific about the desired operations, or, better, submt a pull request with them. colMeans() in particular was intentionally omitted because it depends on having homogeneous data, which is better suited for a matrix, not a data frame. On Mon, Jan 15, 2018 at 10:00 PM, Dario Strbenac <dstr7320 at uni.sydney.edu.au
wrote:
Good day,
Would it be useful to provide the same operations which can be done to a
data.frame for a DataFrame in a future release of S4Vectors? For example,
dataTable <- data.frame(aFeature = 1:5, anotherFeature = 5:1)
colMeans(dataTable)
# aFeature anotherFeature
# 3 3
dataTableS4 <- DataFrame(aFeature = 1:5, anotherFeature = 5:1)
colMeans(dataTableS4)
Error in colMeans(dataTableS4) :
'x' must be an array of at least two dimensions
--------------------------------------
Dario Strbenac
University of Sydney
Camperdown NSW 2050
Australia
_______________________________________________ Bioc-devel at r-project.org mailing list https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel&d=DwICAg&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=YvMtQhvKb8pNL1GAmQmOaYiMzhMOY5gA0I116y0jnSk&s=i3ZtH69dT5x1gcDRlG472FFqoKFc_TwKOPsNFc-IT6A&e=
[[alternative HTML version deleted]]
_______________________________________________ Bioc-devel at r-project.org mailing list https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel&d=DwICAg&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=YvMtQhvKb8pNL1GAmQmOaYiMzhMOY5gA0I116y0jnSk&s=i3ZtH69dT5x1gcDRlG472FFqoKFc_TwKOPsNFc-IT6A&e=
Herv? Pag?s Program in Computational Biology Division of Public Health Sciences Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N, M1-B514 P.O. Box 19024 Seattle, WA 98109-1024 E-mail: hpages at fredhutch.org Phone: (206) 667-5791 Fax: (206) 667-1319