Skip to content

[Bioc-devel] requirement for named assays in SummarizedExperiment

7 messages · Valerie Obenchain, Kasper Daniel Hansen, Tim Triche, Jr. +2 more

#
Hi Aaron,

Thanks for catching this.

I favor enforcing names in 'assays'. Combining by position alone is too 
dangerous. I'm thinking of the VCF class where the genome information is 
stored in 'assays' and the fields are rarely in the same order.

Looks like we also need a more informative error message when names 
don't match.

 > assays(se1)
List of length 1
names(1): counts1

 > assays(se2)
List of length 1
names(1): counts2

 > cbind(se1, se2)
Error in sQuote(accessorName) :
   argument "accessorName" is missing, with no default


Valerie
On 03/05/2015 11:09 PM, Aaron Lun wrote:
5 days later
#
Hi,

After talking with others the vote was against enforcing names on 
assays() and for positional matching if all names are NULL. A mixture of 
names and NULL throws an error.

example(SummarizedExperiment)

## all named
 > se2 = se1
 > assays(cbind(se1, se2))
List of length 1
names(1): counts

## mixture of names and NULL -> error
 > names(assays(se1)) = NULL
 > assays(cbind(se1, se2))
Error in assays(cbind(se1, se2)) :
   error in evaluating the argument 'x' in selecting a method for 
function 'assays': Error in .bind.arrays(args, cbind, "assays") :
   elements in ?assays? must have the same names

## all NULL -> positional matching
 > names(assays(se2)) = NULL
 > assays(cbind(se1, se2))
List of length 1

If we find common use cases where positional matching is needed with a 
mixture of names and NULL we can always relax this constraint.

Changes are in 1.19.46.

Valerie
On 03/06/2015 08:20 AM, Valerie Obenchain wrote:

  
    
#
allowing positional matching strikes me as being far too fragile.
Depending on the actual implementation, it may not even be clear there is
an order of the assays.

On Wed, Mar 11, 2015 at 2:45 PM, Valerie Obenchain <vobencha at fredhutch.org>
wrote:

  
  
#
What he said

This doesn't make any sense from an API perspective.  When would a user ever expect to see unnamed assay matrices?

--t
#
On 03/12/2015 08:12 AM, Tim Triche, Jr. wrote:
When there's a single assay?

  
    
#
Yes, a single-assay SummarizedExperiment would be the most common case 
for unnamed assays. But I think at the very least there should be a 
warning on unnamed assays.
On 3/12/15 9:24 AM, Martin Morgan wrote:
#
Is that a good thing?  (such an important special case that slapping a name on what exactly the assay contains would break other functionality?) 

If memory serves, the shallow reference class that points at the assays doesn't need to copy objects in order for them to have a name, but maybe my memory is not serving me well in this case.  Is that the issue?

Thanks,

--t