Steve Lianoglou
Denali Therapeutics
On Tue, Jan 22, 2019 at 2:54 PM Aaron Lun <aaron.tin.long.lun at gmail.com> wrote:
>
> For 10X experiments, the Bioc-devel version of DropletUtils will read in
> the additional features as extra rows in the count matrix. This reflects
> how they are stored in the 10X output format. The row metadata will
> record the nature of the feature.
>
> In some cases it may be desirable to keep all the features together. For
> starters, it seems like many of the biases are likely to be shared
> (w.r.t. library preparation and capture efficiency), so one could
> imagine using the same scaling factors for normalization of both
> antibody-based features and endogenous mRNAs. In addition, all of the
> scater visualization methods rely on SCE inputs, so if you want to
> overlay them with protein marker intensities, they'll need to be in the
> same matrix.
>
> If you really need to only use mRNAs or antibody-based features, (i) you
> can explicitly subset the SCE based on the rowData, or (ii) pass a
> subsetting vector to the various scran/scater/whatever functions to tell
> them to only use the specified features. Admittedly, if you're going to
> be doing this a lot, it would be more convenient to form a MAE
> containing two SCEs so that you only have to pass the SCE you want into
> those functions.
>
> To that end I would be willing to entertain a PR to DropletUtils to
> create a MAE from an SCE. I'm more reluctant to add an isSpike()-like
> function. The rationale behind isSpike() was that spike-ins are constant
> across cells (theoretically) and thus a function could use this
> information to improve its calculations. It's less clear what
> mathematically useful information can be gained from protein markers -
> biological info, yes, but nothing that you would use to change your
> algorithm.
>
> -A
>
> Steve Lianoglou wrote:
> > Comrades,
> >
> > Sorry if I'm out of the loop and have missed anything obvious.
> >
> > I was curious what the plans are in the single-cell bioconductor-verse
> > to support single cell experiments that produce counts from different
> > feature-spaces, such as those produced by CITE-seq / REAP-seq, for
> > instance.
> >
> > In these types of experiments, I'm pretty sure we want the counts
> > generated from those "features" (oligo-conjugated Antibodies, for
> > instance) to be kept in a separate space than the mRNA counts. I think
> > we would most naturally want to put these in something like an
> > `assay()` matrix with a different (rowwise) dimmension than the gene
> > count matrix, but that can't work since all matrices in the assay()
> > list need to be of the same dimensions.
> >
> > Another option might be to just add them as rows to the assay
> > matrices, but keep some type of feature space meta-information akin to
> > what `isSpike()` currently does;
> >
> > or add a new slot to SingleCellExperiment to hold counts from
> > different feature spaces, perhaps?;
> >
> > Or rely on something like a MultiAssayExperiment?
> >
> > Or?
> >
> > Curious to learn which way you folks are leaning ...
> >
> > Thanks!
> > -steve
> >
> > ps - sorry if this email came through twice, it was somehow magically
> > sent from an email address I don't have access to anymore.
> >
>