Skip to content

[Bioc-devel] New ExperimentHub resource and some related questions

2 messages · Lu, Dongyi (Lambda), Martin Morgan

#
I don?t mean to name the package ?SingleCell?. I was referring to the biocView. Also, BUS format is quite different from the 10x molecule info, since while CellRanger aligns reads to the genome with STAR, the BUS file is generated by pseudoalignment to a transcriptome index and gives the set of transcripts a read is compatible to rather than which gene a read aligns to.

In the ExperimentHub vignette about creating a new ExperimentHub package, we should contact a Bioconductor team member to upload the data. So does it mean that I directly email one of the core team members?

Lambda

?On 12/21/18, 3:02 AM, "Bioc-devel on behalf of bioc-devel-request at r-project.org" <bioc-devel-bounces at r-project.org on behalf of bioc-devel-request at r-project.org> wrote:

    Send Bioc-devel mailing list submissions to
    	bioc-devel at r-project.org
    
    To subscribe or unsubscribe via the World Wide Web, visit
    	https://stat.ethz.ch/mailman/listinfo/bioc-devel
    or, via email, send a message with subject or body 'help' to
    	bioc-devel-request at r-project.org
    
    You can reach the person managing the list at
    	bioc-devel-owner at r-project.org
    
    When replying, please edit your Subject line so it is more specific
    than "Re: Contents of Bioc-devel digest..."
    
    
    Today's Topics:
    
       1. Re:  New ExperimentHub resource and some related questions
          (Aaron Lun)
       2. Re:  New ExperimentHub resource and some related questions
          (Shepherd, Lori)
       3. Re: Aliasing `]` breaks BiocCheck::BiocCheck() version 1.18.0
          (Martin Morgan)
       4. Re: Aliasing `]` breaks BiocCheck::BiocCheck() version 1.18.0
          (Tierney, Luke)
       5. Re: Compilation flags, CHECK errors and BiocNeighbors
          (Obenchain, Valerie)
    
    ----------------------------------------------------------------------
    
    Message: 1
    Date: Thu, 20 Dec 2018 12:00:20 +0000
    From: Aaron Lun <infinite.monkeys.with.keyboards at gmail.com>
    To: bioc-devel <bioc-devel at r-project.org>
    Subject: Re: [Bioc-devel]  New ExperimentHub resource and some related
    	questions
    Message-ID: <9BF95433-AF04-431B-B71D-62425195DEBE at gmail.com>
    Content-Type: text/plain; charset="utf-8"
    
    I presume your package is not actually called ?SingleCell? (in point 1). This would be pretty confusing wjem compared to the simpleSingleCell package, the SingleCellExperiment package, and the SingleCell biocViews term itself. It would probably make more sense to call it BUStoolsR or some other appropriate pun (e.g., RBUS, which is funniest when it gets to version 3.8.0.).
    
    Also, at first glance, the BUS format seems pretty similar to 10X?s molecule information file, for which the DropletUtils package has a series of reader functions. You may find some of the code there useful for your package. I might also add a readBUS() function to DropletUtils if this turns out to be a popular format for droplet data, though TBH the sparse matrix is a much more common starting point.
    
    -A
> On 20 Dec 2018, at 01:42, Lu, Dongyi (Lambda) <dlu2 at caltech.edu> wrote:
> 
    > Hi everyone,
    > 
    > I?m writing a package (biocViews SinigleCell) that converts files of the BUS format (standing for Barcode, UMI, Set, see https://www.biorxiv.org/content/early/2018/11/21/472571) into a sparse matrix in R that can be used in Seurat and SingleCellExperiment. In order to write the examples and the vignette, I?m also putting the data itself into a package for ExperimentHub. The data used here are some mixed human and mouse cells from 10x. Here are my questions:
    > 
    > 
    >  1.  In the documentation for `ExperimentHubData::makeExperimentHubMetadata`, the fields `RDataClass` and `DispatchClass` are required. However, this accompanying dataset package is meant to download text files (generated by command line tools outside R) to disk rather than into the R session, and it?s the job of the SingleCell package to converts the text files into a sparse matrix. There is a website documenting how the command line tools were used to generate the text files. So is this dataset still appropriate for ExperimentHub?
    >  2.  If it is appropriate, then what shall I put in `RDataClass` and `DispatchClass`?
    > 
    > Thanks,
    > Lambda
    > 
    > 	[[alternative HTML version deleted]]
    > 
    > _______________________________________________
    > Bioc-devel at r-project.org mailing list
    > https://stat.ethz.ch/mailman/listinfo/bioc-devel
    
    
    
    
    ------------------------------
    
    Message: 2
    Date: Thu, 20 Dec 2018 12:05:57 +0000
    From: "Shepherd, Lori" <Lori.Shepherd at RoswellPark.org>
    To: "Lu, Dongyi (Lambda)" <dlu2 at caltech.edu>,
    	"bioc-devel at r-project.org" <bioc-devel at r-project.org>
    Subject: Re: [Bioc-devel]  New ExperimentHub resource and some related
    	questions
    Message-ID:
    	<MW2PR12MB23645E21836B066C9E38F9DDF9BF0 at MW2PR12MB2364.namprd12.prod.outlook.com>
    	
    Content-Type: text/plain; charset="utf-8"
    
    There is a DispatchClass  -  FilePath -  That will download the file and give you the path to the file in the cache location rather than loading it to the R session -  You then can use the file path in whatever read/load/etc method you deem fit.
    
    RDataClass  - I would either say character or matrix - knowing that there will be instructions on how to load the data somewhere in your package -
    
    
    
    Lori Shepherd
    
    Bioconductor Core Team
    
    Roswell Park Cancer Institute
    
    Department of Biostatistics & Bioinformatics
    
    Elm & Carlton Streets
    
    Buffalo, New York 14263
1 day later
#
This email is enough to start the conversation, but the person who will help is on holiday until approximately January 3 so a response will be delayed.

Martin

?On 12/21/18, 7:26 PM, "Bioc-devel on behalf of Lu, Dongyi (Lambda)" <bioc-devel-bounces at r-project.org on behalf of dlu2 at caltech.edu> wrote:

    I don?t mean to name the package ?SingleCell?. I was referring to the biocView. Also, BUS format is quite different from the 10x molecule info, since while CellRanger aligns reads to the genome with STAR, the BUS file is generated by pseudoalignment to a transcriptome index and gives the set of transcripts a read is compatible to rather than which gene a read aligns to.
    
    In the ExperimentHub vignette about creating a new ExperimentHub package, we should contact a Bioconductor team member to upload the data. So does it mean that I directly email one of the core team members?
    
    Lambda
    
    ?On 12/21/18, 3:02 AM, "Bioc-devel on behalf of bioc-devel-request at r-project.org" <bioc-devel-bounces at r-project.org on behalf of bioc-devel-request at r-project.org> wrote:
    
        Send Bioc-devel mailing list submissions to
        	bioc-devel at r-project.org
        
        To subscribe or unsubscribe via the World Wide Web, visit
        	https://stat.ethz.ch/mailman/listinfo/bioc-devel
        or, via email, send a message with subject or body 'help' to
        	bioc-devel-request at r-project.org
        
        You can reach the person managing the list at
        	bioc-devel-owner at r-project.org
        
        When replying, please edit your Subject line so it is more specific
        than "Re: Contents of Bioc-devel digest..."
        
        
        Today's Topics:
        
           1. Re:  New ExperimentHub resource and some related questions
              (Aaron Lun)
           2. Re:  New ExperimentHub resource and some related questions
              (Shepherd, Lori)
           3. Re: Aliasing `]` breaks BiocCheck::BiocCheck() version 1.18.0
              (Martin Morgan)
           4. Re: Aliasing `]` breaks BiocCheck::BiocCheck() version 1.18.0
              (Tierney, Luke)
           5. Re: Compilation flags, CHECK errors and BiocNeighbors
              (Obenchain, Valerie)
        
        ----------------------------------------------------------------------
        
        Message: 1
        Date: Thu, 20 Dec 2018 12:00:20 +0000
        From: Aaron Lun <infinite.monkeys.with.keyboards at gmail.com>
        To: bioc-devel <bioc-devel at r-project.org>
        Subject: Re: [Bioc-devel]  New ExperimentHub resource and some related
        	questions
        Message-ID: <9BF95433-AF04-431B-B71D-62425195DEBE at gmail.com>
        Content-Type: text/plain; charset="utf-8"
        
        I presume your package is not actually called ?SingleCell? (in point 1). This would be pretty confusing wjem compared to the simpleSingleCell package, the SingleCellExperiment package, and the SingleCell biocViews term itself. It would probably make more sense to call it BUStoolsR or some other appropriate pun (e.g., RBUS, which is funniest when it gets to version 3.8.0.).
        
        Also, at first glance, the BUS format seems pretty similar to 10X?s molecule information file, for which the DropletUtils package has a series of reader functions. You may find some of the code there useful for your package. I might also add a readBUS() function to DropletUtils if this turns out to be a popular format for droplet data, though TBH the sparse matrix is a much more common starting point.
        
        -A
> On 20 Dec 2018, at 01:42, Lu, Dongyi (Lambda) <dlu2 at caltech.edu> wrote:
> 
        > Hi everyone,
        > 
        > I?m writing a package (biocViews SinigleCell) that converts files of the BUS format (standing for Barcode, UMI, Set, see https://www.biorxiv.org/content/early/2018/11/21/472571) into a sparse matrix in R that can be used in Seurat and SingleCellExperiment. In order to write the examples and the vignette, I?m also putting the data itself into a package for ExperimentHub. The data used here are some mixed human and mouse cells from 10x. Here are my questions:
        > 
        > 
        >  1.  In the documentation for `ExperimentHubData::makeExperimentHubMetadata`, the fields `RDataClass` and `DispatchClass` are required. However, this accompanying dataset package is meant to download text files (generated by command line tools outside R) to disk rather than into the R session, and it?s the job of the SingleCell package to converts the text files into a sparse matrix. There is a website documenting how the command line tools were used to generate the text files. So is this dataset still appropriate for ExperimentHub?
        >  2.  If it is appropriate, then what shall I put in `RDataClass` and `DispatchClass`?
        > 
        > Thanks,
        > Lambda
        > 
        > 	[[alternative HTML version deleted]]
        > 
        > _______________________________________________
        > Bioc-devel at r-project.org mailing list
        > https://stat.ethz.ch/mailman/listinfo/bioc-devel
        
        
        
        
        ------------------------------
        
        Message: 2
        Date: Thu, 20 Dec 2018 12:05:57 +0000
        From: "Shepherd, Lori" <Lori.Shepherd at RoswellPark.org>
        To: "Lu, Dongyi (Lambda)" <dlu2 at caltech.edu>,
        	"bioc-devel at r-project.org" <bioc-devel at r-project.org>
        Subject: Re: [Bioc-devel]  New ExperimentHub resource and some related
        	questions
        Message-ID:
        	<MW2PR12MB23645E21836B066C9E38F9DDF9BF0 at MW2PR12MB2364.namprd12.prod.outlook.com>
        	
        Content-Type: text/plain; charset="utf-8"
        
        There is a DispatchClass  -  FilePath -  That will download the file and give you the path to the file in the cache location rather than loading it to the R session -  You then can use the file path in whatever read/load/etc method you deem fit.
        
        RDataClass  - I would either say character or matrix - knowing that there will be instructions on how to load the data somewhere in your package -
        
        
        
        Lori Shepherd
        
        Bioconductor Core Team
        
        Roswell Park Cancer Institute
        
        Department of Biostatistics & Bioinformatics
        
        Elm & Carlton Streets
        
        Buffalo, New York 14263