you are making it far too difficult
Thank you!
On Thu, Dec 27, 2018 at 4:43 PM Sarah Goslee <sarah.goslee at gmail.com> wrote:
\t is the symbol for a tab. /t is two characters just as it seems. It's highly unlikely your file is delimited with /t, which would look like 1/t2/t3 The help for read.table mentions this tangentially as part of read.delim(), and you can find out more under ?regex - see the section about escaping non-metacharacters with a backslash. Sarah On Thu, Dec 27, 2018 at 4:09 PM Spencer Brackett <spbrackett20 at saintjosephhs.com> wrote:
What is the significance of using / or \ ? On Thu, Dec 27, 2018 at 4:02 PM Sarah Goslee <sarah.goslee at gmail.com>
wrote:
On Thu, Dec 27, 2018 at 2:03 PM Spencer Brackett <spbrackett20 at saintjosephhs.com> wrote:
Thank you for the help! I tried using the read.table command in my
RStudio
using the following argument, and managed to open the file. GBM_protein_expression<-read.table(file.choose(), header=TRUE,
sep=?/t?)
Note that sep="/t" is NOT the same thing as the sep="\t" you were advised to use.
However, my data did not unpack as yours did. I again only received a
table
of true and flase distinctions per column, and my environment tab
says that
there is 0 observations upon 0 variables. I believe I should be getting data similar to what you got, as it
would
appear that your?s actually contains relevant gene/protein expression
info.
On Thu, Dec 27, 2018 at 6:21 AM Federico Calboli < federico.calboli at kuleuven.be> wrote:
Once you have your TSV files just use something as
x = read.table('protein_expression.tsv', h = T, sep = '\t')
Do not copy paste the code of this email because it is formatted
and would
not work in R.
Best
F
PS the data looks like this to me
head(x)
icgc_donor_id project_code icgc_specimen_id icgc_sample_id
submitted_sample_id analysis_id antibody_id gene_name
1 DO12370 GBM-US SP26475 SA131594
TCGA-19-5960-01A-13-1900-20 97765 14-3-3_epsilon-M-C YWHAE
2 DO12370 GBM-US SP26475 SA131594
TCGA-19-5960-01A-13-1900-20 97765 4E-BP1-R-V EIF4EBP1
3 DO12370 GBM-US SP26475 SA131594
TCGA-19-5960-01A-13-1900-20 97765 4E-BP1_pS65-R-V EIF4EBP1
4 DO12370 GBM-US SP26475 SA131594
TCGA-19-5960-01A-13-1900-20 97765 4E-BP1_pT37-R-V EIF4EBP1
5 DO12370 GBM-US SP26475 SA131594
TCGA-19-5960-01A-13-1900-20 97765 4E-BP1_pT70-R-C EIF4EBP1
6 DO12370 GBM-US SP26475 SA131594
TCGA-19-5960-01A-13-1900-20 97765 53BP1-R-C TP53BP1
gene_stable_id gene_build_version normalized_expression_level
verification_status verification_platform
1 NA NA -1.1636330
not tested NA
2 NA NA -1.7969721
not tested NA
3 NA NA -0.7256390
not tested NA
4 NA NA 0.6498421
not tested NA
5 NA NA -1.0262844
not tested NA
6 NA NA 1.5186400
not tested NA
platform
1 M.D. Anderson Reverse Phase Protein Array Core
2 M.D. Anderson Reverse Phase Protein Array Core
3 M.D. Anderson Reverse Phase Protein Array Core
4 M.D. Anderson Reverse Phase Protein Array Core
5 M.D. Anderson Reverse Phase Protein Array Core
6 M.D. Anderson Reverse Phase Protein Array Core
experimental_protocol
1 MDA_RPPA_Core
2 MDA_RPPA_Core
3 MDA_RPPA_Core
4 MDA_RPPA_Core
5 MDA_RPPA_Core
6 MDA_RPPA_Core
raw_data_repository raw_data_accession 1 TCGA TCGA-19-5960-01A-13-1900-20 2 TCGA TCGA-19-5960-01A-13-1900-20 3 TCGA TCGA-19-5960-01A-13-1900-20 4 TCGA TCGA-19-5960-01A-13-1900-20 5 TCGA TCGA-19-5960-01A-13-1900-20 6 TCGA TCGA-19-5960-01A-13-1900-20
-- Sarah Goslee (she/her) http://www.numberwright.com