Caitlin, Forgive me, but I?m not quite sure exactly what your question is asking. The data is originally from the TCGA and I have it downloaded onto another R script. I opened a new script to perform the functions I posted to this forum because I was unable to input any other commands into the console.... due to the fact that the translated data filled the entirety of said consule. Perhaps overloaded it? Regardless, I was unable to input any further commands. -Spencer Brackett On Sun, Aug 26, 2018 at 8:27 PM Caitlin <bioprogrammer at gmail.com> wrote:> You're welcome Spencer :) > > The 4th line: > > path <? "." > > refers to the current directory (the dot in other words). Is the data > stored in the same directory where the code is being run? > > > > On Sun, Aug 26, 2018 at 5:22 PM Spencer Brackett < > spbrackett20 at saintjosephhs.com> wrote: > >> Thank you! I will make note of that. Unfortunately, lines 1 and 4 of the >> first portion of this analysis appear to be where the error begins... to >> which several subsequent lines also come up as ?errored?. Perhaps this is >> an issue of the capitalization and/or spacing (something within the text)? >> The proposed method for methylation data extraction is based on the first >> third of the following TCGA workflow: >> https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5302158/#!po=0.0715308 >> >> Best, >> >> Spencer Brackett >> >> >> >> >> >> >> >> >> >> >> >> >> On Sun, Aug 26, 2018 at 8:07 PM Caitlin <bioprogrammer at gmail.com> wrote: >> >>> Hi Spencer. >>> >>> Should you capitalize the following library import? >>> >>> library(summarizedExperiment) >>> >>> In other words, I think that line should be: >>> >>> library(SummarizedExperiment) >>> >>> Hope this helps. >>> >>> ~Caitlin >>> >>> >>> >>> >>> On Sun, Aug 26, 2018 at 2:09 PM Spencer Brackett < >>> spbrackett20 at saintjosephhs.com> wrote: >>> >>>> Good evening, >>>> >>>> I am attempting to run the following analysis on TCGA data, however >>>> something is being reported as an error in my arguments... any ideas as >>>> to >>>> what is incorrect in the following? Thanks! >>>> >>>> 1 library(TCGAbiolinks) >>>> 2 >>>> 3 # Download the DNA methylation data: HumanMethylation450 LGG and GBM. >>>> 4 path <? "." >>>> 5 >>>> 6 query.met <? TCGAquery(tumor = c("LGG","GBM"),"HumanMethylation450", >>>> level = 3) >>>> 7 TCGAdownload(query.met, path = path ) >>>> 8 met <? TCGAprepare(query = query.met,dir = path, >>>> 9 add.subtype = TRUE, add.clinical = TRUE, >>>> 10 summarizedExperiment = TRUE, >>>> 11 save = TRUE, filename = "lgg_gbm_met.rda") >>>> 12 >>>> 13 # Download the expression data: IlluminaHiSeq_RNASeqV2 LGG and GBM. >>>> 14 query.exp <? TCGAquery(tumor = c("lgg","gbm"), platform >>>> "IlluminaHiSeq_ >>>> RNASeqV2",level = 3) >>>> 15 >>>> 16 TCGAdownload(query.exp,path = path, type = "rsem.genes.normalized_ >>>> results") >>>> 17 >>>> 18 exp <? TCGAprepare(query = query.exp, dir = path, >>>> 19 summarizedExperiment = TRUE, >>>> 20 add.subtype = TRUE, add.clinical = TRUE, >>>> 21 type = "rsem.genes.normalized_results", >>>> 22 save = T,filename = "lgg_gbm_exp.rda") >>>> >>>> To download data on DNA methylation and gene expression? >>>> >>>> 1 library(summarizedExperiment) >>>> 2 # get expression matrix >>>> 3 data <? assay(exp) >>>> 4 >>>> 5 # get sample information >>>> 6 sample.info <? colData(exp) >>>> 7 >>>> 8 # get genes information >>>> 9 genes.info <? rowRanges(exp) >>>> >>>> Following stepwise procedure for obtaining GBM and LGG clinical data? >>>> >>>> 1 # get clinical patient data for GBM samples >>>> 2 gbm_clin <? TCGAquery_clinic("gbm","clinical_patient") >>>> 3 >>>> 4 # get clinical patient data for LGG samples >>>> 5 lgg_clin <? TCGAquery_clinic("lgg","clinical_patient") >>>> 6 >>>> 7 # Bind the results, as the columns might not be the same, >>>> 8 # we will plyr rbind.fill , to have all columns from both files >>>> 9 clinical <? plyr::rbind.fill(gbm_clin ,lgg_clin) >>>> 10 >>>> 11 # Other clinical files can be downloaded, >>>> 12 # Use ?TCGAquery_clinic for more information >>>> 13 clin_radiation <? TCGAquery_clinic("lgg","clinical_radiation") >>>> 14 >>>> 15 # Also, you can get clinical information from different tumor types. >>>> 16 # For example sample 1 is GBM, sample 2 and 3 are TGCT >>>> 17 data <? TCGAquery_clinic(clinical_data_type = "clinical_patient", >>>> 18 samples = c("TCGA-06-5416-01A-01D-1481-05", >>>> 19 "TCGA-2G-AAEW-01A-11D-A42Z-05", >>>> 20 "TCGA-2G-AAEX-01A-11D-A42Z-05")) >>>> >>>> >>>> # Searching idat file for DNA methylation >>>> query <- GDCquery(project = "TCGA-GBM", >>>> data.category = "Raw microarray data", >>>> data.type = "Raw intensities", >>>> experimental.strategy = "Methylation array", >>>> legacy = TRUE, >>>> file.type = ".idat", >>>> platform = "Illumina Human Methylation 450") >>>> >>>> **Repeat for LGG** >>>> >>>> To access mutational information concerning TMZ methylation? >>>> >>>> > mutation <? TCGAquery_maf(tumor = "lgg") >>>> 2 Getting maf tables >>>> 3 Source: https://wiki.nci.nih.gov/display/TCGA/TCGA+MAF+Files >>>> 4 We found these maf files below: >>>> 5 MAF.File.Name >>>> 6 2 hgsc.bcm.edu_LGG.IlluminaGA_DNASeq.1.somatic.maf >>>> 7 >>>> 8 3 >>>> LGG_FINAL_ANALYSIS.aggregated.capture.tcga.uuid.curated.somatic.maf >>>> 9 >>>> 10 Archive.Name Deploy.Date >>>> 11 2 hgsc.bcm.edu_LGG.IlluminaGA_DNASeq_automated.Level_2.1.0.0 >>>> 10-DEC-13 >>>> 12 3 broad.mit.edu_LGG.IlluminaGA_DNASeq_curated.Level_2.1.3.0 >>>> 24-DEC-14 >>>> 13 >>>> 14 Please, select the line that you want to download: 3 >>>> >>>> **Repeat this for GBM*** >>>> >>>> Selecting specified lines to download? >>>> >>>> 1 gbm.subtypes <? TCGAquery_subtype(tumor = "gbm") >>>> 2 lgg.subtypes <? TCGAquery_subtype(tumor = "lgg?) >>>> >>>> >>>> >>>> Downloading data via the Bioconductor package RTCGAtoolbox? >>>> >>>> library(RTCGAToolbox) >>>> 2 >>>> 3 # Get the last run dates >>>> 4 lastRunDate <? getFirehoseRunningDates()[1] >>>> 5 lastAnalyseDate <? getFirehoseAnalyzeDates(1) >>>> 6 >>>> 7 # get DNA methylation data, RNAseq2 and clinical data for LGG >>>> 8 lgg.data <? getFirehoseData(dataset = "LGG", >>>> 9 gistic2_Date = getFirehoseAnalyzeDates(1), runDate >>>> lastRunDate, >>>> 10 Methylation = TRUE, RNAseq2_Gene_Norm = TRUE, Clinic = TRUE, >>>> 11 Mutation = T, >>>> 12 fileSizeLimit = 10000) >>>> 13 >>>> 14 # get DNA methylation data, RNAseq2 and clinical data for GBM >>>> 15 gbm.data <? getFirehoseData(dataset = "GBM", >>>> 16 runDate = lastDate, gistic2_Date = getFirehoseAnalyzeDates(1), >>>> 17 Methylation = TRUE, Clinic = TRUE, RNAseq2_Gene_Norm = TRUE, >>>> 18 fileSizeLimit = 10000) >>>> 19 >>>> 20 # To access the data you should use the getData function >>>> 21 # or simply access with @ (for example gbm.data at Clinical) >>>> 22 gbm.mut <? getData(gbm.data,"Mutations") >>>> 23 gbm.clin <? getData(gbm.data,"Clinical") >>>> 24 gbm.gistic <? getData(gbm.data,"GISTIC") >>>> >>>> >>>> >>>> >>>> >>>> >>>> Genomic Analysis/Final data extraction: >>>> >>>> Enable ?getData? to access the data >>>> >>>> Obtaining GISTIC results? >>>> >>>> 1 # Download GISTIC results >>>> 2 gistic <? getFirehoseData("GBM",gistic2_Date ="20141017" ) >>>> 3 >>>> 4 # get GISTIC results >>>> 5 gistic.allbygene <? gistic at GISTIC@AllByGene >>>> 6 gistic.thresholedbygene <? gistic at GISTIC@ThresholedByGene >>>> >>>> Repeat this procedure to obtain LGG GISTIC results. >>>> >>>> ***Please ignore the 'non-coded' text as they are procedural >>>> steps/classifications*** >>>> >>>> [[alternative HTML version deleted]] >>>> >>>> ______________________________________________ >>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>> PLEASE do read the posting guide >>>> http://www.R-project.org/posting-guide.html >>>> and provide commented, minimal, self-contained, reproducible code. >>>> >>>[[alternative HTML version deleted]]
No worries Spencer. There is no downloaded data? Nothing is physically stored on your hard drive? The dot in the path would be interpreted (no pun intended!) as something like the following: If the TCGA data was stored in a file named "tcga_data.dat" and it was in a directory named "C:\spencer", the 4th line of that script would set the path to "C:\spencer\tcga_data.dat" if you ran the script from that same folder. If your tcga data is not stored in the same file from which the script is being ran, it won't find any data to work with. Does this help? On Sun, Aug 26, 2018 at 5:34 PM Spencer Brackett < spbrackett20 at saintjosephhs.com> wrote:> Caitlin, > > Forgive me, but I?m not quite sure exactly what your question is asking. > The data is originally from the TCGA and I have it downloaded onto another > R script. I opened a new script to perform the functions I posted to this > forum because I was unable to input any other commands into the console.... > due to the fact that the translated data filled the entirety of said > consule. Perhaps overloaded it? Regardless, I was unable to input any > further commands. > > -Spencer Brackett > > > On Sun, Aug 26, 2018 at 8:27 PM Caitlin <bioprogrammer at gmail.com> wrote: > >> You're welcome Spencer :) >> >> The 4th line: >> >> path <? "." >> >> refers to the current directory (the dot in other words). Is the data >> stored in the same directory where the code is being run? >> >> >> >> On Sun, Aug 26, 2018 at 5:22 PM Spencer Brackett < >> spbrackett20 at saintjosephhs.com> wrote: >> >>> Thank you! I will make note of that. Unfortunately, lines 1 and 4 of >>> the first portion of this analysis appear to be where the error begins... >>> to which several subsequent lines also come up as ?errored?. Perhaps this >>> is an issue of the capitalization and/or spacing (something within the >>> text)? The proposed method for methylation data extraction is based on the >>> first third of the following TCGA workflow: >>> https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5302158/#!po=0.0715308 >>> >>> Best, >>> >>> Spencer Brackett >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> On Sun, Aug 26, 2018 at 8:07 PM Caitlin <bioprogrammer at gmail.com> wrote: >>> >>>> Hi Spencer. >>>> >>>> Should you capitalize the following library import? >>>> >>>> library(summarizedExperiment) >>>> >>>> In other words, I think that line should be: >>>> >>>> library(SummarizedExperiment) >>>> >>>> Hope this helps. >>>> >>>> ~Caitlin >>>> >>>> >>>> >>>> >>>> On Sun, Aug 26, 2018 at 2:09 PM Spencer Brackett < >>>> spbrackett20 at saintjosephhs.com> wrote: >>>> >>>>> Good evening, >>>>> >>>>> I am attempting to run the following analysis on TCGA data, however >>>>> something is being reported as an error in my arguments... any ideas >>>>> as to >>>>> what is incorrect in the following? Thanks! >>>>> >>>>> 1 library(TCGAbiolinks) >>>>> 2 >>>>> 3 # Download the DNA methylation data: HumanMethylation450 LGG and GBM. >>>>> 4 path <? "." >>>>> 5 >>>>> 6 query.met <? TCGAquery(tumor = c("LGG","GBM"),"HumanMethylation450", >>>>> level = 3) >>>>> 7 TCGAdownload(query.met, path = path ) >>>>> 8 met <? TCGAprepare(query = query.met,dir = path, >>>>> 9 add.subtype = TRUE, add.clinical = TRUE, >>>>> 10 summarizedExperiment = TRUE, >>>>> 11 save = TRUE, filename = "lgg_gbm_met.rda") >>>>> 12 >>>>> 13 # Download the expression data: IlluminaHiSeq_RNASeqV2 LGG and GBM. >>>>> 14 query.exp <? TCGAquery(tumor = c("lgg","gbm"), platform >>>>> "IlluminaHiSeq_ >>>>> RNASeqV2",level = 3) >>>>> 15 >>>>> 16 TCGAdownload(query.exp,path = path, type = "rsem.genes.normalized_ >>>>> results") >>>>> 17 >>>>> 18 exp <? TCGAprepare(query = query.exp, dir = path, >>>>> 19 summarizedExperiment = TRUE, >>>>> 20 add.subtype = TRUE, add.clinical = TRUE, >>>>> 21 type = "rsem.genes.normalized_results", >>>>> 22 save = T,filename = "lgg_gbm_exp.rda") >>>>> >>>>> To download data on DNA methylation and gene expression? >>>>> >>>>> 1 library(summarizedExperiment) >>>>> 2 # get expression matrix >>>>> 3 data <? assay(exp) >>>>> 4 >>>>> 5 # get sample information >>>>> 6 sample.info <? colData(exp) >>>>> 7 >>>>> 8 # get genes information >>>>> 9 genes.info <? rowRanges(exp) >>>>> >>>>> Following stepwise procedure for obtaining GBM and LGG clinical data? >>>>> >>>>> 1 # get clinical patient data for GBM samples >>>>> 2 gbm_clin <? TCGAquery_clinic("gbm","clinical_patient") >>>>> 3 >>>>> 4 # get clinical patient data for LGG samples >>>>> 5 lgg_clin <? TCGAquery_clinic("lgg","clinical_patient") >>>>> 6 >>>>> 7 # Bind the results, as the columns might not be the same, >>>>> 8 # we will plyr rbind.fill , to have all columns from both files >>>>> 9 clinical <? plyr::rbind.fill(gbm_clin ,lgg_clin) >>>>> 10 >>>>> 11 # Other clinical files can be downloaded, >>>>> 12 # Use ?TCGAquery_clinic for more information >>>>> 13 clin_radiation <? TCGAquery_clinic("lgg","clinical_radiation") >>>>> 14 >>>>> 15 # Also, you can get clinical information from different tumor types. >>>>> 16 # For example sample 1 is GBM, sample 2 and 3 are TGCT >>>>> 17 data <? TCGAquery_clinic(clinical_data_type = "clinical_patient", >>>>> 18 samples = c("TCGA-06-5416-01A-01D-1481-05", >>>>> 19 "TCGA-2G-AAEW-01A-11D-A42Z-05", >>>>> 20 "TCGA-2G-AAEX-01A-11D-A42Z-05")) >>>>> >>>>> >>>>> # Searching idat file for DNA methylation >>>>> query <- GDCquery(project = "TCGA-GBM", >>>>> data.category = "Raw microarray data", >>>>> data.type = "Raw intensities", >>>>> experimental.strategy = "Methylation array", >>>>> legacy = TRUE, >>>>> file.type = ".idat", >>>>> platform = "Illumina Human Methylation 450") >>>>> >>>>> **Repeat for LGG** >>>>> >>>>> To access mutational information concerning TMZ methylation? >>>>> >>>>> > mutation <? TCGAquery_maf(tumor = "lgg") >>>>> 2 Getting maf tables >>>>> 3 Source: https://wiki.nci.nih.gov/display/TCGA/TCGA+MAF+Files >>>>> 4 We found these maf files below: >>>>> 5 MAF.File.Name >>>>> 6 2 hgsc.bcm.edu_LGG.IlluminaGA_DNASeq.1.somatic.maf >>>>> 7 >>>>> 8 3 >>>>> LGG_FINAL_ANALYSIS.aggregated.capture.tcga.uuid.curated.somatic.maf >>>>> 9 >>>>> 10 Archive.Name Deploy.Date >>>>> 11 2 hgsc.bcm.edu_LGG.IlluminaGA_DNASeq_automated.Level_2.1.0.0 >>>>> 10-DEC-13 >>>>> 12 3 broad.mit.edu_LGG.IlluminaGA_DNASeq_curated.Level_2.1.3.0 >>>>> 24-DEC-14 >>>>> 13 >>>>> 14 Please, select the line that you want to download: 3 >>>>> >>>>> **Repeat this for GBM*** >>>>> >>>>> Selecting specified lines to download? >>>>> >>>>> 1 gbm.subtypes <? TCGAquery_subtype(tumor = "gbm") >>>>> 2 lgg.subtypes <? TCGAquery_subtype(tumor = "lgg?) >>>>> >>>>> >>>>> >>>>> Downloading data via the Bioconductor package RTCGAtoolbox? >>>>> >>>>> library(RTCGAToolbox) >>>>> 2 >>>>> 3 # Get the last run dates >>>>> 4 lastRunDate <? getFirehoseRunningDates()[1] >>>>> 5 lastAnalyseDate <? getFirehoseAnalyzeDates(1) >>>>> 6 >>>>> 7 # get DNA methylation data, RNAseq2 and clinical data for LGG >>>>> 8 lgg.data <? getFirehoseData(dataset = "LGG", >>>>> 9 gistic2_Date = getFirehoseAnalyzeDates(1), runDate >>>>> lastRunDate, >>>>> 10 Methylation = TRUE, RNAseq2_Gene_Norm = TRUE, Clinic = TRUE, >>>>> 11 Mutation = T, >>>>> 12 fileSizeLimit = 10000) >>>>> 13 >>>>> 14 # get DNA methylation data, RNAseq2 and clinical data for GBM >>>>> 15 gbm.data <? getFirehoseData(dataset = "GBM", >>>>> 16 runDate = lastDate, gistic2_Date = getFirehoseAnalyzeDates(1), >>>>> 17 Methylation = TRUE, Clinic = TRUE, RNAseq2_Gene_Norm = TRUE, >>>>> 18 fileSizeLimit = 10000) >>>>> 19 >>>>> 20 # To access the data you should use the getData function >>>>> 21 # or simply access with @ (for example gbm.data at Clinical) >>>>> 22 gbm.mut <? getData(gbm.data,"Mutations") >>>>> 23 gbm.clin <? getData(gbm.data,"Clinical") >>>>> 24 gbm.gistic <? getData(gbm.data,"GISTIC") >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> Genomic Analysis/Final data extraction: >>>>> >>>>> Enable ?getData? to access the data >>>>> >>>>> Obtaining GISTIC results? >>>>> >>>>> 1 # Download GISTIC results >>>>> 2 gistic <? getFirehoseData("GBM",gistic2_Date ="20141017" ) >>>>> 3 >>>>> 4 # get GISTIC results >>>>> 5 gistic.allbygene <? gistic at GISTIC@AllByGene >>>>> 6 gistic.thresholedbygene <? gistic at GISTIC@ThresholedByGene >>>>> >>>>> Repeat this procedure to obtain LGG GISTIC results. >>>>> >>>>> ***Please ignore the 'non-coded' text as they are procedural >>>>> steps/classifications*** >>>>> >>>>> [[alternative HTML version deleted]] >>>>> >>>>> ______________________________________________ >>>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >>>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>>> PLEASE do read the posting guide >>>>> http://www.R-project.org/posting-guide.html >>>>> and provide commented, minimal, self-contained, reproducible code. >>>>> >>>>[[alternative HTML version deleted]]
Caitlin, Perhaps that is the problem. To be more specific, the data was transferred from the TCGA database to a CSV file... there are technically two separate files (CSV) for this analysis.... one for GBM and one for LGG. Both CVS files were then individually downloaded onto my open R console. Upon arranging them with the summary () function, the data expanded and took up the whole console page... even seemingly abrogating the arguments which allowed for the data to be downloaded onto R in the first place. Are you suggesting that I would need to utilize a flash drive to successfully utilize the function you suggested? Or could I perhaps do so with the CSV field I mentioned? If so, how? -Spencer B On Sun, Aug 26, 2018 at 8:42 PM Caitlin <bioprogrammer at gmail.com> wrote:> No worries Spencer. There is no downloaded data? Nothing is physically > stored on your hard drive? The dot in the path would be interpreted (no pun > intended!) as something like the following: > > If the TCGA data was stored in a file named "tcga_data.dat" and it was in > a directory named "C:\spencer", the 4th line of that script would set the > path to "C:\spencer\tcga_data.dat" if you ran the script from that same > folder. If your tcga data is not stored in the same file from which the > script is being ran, it won't find any data to work with. Does this help? > > > On Sun, Aug 26, 2018 at 5:34 PM Spencer Brackett < > spbrackett20 at saintjosephhs.com> wrote: > >> Caitlin, >> >> Forgive me, but I?m not quite sure exactly what your question is >> asking. The data is originally from the TCGA and I have it downloaded onto >> another R script. I opened a new script to perform the functions I posted >> to this forum because I was unable to input any other commands into the >> console.... due to the fact that the translated data filled the entirety of >> said consule. Perhaps overloaded it? Regardless, I was unable to input any >> further commands. >> >> -Spencer Brackett >> >> >> On Sun, Aug 26, 2018 at 8:27 PM Caitlin <bioprogrammer at gmail.com> wrote: >> >>> You're welcome Spencer :) >>> >>> The 4th line: >>> >>> path <? "." >>> >>> refers to the current directory (the dot in other words). Is the data >>> stored in the same directory where the code is being run? >>> >>> >>> >>> On Sun, Aug 26, 2018 at 5:22 PM Spencer Brackett < >>> spbrackett20 at saintjosephhs.com> wrote: >>> >>>> Thank you! I will make note of that. Unfortunately, lines 1 and 4 of >>>> the first portion of this analysis appear to be where the error begins... >>>> to which several subsequent lines also come up as ?errored?. Perhaps this >>>> is an issue of the capitalization and/or spacing (something within the >>>> text)? The proposed method for methylation data extraction is based on the >>>> first third of the following TCGA workflow: >>>> https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5302158/#!po=0.0715308 >>>> >>>> Best, >>>> >>>> Spencer Brackett >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> On Sun, Aug 26, 2018 at 8:07 PM Caitlin <bioprogrammer at gmail.com> >>>> wrote: >>>> >>>>> Hi Spencer. >>>>> >>>>> Should you capitalize the following library import? >>>>> >>>>> library(summarizedExperiment) >>>>> >>>>> In other words, I think that line should be: >>>>> >>>>> library(SummarizedExperiment) >>>>> >>>>> Hope this helps. >>>>> >>>>> ~Caitlin >>>>> >>>>> >>>>> >>>>> >>>>> On Sun, Aug 26, 2018 at 2:09 PM Spencer Brackett < >>>>> spbrackett20 at saintjosephhs.com> wrote: >>>>> >>>>>> Good evening, >>>>>> >>>>>> I am attempting to run the following analysis on TCGA data, however >>>>>> something is being reported as an error in my arguments... any ideas >>>>>> as to >>>>>> what is incorrect in the following? Thanks! >>>>>> >>>>>> 1 library(TCGAbiolinks) >>>>>> 2 >>>>>> 3 # Download the DNA methylation data: HumanMethylation450 LGG and >>>>>> GBM. >>>>>> 4 path <? "." >>>>>> 5 >>>>>> 6 query.met <? TCGAquery(tumor = c("LGG","GBM"),"HumanMethylation450", >>>>>> level = 3) >>>>>> 7 TCGAdownload(query.met, path = path ) >>>>>> 8 met <? TCGAprepare(query = query.met,dir = path, >>>>>> 9 add.subtype = TRUE, add.clinical = TRUE, >>>>>> 10 summarizedExperiment = TRUE, >>>>>> 11 save = TRUE, filename = "lgg_gbm_met.rda") >>>>>> 12 >>>>>> 13 # Download the expression data: IlluminaHiSeq_RNASeqV2 LGG and GBM. >>>>>> 14 query.exp <? TCGAquery(tumor = c("lgg","gbm"), platform >>>>>> "IlluminaHiSeq_ >>>>>> RNASeqV2",level = 3) >>>>>> 15 >>>>>> 16 TCGAdownload(query.exp,path = path, type = "rsem.genes.normalized_ >>>>>> results") >>>>>> 17 >>>>>> 18 exp <? TCGAprepare(query = query.exp, dir = path, >>>>>> 19 summarizedExperiment = TRUE, >>>>>> 20 add.subtype = TRUE, add.clinical = TRUE, >>>>>> 21 type = "rsem.genes.normalized_results", >>>>>> 22 save = T,filename = "lgg_gbm_exp.rda") >>>>>> >>>>>> To download data on DNA methylation and gene expression? >>>>>> >>>>>> 1 library(summarizedExperiment) >>>>>> 2 # get expression matrix >>>>>> 3 data <? assay(exp) >>>>>> 4 >>>>>> 5 # get sample information >>>>>> 6 sample.info <? colData(exp) >>>>>> 7 >>>>>> 8 # get genes information >>>>>> 9 genes.info <? rowRanges(exp) >>>>>> >>>>>> Following stepwise procedure for obtaining GBM and LGG clinical data? >>>>>> >>>>>> 1 # get clinical patient data for GBM samples >>>>>> 2 gbm_clin <? TCGAquery_clinic("gbm","clinical_patient") >>>>>> 3 >>>>>> 4 # get clinical patient data for LGG samples >>>>>> 5 lgg_clin <? TCGAquery_clinic("lgg","clinical_patient") >>>>>> 6 >>>>>> 7 # Bind the results, as the columns might not be the same, >>>>>> 8 # we will plyr rbind.fill , to have all columns from both files >>>>>> 9 clinical <? plyr::rbind.fill(gbm_clin ,lgg_clin) >>>>>> 10 >>>>>> 11 # Other clinical files can be downloaded, >>>>>> 12 # Use ?TCGAquery_clinic for more information >>>>>> 13 clin_radiation <? TCGAquery_clinic("lgg","clinical_radiation") >>>>>> 14 >>>>>> 15 # Also, you can get clinical information from different tumor >>>>>> types. >>>>>> 16 # For example sample 1 is GBM, sample 2 and 3 are TGCT >>>>>> 17 data <? TCGAquery_clinic(clinical_data_type = "clinical_patient", >>>>>> 18 samples = c("TCGA-06-5416-01A-01D-1481-05", >>>>>> 19 "TCGA-2G-AAEW-01A-11D-A42Z-05", >>>>>> 20 "TCGA-2G-AAEX-01A-11D-A42Z-05")) >>>>>> >>>>>> >>>>>> # Searching idat file for DNA methylation >>>>>> query <- GDCquery(project = "TCGA-GBM", >>>>>> data.category = "Raw microarray data", >>>>>> data.type = "Raw intensities", >>>>>> experimental.strategy = "Methylation array", >>>>>> legacy = TRUE, >>>>>> file.type = ".idat", >>>>>> platform = "Illumina Human Methylation 450") >>>>>> >>>>>> **Repeat for LGG** >>>>>> >>>>>> To access mutational information concerning TMZ methylation? >>>>>> >>>>>> > mutation <? TCGAquery_maf(tumor = "lgg") >>>>>> 2 Getting maf tables >>>>>> 3 Source: https://wiki.nci.nih.gov/display/TCGA/TCGA+MAF+Files >>>>>> 4 We found these maf files below: >>>>>> 5 MAF.File.Name >>>>>> 6 2 hgsc.bcm.edu_LGG.IlluminaGA_DNASeq.1.somatic.maf >>>>>> 7 >>>>>> 8 3 >>>>>> LGG_FINAL_ANALYSIS.aggregated.capture.tcga.uuid.curated.somatic.maf >>>>>> 9 >>>>>> 10 Archive.Name Deploy.Date >>>>>> 11 2 hgsc.bcm.edu_LGG.IlluminaGA_DNASeq_automated.Level_2.1.0.0 >>>>>> 10-DEC-13 >>>>>> 12 3 broad.mit.edu_LGG.IlluminaGA_DNASeq_curated.Level_2.1.3.0 >>>>>> 24-DEC-14 >>>>>> 13 >>>>>> 14 Please, select the line that you want to download: 3 >>>>>> >>>>>> **Repeat this for GBM*** >>>>>> >>>>>> Selecting specified lines to download? >>>>>> >>>>>> 1 gbm.subtypes <? TCGAquery_subtype(tumor = "gbm") >>>>>> 2 lgg.subtypes <? TCGAquery_subtype(tumor = "lgg?) >>>>>> >>>>>> >>>>>> >>>>>> Downloading data via the Bioconductor package RTCGAtoolbox? >>>>>> >>>>>> library(RTCGAToolbox) >>>>>> 2 >>>>>> 3 # Get the last run dates >>>>>> 4 lastRunDate <? getFirehoseRunningDates()[1] >>>>>> 5 lastAnalyseDate <? getFirehoseAnalyzeDates(1) >>>>>> 6 >>>>>> 7 # get DNA methylation data, RNAseq2 and clinical data for LGG >>>>>> 8 lgg.data <? getFirehoseData(dataset = "LGG", >>>>>> 9 gistic2_Date = getFirehoseAnalyzeDates(1), runDate >>>>>> lastRunDate, >>>>>> 10 Methylation = TRUE, RNAseq2_Gene_Norm = TRUE, Clinic = TRUE, >>>>>> 11 Mutation = T, >>>>>> 12 fileSizeLimit = 10000) >>>>>> 13 >>>>>> 14 # get DNA methylation data, RNAseq2 and clinical data for GBM >>>>>> 15 gbm.data <? getFirehoseData(dataset = "GBM", >>>>>> 16 runDate = lastDate, gistic2_Date >>>>>> getFirehoseAnalyzeDates(1), >>>>>> 17 Methylation = TRUE, Clinic = TRUE, RNAseq2_Gene_Norm = TRUE, >>>>>> 18 fileSizeLimit = 10000) >>>>>> 19 >>>>>> 20 # To access the data you should use the getData function >>>>>> 21 # or simply access with @ (for example gbm.data at Clinical) >>>>>> 22 gbm.mut <? getData(gbm.data,"Mutations") >>>>>> 23 gbm.clin <? getData(gbm.data,"Clinical") >>>>>> 24 gbm.gistic <? getData(gbm.data,"GISTIC") >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> Genomic Analysis/Final data extraction: >>>>>> >>>>>> Enable ?getData? to access the data >>>>>> >>>>>> Obtaining GISTIC results? >>>>>> >>>>>> 1 # Download GISTIC results >>>>>> 2 gistic <? getFirehoseData("GBM",gistic2_Date ="20141017" ) >>>>>> 3 >>>>>> 4 # get GISTIC results >>>>>> 5 gistic.allbygene <? gistic at GISTIC@AllByGene >>>>>> 6 gistic.thresholedbygene <? gistic at GISTIC@ThresholedByGene >>>>>> >>>>>> Repeat this procedure to obtain LGG GISTIC results. >>>>>> >>>>>> ***Please ignore the 'non-coded' text as they are procedural >>>>>> steps/classifications*** >>>>>> >>>>>> [[alternative HTML version deleted]] >>>>>> >>>>>> ______________________________________________ >>>>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >>>>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>>>> PLEASE do read the posting guide >>>>>> http://www.R-project.org/posting-guide.html >>>>>> and provide commented, minimal, self-contained, reproducible code. >>>>>> >>>>>[[alternative HTML version deleted]]
Hello all, To begin my analysis, I downloaded two TCGA datasets (GBM and LGG), both csv files, onto on r script after loading the cBioLite package. Following this, I inputted the following argument...> the_data<-read.csv(file=?c:/file_name.csv,header=TRUE,sep=?,?)Upon running the line I received this... + If continue to press enter, the + sign continues to appear on every subsequent/new line. Does anyone know what this is indicative of and how I may continue on with my analysis My next step after this would have been the following (the numbers before each command being line markers; not part of line).. 1 library(TCGAbiolinks) 2 3 # Download the DNA methylation data: HumanMethylation450 LGG and GBM. 4 path <? "." Best wishes, Spencer Brackett On Sun, Aug 26, 2018 at 9:13 PM Caitlin <bioprogrammer at gmail.com> wrote:> You're welcome Spencer :) > > I hope I was able to help you. If this problem persists, or a new one > appears, feel free to post or email. You might also like: > > https://www.biostars.org/ > > It is quite similar to StackOverflow but with a biological sciences focus. > > Hope this helps! > > ~Caitlin > > > > On Sun, Aug 26, 2018 at 6:02 PM Spencer Brackett < > spbrackett20 at saintjosephhs.com> wrote: > >> Caitlin, >> >> Thanks again! I already have the two files stored in those two CSV files >> via my desktop, but if tuning those with this function do not work, then I >> will try it with a flash drive. >> >> Best, >> >> Spencer Brackett >> >> On Sun, Aug 26, 2018 at 8:56 PM Caitlin <bioprogrammer at gmail.com> wrote: >> >>> Hmm...could you store each in its own file (a flash drive would be fine) >>> then use: >>> >>> the_data <- read.csv(file="c:/file_name.csv", header=TRUE, sep=",") >>> >>> to read each into your script? The data would then exist as a dataframe object that you could then work with. >>> >>> >>> On Sun, Aug 26, 2018 at 5:50 PM Spencer Brackett < >>> spbrackett20 at saintjosephhs.com> wrote: >>> >>>> Caitlin, >>>> >>>> Perhaps that is the problem. To be more specific, the data was >>>> transferred from the TCGA database to a CSV file... there are technically >>>> two separate files (CSV) for this analysis.... one for GBM and one for LGG. >>>> Both CVS files were then individually downloaded onto my open R console. >>>> Upon arranging them with the summary () function, the data expanded and >>>> took up the whole console page... even seemingly abrogating the arguments >>>> which allowed for the data to be downloaded onto R in the first place. Are >>>> you suggesting that I would need to utilize a flash drive to successfully >>>> utilize the function you suggested? Or could I perhaps do so with the CSV >>>> field I mentioned? If so, how? >>>> >>>> -Spencer B >>>> >>>> On Sun, Aug 26, 2018 at 8:42 PM Caitlin <bioprogrammer at gmail.com> >>>> wrote: >>>> >>>>> No worries Spencer. There is no downloaded data? Nothing is physically >>>>> stored on your hard drive? The dot in the path would be interpreted (no pun >>>>> intended!) as something like the following: >>>>> >>>>> If the TCGA data was stored in a file named "tcga_data.dat" and it was >>>>> in a directory named "C:\spencer", the 4th line of that script would set >>>>> the path to "C:\spencer\tcga_data.dat" if you ran the script from that same >>>>> folder. If your tcga data is not stored in the same file from which the >>>>> script is being ran, it won't find any data to work with. Does this help? >>>>> >>>>> >>>>> On Sun, Aug 26, 2018 at 5:34 PM Spencer Brackett < >>>>> spbrackett20 at saintjosephhs.com> wrote: >>>>> >>>>>> Caitlin, >>>>>> >>>>>> Forgive me, but I?m not quite sure exactly what your question is >>>>>> asking. The data is originally from the TCGA and I have it downloaded onto >>>>>> another R script. I opened a new script to perform the functions I posted >>>>>> to this forum because I was unable to input any other commands into the >>>>>> console.... due to the fact that the translated data filled the entirety of >>>>>> said consule. Perhaps overloaded it? Regardless, I was unable to input any >>>>>> further commands. >>>>>> >>>>>> -Spencer Brackett >>>>>> >>>>>> >>>>>> On Sun, Aug 26, 2018 at 8:27 PM Caitlin <bioprogrammer at gmail.com> >>>>>> wrote: >>>>>> >>>>>>> You're welcome Spencer :) >>>>>>> >>>>>>> The 4th line: >>>>>>> >>>>>>> path <? "." >>>>>>> >>>>>>> refers to the current directory (the dot in other words). Is the >>>>>>> data stored in the same directory where the code is being run? >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Sun, Aug 26, 2018 at 5:22 PM Spencer Brackett < >>>>>>> spbrackett20 at saintjosephhs.com> wrote: >>>>>>> >>>>>>>> Thank you! I will make note of that. Unfortunately, lines 1 and 4 >>>>>>>> of the first portion of this analysis appear to be where the error >>>>>>>> begins... to which several subsequent lines also come up as ?errored?. >>>>>>>> Perhaps this is an issue of the capitalization and/or spacing (something >>>>>>>> within the text)? The proposed method for methylation data extraction is >>>>>>>> based on the first third of the following TCGA workflow: >>>>>>>> https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5302158/#!po=0.0715308 >>>>>>>> >>>>>>>> Best, >>>>>>>> >>>>>>>> Spencer Brackett >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On Sun, Aug 26, 2018 at 8:07 PM Caitlin <bioprogrammer at gmail.com> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> Hi Spencer. >>>>>>>>> >>>>>>>>> Should you capitalize the following library import? >>>>>>>>> >>>>>>>>> library(summarizedExperiment) >>>>>>>>> >>>>>>>>> In other words, I think that line should be: >>>>>>>>> >>>>>>>>> library(SummarizedExperiment) >>>>>>>>> >>>>>>>>> Hope this helps. >>>>>>>>> >>>>>>>>> ~Caitlin >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> On Sun, Aug 26, 2018 at 2:09 PM Spencer Brackett < >>>>>>>>> spbrackett20 at saintjosephhs.com> wrote: >>>>>>>>> >>>>>>>>>> Good evening, >>>>>>>>>> >>>>>>>>>> I am attempting to run the following analysis on TCGA data, >>>>>>>>>> however >>>>>>>>>> something is being reported as an error in my arguments... any >>>>>>>>>> ideas as to >>>>>>>>>> what is incorrect in the following? Thanks! >>>>>>>>>> >>>>>>>>>> 1 library(TCGAbiolinks) >>>>>>>>>> 2 >>>>>>>>>> 3 # Download the DNA methylation data: HumanMethylation450 LGG >>>>>>>>>> and GBM. >>>>>>>>>> 4 path <? "." >>>>>>>>>> 5 >>>>>>>>>> 6 query.met <? TCGAquery(tumor >>>>>>>>>> c("LGG","GBM"),"HumanMethylation450", >>>>>>>>>> level = 3) >>>>>>>>>> 7 TCGAdownload(query.met, path = path ) >>>>>>>>>> 8 met <? TCGAprepare(query = query.met,dir = path, >>>>>>>>>> 9 add.subtype = TRUE, add.clinical = TRUE, >>>>>>>>>> 10 summarizedExperiment = TRUE, >>>>>>>>>> 11 save = TRUE, filename = "lgg_gbm_met.rda") >>>>>>>>>> 12 >>>>>>>>>> 13 # Download the expression data: IlluminaHiSeq_RNASeqV2 LGG and >>>>>>>>>> GBM. >>>>>>>>>> 14 query.exp <? TCGAquery(tumor = c("lgg","gbm"), platform >>>>>>>>>> "IlluminaHiSeq_ >>>>>>>>>> RNASeqV2",level = 3) >>>>>>>>>> 15 >>>>>>>>>> 16 TCGAdownload(query.exp,path = path, type >>>>>>>>>> "rsem.genes.normalized_ >>>>>>>>>> results") >>>>>>>>>> 17 >>>>>>>>>> 18 exp <? TCGAprepare(query = query.exp, dir = path, >>>>>>>>>> 19 summarizedExperiment = TRUE, >>>>>>>>>> 20 add.subtype = TRUE, add.clinical = TRUE, >>>>>>>>>> 21 type = "rsem.genes.normalized_results", >>>>>>>>>> 22 save = T,filename = "lgg_gbm_exp.rda") >>>>>>>>>> >>>>>>>>>> To download data on DNA methylation and gene expression? >>>>>>>>>> >>>>>>>>>> 1 library(summarizedExperiment) >>>>>>>>>> 2 # get expression matrix >>>>>>>>>> 3 data <? assay(exp) >>>>>>>>>> 4 >>>>>>>>>> 5 # get sample information >>>>>>>>>> 6 sample.info <? colData(exp) >>>>>>>>>> 7 >>>>>>>>>> 8 # get genes information >>>>>>>>>> 9 genes.info <? rowRanges(exp) >>>>>>>>>> >>>>>>>>>> Following stepwise procedure for obtaining GBM and LGG clinical >>>>>>>>>> data? >>>>>>>>>> >>>>>>>>>> 1 # get clinical patient data for GBM samples >>>>>>>>>> 2 gbm_clin <? TCGAquery_clinic("gbm","clinical_patient") >>>>>>>>>> 3 >>>>>>>>>> 4 # get clinical patient data for LGG samples >>>>>>>>>> 5 lgg_clin <? TCGAquery_clinic("lgg","clinical_patient") >>>>>>>>>> 6 >>>>>>>>>> 7 # Bind the results, as the columns might not be the same, >>>>>>>>>> 8 # we will plyr rbind.fill , to have all columns from both files >>>>>>>>>> 9 clinical <? plyr::rbind.fill(gbm_clin ,lgg_clin) >>>>>>>>>> 10 >>>>>>>>>> 11 # Other clinical files can be downloaded, >>>>>>>>>> 12 # Use ?TCGAquery_clinic for more information >>>>>>>>>> 13 clin_radiation <? TCGAquery_clinic("lgg","clinical_radiation") >>>>>>>>>> 14 >>>>>>>>>> 15 # Also, you can get clinical information from different tumor >>>>>>>>>> types. >>>>>>>>>> 16 # For example sample 1 is GBM, sample 2 and 3 are TGCT >>>>>>>>>> 17 data <? TCGAquery_clinic(clinical_data_type >>>>>>>>>> "clinical_patient", >>>>>>>>>> 18 samples = c("TCGA-06-5416-01A-01D-1481-05", >>>>>>>>>> 19 "TCGA-2G-AAEW-01A-11D-A42Z-05", >>>>>>>>>> 20 "TCGA-2G-AAEX-01A-11D-A42Z-05")) >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> # Searching idat file for DNA methylation >>>>>>>>>> query <- GDCquery(project = "TCGA-GBM", >>>>>>>>>> data.category = "Raw microarray data", >>>>>>>>>> data.type = "Raw intensities", >>>>>>>>>> experimental.strategy = "Methylation array", >>>>>>>>>> legacy = TRUE, >>>>>>>>>> file.type = ".idat", >>>>>>>>>> platform = "Illumina Human Methylation 450") >>>>>>>>>> >>>>>>>>>> **Repeat for LGG** >>>>>>>>>> >>>>>>>>>> To access mutational information concerning TMZ methylation? >>>>>>>>>> >>>>>>>>>> > mutation <? TCGAquery_maf(tumor = "lgg") >>>>>>>>>> 2 Getting maf tables >>>>>>>>>> 3 Source: https://wiki.nci.nih.gov/display/TCGA/TCGA+MAF+Files >>>>>>>>>> 4 We found these maf files below: >>>>>>>>>> 5 MAF.File.Name >>>>>>>>>> 6 2 hgsc.bcm.edu_LGG.IlluminaGA_DNASeq.1.somatic.maf >>>>>>>>>> 7 >>>>>>>>>> 8 3 >>>>>>>>>> LGG_FINAL_ANALYSIS.aggregated.capture.tcga.uuid.curated.somatic.maf >>>>>>>>>> 9 >>>>>>>>>> 10 Archive.Name Deploy.Date >>>>>>>>>> 11 2 hgsc.bcm.edu_LGG.IlluminaGA_DNASeq_automated.Level_2.1.0.0 >>>>>>>>>> 10-DEC-13 >>>>>>>>>> 12 3 broad.mit.edu_LGG.IlluminaGA_DNASeq_curated.Level_2.1.3.0 >>>>>>>>>> 24-DEC-14 >>>>>>>>>> 13 >>>>>>>>>> 14 Please, select the line that you want to download: 3 >>>>>>>>>> >>>>>>>>>> **Repeat this for GBM*** >>>>>>>>>> >>>>>>>>>> Selecting specified lines to download? >>>>>>>>>> >>>>>>>>>> 1 gbm.subtypes <? TCGAquery_subtype(tumor = "gbm") >>>>>>>>>> 2 lgg.subtypes <? TCGAquery_subtype(tumor = "lgg?) >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Downloading data via the Bioconductor package RTCGAtoolbox? >>>>>>>>>> >>>>>>>>>> library(RTCGAToolbox) >>>>>>>>>> 2 >>>>>>>>>> 3 # Get the last run dates >>>>>>>>>> 4 lastRunDate <? getFirehoseRunningDates()[1] >>>>>>>>>> 5 lastAnalyseDate <? getFirehoseAnalyzeDates(1) >>>>>>>>>> 6 >>>>>>>>>> 7 # get DNA methylation data, RNAseq2 and clinical data for LGG >>>>>>>>>> 8 lgg.data <? getFirehoseData(dataset = "LGG", >>>>>>>>>> 9 gistic2_Date = getFirehoseAnalyzeDates(1), runDate >>>>>>>>>> lastRunDate, >>>>>>>>>> 10 Methylation = TRUE, RNAseq2_Gene_Norm = TRUE, Clinic >>>>>>>>>> TRUE, >>>>>>>>>> 11 Mutation = T, >>>>>>>>>> 12 fileSizeLimit = 10000) >>>>>>>>>> 13 >>>>>>>>>> 14 # get DNA methylation data, RNAseq2 and clinical data for GBM >>>>>>>>>> 15 gbm.data <? getFirehoseData(dataset = "GBM", >>>>>>>>>> 16 runDate = lastDate, gistic2_Date >>>>>>>>>> getFirehoseAnalyzeDates(1), >>>>>>>>>> 17 Methylation = TRUE, Clinic = TRUE, RNAseq2_Gene_Norm >>>>>>>>>> TRUE, >>>>>>>>>> 18 fileSizeLimit = 10000) >>>>>>>>>> 19 >>>>>>>>>> 20 # To access the data you should use the getData function >>>>>>>>>> 21 # or simply access with @ (for example gbm.data at Clinical) >>>>>>>>>> 22 gbm.mut <? getData(gbm.data,"Mutations") >>>>>>>>>> 23 gbm.clin <? getData(gbm.data,"Clinical") >>>>>>>>>> 24 gbm.gistic <? getData(gbm.data,"GISTIC") >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Genomic Analysis/Final data extraction: >>>>>>>>>> >>>>>>>>>> Enable ?getData? to access the data >>>>>>>>>> >>>>>>>>>> Obtaining GISTIC results? >>>>>>>>>> >>>>>>>>>> 1 # Download GISTIC results >>>>>>>>>> 2 gistic <? getFirehoseData("GBM",gistic2_Date ="20141017" ) >>>>>>>>>> 3 >>>>>>>>>> 4 # get GISTIC results >>>>>>>>>> 5 gistic.allbygene <? gistic at GISTIC@AllByGene >>>>>>>>>> 6 gistic.thresholedbygene <? gistic at GISTIC@ThresholedByGene >>>>>>>>>> >>>>>>>>>> Repeat this procedure to obtain LGG GISTIC results. >>>>>>>>>> >>>>>>>>>> ***Please ignore the 'non-coded' text as they are procedural >>>>>>>>>> steps/classifications*** >>>>>>>>>> >>>>>>>>>> [[alternative HTML version deleted]] >>>>>>>>>> >>>>>>>>>> ______________________________________________ >>>>>>>>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >>>>>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>>>>>>>> PLEASE do read the posting guide >>>>>>>>>> http://www.R-project.org/posting-guide.html >>>>>>>>>> and provide commented, minimal, self-contained, reproducible code. >>>>>>>>>> >>>>>>>>>[[alternative HTML version deleted]]
Your problem is that the command you entered> the_data<-read.csv(file=?c:/file_name.csv,header=TRUE,sep=?,?)is missing a double quote after the .csv. The statement should be> the_data<-read.csv(file=?c:/file_name.csv",header=TRUE,sep=?,?)The '+' sign is a prompt from R that indicates it has not yet seen the end of a statement, and it is expecting you to continue from the previous line. The explanation: you are supplying the read.csv() function three arguments, one each for the parameters 'file', 'header' and 'sep'. The parameters 'file' and 'sep' are expecting strings as arguments, such as "c:/file_name.csv" or "c:/myspecialdata.csv". The parameter 'sep' (for separator) indicates that the separator is a comma. Note that you could also have written> the_data<-read.csv(file=?c:/file_name.csv")as the default values for the parameter 'header' is TRUE, and for the parameter 'sep' is comma. You can confirm this by looking at the help via> ?read.csvHTH, Eric On Mon, Aug 27, 2018 at 6:49 AM, Spencer Brackett < spbrackett20 at saintjosephhs.com> wrote:> Hello all, > > To begin my analysis, I downloaded two TCGA datasets (GBM and LGG), both > csv files, onto on r script after loading the cBioLite package. Following > this, I inputted the following argument... > > > the_data<-read.csv(file=?c:/file_name.csv,header=TRUE,sep=?,?) > > Upon running the line I received this... > > + > > If continue to press enter, the + sign continues to appear on every > subsequent/new line. > > Does anyone know what this is indicative of and how I may continue on with > my analysis > > My next step after this would have been the following (the numbers before > each command being line markers; not part of line).. > > 1 library(TCGAbiolinks) > 2 > 3 # Download the DNA methylation data: HumanMethylation450 LGG and GBM. > 4 path <? "." > > Best wishes, > > Spencer Brackett > > On Sun, Aug 26, 2018 at 9:13 PM Caitlin <bioprogrammer at gmail.com> wrote: > > > You're welcome Spencer :) > > > > I hope I was able to help you. If this problem persists, or a new one > > appears, feel free to post or email. You might also like: > > > > https://www.biostars.org/ > > > > It is quite similar to StackOverflow but with a biological sciences > focus. > > > > Hope this helps! > > > > ~Caitlin > > > > > > > > On Sun, Aug 26, 2018 at 6:02 PM Spencer Brackett < > > spbrackett20 at saintjosephhs.com> wrote: > > > >> Caitlin, > >> > >> Thanks again! I already have the two files stored in those two CSV > files > >> via my desktop, but if tuning those with this function do not work, > then I > >> will try it with a flash drive. > >> > >> Best, > >> > >> Spencer Brackett > >> > >> On Sun, Aug 26, 2018 at 8:56 PM Caitlin <bioprogrammer at gmail.com> > wrote: > >> > >>> Hmm...could you store each in its own file (a flash drive would be > fine) > >>> then use: > >>> > >>> the_data <- read.csv(file="c:/file_name.csv", header=TRUE, sep=",") > >>> > >>> to read each into your script? The data would then exist as a > dataframe object that you could then work with. > >>> > >>> > >>> On Sun, Aug 26, 2018 at 5:50 PM Spencer Brackett < > >>> spbrackett20 at saintjosephhs.com> wrote: > >>> > >>>> Caitlin, > >>>> > >>>> Perhaps that is the problem. To be more specific, the data was > >>>> transferred from the TCGA database to a CSV file... there are > technically > >>>> two separate files (CSV) for this analysis.... one for GBM and one > for LGG. > >>>> Both CVS files were then individually downloaded onto my open R > console. > >>>> Upon arranging them with the summary () function, the data expanded > and > >>>> took up the whole console page... even seemingly abrogating the > arguments > >>>> which allowed for the data to be downloaded onto R in the first > place. Are > >>>> you suggesting that I would need to utilize a flash drive to > successfully > >>>> utilize the function you suggested? Or could I perhaps do so with the > CSV > >>>> field I mentioned? If so, how? > >>>> > >>>> -Spencer B > >>>> > >>>> On Sun, Aug 26, 2018 at 8:42 PM Caitlin <bioprogrammer at gmail.com> > >>>> wrote: > >>>> > >>>>> No worries Spencer. There is no downloaded data? Nothing is > physically > >>>>> stored on your hard drive? The dot in the path would be interpreted > (no pun > >>>>> intended!) as something like the following: > >>>>> > >>>>> If the TCGA data was stored in a file named "tcga_data.dat" and it > was > >>>>> in a directory named "C:\spencer", the 4th line of that script would > set > >>>>> the path to "C:\spencer\tcga_data.dat" if you ran the script from > that same > >>>>> folder. If your tcga data is not stored in the same file from which > the > >>>>> script is being ran, it won't find any data to work with. Does this > help? > >>>>> > >>>>> > >>>>> On Sun, Aug 26, 2018 at 5:34 PM Spencer Brackett < > >>>>> spbrackett20 at saintjosephhs.com> wrote: > >>>>> > >>>>>> Caitlin, > >>>>>> > >>>>>> Forgive me, but I?m not quite sure exactly what your question is > >>>>>> asking. The data is originally from the TCGA and I have it > downloaded onto > >>>>>> another R script. I opened a new script to perform the functions I > posted > >>>>>> to this forum because I was unable to input any other commands into > the > >>>>>> console.... due to the fact that the translated data filled the > entirety of > >>>>>> said consule. Perhaps overloaded it? Regardless, I was unable to > input any > >>>>>> further commands. > >>>>>> > >>>>>> -Spencer Brackett > >>>>>> > >>>>>> > >>>>>> On Sun, Aug 26, 2018 at 8:27 PM Caitlin <bioprogrammer at gmail.com> > >>>>>> wrote: > >>>>>> > >>>>>>> You're welcome Spencer :) > >>>>>>> > >>>>>>> The 4th line: > >>>>>>> > >>>>>>> path <? "." > >>>>>>> > >>>>>>> refers to the current directory (the dot in other words). Is the > >>>>>>> data stored in the same directory where the code is being run? > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> On Sun, Aug 26, 2018 at 5:22 PM Spencer Brackett < > >>>>>>> spbrackett20 at saintjosephhs.com> wrote: > >>>>>>> > >>>>>>>> Thank you! I will make note of that. Unfortunately, lines 1 and 4 > >>>>>>>> of the first portion of this analysis appear to be where the error > >>>>>>>> begins... to which several subsequent lines also come up as > ?errored?. > >>>>>>>> Perhaps this is an issue of the capitalization and/or spacing > (something > >>>>>>>> within the text)? The proposed method for methylation data > extraction is > >>>>>>>> based on the first third of the following TCGA workflow: > >>>>>>>> https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5302158/#!po> 0.0715308 > >>>>>>>> > >>>>>>>> Best, > >>>>>>>> > >>>>>>>> Spencer Brackett > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> On Sun, Aug 26, 2018 at 8:07 PM Caitlin <bioprogrammer at gmail.com> > >>>>>>>> wrote: > >>>>>>>> > >>>>>>>>> Hi Spencer. > >>>>>>>>> > >>>>>>>>> Should you capitalize the following library import? > >>>>>>>>> > >>>>>>>>> library(summarizedExperiment) > >>>>>>>>> > >>>>>>>>> In other words, I think that line should be: > >>>>>>>>> > >>>>>>>>> library(SummarizedExperiment) > >>>>>>>>> > >>>>>>>>> Hope this helps. > >>>>>>>>> > >>>>>>>>> ~Caitlin > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> On Sun, Aug 26, 2018 at 2:09 PM Spencer Brackett < > >>>>>>>>> spbrackett20 at saintjosephhs.com> wrote: > >>>>>>>>> > >>>>>>>>>> Good evening, > >>>>>>>>>> > >>>>>>>>>> I am attempting to run the following analysis on TCGA data, > >>>>>>>>>> however > >>>>>>>>>> something is being reported as an error in my arguments... any > >>>>>>>>>> ideas as to > >>>>>>>>>> what is incorrect in the following? Thanks! > >>>>>>>>>> > >>>>>>>>>> 1 library(TCGAbiolinks) > >>>>>>>>>> 2 > >>>>>>>>>> 3 # Download the DNA methylation data: HumanMethylation450 LGG > >>>>>>>>>> and GBM. > >>>>>>>>>> 4 path <? "." > >>>>>>>>>> 5 > >>>>>>>>>> 6 query.met <? TCGAquery(tumor > >>>>>>>>>> c("LGG","GBM"),"HumanMethylation450", > >>>>>>>>>> level = 3) > >>>>>>>>>> 7 TCGAdownload(query.met, path = path ) > >>>>>>>>>> 8 met <? TCGAprepare(query = query.met,dir = path, > >>>>>>>>>> 9 add.subtype = TRUE, add.clinical = TRUE, > >>>>>>>>>> 10 summarizedExperiment = TRUE, > >>>>>>>>>> 11 save = TRUE, filename > "lgg_gbm_met.rda") > >>>>>>>>>> 12 > >>>>>>>>>> 13 # Download the expression data: IlluminaHiSeq_RNASeqV2 LGG > and > >>>>>>>>>> GBM. > >>>>>>>>>> 14 query.exp <? TCGAquery(tumor = c("lgg","gbm"), platform > >>>>>>>>>> "IlluminaHiSeq_ > >>>>>>>>>> RNASeqV2",level = 3) > >>>>>>>>>> 15 > >>>>>>>>>> 16 TCGAdownload(query.exp,path = path, type > >>>>>>>>>> "rsem.genes.normalized_ > >>>>>>>>>> results") > >>>>>>>>>> 17 > >>>>>>>>>> 18 exp <? TCGAprepare(query = query.exp, dir = path, > >>>>>>>>>> 19 summarizedExperiment = TRUE, > >>>>>>>>>> 20 add.subtype = TRUE, add.clinical = TRUE, > >>>>>>>>>> 21 type = "rsem.genes.normalized_results", > >>>>>>>>>> 22 save = T,filename = "lgg_gbm_exp.rda") > >>>>>>>>>> > >>>>>>>>>> To download data on DNA methylation and gene expression? > >>>>>>>>>> > >>>>>>>>>> 1 library(summarizedExperiment) > >>>>>>>>>> 2 # get expression matrix > >>>>>>>>>> 3 data <? assay(exp) > >>>>>>>>>> 4 > >>>>>>>>>> 5 # get sample information > >>>>>>>>>> 6 sample.info <? colData(exp) > >>>>>>>>>> 7 > >>>>>>>>>> 8 # get genes information > >>>>>>>>>> 9 genes.info <? rowRanges(exp) > >>>>>>>>>> > >>>>>>>>>> Following stepwise procedure for obtaining GBM and LGG clinical > >>>>>>>>>> data? > >>>>>>>>>> > >>>>>>>>>> 1 # get clinical patient data for GBM samples > >>>>>>>>>> 2 gbm_clin <? TCGAquery_clinic("gbm","clinical_patient") > >>>>>>>>>> 3 > >>>>>>>>>> 4 # get clinical patient data for LGG samples > >>>>>>>>>> 5 lgg_clin <? TCGAquery_clinic("lgg","clinical_patient") > >>>>>>>>>> 6 > >>>>>>>>>> 7 # Bind the results, as the columns might not be the same, > >>>>>>>>>> 8 # we will plyr rbind.fill , to have all columns from both > files > >>>>>>>>>> 9 clinical <? plyr::rbind.fill(gbm_clin ,lgg_clin) > >>>>>>>>>> 10 > >>>>>>>>>> 11 # Other clinical files can be downloaded, > >>>>>>>>>> 12 # Use ?TCGAquery_clinic for more information > >>>>>>>>>> 13 clin_radiation <? TCGAquery_clinic("lgg"," > clinical_radiation") > >>>>>>>>>> 14 > >>>>>>>>>> 15 # Also, you can get clinical information from different tumor > >>>>>>>>>> types. > >>>>>>>>>> 16 # For example sample 1 is GBM, sample 2 and 3 are TGCT > >>>>>>>>>> 17 data <? TCGAquery_clinic(clinical_data_type > >>>>>>>>>> "clinical_patient", > >>>>>>>>>> 18 samples = c("TCGA-06-5416-01A-01D-1481-05", > >>>>>>>>>> 19 "TCGA-2G-AAEW-01A-11D-A42Z-05", > >>>>>>>>>> 20 "TCGA-2G-AAEX-01A-11D-A42Z-05")) > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> # Searching idat file for DNA methylation > >>>>>>>>>> query <- GDCquery(project = "TCGA-GBM", > >>>>>>>>>> data.category = "Raw microarray data", > >>>>>>>>>> data.type = "Raw intensities", > >>>>>>>>>> experimental.strategy = "Methylation array", > >>>>>>>>>> legacy = TRUE, > >>>>>>>>>> file.type = ".idat", > >>>>>>>>>> platform = "Illumina Human Methylation 450") > >>>>>>>>>> > >>>>>>>>>> **Repeat for LGG** > >>>>>>>>>> > >>>>>>>>>> To access mutational information concerning TMZ methylation? > >>>>>>>>>> > >>>>>>>>>> > mutation <? TCGAquery_maf(tumor = "lgg") > >>>>>>>>>> 2 Getting maf tables > >>>>>>>>>> 3 Source: https://wiki.nci.nih.gov/ > display/TCGA/TCGA+MAF+Files > >>>>>>>>>> 4 We found these maf files below: > >>>>>>>>>> 5 MAF.File.Name > >>>>>>>>>> 6 2 hgsc.bcm.edu_LGG.IlluminaGA_ > DNASeq.1.somatic.maf > >>>>>>>>>> 7 > >>>>>>>>>> 8 3 > >>>>>>>>>> LGG_FINAL_ANALYSIS.aggregated.capture.tcga.uuid.curated. > somatic.maf > >>>>>>>>>> 9 > >>>>>>>>>> 10 Archive.Name Deploy.Date > >>>>>>>>>> 11 2 hgsc.bcm.edu_LGG.IlluminaGA_ > DNASeq_automated.Level_2.1.0.0 > >>>>>>>>>> 10-DEC-13 > >>>>>>>>>> 12 3 broad.mit.edu_LGG.IlluminaGA_ > DNASeq_curated.Level_2.1.3.0 > >>>>>>>>>> 24-DEC-14 > >>>>>>>>>> 13 > >>>>>>>>>> 14 Please, select the line that you want to download: 3 > >>>>>>>>>> > >>>>>>>>>> **Repeat this for GBM*** > >>>>>>>>>> > >>>>>>>>>> Selecting specified lines to download? > >>>>>>>>>> > >>>>>>>>>> 1 gbm.subtypes <? TCGAquery_subtype(tumor = "gbm") > >>>>>>>>>> 2 lgg.subtypes <? TCGAquery_subtype(tumor = "lgg?) > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> Downloading data via the Bioconductor package RTCGAtoolbox? > >>>>>>>>>> > >>>>>>>>>> library(RTCGAToolbox) > >>>>>>>>>> 2 > >>>>>>>>>> 3 # Get the last run dates > >>>>>>>>>> 4 lastRunDate <? getFirehoseRunningDates()[1] > >>>>>>>>>> 5 lastAnalyseDate <? getFirehoseAnalyzeDates(1) > >>>>>>>>>> 6 > >>>>>>>>>> 7 # get DNA methylation data, RNAseq2 and clinical data for LGG > >>>>>>>>>> 8 lgg.data <? getFirehoseData(dataset = "LGG", > >>>>>>>>>> 9 gistic2_Date = getFirehoseAnalyzeDates(1), runDate > >>>>>>>>>> lastRunDate, > >>>>>>>>>> 10 Methylation = TRUE, RNAseq2_Gene_Norm = TRUE, Clinic > >>>>>>>>>> TRUE, > >>>>>>>>>> 11 Mutation = T, > >>>>>>>>>> 12 fileSizeLimit = 10000) > >>>>>>>>>> 13 > >>>>>>>>>> 14 # get DNA methylation data, RNAseq2 and clinical data for GBM > >>>>>>>>>> 15 gbm.data <? getFirehoseData(dataset = "GBM", > >>>>>>>>>> 16 runDate = lastDate, gistic2_Date > >>>>>>>>>> getFirehoseAnalyzeDates(1), > >>>>>>>>>> 17 Methylation = TRUE, Clinic = TRUE, RNAseq2_Gene_Norm > >>>>>>>>>> TRUE, > >>>>>>>>>> 18 fileSizeLimit = 10000) > >>>>>>>>>> 19 > >>>>>>>>>> 20 # To access the data you should use the getData function > >>>>>>>>>> 21 # or simply access with @ (for example gbm.data at Clinical) > >>>>>>>>>> 22 gbm.mut <? getData(gbm.data,"Mutations") > >>>>>>>>>> 23 gbm.clin <? getData(gbm.data,"Clinical") > >>>>>>>>>> 24 gbm.gistic <? getData(gbm.data,"GISTIC") > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> Genomic Analysis/Final data extraction: > >>>>>>>>>> > >>>>>>>>>> Enable ?getData? to access the data > >>>>>>>>>> > >>>>>>>>>> Obtaining GISTIC results? > >>>>>>>>>> > >>>>>>>>>> 1 # Download GISTIC results > >>>>>>>>>> 2 gistic <? getFirehoseData("GBM",gistic2_Date ="20141017" ) > >>>>>>>>>> 3 > >>>>>>>>>> 4 # get GISTIC results > >>>>>>>>>> 5 gistic.allbygene <? gistic at GISTIC@AllByGene > >>>>>>>>>> 6 gistic.thresholedbygene <? gistic at GISTIC@ThresholedByGene > >>>>>>>>>> > >>>>>>>>>> Repeat this procedure to obtain LGG GISTIC results. > >>>>>>>>>> > >>>>>>>>>> ***Please ignore the 'non-coded' text as they are procedural > >>>>>>>>>> steps/classifications*** > >>>>>>>>>> > >>>>>>>>>> [[alternative HTML version deleted]] > >>>>>>>>>> > >>>>>>>>>> ______________________________________________ > >>>>>>>>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, > see > >>>>>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help > >>>>>>>>>> PLEASE do read the posting guide > >>>>>>>>>> http://www.R-project.org/posting-guide.html > >>>>>>>>>> and provide commented, minimal, self-contained, reproducible > code. > >>>>>>>>>> > >>>>>>>>> > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/ > posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]