I performed GEO2R analysis on a series dataset and I'm looking to find the up-regulated and down-regulated genes. I know that to find up-regulated and down-regulated genes, I should check logFC (Fold-change in log2 scale (generally)).Consider the value of 1 in log2 is 0. There is optimal cutoff but log2 > 1 indicates up-regulation and log2 < -1 indicates down-regulation of genes. Moreover, I should consider adj.p.val which is the adjusted p-value (corrected p-value dues multiple comparisons). Again there is no generally accepted cutoff but I should consider values < 0.05 which indicates the test is statistically significant. But the problem is in this particular GSE series none of the adj.p.value is < 0.05 - they are all "1" and a "0.636". However, the logFC values are >1, but none of the samples have a condition of "p <0.05 & logFC > 1". So, can it be said that in this case, I need to normalise my data to find out the DEGs, up-regulated and down-regulated genes in a series GEO file? [[alternative HTML version deleted]]
One approach for adjusted p-values is to divide by the "magic cut-off" value by the number of tests. Typically one decides significance if the p-value is lass than 0.05. So not it is 0.05/(number of tests). This can be too conservative, but sometimes there are not many options. If you have a multiple comparison procedure (like the mean difference between five treatments) then there are tests like Tukey's HSD test, and many others that try for more balance. "They are all 1" is possible but unlikely that all p-values would equal 1 except for one vale of 0.636. Maybe some data was not copied correctly, or maybe the model was not built correctly, or data not read in correctly. Please check that the data the computer has in memory matches the data in your file Tim -----Original Message----- From: R-help <r-help-bounces at r-project.org> On Behalf Of Anas Jamshed Sent: Monday, August 15, 2022 11:34 AM To: R-help Mailing List <r-help at r-project.org> Subject: [R] Normalize GEO Data [External Email] I performed GEO2R analysis on a series dataset and I'm looking to find the up-regulated and down-regulated genes. I know that to find up-regulated and down-regulated genes, I should check logFC (Fold-change in log2 scale (generally)).Consider the value of 1 in log2 is 0. There is optimal cutoff but log2 > 1 indicates up-regulation and log2 < -1 indicates down-regulation of genes. Moreover, I should consider adj.p.val which is the adjusted p-value (corrected p-value dues multiple comparisons). Again there is no generally accepted cutoff but I should consider values < 0.05 which indicates the test is statistically significant. But the problem is in this particular GSE series none of the adj.p.value is < 0.05 - they are all "1" and a "0.636". However, the logFC values are >1, but none of the samples have a condition of "p <0.05 & logFC > 1". So, can it be said that in this case, I need to normalise my data to find out the DEGs, up-regulated and down-regulated genes in a series GEO file? [[alternative HTML version deleted]] ______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fr-help&data=05%7C01%7Ctebert%40ufl.edu%7Ceee701d2e54a41e01f9808da7ed3c46b%7C0d4da0f84a314d76ace60a62331e1b84%7C0%7C0%7C637961745255003828%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=CSx3odPuL1fZhvyyHBuq2p24a%2Fi9V7qqIMC9UMp3yZ0%3D&reserved=0 PLEASE do read the posting guide https://nam10.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.r-project.org%2Fposting-guide.html&data=05%7C01%7Ctebert%40ufl.edu%7Ceee701d2e54a41e01f9808da7ed3c46b%7C0d4da0f84a314d76ace60a62331e1b84%7C0%7C0%7C637961745255003828%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=tlLb%2B9%2BlfexcLPRRpb%2FPvby5%2Bf3cC5jOWB8t5k6JdNU%3D&reserved=0 and provide commented, minimal, self-contained, reproducible code.
You realise, I presume, that your sample size may be too small to flag *any* genes, up or down, using p-values? ... and that at a more fundamental level, the use of hypotheses tests and p values for any of this is controversial? A discussion of the latter is wayyyyyy off topic on this list, but if you care to go down that rabbit hole, online searches should take you there. Cheers, Bert On Mon, Aug 15, 2022 at 8:34 AM Anas Jamshed <anasjamshed1994 at gmail.com> wrote:> > I performed GEO2R analysis on a series dataset and I'm looking to find the > up-regulated and down-regulated genes. I know that to find up-regulated and > down-regulated genes, I should check logFC (Fold-change in log2 scale > (generally)).Consider the value of 1 in log2 is 0. There is optimal cutoff > but log2 > 1 indicates up-regulation and log2 < -1 indicates > down-regulation of genes. Moreover, I should consider adj.p.val which is > the adjusted p-value (corrected p-value dues multiple comparisons). Again > there is no generally accepted cutoff but I should consider values < 0.05 > which indicates the test is statistically significant. > But the problem is in this particular GSE series none of the adj.p.value is > < 0.05 - they are all "1" and a "0.636". However, the logFC values are >1, > but none of the samples have a condition of "p <0.05 & logFC > 1". > So, can it be said that in this case, I need to normalise my data to find > out the DEGs, up-regulated and down-regulated genes in a series GEO file? > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Hi Anas, ??? How many replicates were there ?? and about how many genes are up-regulated (>logFC 1) and downregulated (<logFC1) ? Are there any genes with logFC > 4 or more or < logFC 4 ? Matthew On 8/15/22 11:34 AM, Anas Jamshed wrote:> External Email - Use Caution > > I performed GEO2R analysis on a series dataset and I'm looking to find the > up-regulated and down-regulated genes. I know that to find up-regulated and > down-regulated genes, I should check logFC (Fold-change in log2 scale > (generally)).Consider the value of 1 in log2 is 0. There is optimal cutoff > but log2 > 1 indicates up-regulation and log2 < -1 indicates > down-regulation of genes. Moreover, I should consider adj.p.val which is > the adjusted p-value (corrected p-value dues multiple comparisons). Again > there is no generally accepted cutoff but I should consider values < 0.05 > which indicates the test is statistically significant. > But the problem is in this particular GSE series none of the adj.p.value is > < 0.05 - they are all "1" and a "0.636". However, the logFC values are >1, > but none of the samples have a condition of "p <0.05 & logFC > 1". > So, can it be said that in this case, I need to normalise my data to find > out the DEGs, up-regulated and down-regulated genes in a series GEO file? > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://secure-web.cisco.com/1ejCH1ScxpWp3jKUTdhYh6oHp96oRTct5eZw7R78-f8rn9VA1xauGS565x25xXZMaV4xbC6VrjXrLIfB4ai3hKLZEwdZF_iKKTbzItQXqqhu9TOH7iRubx4o5GqRHUF7YFfQRZ1F7Mwi1x2we4pDL2m9DnFYN__1l-p0Cyqsb6R-L4ZURoSqj8VGDFU4073MeUd1Cx5X4PgWo2VvEXS558qllR5ovbJ1jkAEwDuWnH9xJnisDB27QaNn3Pn4Bs-7ryzW4HFvwyWhXVmpE9KXdkA1IcFYCBQy6A_7wtuG2NGlA-a9CW7ag8mfY8oMZhQumu0-2huQ_0V93UKHkHuqgaBYNWg9_HoLTx8lpmBPIjkw/https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fr-help > PLEASE do read the posting guide http://secure-web.cisco.com/1igXTXMGiWQafW7kci-9bCcUgQPc-ZyZkteyDHvPEdWvwSFYi2z5rFNONA0jn5it4q7YBduGJ_zxapPA126PjgFsgP9AY7TDqz0TaoK5xYGiFODIXiYBvNz70ml1wWiShtNr27rQvNy_gd6GiPbnRXjPQjV4ladYICHSrLjwzcTIYt7cAKfktk4J6pn2WrIEGB1O6uGorHb2ThWnp3ZP_7Ra2qaFkswhDgdmEwzu6PfSm2LZ1HrDNzoEnpk_vFcVZ1_isknajFN91QtM9EtvfWSu0lHOEgan9nTqEuPn_4huzbQxqYcwrtAgzrQ0joJZIxmVkv2C89-ieeKsSLCCI_95kI2FPOrI3JQVobAbnu0o/http%3A%2F%2Fwww.R-project.org%2Fposting-guide.html > and provide commented, minimal, self-contained, reproducible code. >The information in this e-mail is intended only for the person to whom it is addressed. If you believe this e-mail was sent to you in error and the e-mail contains patient information, please contact the Mass General Brigham Compliance HelpLine at https://www.massgeneralbrigham.org/complianceline <https://www.massgeneralbrigham.org/complianceline> . Please note that this e-mail is not secure (encrypted). If you do not wish to continue communication over unencrypted e-mail, please notify the sender of this message immediately. Continuing to send or respond to e-mail after receiving this message means you understand and accept this risk and wish to continue to communicate over unencrypted e-mail.