I performed GEO2R analysis on a series dataset and I'm looking to find the up-regulated and down-regulated genes. I know that to find up-regulated and down-regulated genes, I should check logFC (Fold-change in log2 scale (generally)).Consider the value of 1 in log2 is 0. There is optimal cutoff but log2 > 1 indicates up-regulation and log2 < -1 indicates down-regulation of genes. Moreover, I should consider adj.p.val which is the adjusted p-value (corrected p-value dues multiple comparisons). Again there is no generally accepted cutoff but I should consider values < 0.05 which indicates the test is statistically significant. But the problem is in this particular GSE series none of the adj.p.value is < 0.05 - they are all "1" and a "0.636". However, the logFC values are >1, but none of the samples have a condition of "p <0.05 & logFC > 1". So, can it be said that in this case, I need to normalise my data to find out the DEGs, up-regulated and down-regulated genes in a series GEO file? [[alternative HTML version deleted]]
One approach for adjusted p-values is to divide by the "magic cut-off"
value by the number of tests. Typically one decides significance if the p-value
is lass than 0.05. So not it is 0.05/(number of tests). This can be too
conservative, but sometimes there are not many options. If you have a multiple
comparison procedure (like the mean difference between five treatments) then
there are tests like Tukey's HSD test, and many others that try for more
balance.
"They are all 1" is possible but unlikely that all p-values would
equal 1 except for one vale of 0.636. Maybe some data was not copied correctly,
or maybe the model was not built correctly, or data not read in correctly.
Please check that the data the computer has in memory matches the data in your
file
Tim
-----Original Message-----
From: R-help <r-help-bounces at r-project.org> On Behalf Of Anas Jamshed
Sent: Monday, August 15, 2022 11:34 AM
To: R-help Mailing List <r-help at r-project.org>
Subject: [R] Normalize GEO Data
[External Email]
I performed GEO2R analysis on a series dataset and I'm looking to find the
up-regulated and down-regulated genes. I know that to find up-regulated and
down-regulated genes, I should check logFC (Fold-change in log2 scale
(generally)).Consider the value of 1 in log2 is 0. There is optimal cutoff but
log2 > 1 indicates up-regulation and log2 < -1 indicates down-regulation
of genes. Moreover, I should consider adj.p.val which is the adjusted p-value
(corrected p-value dues multiple comparisons). Again there is no generally
accepted cutoff but I should consider values < 0.05 which indicates the test
is statistically significant.
But the problem is in this particular GSE series none of the adj.p.value is <
0.05 - they are all "1" and a "0.636". However, the logFC
values are >1, but none of the samples have a condition of "p <0.05
& logFC > 1".
So, can it be said that in this case, I need to normalise my data to find out
the DEGs, up-regulated and down-regulated genes in a series GEO file?
[[alternative HTML version deleted]]
______________________________________________
R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fr-help&data=05%7C01%7Ctebert%40ufl.edu%7Ceee701d2e54a41e01f9808da7ed3c46b%7C0d4da0f84a314d76ace60a62331e1b84%7C0%7C0%7C637961745255003828%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=CSx3odPuL1fZhvyyHBuq2p24a%2Fi9V7qqIMC9UMp3yZ0%3D&reserved=0
PLEASE do read the posting guide
https://nam10.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.r-project.org%2Fposting-guide.html&data=05%7C01%7Ctebert%40ufl.edu%7Ceee701d2e54a41e01f9808da7ed3c46b%7C0d4da0f84a314d76ace60a62331e1b84%7C0%7C0%7C637961745255003828%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=tlLb%2B9%2BlfexcLPRRpb%2FPvby5%2Bf3cC5jOWB8t5k6JdNU%3D&reserved=0
and provide commented, minimal, self-contained, reproducible code.
You realise, I presume, that your sample size may be too small to flag *any* genes, up or down, using p-values? ... and that at a more fundamental level, the use of hypotheses tests and p values for any of this is controversial? A discussion of the latter is wayyyyyy off topic on this list, but if you care to go down that rabbit hole, online searches should take you there. Cheers, Bert On Mon, Aug 15, 2022 at 8:34 AM Anas Jamshed <anasjamshed1994 at gmail.com> wrote:> > I performed GEO2R analysis on a series dataset and I'm looking to find the > up-regulated and down-regulated genes. I know that to find up-regulated and > down-regulated genes, I should check logFC (Fold-change in log2 scale > (generally)).Consider the value of 1 in log2 is 0. There is optimal cutoff > but log2 > 1 indicates up-regulation and log2 < -1 indicates > down-regulation of genes. Moreover, I should consider adj.p.val which is > the adjusted p-value (corrected p-value dues multiple comparisons). Again > there is no generally accepted cutoff but I should consider values < 0.05 > which indicates the test is statistically significant. > But the problem is in this particular GSE series none of the adj.p.value is > < 0.05 - they are all "1" and a "0.636". However, the logFC values are >1, > but none of the samples have a condition of "p <0.05 & logFC > 1". > So, can it be said that in this case, I need to normalise my data to find > out the DEGs, up-regulated and down-regulated genes in a series GEO file? > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Hi Anas, ??? How many replicates were there ?? and about how many genes are up-regulated (>logFC 1) and downregulated (<logFC1) ? Are there any genes with logFC > 4 or more or < logFC 4 ? Matthew On 8/15/22 11:34 AM, Anas Jamshed wrote:> External Email - Use Caution > > I performed GEO2R analysis on a series dataset and I'm looking to find the > up-regulated and down-regulated genes. I know that to find up-regulated and > down-regulated genes, I should check logFC (Fold-change in log2 scale > (generally)).Consider the value of 1 in log2 is 0. There is optimal cutoff > but log2 > 1 indicates up-regulation and log2 < -1 indicates > down-regulation of genes. Moreover, I should consider adj.p.val which is > the adjusted p-value (corrected p-value dues multiple comparisons). Again > there is no generally accepted cutoff but I should consider values < 0.05 > which indicates the test is statistically significant. > But the problem is in this particular GSE series none of the adj.p.value is > < 0.05 - they are all "1" and a "0.636". However, the logFC values are >1, > but none of the samples have a condition of "p <0.05 & logFC > 1". > So, can it be said that in this case, I need to normalise my data to find > out the DEGs, up-regulated and down-regulated genes in a series GEO file? > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://secure-web.cisco.com/1ejCH1ScxpWp3jKUTdhYh6oHp96oRTct5eZw7R78-f8rn9VA1xauGS565x25xXZMaV4xbC6VrjXrLIfB4ai3hKLZEwdZF_iKKTbzItQXqqhu9TOH7iRubx4o5GqRHUF7YFfQRZ1F7Mwi1x2we4pDL2m9DnFYN__1l-p0Cyqsb6R-L4ZURoSqj8VGDFU4073MeUd1Cx5X4PgWo2VvEXS558qllR5ovbJ1jkAEwDuWnH9xJnisDB27QaNn3Pn4Bs-7ryzW4HFvwyWhXVmpE9KXdkA1IcFYCBQy6A_7wtuG2NGlA-a9CW7ag8mfY8oMZhQumu0-2huQ_0V93UKHkHuqgaBYNWg9_HoLTx8lpmBPIjkw/https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fr-help > PLEASE do read the posting guide http://secure-web.cisco.com/1igXTXMGiWQafW7kci-9bCcUgQPc-ZyZkteyDHvPEdWvwSFYi2z5rFNONA0jn5it4q7YBduGJ_zxapPA126PjgFsgP9AY7TDqz0TaoK5xYGiFODIXiYBvNz70ml1wWiShtNr27rQvNy_gd6GiPbnRXjPQjV4ladYICHSrLjwzcTIYt7cAKfktk4J6pn2WrIEGB1O6uGorHb2ThWnp3ZP_7Ra2qaFkswhDgdmEwzu6PfSm2LZ1HrDNzoEnpk_vFcVZ1_isknajFN91QtM9EtvfWSu0lHOEgan9nTqEuPn_4huzbQxqYcwrtAgzrQ0joJZIxmVkv2C89-ieeKsSLCCI_95kI2FPOrI3JQVobAbnu0o/http%3A%2F%2Fwww.R-project.org%2Fposting-guide.html > and provide commented, minimal, self-contained, reproducible code. >The information in this e-mail is intended only for the person to whom it is addressed. If you believe this e-mail was sent to you in error and the e-mail contains patient information, please contact the Mass General Brigham Compliance HelpLine at https://www.massgeneralbrigham.org/complianceline <https://www.massgeneralbrigham.org/complianceline> . Please note that this e-mail is not secure (encrypted). If you do not wish to continue communication over unencrypted e-mail, please notify the sender of this message immediately. Continuing to send or respond to e-mail after receiving this message means you understand and accept this risk and wish to continue to communicate over unencrypted e-mail.