Hi,
I used a small set of data (several columns and rows) and it works fine
using the following command:
abc <- rowSums(test[,grep("ABC",names(test),fixed=T)],na.rm=T)
But when I used the real big data table, "Error in rowSums(dat[,
grep("ABC", names(dat), fixed = T)], na.rm = T) :
'x' must be numeric"
Then it didn't work either using as.numeric():> as.numeric(dat)
Error: (list) object cannot be coerced to type 'double'
Thanks!
Dawn
On Fri, Jul 10, 2015 at 4:35 PM, Dawn <dawn1313 at gmail.com> wrote:
> Thank you all and sorry for the data messing. It has worked!
>
> Best,
> Dawn
>
> On Fri, Jul 10, 2015 at 4:15 AM, Jim Lemon <drjimlemon at gmail.com>
wrote:
>
>> Hi Dawn,
>> Your data are a bit messed up, but try the following:
>>
>> colSums(dat[,grep("ABC",names(dat),fixed=TRUE)],na.rm=TRUE)
>> colSums(dat[,grep("XYZ",names(dat),fixed=TRUE)],na.rm=TRUE)
>>
>> I'm assuming that you want to discard the NA values.
>>
>> Jim
>>
>> On Fri, Jul 10, 2015 at 6:52 AM, Rui Barradas <ruipbarradas at
sapo.pt>
>> wrote:
>> > Hello,
>> >
>> > Please use ?dput to give a data example, like this it's
completely
>> > unreadable. If your data.frame is named 'dat' use
>> >
>> > dput(head(dat, 30)) # paste the outut of this in your mail
>> >
>> >
>> > And don't post in html, use plain text only, like the posting
guide
>> says.
>> >
>> > Rui Barradas
>> >
>> >
>> > Em 09-07-2015 18:12, Dawn escreveu:
>> >>
>> >> Hi,
>> >>
>> >> I have a big dataframe as follows
>> >>
>> >> 109ABC 109XYZ 18ABC 18XYZ 22XYZ 23ABC
25ABC
>> >> 25XYZ
>> >> 30ABC 31XYZ 32ABC 32XYZ 34DCM 34XYZ
36ABC
>> 36SUR
>> >> 38DCM 38XYZ 39DCM 39SUR 41DCM 41SUR 42DCM
42SUR
>> >> 46SUR 52DCM 64ABC 64XYZ 65ABC 65XYZ 66ABC
66XYZ
>> >> 67XYZ 68ABC 68SUR 70MES 70SUR 72ABC 72XYZ
76ABC
>> >> 76XYZ 82ABC 85ABC POV
>> >> Cluster_1
17
>> 1
>> >> 3 10 14 5 2 2 1 1 1 2
>> >> 2 TT:61
>> >> Cluster_2 1
4 20
>> >> 6 5 3 6 9 9 6 10 1 3 1
>> >> 4
TT:88
>> >> Cluster_3 3 3 6 4
17
>> >> 17 18 13 17 19 22 11 5 21 8 5
18 4
>> >> 7 9
>> >> TT:227
>> >> ........
>> >>
>> >> I want to get two columns, i.e, one is to sum columns for all
>> including
>> >> ABC for each row and the other is to sum columns for all
including XYZ
>> >> for
>> >> each row.
>> >>
>> >> Is there some help? Thank you!
>> >> Dawn
>> >>
>> >> [[alternative HTML version deleted]]
>> >>
>> >> ______________________________________________
>> >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and
more, see
>> >> https://stat.ethz.ch/mailman/listinfo/r-help
>> >> PLEASE do read the posting guide
>> >> http://www.R-project.org/posting-guide.html
>> >> and provide commented, minimal, self-contained, reproducible
code.
>> >>
>> >
>> > ______________________________________________
>> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more,
see
>> > https://stat.ethz.ch/mailman/listinfo/r-help
>> > PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> > and provide commented, minimal, self-contained, reproducible code.
>>
>
>
[[alternative HTML version deleted]]
I suspect your data frame "dat" has non-numeric data in some of the
columns that have ABC in their names. Any column of a data frame can be numeric
or not, but the data frame as a unit cannot be numeric. If your data file has
odd characters in done of the otherwise-numeric columns, the whole column will
be read in as a factor or character strings. Look at the output of str(dat) for
columns that don't show "num'. If you can find the column, and then
one of the bad rows, you can use a text editor to fix them manually, or show us
examples of the bad data and we can suggest ways to fix it in R.
---------------------------------------------------------------------------
Jeff Newmiller The ..... ..... Go Live...
DCN:<jdnewmil at dcn.davis.ca.us> Basics: ##.#. ##.#. Live
Go...
Live: OO#.. Dead: OO#.. Playing
Research Engineer (Solar/Batteries O.O#. #.O#. with
/Software/Embedded Controllers) .OO#. .OO#. rocks...1k
---------------------------------------------------------------------------
Sent from my phone. Please excuse my brevity.
On July 14, 2015 2:35:38 PM PDT, Dawn <dawn1313 at gmail.com>
wrote:>Hi,
>
>I used a small set of data (several columns and rows) and it works fine
>using the following command:
>abc <- rowSums(test[,grep("ABC",names(test),fixed=T)],na.rm=T)
>
>But when I used the real big data table, "Error in rowSums(dat[,
>grep("ABC", names(dat), fixed = T)], na.rm = T) :
> 'x' must be numeric"
>Then it didn't work either using as.numeric():
>> as.numeric(dat)
>Error: (list) object cannot be coerced to type 'double'
>
>Thanks!
>Dawn
>
>
>
>
>On Fri, Jul 10, 2015 at 4:35 PM, Dawn <dawn1313 at gmail.com> wrote:
>
>> Thank you all and sorry for the data messing. It has worked!
>>
>> Best,
>> Dawn
>>
>> On Fri, Jul 10, 2015 at 4:15 AM, Jim Lemon <drjimlemon at
gmail.com>
>wrote:
>>
>>> Hi Dawn,
>>> Your data are a bit messed up, but try the following:
>>>
>>>
colSums(dat[,grep("ABC",names(dat),fixed=TRUE)],na.rm=TRUE)
>>>
colSums(dat[,grep("XYZ",names(dat),fixed=TRUE)],na.rm=TRUE)
>>>
>>> I'm assuming that you want to discard the NA values.
>>>
>>> Jim
>>>
>>> On Fri, Jul 10, 2015 at 6:52 AM, Rui Barradas <ruipbarradas at
sapo.pt>
>>> wrote:
>>> > Hello,
>>> >
>>> > Please use ?dput to give a data example, like this it's
completely
>>> > unreadable. If your data.frame is named 'dat' use
>>> >
>>> > dput(head(dat, 30)) # paste the outut of this in your mail
>>> >
>>> >
>>> > And don't post in html, use plain text only, like the
posting
>guide
>>> says.
>>> >
>>> > Rui Barradas
>>> >
>>> >
>>> > Em 09-07-2015 18:12, Dawn escreveu:
>>> >>
>>> >> Hi,
>>> >>
>>> >> I have a big dataframe as follows
>>> >>
>>> >> 109ABC 109XYZ 18ABC 18XYZ 22XYZ 23ABC
>25ABC
>>> >> 25XYZ
>>> >> 30ABC 31XYZ 32ABC 32XYZ 34DCM 34XYZ
36ABC
>>> 36SUR
>>> >> 38DCM 38XYZ 39DCM 39SUR 41DCM 41SUR
42DCM
>42SUR
>>> >> 46SUR 52DCM 64ABC 64XYZ 65ABC 65XYZ
66ABC
>66XYZ
>>> >> 67XYZ 68ABC 68SUR 70MES 70SUR 72ABC
72XYZ
>76ABC
>>> >> 76XYZ 82ABC 85ABC POV
>>> >> Cluster_1
>17
>>> 1
>>> >> 3 10 14 5 2 2 1 1 1 2
>>> >> 2
TT:61
>>> >> Cluster_2 1
4
> 20
>>> >> 6 5 3 6 9 9 6 10 1 3
1
>>> >> 4
TT:88
>>> >> Cluster_3 3 3 6
4
> 17
>>> >> 17 18 13 17 19 22 11 5 21 8
5 18
> 4
>>> >> 7 9
>>> >> TT:227
>>> >> ........
>>> >>
>>> >> I want to get two columns, i.e, one is to sum columns for
all
>>> including
>>> >> ABC for each row and the other is to sum columns for all
>including XYZ
>>> >> for
>>> >> each row.
>>> >>
>>> >> Is there some help? Thank you!
>>> >> Dawn
>>> >>
>>> >> [[alternative HTML version deleted]]
>>> >>
>>> >> ______________________________________________
>>> >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and
more, see
>>> >> https://stat.ethz.ch/mailman/listinfo/r-help
>>> >> PLEASE do read the posting guide
>>> >> http://www.R-project.org/posting-guide.html
>>> >> and provide commented, minimal, self-contained,
reproducible
>code.
>>> >>
>>> >
>>> > ______________________________________________
>>> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and
more, see
>>> > https://stat.ethz.ch/mailman/listinfo/r-help
>>> > PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> > and provide commented, minimal, self-contained, reproducible
code.
>>>
>>
>>
I used two rows to test the data frame, as follows.> dat <- read.table("TOV_43_Protein_Clusters_abundance1.tab",header=TRUE,sep = "\t")> dat1 <- dat[1:2,] > str(dat1)'data.frame': 2 obs. of 44 variables: $ X : Factor w/ 1075762 levels "","POV_Cluster_1000001",..: 305266 625028 $ X109DCM: Factor w/ 46 levels "","1","10","109DCM",..: 1 1 $ X109SUR: Factor w/ 41 levels "","1","10","109SUR",..: 1 1 $ X18DCM : Factor w/ 31 levels "","1","10","11",..: 1 1 $ X18SUR : Factor w/ 25 levels "","1","10","11",..: 1 1 $ X22SUR : Factor w/ 50 levels "","1","10","11",..: 1 2 $ X23DCM : Factor w/ 46 levels "","1","10","11",..: 1 1 $ X25DCM : Factor w/ 42 levels "","1","10","11",..: 1 1 $ X25SUR : Factor w/ 47 levels "","1","10","11",..: 1 1 $ X30DCM : Factor w/ 34 levels "","1","10","11",..: 1 1 $ X31SUR : Factor w/ 43 levels "","1","10","11",..: 1 1 $ X32DCM : Factor w/ 15 levels "","1","10","11",..: 1 1 $ X32SUR : Factor w/ 58 levels "","1","10","11",..: 1 1 $ X34DCM : Factor w/ 53 levels "","1","10","11",..: 1 35 $ X34SUR : Factor w/ 47 levels "","1","10","11",..: 10 14 $ X36DCM : Factor w/ 48 levels "","1","10","11",..: 2 43 $ X36SUR : Factor w/ 45 levels "","1","10","11",..: 23 38 $ X38DCM : Factor w/ 40 levels "","1","10","11",..: 3 23 $ X38SUR : Factor w/ 44 levels "","1","10","11",..: 7 41 $ X39DCM : Factor w/ 38 levels "","1","10","11",..: 34 38 $ X39SUR : Factor w/ 40 levels "","1","10","11",..: 13 40 $ X41DCM : Factor w/ 47 levels "","1","10","11",..: 13 40 $ X41SUR : Factor w/ 40 levels "","1","10","11",..: 1 1 $ X42DCM : Factor w/ 48 levels "","1","10","11",..: 2 3 $ X42SUR : Factor w/ 41 levels "","1","10","11",..: 2 1 $ X46SUR : Factor w/ 31 levels "","1","10","11",..: 2 2 $ X52DCM : Factor w/ 49 levels "","1","10","11",..: 13 23 $ X64DCM : Factor w/ 35 levels "","1","10","11",..: 1 2 $ X64SUR : Factor w/ 36 levels "","1","10","11",..: 1 1 $ X65DCM : Factor w/ 38 levels "","1","10","11",..: 1 1 $ X65SUR : Factor w/ 35 levels "","1","10","11",..: 1 1 $ X66DCM : Factor w/ 27 levels "","1","10","11",..: 1 1 $ X66SUR : Factor w/ 35 levels "","1","10","11",..: 1 1 $ X67SUR : Factor w/ 38 levels "","1","10","11",..: 1 1 $ X68DCM : Factor w/ 33 levels "","1","10","11",..: 1 1 $ X68SUR : Factor w/ 36 levels "","1","10","11",..: 1 1 $ X70MES : Factor w/ 23 levels "","1","10","11",..: 1 1 $ X70SUR : Factor w/ 37 levels "","1","10","11",..: 1 1 $ X72DCM : Factor w/ 40 levels "","1","10","11",..: 13 27 $ X72SUR : Factor w/ 38 levels "","1","10","11",..: 1 1 $ X76DCM : Factor w/ 44 levels "","1","10","11",..: 1 1 $ X76SUR : Factor w/ 34 levels "","1","10","11",..: 1 1 $ X82DCM : Factor w/ 29 levels "","1","10","11",..: 1 1 $ X85DCM : Factor w/ 30 levels "","1","10","11",..: 1 1 Thank you!! Dawn On Tue, Jul 14, 2015 at 3:48 PM, Jeff Newmiller <jdnewmil at dcn.davis.ca.us> wrote:> I suspect your data frame "dat" has non-numeric data in some of the > columns that have ABC in their names. Any column of a data frame can be > numeric or not, but the data frame as a unit cannot be numeric. If your > data file has odd characters in done of the otherwise-numeric columns, the > whole column will be read in as a factor or character strings. Look at the > output of str(dat) for columns that don't show "num'. If you can find the > column, and then one of the bad rows, you can use a text editor to fix them > manually, or show us examples of the bad data and we can suggest ways to > fix it in R. > --------------------------------------------------------------------------- > Jeff Newmiller The ..... ..... Go Live... > DCN:<jdnewmil at dcn.davis.ca.us> Basics: ##.#. ##.#. Live > Go... > Live: OO#.. Dead: OO#.. Playing > Research Engineer (Solar/Batteries O.O#. #.O#. with > /Software/Embedded Controllers) .OO#. .OO#. rocks...1k > --------------------------------------------------------------------------- > Sent from my phone. Please excuse my brevity. > > On July 14, 2015 2:35:38 PM PDT, Dawn <dawn1313 at gmail.com> wrote: > >Hi, > > > >I used a small set of data (several columns and rows) and it works fine > >using the following command: > >abc <- rowSums(test[,grep("ABC",names(test),fixed=T)],na.rm=T) > > > >But when I used the real big data table, "Error in rowSums(dat[, > >grep("ABC", names(dat), fixed = T)], na.rm = T) : > > 'x' must be numeric" > >Then it didn't work either using as.numeric(): > >> as.numeric(dat) > >Error: (list) object cannot be coerced to type 'double' > > > >Thanks! > >Dawn > > > > > > > > > >On Fri, Jul 10, 2015 at 4:35 PM, Dawn <dawn1313 at gmail.com> wrote: > > > >> Thank you all and sorry for the data messing. It has worked! > >> > >> Best, > >> Dawn > >> > >> On Fri, Jul 10, 2015 at 4:15 AM, Jim Lemon <drjimlemon at gmail.com> > >wrote: > >> > >>> Hi Dawn, > >>> Your data are a bit messed up, but try the following: > >>> > >>> colSums(dat[,grep("ABC",names(dat),fixed=TRUE)],na.rm=TRUE) > >>> colSums(dat[,grep("XYZ",names(dat),fixed=TRUE)],na.rm=TRUE) > >>> > >>> I'm assuming that you want to discard the NA values. > >>> > >>> Jim > >>> > >>> On Fri, Jul 10, 2015 at 6:52 AM, Rui Barradas <ruipbarradas at sapo.pt> > >>> wrote: > >>> > Hello, > >>> > > >>> > Please use ?dput to give a data example, like this it's completely > >>> > unreadable. If your data.frame is named 'dat' use > >>> > > >>> > dput(head(dat, 30)) # paste the outut of this in your mail > >>> > > >>> > > >>> > And don't post in html, use plain text only, like the posting > >guide > >>> says. > >>> > > >>> > Rui Barradas > >>> > > >>> > > >>> > Em 09-07-2015 18:12, Dawn escreveu: > >>> >> > >>> >> Hi, > >>> >> > >>> >> I have a big dataframe as follows > >>> >> > >>> >> 109ABC 109XYZ 18ABC 18XYZ 22XYZ 23ABC > >25ABC > >>> >> 25XYZ > >>> >> 30ABC 31XYZ 32ABC 32XYZ 34DCM 34XYZ 36ABC > >>> 36SUR > >>> >> 38DCM 38XYZ 39DCM 39SUR 41DCM 41SUR 42DCM > >42SUR > >>> >> 46SUR 52DCM 64ABC 64XYZ 65ABC 65XYZ 66ABC > >66XYZ > >>> >> 67XYZ 68ABC 68SUR 70MES 70SUR 72ABC 72XYZ > >76ABC > >>> >> 76XYZ 82ABC 85ABC POV > >>> >> Cluster_1 > >17 > >>> 1 > >>> >> 3 10 14 5 2 2 1 1 1 2 > >>> >> 2 TT:61 > >>> >> Cluster_2 1 4 > > 20 > >>> >> 6 5 3 6 9 9 6 10 1 3 1 > >>> >> 4 TT:88 > >>> >> Cluster_3 3 3 6 4 > > 17 > >>> >> 17 18 13 17 19 22 11 5 21 8 5 18 > > 4 > >>> >> 7 9 > >>> >> TT:227 > >>> >> ........ > >>> >> > >>> >> I want to get two columns, i.e, one is to sum columns for all > >>> including > >>> >> ABC for each row and the other is to sum columns for all > >including XYZ > >>> >> for > >>> >> each row. > >>> >> > >>> >> Is there some help? Thank you! > >>> >> Dawn > >>> >> > >>> >> [[alternative HTML version deleted]] > >>> >> > >>> >> ______________________________________________ > >>> >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > >>> >> https://stat.ethz.ch/mailman/listinfo/r-help > >>> >> PLEASE do read the posting guide > >>> >> http://www.R-project.org/posting-guide.html > >>> >> and provide commented, minimal, self-contained, reproducible > >code. > >>> >> > >>> > > >>> > ______________________________________________ > >>> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > >>> > https://stat.ethz.ch/mailman/listinfo/r-help > >>> > PLEASE do read the posting guide > >>> http://www.R-project.org/posting-guide.html > >>> > and provide commented, minimal, self-contained, reproducible code. > >>> > >> > >> > >[[alternative HTML version deleted]]