thr3ads.net - R help - [R] sum some columns for each row [Jul 2015]

If this information is useful, please help other people find it:
Share via:

Dawn

2015-Jul-14 23:05 UTC

[R] sum some columns for each row

I used two rows to test the data frame, as follows.
> dat <- read.table("TOV_43_Protein_Clusters_abundance1.tab",
header=TRUE,sep = "\t")> dat1 <- dat[1:2,]
> str(dat1)'data.frame':    2 obs. of  44 variables:
 $ X      : Factor w/ 1075762 levels
"","POV_Cluster_1000001",..: 305266
625028
 $ X109DCM: Factor w/ 46 levels
"","1","10","109DCM",..: 1 1
 $ X109SUR: Factor w/ 41 levels
"","1","10","109SUR",..: 1 1
 $ X18DCM : Factor w/ 31 levels
"","1","10","11",..: 1 1
 $ X18SUR : Factor w/ 25 levels
"","1","10","11",..: 1 1
 $ X22SUR : Factor w/ 50 levels
"","1","10","11",..: 1 2
 $ X23DCM : Factor w/ 46 levels
"","1","10","11",..: 1 1
 $ X25DCM : Factor w/ 42 levels
"","1","10","11",..: 1 1
 $ X25SUR : Factor w/ 47 levels
"","1","10","11",..: 1 1
 $ X30DCM : Factor w/ 34 levels
"","1","10","11",..: 1 1
 $ X31SUR : Factor w/ 43 levels
"","1","10","11",..: 1 1
 $ X32DCM : Factor w/ 15 levels
"","1","10","11",..: 1 1
 $ X32SUR : Factor w/ 58 levels
"","1","10","11",..: 1 1
 $ X34DCM : Factor w/ 53 levels
"","1","10","11",..: 1 35
 $ X34SUR : Factor w/ 47 levels
"","1","10","11",..: 10 14
 $ X36DCM : Factor w/ 48 levels
"","1","10","11",..: 2 43
 $ X36SUR : Factor w/ 45 levels
"","1","10","11",..: 23 38
 $ X38DCM : Factor w/ 40 levels
"","1","10","11",..: 3 23
 $ X38SUR : Factor w/ 44 levels
"","1","10","11",..: 7 41
 $ X39DCM : Factor w/ 38 levels
"","1","10","11",..: 34 38
 $ X39SUR : Factor w/ 40 levels
"","1","10","11",..: 13 40
 $ X41DCM : Factor w/ 47 levels
"","1","10","11",..: 13 40
 $ X41SUR : Factor w/ 40 levels
"","1","10","11",..: 1 1
 $ X42DCM : Factor w/ 48 levels
"","1","10","11",..: 2 3
 $ X42SUR : Factor w/ 41 levels
"","1","10","11",..: 2 1
 $ X46SUR : Factor w/ 31 levels
"","1","10","11",..: 2 2
 $ X52DCM : Factor w/ 49 levels
"","1","10","11",..: 13 23
 $ X64DCM : Factor w/ 35 levels
"","1","10","11",..: 1 2
 $ X64SUR : Factor w/ 36 levels
"","1","10","11",..: 1 1
 $ X65DCM : Factor w/ 38 levels
"","1","10","11",..: 1 1
 $ X65SUR : Factor w/ 35 levels
"","1","10","11",..: 1 1
 $ X66DCM : Factor w/ 27 levels
"","1","10","11",..: 1 1
 $ X66SUR : Factor w/ 35 levels
"","1","10","11",..: 1 1
 $ X67SUR : Factor w/ 38 levels
"","1","10","11",..: 1 1
 $ X68DCM : Factor w/ 33 levels
"","1","10","11",..: 1 1
 $ X68SUR : Factor w/ 36 levels
"","1","10","11",..: 1 1
 $ X70MES : Factor w/ 23 levels
"","1","10","11",..: 1 1
 $ X70SUR : Factor w/ 37 levels
"","1","10","11",..: 1 1
 $ X72DCM : Factor w/ 40 levels
"","1","10","11",..: 13 27
 $ X72SUR : Factor w/ 38 levels
"","1","10","11",..: 1 1
 $ X76DCM : Factor w/ 44 levels
"","1","10","11",..: 1 1
 $ X76SUR : Factor w/ 34 levels
"","1","10","11",..: 1 1
 $ X82DCM : Factor w/ 29 levels
"","1","10","11",..: 1 1
 $ X85DCM : Factor w/ 30 levels
"","1","10","11",..: 1 1


Thank you!!
Dawn

On Tue, Jul 14, 2015 at 3:48 PM, Jeff Newmiller <jdnewmil at
dcn.davis.ca.us>
wrote:
> I suspect your data frame "dat" has non-numeric data in some of
the
> columns that have ABC in their names. Any column of a data frame can be
> numeric or not, but the data frame as a unit cannot be numeric. If your
> data file has odd characters in done of the otherwise-numeric columns, the
> whole column will be read in as a factor or character strings. Look at the
> output of str(dat) for columns that don't show "num'. If you
can find the
> column, and then one of the bad rows, you can use a text editor to fix them
> manually, or show us examples of the bad data and we can suggest ways to
> fix it in R.
> ---------------------------------------------------------------------------
> Jeff Newmiller                        The     .....       .....  Go Live...
> DCN:<jdnewmil at dcn.davis.ca.us>        Basics: ##.#.       ##.#. 
Live
> Go...
>                                       Live:   OO#.. Dead: OO#..  Playing
> Research Engineer (Solar/Batteries            O.O#.       #.O#.  with
> /Software/Embedded Controllers)               .OO#.       .OO#.  rocks...1k
> ---------------------------------------------------------------------------
> Sent from my phone. Please excuse my brevity.
>
> On July 14, 2015 2:35:38 PM PDT, Dawn <dawn1313 at gmail.com> wrote:
> >Hi,
> >
> >I used a small set of data (several columns and rows) and it works fine
> >using the following command:
> >abc <-
rowSums(test[,grep("ABC",names(test),fixed=T)],na.rm=T)
> >
> >But when I used the real big data table, "Error in rowSums(dat[,
> >grep("ABC", names(dat), fixed = T)], na.rm = T) :
> >  'x' must be numeric"
> >Then it didn't work either using as.numeric():
> >> as.numeric(dat)
> >Error: (list) object cannot be coerced to type 'double'
> >
> >Thanks!
> >Dawn
> >
> >
> >
> >
> >On Fri, Jul 10, 2015 at 4:35 PM, Dawn <dawn1313 at gmail.com>
wrote:
> >
> >> Thank you all and sorry for the data messing. It has worked!
> >>
> >> Best,
> >> Dawn
> >>
> >> On Fri, Jul 10, 2015 at 4:15 AM, Jim Lemon <drjimlemon at
gmail.com>
> >wrote:
> >>
> >>> Hi Dawn,
> >>> Your data are a bit messed up, but try the following:
> >>>
> >>>
colSums(dat[,grep("ABC",names(dat),fixed=TRUE)],na.rm=TRUE)
> >>>
colSums(dat[,grep("XYZ",names(dat),fixed=TRUE)],na.rm=TRUE)
> >>>
> >>> I'm assuming that you want to discard the NA values.
> >>>
> >>> Jim
> >>>
> >>> On Fri, Jul 10, 2015 at 6:52 AM, Rui Barradas <ruipbarradas
at sapo.pt>
> >>> wrote:
> >>> > Hello,
> >>> >
> >>> > Please use ?dput to give a data example, like this
it's completely
> >>> > unreadable. If your data.frame is named 'dat' use
> >>> >
> >>> > dput(head(dat, 30))  # paste the outut of this in your
mail
> >>> >
> >>> >
> >>> > And don't post in html, use plain text only, like the
posting
> >guide
> >>> says.
> >>> >
> >>> > Rui Barradas
> >>> >
> >>> >
> >>> > Em 09-07-2015 18:12, Dawn escreveu:
> >>> >>
> >>> >> Hi,
> >>> >>
> >>> >> I have a big dataframe as follows
> >>> >>
> >>> >>      109ABC    109XYZ    18ABC    18XYZ    22XYZ   
23ABC
> >25ABC
> >>> >> 25XYZ
> >>> >>     30ABC    31XYZ    32ABC    32XYZ    34DCM   
34XYZ    36ABC
> >>> 36SUR
> >>> >> 38DCM    38XYZ    39DCM    39SUR    41DCM    41SUR   
42DCM
> >42SUR
> >>> >> 46SUR    52DCM    64ABC    64XYZ    65ABC    65XYZ   
66ABC
> >66XYZ
> >>> >> 67XYZ    68ABC    68SUR    70MES    70SUR    72ABC   
72XYZ
> >76ABC
> >>> >> 76XYZ    82ABC    85ABC    POV
> >>> >> Cluster_1
> >17
> >>> 1
> >>> >> 3    10    14    5    2    2        1    1    1    2
> >>> >>                          2                           
TT:61
> >>> >> Cluster_2                    1                       
4
> > 20
> >>> >> 6    5    3    6    9    9    6        10        1   
3    1
> >>> >>                              4                       
TT:88
> >>> >> Cluster_3    3        3                            6 
4
> >   17
> >>> >> 17    18    13    17    19    22    11    5    21   
8    5    18
> >   4
> >>> >> 7                                        9
> >>> >> TT:227
> >>> >> ........
> >>> >>
> >>> >> I want to get two columns, i.e,  one is to sum
columns for all
> >>> including
> >>> >> ABC for each row and the other is  to sum columns for
all
> >including XYZ
> >>> >> for
> >>> >> each row.
> >>> >>
> >>> >> Is there some help? Thank you!
> >>> >> Dawn
> >>> >>
> >>> >>         [[alternative HTML version deleted]]
> >>> >>
> >>> >> ______________________________________________
> >>> >> R-help at r-project.org mailing list -- To
UNSUBSCRIBE and more, see
> >>> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >>> >> PLEASE do read the posting guide
> >>> >> http://www.R-project.org/posting-guide.html
> >>> >> and provide commented, minimal, self-contained,
reproducible
> >code.
> >>> >>
> >>> >
> >>> > ______________________________________________
> >>> > R-help at r-project.org mailing list -- To UNSUBSCRIBE
and more, see
> >>> > https://stat.ethz.ch/mailman/listinfo/r-help
> >>> > PLEASE do read the posting guide
> >>> http://www.R-project.org/posting-guide.html
> >>> > and provide commented, minimal, self-contained,
reproducible code.
> >>>
> >>
> >>
>
>
	[[alternative HTML version deleted]]

Jeff Newmiller

2015-Jul-14 23:36 UTC

head link

[R] sum some columns for each row

Well it is pretty obvious that all of your columns have non-numeric data in
them, but you are the only one who can tell which ones should have been numeric,
and you are also the one who can peruse your data file in a text editor.
---------------------------------------------------------------------------
Jeff Newmiller                        The     .....       .....  Go Live...
DCN:<jdnewmil at dcn.davis.ca.us>        Basics: ##.#.       ##.#.  Live
Go...
                                      Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/Batteries            O.O#.       #.O#.  with
/Software/Embedded Controllers)               .OO#.       .OO#.  rocks...1k
--------------------------------------------------------------------------- 
Sent from my phone. Please excuse my brevity.

On July 14, 2015 4:05:37 PM PDT, Dawn <dawn1313 at gmail.com>
wrote:>I used two rows to test the data frame, as follows.
>
>> dat <-
read.table("TOV_43_Protein_Clusters_abundance1.tab",
>header=TRUE,sep = "\t")
>> dat1 <- dat[1:2,]
>> str(dat1)
>'data.frame':    2 obs. of  44 variables:
>$ X      : Factor w/ 1075762 levels
"","POV_Cluster_1000001",..: 305266
>625028
> $ X109DCM: Factor w/ 46 levels
"","1","10","109DCM",..: 1 1
> $ X109SUR: Factor w/ 41 levels
"","1","10","109SUR",..: 1 1
> $ X18DCM : Factor w/ 31 levels
"","1","10","11",..: 1 1
> $ X18SUR : Factor w/ 25 levels
"","1","10","11",..: 1 1
> $ X22SUR : Factor w/ 50 levels
"","1","10","11",..: 1 2
> $ X23DCM : Factor w/ 46 levels
"","1","10","11",..: 1 1
> $ X25DCM : Factor w/ 42 levels
"","1","10","11",..: 1 1
> $ X25SUR : Factor w/ 47 levels
"","1","10","11",..: 1 1
> $ X30DCM : Factor w/ 34 levels
"","1","10","11",..: 1 1
> $ X31SUR : Factor w/ 43 levels
"","1","10","11",..: 1 1
> $ X32DCM : Factor w/ 15 levels
"","1","10","11",..: 1 1
> $ X32SUR : Factor w/ 58 levels
"","1","10","11",..: 1 1
> $ X34DCM : Factor w/ 53 levels
"","1","10","11",..: 1 35
> $ X34SUR : Factor w/ 47 levels
"","1","10","11",..: 10 14
> $ X36DCM : Factor w/ 48 levels
"","1","10","11",..: 2 43
> $ X36SUR : Factor w/ 45 levels
"","1","10","11",..: 23 38
> $ X38DCM : Factor w/ 40 levels
"","1","10","11",..: 3 23
> $ X38SUR : Factor w/ 44 levels
"","1","10","11",..: 7 41
> $ X39DCM : Factor w/ 38 levels
"","1","10","11",..: 34 38
> $ X39SUR : Factor w/ 40 levels
"","1","10","11",..: 13 40
> $ X41DCM : Factor w/ 47 levels
"","1","10","11",..: 13 40
> $ X41SUR : Factor w/ 40 levels
"","1","10","11",..: 1 1
> $ X42DCM : Factor w/ 48 levels
"","1","10","11",..: 2 3
> $ X42SUR : Factor w/ 41 levels
"","1","10","11",..: 2 1
> $ X46SUR : Factor w/ 31 levels
"","1","10","11",..: 2 2
> $ X52DCM : Factor w/ 49 levels
"","1","10","11",..: 13 23
> $ X64DCM : Factor w/ 35 levels
"","1","10","11",..: 1 2
> $ X64SUR : Factor w/ 36 levels
"","1","10","11",..: 1 1
> $ X65DCM : Factor w/ 38 levels
"","1","10","11",..: 1 1
> $ X65SUR : Factor w/ 35 levels
"","1","10","11",..: 1 1
> $ X66DCM : Factor w/ 27 levels
"","1","10","11",..: 1 1
> $ X66SUR : Factor w/ 35 levels
"","1","10","11",..: 1 1
> $ X67SUR : Factor w/ 38 levels
"","1","10","11",..: 1 1
> $ X68DCM : Factor w/ 33 levels
"","1","10","11",..: 1 1
> $ X68SUR : Factor w/ 36 levels
"","1","10","11",..: 1 1
> $ X70MES : Factor w/ 23 levels
"","1","10","11",..: 1 1
> $ X70SUR : Factor w/ 37 levels
"","1","10","11",..: 1 1
> $ X72DCM : Factor w/ 40 levels
"","1","10","11",..: 13 27
> $ X72SUR : Factor w/ 38 levels
"","1","10","11",..: 1 1
> $ X76DCM : Factor w/ 44 levels
"","1","10","11",..: 1 1
> $ X76SUR : Factor w/ 34 levels
"","1","10","11",..: 1 1
> $ X82DCM : Factor w/ 29 levels
"","1","10","11",..: 1 1
> $ X85DCM : Factor w/ 30 levels
"","1","10","11",..: 1 1
>
>
>Thank you!!
>Dawn
>
>On Tue, Jul 14, 2015 at 3:48 PM, Jeff Newmiller
><jdnewmil at dcn.davis.ca.us>
>wrote:
>
>> I suspect your data frame "dat" has non-numeric data in some
of the
>> columns that have ABC in their names. Any column of a data frame can
>be
>> numeric or not, but the data frame as a unit cannot be numeric. If
>your
>> data file has odd characters in done of the otherwise-numeric
>columns, the
>> whole column will be read in as a factor or character strings. Look
>at the
>> output of str(dat) for columns that don't show "num'. If
you can find
>the
>> column, and then one of the bad rows, you can use a text editor to
>fix them
>> manually, or show us examples of the bad data and we can suggest ways
>to
>> fix it in R.
>>
>---------------------------------------------------------------------------
>> Jeff Newmiller                        The     .....       .....  Go
>Live...
>> DCN:<jdnewmil at dcn.davis.ca.us>        Basics: ##.#.      
##.#.  Live
>> Go...
>>                                       Live:   OO#.. Dead: OO#.. 
>Playing
>> Research Engineer (Solar/Batteries            O.O#.       #.O#.  with
>> /Software/Embedded Controllers)               .OO#.       .OO#. 
>rocks...1k
>>
>---------------------------------------------------------------------------
>> Sent from my phone. Please excuse my brevity.
>>
>> On July 14, 2015 2:35:38 PM PDT, Dawn <dawn1313 at gmail.com>
wrote:
>> >Hi,
>> >
>> >I used a small set of data (several columns and rows) and it works
>fine
>> >using the following command:
>> >abc <-
rowSums(test[,grep("ABC",names(test),fixed=T)],na.rm=T)
>> >
>> >But when I used the real big data table, "Error in
rowSums(dat[,
>> >grep("ABC", names(dat), fixed = T)], na.rm = T) :
>> >  'x' must be numeric"
>> >Then it didn't work either using as.numeric():
>> >> as.numeric(dat)
>> >Error: (list) object cannot be coerced to type 'double'
>> >
>> >Thanks!
>> >Dawn
>> >
>> >
>> >
>> >
>> >On Fri, Jul 10, 2015 at 4:35 PM, Dawn <dawn1313 at gmail.com>
wrote:
>> >
>> >> Thank you all and sorry for the data messing. It has worked!
>> >>
>> >> Best,
>> >> Dawn
>> >>
>> >> On Fri, Jul 10, 2015 at 4:15 AM, Jim Lemon <drjimlemon at
gmail.com>
>> >wrote:
>> >>
>> >>> Hi Dawn,
>> >>> Your data are a bit messed up, but try the following:
>> >>>
>> >>>
colSums(dat[,grep("ABC",names(dat),fixed=TRUE)],na.rm=TRUE)
>> >>>
colSums(dat[,grep("XYZ",names(dat),fixed=TRUE)],na.rm=TRUE)
>> >>>
>> >>> I'm assuming that you want to discard the NA values.
>> >>>
>> >>> Jim
>> >>>
>> >>> On Fri, Jul 10, 2015 at 6:52 AM, Rui Barradas
><ruipbarradas at sapo.pt>
>> >>> wrote:
>> >>> > Hello,
>> >>> >
>> >>> > Please use ?dput to give a data example, like this
it's
>completely
>> >>> > unreadable. If your data.frame is named 'dat'
use
>> >>> >
>> >>> > dput(head(dat, 30))  # paste the outut of this in
your mail
>> >>> >
>> >>> >
>> >>> > And don't post in html, use plain text only, like
the posting
>> >guide
>> >>> says.
>> >>> >
>> >>> > Rui Barradas
>> >>> >
>> >>> >
>> >>> > Em 09-07-2015 18:12, Dawn escreveu:
>> >>> >>
>> >>> >> Hi,
>> >>> >>
>> >>> >> I have a big dataframe as follows
>> >>> >>
>> >>> >>      109ABC    109XYZ    18ABC    18XYZ    22XYZ 
23ABC
>> >25ABC
>> >>> >> 25XYZ
>> >>> >>     30ABC    31XYZ    32ABC    32XYZ    34DCM   
34XYZ
>36ABC
>> >>> 36SUR
>> >>> >> 38DCM    38XYZ    39DCM    39SUR    41DCM   
41SUR    42DCM
>> >42SUR
>> >>> >> 46SUR    52DCM    64ABC    64XYZ    65ABC   
65XYZ    66ABC
>> >66XYZ
>> >>> >> 67XYZ    68ABC    68SUR    70MES    70SUR   
72ABC    72XYZ
>> >76ABC
>> >>> >> 76XYZ    82ABC    85ABC    POV
>> >>> >> Cluster_1
>> >17
>> >>> 1
>> >>> >> 3    10    14    5    2    2        1    1    1  
2
>> >>> >>                          2                       
TT:61
>> >>> >> Cluster_2                    1
>4
>> > 20
>> >>> >> 6    5    3    6    9    9    6        10       
1    3    1
>> >>> >>                              4
>TT:88
>> >>> >> Cluster_3    3        3                          
6        4
>> >   17
>> >>> >> 17    18    13    17    19    22    11    5    21
8    5
>18
>> >   4
>> >>> >> 7                                        9
>> >>> >> TT:227
>> >>> >> ........
>> >>> >>
>> >>> >> I want to get two columns, i.e,  one is to sum
columns for all
>> >>> including
>> >>> >> ABC for each row and the other is  to sum columns
for all
>> >including XYZ
>> >>> >> for
>> >>> >> each row.
>> >>> >>
>> >>> >> Is there some help? Thank you!
>> >>> >> Dawn
>> >>> >>
>> >>> >>         [[alternative HTML version deleted]]
>> >>> >>
>> >>> >> ______________________________________________
>> >>> >> R-help at r-project.org mailing list -- To
UNSUBSCRIBE and more,
>see
>> >>> >> https://stat.ethz.ch/mailman/listinfo/r-help
>> >>> >> PLEASE do read the posting guide
>> >>> >> http://www.R-project.org/posting-guide.html
>> >>> >> and provide commented, minimal, self-contained,
reproducible
>> >code.
>> >>> >>
>> >>> >
>> >>> > ______________________________________________
>> >>> > R-help at r-project.org mailing list -- To
UNSUBSCRIBE and more,
>see
>> >>> > https://stat.ethz.ch/mailman/listinfo/r-help
>> >>> > PLEASE do read the posting guide
>> >>> http://www.R-project.org/posting-guide.html
>> >>> > and provide commented, minimal, self-contained,
reproducible
>code.
>> >>>
>> >>
>> >>
>>
>>

Bert Gunter

2015-Jul-14 23:44 UTC

head link

[R] sum some columns for each row

It seems that Dawn could really benefit from spending some time with
an online R tutorial or two, as she appears not to have much of a clue
about R's basic data structures.

Cheers,
Bert

Bert Gunter

"Data is not information. Information is not knowledge. And knowledge
is certainly not wisdom."
   -- Clifford Stoll


On Tue, Jul 14, 2015 at 4:36 PM, Jeff Newmiller
<jdnewmil at dcn.davis.ca.us> wrote:> Well it is pretty obvious that all of your columns have non-numeric data in
them, but you are the only one who can tell which ones should have been numeric,
and you are also the one who can peruse your data file in a text editor.
> ---------------------------------------------------------------------------
> Jeff Newmiller                        The     .....       .....  Go Live...
> DCN:<jdnewmil at dcn.davis.ca.us>        Basics: ##.#.       ##.#. 
Live Go...
>                                       Live:   OO#.. Dead: OO#..  Playing
> Research Engineer (Solar/Batteries            O.O#.       #.O#.  with
> /Software/Embedded Controllers)               .OO#.       .OO#.  rocks...1k
> ---------------------------------------------------------------------------
> Sent from my phone. Please excuse my brevity.
>
> On July 14, 2015 4:05:37 PM PDT, Dawn <dawn1313 at gmail.com> wrote:
>>I used two rows to test the data frame, as follows.
>>
>>> dat <-
read.table("TOV_43_Protein_Clusters_abundance1.tab",
>>header=TRUE,sep = "\t")
>>> dat1 <- dat[1:2,]
>>> str(dat1)
>>'data.frame':    2 obs. of  44 variables:
>>$ X      : Factor w/ 1075762 levels
"","POV_Cluster_1000001",..: 305266
>>625028
>> $ X109DCM: Factor w/ 46 levels
"","1","10","109DCM",..: 1 1
>> $ X109SUR: Factor w/ 41 levels
"","1","10","109SUR",..: 1 1
>> $ X18DCM : Factor w/ 31 levels
"","1","10","11",..: 1 1
>> $ X18SUR : Factor w/ 25 levels
"","1","10","11",..: 1 1
>> $ X22SUR : Factor w/ 50 levels
"","1","10","11",..: 1 2
>> $ X23DCM : Factor w/ 46 levels
"","1","10","11",..: 1 1
>> $ X25DCM : Factor w/ 42 levels
"","1","10","11",..: 1 1
>> $ X25SUR : Factor w/ 47 levels
"","1","10","11",..: 1 1
>> $ X30DCM : Factor w/ 34 levels
"","1","10","11",..: 1 1
>> $ X31SUR : Factor w/ 43 levels
"","1","10","11",..: 1 1
>> $ X32DCM : Factor w/ 15 levels
"","1","10","11",..: 1 1
>> $ X32SUR : Factor w/ 58 levels
"","1","10","11",..: 1 1
>> $ X34DCM : Factor w/ 53 levels
"","1","10","11",..: 1 35
>> $ X34SUR : Factor w/ 47 levels
"","1","10","11",..: 10 14
>> $ X36DCM : Factor w/ 48 levels
"","1","10","11",..: 2 43
>> $ X36SUR : Factor w/ 45 levels
"","1","10","11",..: 23 38
>> $ X38DCM : Factor w/ 40 levels
"","1","10","11",..: 3 23
>> $ X38SUR : Factor w/ 44 levels
"","1","10","11",..: 7 41
>> $ X39DCM : Factor w/ 38 levels
"","1","10","11",..: 34 38
>> $ X39SUR : Factor w/ 40 levels
"","1","10","11",..: 13 40
>> $ X41DCM : Factor w/ 47 levels
"","1","10","11",..: 13 40
>> $ X41SUR : Factor w/ 40 levels
"","1","10","11",..: 1 1
>> $ X42DCM : Factor w/ 48 levels
"","1","10","11",..: 2 3
>> $ X42SUR : Factor w/ 41 levels
"","1","10","11",..: 2 1
>> $ X46SUR : Factor w/ 31 levels
"","1","10","11",..: 2 2
>> $ X52DCM : Factor w/ 49 levels
"","1","10","11",..: 13 23
>> $ X64DCM : Factor w/ 35 levels
"","1","10","11",..: 1 2
>> $ X64SUR : Factor w/ 36 levels
"","1","10","11",..: 1 1
>> $ X65DCM : Factor w/ 38 levels
"","1","10","11",..: 1 1
>> $ X65SUR : Factor w/ 35 levels
"","1","10","11",..: 1 1
>> $ X66DCM : Factor w/ 27 levels
"","1","10","11",..: 1 1
>> $ X66SUR : Factor w/ 35 levels
"","1","10","11",..: 1 1
>> $ X67SUR : Factor w/ 38 levels
"","1","10","11",..: 1 1
>> $ X68DCM : Factor w/ 33 levels
"","1","10","11",..: 1 1
>> $ X68SUR : Factor w/ 36 levels
"","1","10","11",..: 1 1
>> $ X70MES : Factor w/ 23 levels
"","1","10","11",..: 1 1
>> $ X70SUR : Factor w/ 37 levels
"","1","10","11",..: 1 1
>> $ X72DCM : Factor w/ 40 levels
"","1","10","11",..: 13 27
>> $ X72SUR : Factor w/ 38 levels
"","1","10","11",..: 1 1
>> $ X76DCM : Factor w/ 44 levels
"","1","10","11",..: 1 1
>> $ X76SUR : Factor w/ 34 levels
"","1","10","11",..: 1 1
>> $ X82DCM : Factor w/ 29 levels
"","1","10","11",..: 1 1
>> $ X85DCM : Factor w/ 30 levels
"","1","10","11",..: 1 1
>>
>>
>>Thank you!!
>>Dawn
>>
>>On Tue, Jul 14, 2015 at 3:48 PM, Jeff Newmiller
>><jdnewmil at dcn.davis.ca.us>
>>wrote:
>>
>>> I suspect your data frame "dat" has non-numeric data in
some of the
>>> columns that have ABC in their names. Any column of a data frame
can
>>be
>>> numeric or not, but the data frame as a unit cannot be numeric. If
>>your
>>> data file has odd characters in done of the otherwise-numeric
>>columns, the
>>> whole column will be read in as a factor or character strings. Look
>>at the
>>> output of str(dat) for columns that don't show "num'.
If you can find
>>the
>>> column, and then one of the bad rows, you can use a text editor to
>>fix them
>>> manually, or show us examples of the bad data and we can suggest
ways
>>to
>>> fix it in R.
>>>
>>---------------------------------------------------------------------------
>>> Jeff Newmiller                        The     .....       .....  Go
>>Live...
>>> DCN:<jdnewmil at dcn.davis.ca.us>        Basics: ##.#.      
##.#.  Live
>>> Go...
>>>                                       Live:   OO#.. Dead: OO#..
>>Playing
>>> Research Engineer (Solar/Batteries            O.O#.       #.O#. 
with
>>> /Software/Embedded Controllers)               .OO#.       .OO#.
>>rocks...1k
>>>
>>---------------------------------------------------------------------------
>>> Sent from my phone. Please excuse my brevity.
>>>
>>> On July 14, 2015 2:35:38 PM PDT, Dawn <dawn1313 at gmail.com>
wrote:
>>> >Hi,
>>> >
>>> >I used a small set of data (several columns and rows) and it
works
>>fine
>>> >using the following command:
>>> >abc <-
rowSums(test[,grep("ABC",names(test),fixed=T)],na.rm=T)
>>> >
>>> >But when I used the real big data table, "Error in
rowSums(dat[,
>>> >grep("ABC", names(dat), fixed = T)], na.rm = T) :
>>> >  'x' must be numeric"
>>> >Then it didn't work either using as.numeric():
>>> >> as.numeric(dat)
>>> >Error: (list) object cannot be coerced to type 'double'
>>> >
>>> >Thanks!
>>> >Dawn
>>> >
>>> >
>>> >
>>> >
>>> >On Fri, Jul 10, 2015 at 4:35 PM, Dawn <dawn1313 at
gmail.com> wrote:
>>> >
>>> >> Thank you all and sorry for the data messing. It has
worked!
>>> >>
>>> >> Best,
>>> >> Dawn
>>> >>
>>> >> On Fri, Jul 10, 2015 at 4:15 AM, Jim Lemon <drjimlemon
at gmail.com>
>>> >wrote:
>>> >>
>>> >>> Hi Dawn,
>>> >>> Your data are a bit messed up, but try the following:
>>> >>>
>>> >>>
colSums(dat[,grep("ABC",names(dat),fixed=TRUE)],na.rm=TRUE)
>>> >>>
colSums(dat[,grep("XYZ",names(dat),fixed=TRUE)],na.rm=TRUE)
>>> >>>
>>> >>> I'm assuming that you want to discard the NA
values.
>>> >>>
>>> >>> Jim
>>> >>>
>>> >>> On Fri, Jul 10, 2015 at 6:52 AM, Rui Barradas
>><ruipbarradas at sapo.pt>
>>> >>> wrote:
>>> >>> > Hello,
>>> >>> >
>>> >>> > Please use ?dput to give a data example, like
this it's
>>completely
>>> >>> > unreadable. If your data.frame is named
'dat' use
>>> >>> >
>>> >>> > dput(head(dat, 30))  # paste the outut of this in
your mail
>>> >>> >
>>> >>> >
>>> >>> > And don't post in html, use plain text only,
like the posting
>>> >guide
>>> >>> says.
>>> >>> >
>>> >>> > Rui Barradas
>>> >>> >
>>> >>> >
>>> >>> > Em 09-07-2015 18:12, Dawn escreveu:
>>> >>> >>
>>> >>> >> Hi,
>>> >>> >>
>>> >>> >> I have a big dataframe as follows
>>> >>> >>
>>> >>> >>      109ABC    109XYZ    18ABC    18XYZ   
22XYZ    23ABC
>>> >25ABC
>>> >>> >> 25XYZ
>>> >>> >>     30ABC    31XYZ    32ABC    32XYZ    34DCM
34XYZ
>>36ABC
>>> >>> 36SUR
>>> >>> >> 38DCM    38XYZ    39DCM    39SUR    41DCM   
41SUR    42DCM
>>> >42SUR
>>> >>> >> 46SUR    52DCM    64ABC    64XYZ    65ABC   
65XYZ    66ABC
>>> >66XYZ
>>> >>> >> 67XYZ    68ABC    68SUR    70MES    70SUR   
72ABC    72XYZ
>>> >76ABC
>>> >>> >> 76XYZ    82ABC    85ABC    POV
>>> >>> >> Cluster_1
>>> >17
>>> >>> 1
>>> >>> >> 3    10    14    5    2    2        1    1   
1    2
>>> >>> >>                          2                   
TT:61
>>> >>> >> Cluster_2                    1
>>4
>>> > 20
>>> >>> >> 6    5    3    6    9    9    6        10    
1    3    1
>>> >>> >>                              4
>>TT:88
>>> >>> >> Cluster_3    3        3                      
6        4
>>> >   17
>>> >>> >> 17    18    13    17    19    22    11    5  
21    8    5
>>18
>>> >   4
>>> >>> >> 7                                        9
>>> >>> >> TT:227
>>> >>> >> ........
>>> >>> >>
>>> >>> >> I want to get two columns, i.e,  one is to
sum columns for all
>>> >>> including
>>> >>> >> ABC for each row and the other is  to sum
columns for all
>>> >including XYZ
>>> >>> >> for
>>> >>> >> each row.
>>> >>> >>
>>> >>> >> Is there some help? Thank you!
>>> >>> >> Dawn
>>> >>> >>
>>> >>> >>         [[alternative HTML version deleted]]
>>> >>> >>
>>> >>> >>
______________________________________________
>>> >>> >> R-help at r-project.org mailing list -- To
UNSUBSCRIBE and more,
>>see
>>> >>> >> https://stat.ethz.ch/mailman/listinfo/r-help
>>> >>> >> PLEASE do read the posting guide
>>> >>> >> http://www.R-project.org/posting-guide.html
>>> >>> >> and provide commented, minimal,
self-contained, reproducible
>>> >code.
>>> >>> >>
>>> >>> >
>>> >>> > ______________________________________________
>>> >>> > R-help at r-project.org mailing list -- To
UNSUBSCRIBE and more,
>>see
>>> >>> > https://stat.ethz.ch/mailman/listinfo/r-help
>>> >>> > PLEASE do read the posting guide
>>> >>> http://www.R-project.org/posting-guide.html
>>> >>> > and provide commented, minimal, self-contained,
reproducible
>>code.
>>> >>>
>>> >>
>>> >>
>>>
>>>
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

Dawn

2015-Jul-14 23:49 UTC

head link

[R] sum some columns for each row

I attached the file including the first two rows and please help to make it
the numeric data frame. Hopefully the following command works:

dcm <- rowSums(dat1[,grep("DCM",names(dat1),fixed=T)],na.rm=T)

Thank you very much!
Dawn

On Tue, Jul 14, 2015 at 4:36 PM, Jeff Newmiller <jdnewmil at
dcn.davis.ca.us>
wrote:
> Well it is pretty obvious that all of your columns have non-numeric data
> in them, but you are the only one who can tell which ones should have been
> numeric, and you are also the one who can peruse your data file in a text
> editor.
> ---------------------------------------------------------------------------
> Jeff Newmiller                        The     .....       .....  Go Live...
> DCN:<jdnewmil at dcn.davis.ca.us>        Basics: ##.#.       ##.#. 
Live
> Go...
>                                       Live:   OO#.. Dead: OO#..  Playing
> Research Engineer (Solar/Batteries            O.O#.       #.O#.  with
> /Software/Embedded Controllers)               .OO#.       .OO#.  rocks...1k
> ---------------------------------------------------------------------------
> Sent from my phone. Please excuse my brevity.
>
> On July 14, 2015 4:05:37 PM PDT, Dawn <dawn1313 at gmail.com> wrote:
> >I used two rows to test the data frame, as follows.
> >
> >> dat <-
read.table("TOV_43_Protein_Clusters_abundance1.tab",
> >header=TRUE,sep = "\t")
> >> dat1 <- dat[1:2,]
> >> str(dat1)
> >'data.frame':    2 obs. of  44 variables:
> >$ X      : Factor w/ 1075762 levels
"","POV_Cluster_1000001",..: 305266
> >625028
> > $ X109DCM: Factor w/ 46 levels
"","1","10","109DCM",..: 1 1
> > $ X109SUR: Factor w/ 41 levels
"","1","10","109SUR",..: 1 1
> > $ X18DCM : Factor w/ 31 levels
"","1","10","11",..: 1 1
> > $ X18SUR : Factor w/ 25 levels
"","1","10","11",..: 1 1
> > $ X22SUR : Factor w/ 50 levels
"","1","10","11",..: 1 2
> > $ X23DCM : Factor w/ 46 levels
"","1","10","11",..: 1 1
> > $ X25DCM : Factor w/ 42 levels
"","1","10","11",..: 1 1
> > $ X25SUR : Factor w/ 47 levels
"","1","10","11",..: 1 1
> > $ X30DCM : Factor w/ 34 levels
"","1","10","11",..: 1 1
> > $ X31SUR : Factor w/ 43 levels
"","1","10","11",..: 1 1
> > $ X32DCM : Factor w/ 15 levels
"","1","10","11",..: 1 1
> > $ X32SUR : Factor w/ 58 levels
"","1","10","11",..: 1 1
> > $ X34DCM : Factor w/ 53 levels
"","1","10","11",..: 1 35
> > $ X34SUR : Factor w/ 47 levels
"","1","10","11",..: 10 14
> > $ X36DCM : Factor w/ 48 levels
"","1","10","11",..: 2 43
> > $ X36SUR : Factor w/ 45 levels
"","1","10","11",..: 23 38
> > $ X38DCM : Factor w/ 40 levels
"","1","10","11",..: 3 23
> > $ X38SUR : Factor w/ 44 levels
"","1","10","11",..: 7 41
> > $ X39DCM : Factor w/ 38 levels
"","1","10","11",..: 34 38
> > $ X39SUR : Factor w/ 40 levels
"","1","10","11",..: 13 40
> > $ X41DCM : Factor w/ 47 levels
"","1","10","11",..: 13 40
> > $ X41SUR : Factor w/ 40 levels
"","1","10","11",..: 1 1
> > $ X42DCM : Factor w/ 48 levels
"","1","10","11",..: 2 3
> > $ X42SUR : Factor w/ 41 levels
"","1","10","11",..: 2 1
> > $ X46SUR : Factor w/ 31 levels
"","1","10","11",..: 2 2
> > $ X52DCM : Factor w/ 49 levels
"","1","10","11",..: 13 23
> > $ X64DCM : Factor w/ 35 levels
"","1","10","11",..: 1 2
> > $ X64SUR : Factor w/ 36 levels
"","1","10","11",..: 1 1
> > $ X65DCM : Factor w/ 38 levels
"","1","10","11",..: 1 1
> > $ X65SUR : Factor w/ 35 levels
"","1","10","11",..: 1 1
> > $ X66DCM : Factor w/ 27 levels
"","1","10","11",..: 1 1
> > $ X66SUR : Factor w/ 35 levels
"","1","10","11",..: 1 1
> > $ X67SUR : Factor w/ 38 levels
"","1","10","11",..: 1 1
> > $ X68DCM : Factor w/ 33 levels
"","1","10","11",..: 1 1
> > $ X68SUR : Factor w/ 36 levels
"","1","10","11",..: 1 1
> > $ X70MES : Factor w/ 23 levels
"","1","10","11",..: 1 1
> > $ X70SUR : Factor w/ 37 levels
"","1","10","11",..: 1 1
> > $ X72DCM : Factor w/ 40 levels
"","1","10","11",..: 13 27
> > $ X72SUR : Factor w/ 38 levels
"","1","10","11",..: 1 1
> > $ X76DCM : Factor w/ 44 levels
"","1","10","11",..: 1 1
> > $ X76SUR : Factor w/ 34 levels
"","1","10","11",..: 1 1
> > $ X82DCM : Factor w/ 29 levels
"","1","10","11",..: 1 1
> > $ X85DCM : Factor w/ 30 levels
"","1","10","11",..: 1 1
> >
> >
> >Thank you!!
> >Dawn
> >
> >On Tue, Jul 14, 2015 at 3:48 PM, Jeff Newmiller
> ><jdnewmil at dcn.davis.ca.us>
> >wrote:
> >
> >> I suspect your data frame "dat" has non-numeric data in
some of the
> >> columns that have ABC in their names. Any column of a data frame
can
> >be
> >> numeric or not, but the data frame as a unit cannot be numeric. If
> >your
> >> data file has odd characters in done of the otherwise-numeric
> >columns, the
> >> whole column will be read in as a factor or character strings.
Look
> >at the
> >> output of str(dat) for columns that don't show "num'.
If you can find
> >the
> >> column, and then one of the bad rows, you can use a text editor to
> >fix them
> >> manually, or show us examples of the bad data and we can suggest
ways
> >to
> >> fix it in R.
> >>
>
>
>---------------------------------------------------------------------------
> >> Jeff Newmiller                        The     .....       ..... 
Go
> >Live...
> >> DCN:<jdnewmil at dcn.davis.ca.us>        Basics: ##.#.      
##.#.  Live
> >> Go...
> >>                                       Live:   OO#.. Dead: OO#..
> >Playing
> >> Research Engineer (Solar/Batteries            O.O#.       #.O#. 
with
> >> /Software/Embedded Controllers)               .OO#.       .OO#.
> >rocks...1k
> >>
>
>
>---------------------------------------------------------------------------
> >> Sent from my phone. Please excuse my brevity.
> >>
> >> On July 14, 2015 2:35:38 PM PDT, Dawn <dawn1313 at
gmail.com> wrote:
> >> >Hi,
> >> >
> >> >I used a small set of data (several columns and rows) and it
works
> >fine
> >> >using the following command:
> >> >abc <-
rowSums(test[,grep("ABC",names(test),fixed=T)],na.rm=T)
> >> >
> >> >But when I used the real big data table, "Error in
rowSums(dat[,
> >> >grep("ABC", names(dat), fixed = T)], na.rm = T) :
> >> >  'x' must be numeric"
> >> >Then it didn't work either using as.numeric():
> >> >> as.numeric(dat)
> >> >Error: (list) object cannot be coerced to type
'double'
> >> >
> >> >Thanks!
> >> >Dawn
> >> >
> >> >
> >> >
> >> >
> >> >On Fri, Jul 10, 2015 at 4:35 PM, Dawn <dawn1313 at
gmail.com> wrote:
> >> >
> >> >> Thank you all and sorry for the data messing. It has
worked!
> >> >>
> >> >> Best,
> >> >> Dawn
> >> >>
> >> >> On Fri, Jul 10, 2015 at 4:15 AM, Jim Lemon <drjimlemon
at gmail.com>
> >> >wrote:
> >> >>
> >> >>> Hi Dawn,
> >> >>> Your data are a bit messed up, but try the following:
> >> >>>
> >> >>>
colSums(dat[,grep("ABC",names(dat),fixed=TRUE)],na.rm=TRUE)
> >> >>>
colSums(dat[,grep("XYZ",names(dat),fixed=TRUE)],na.rm=TRUE)
> >> >>>
> >> >>> I'm assuming that you want to discard the NA
values.
> >> >>>
> >> >>> Jim
> >> >>>
> >> >>> On Fri, Jul 10, 2015 at 6:52 AM, Rui Barradas
> ><ruipbarradas at sapo.pt>
> >> >>> wrote:
> >> >>> > Hello,
> >> >>> >
> >> >>> > Please use ?dput to give a data example, like
this it's
> >completely
> >> >>> > unreadable. If your data.frame is named
'dat' use
> >> >>> >
> >> >>> > dput(head(dat, 30))  # paste the outut of this
in your mail
> >> >>> >
> >> >>> >
> >> >>> > And don't post in html, use plain text only,
like the posting
> >> >guide
> >> >>> says.
> >> >>> >
> >> >>> > Rui Barradas
> >> >>> >
> >> >>> >
> >> >>> > Em 09-07-2015 18:12, Dawn escreveu:
> >> >>> >>
> >> >>> >> Hi,
> >> >>> >>
> >> >>> >> I have a big dataframe as follows
> >> >>> >>
> >> >>> >>      109ABC    109XYZ    18ABC    18XYZ   
22XYZ    23ABC
> >> >25ABC
> >> >>> >> 25XYZ
> >> >>> >>     30ABC    31XYZ    32ABC    32XYZ   
34DCM    34XYZ
> >36ABC
> >> >>> 36SUR
> >> >>> >> 38DCM    38XYZ    39DCM    39SUR    41DCM   
41SUR    42DCM
> >> >42SUR
> >> >>> >> 46SUR    52DCM    64ABC    64XYZ    65ABC   
65XYZ    66ABC
> >> >66XYZ
> >> >>> >> 67XYZ    68ABC    68SUR    70MES    70SUR   
72ABC    72XYZ
> >> >76ABC
> >> >>> >> 76XYZ    82ABC    85ABC    POV
> >> >>> >> Cluster_1
> >> >17
> >> >>> 1
> >> >>> >> 3    10    14    5    2    2        1    1  
1    2
> >> >>> >>                          2                  
TT:61
> >> >>> >> Cluster_2                    1
> >4
> >> > 20
> >> >>> >> 6    5    3    6    9    9    6        10   
1    3    1
> >> >>> >>                              4
> >TT:88
> >> >>> >> Cluster_3    3        3                     
6        4
> >> >   17
> >> >>> >> 17    18    13    17    19    22    11    5 
21    8    5
> >18
> >> >   4
> >> >>> >> 7                                        9
> >> >>> >> TT:227
> >> >>> >> ........
> >> >>> >>
> >> >>> >> I want to get two columns, i.e,  one is to
sum columns for all
> >> >>> including
> >> >>> >> ABC for each row and the other is  to sum
columns for all
> >> >including XYZ
> >> >>> >> for
> >> >>> >> each row.
> >> >>> >>
> >> >>> >> Is there some help? Thank you!
> >> >>> >> Dawn
> >> >>> >>
> >> >>> >>         [[alternative HTML version deleted]]
> >> >>> >>
> >> >>> >>
______________________________________________
> >> >>> >> R-help at r-project.org mailing list -- To
UNSUBSCRIBE and more,
> >see
> >> >>> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> >>> >> PLEASE do read the posting guide
> >> >>> >> http://www.R-project.org/posting-guide.html
> >> >>> >> and provide commented, minimal,
self-contained, reproducible
> >> >code.
> >> >>> >>
> >> >>> >
> >> >>> > ______________________________________________
> >> >>> > R-help at r-project.org mailing list -- To
UNSUBSCRIBE and more,
> >see
> >> >>> > https://stat.ethz.ch/mailman/listinfo/r-help
> >> >>> > PLEASE do read the posting guide
> >> >>> http://www.R-project.org/posting-guide.html
> >> >>> > and provide commented, minimal, self-contained,
reproducible
> >code.
> >> >>>
> >> >>
> >> >>
> >>
> >>
>
>

R help - Jul 2015 - sum some columns for each row

[R] sum some columns for each row

[R] sum some columns for each row

[R] sum some columns for each row

[R] sum some columns for each row