thr3ads.net - R help - [R] how to automatically select certain columns using for loop in dataframe [Apr 2009]

If this information is useful, please help other people find it:
Share via:

Ferry

2009-Apr-09 22:30 UTC

[R] how to automatically select certain columns using for loop in dataframe

Hi,

I am trying to display / print certain columns in my data frame that share
certain condition (for example, part of the column name). I am using for
loop, as follow:

# below is the sample data structure
all.data <- data.frame( NUM_A = 1:5, NAME_A = c("Andy",
"Andrew", "Angus",
"Alex", "Argo"),
                        NUM_B = 1:5, NAME_B = c(NA, "Barn",
"Bolton",
"Bravo", NA),
                        NUM_C = 1:5, NAME_C = c("Candy", NA,
"Cecil",
"Crayon", "Corey"),
                        NUM_D = 1:5, NAME_D = c("David",
"Delta", NA, NA,
"Dummy") )

col_names <- c("A", "B", "C", "D")
> all.data  NUM_A NAME_A NUM_B NAME_B NUM_C NAME_C NUM_D NAME_D
1     1   Andy     1   <NA>     1  Candy     1  David
2     2 Andrew     2   Barn     2   <NA>     2  Delta
3     3  Angus     3 Bolton     3  Cecil     3   <NA>
4     4   Alex     4  Bravo     4 Crayon     4   <NA>
5     5   Argo     5   <NA>     5  Corey     5 
Dummy>
Then for each col_names, I want to display the columns:

for (each_name in col_names) {

        sub.data <- subset( all.data,
                            !is.na( paste("NAME_", each_name, sep =
'') ),
                            select = c( paste("NUM_", each_name, sep =
'') ,
paste("NAME_", each_name, sep = '') )
                          )
        print(sub.data)
}

the "incorrect" result:

NUM_A NAME_A
1     1   Andy
2     2 Andrew
3     3  Angus
4     4   Alex
5     5   Argo
  NUM_B NAME_B
1     1   <NA>
2     2   Barn
3     3 Bolton
4     4  Bravo
5     5   <NA>
  NUM_C NAME_C
1     1  Candy
2     2   <NA>
3     3  Cecil
4     4 Crayon
5     5  Corey
  NUM_D NAME_D
1     1  David
2     2  Delta
3     3   <NA>
4     4   <NA>
5     5  Dummy>
What I want to achieve is that the result should only display the NUM and
NAME that is not NA. Here, the NA can be NULL, or zero (or other specific
values).

the "correct" result:

NUM_A NAME_A
1     1   Andy
2     2 Andrew
3     3  Angus
4     4   Alex
5     5   Argo
  NUM_B NAME_B
 2     2   Barn
3     3 Bolton
4     4  Bravo
   NUM_C NAME_C
1     1  Candy
 3     3  Cecil
4     4 Crayon
5     5  Corey
  NUM_D NAME_D
1     1  David
2     2  Delta
5     5  Dummy>
I am guessing that I don't use the correct type on the following statement
(within the subset in the loop):
!is.na( paste("NAME_", each_name, sep = '') )

But then, I might be completely using a wrong approach.

Any idea is definitely appreciated.

Thank you,

Ferry

	[[alternative HTML version deleted]]

milton ruser

2009-Apr-10 03:30 UTC

head link

[R] how to automatically select certain columns using for loop in dataframe

Hi Ferry,

It is not so elegant, but you can try

for (each_name in col_names) {

       sub.data <- subset( all.data,
                           !is.na( paste("NAME_", each_name, sep =
'') ),
                           select = c( paste("NUM_", each_name, sep =
'') ,
paste("NAME_", each_name, sep = '') )
                         )
    sub.data.2<-subset(sub.data, !is.na(sub.data[,2]))
       print(sub.data.2)
}


On Thu, Apr 9, 2009 at 6:30 PM, Ferry <fmi.mlist@gmail.com> wrote:
> Hi,
>
> I am trying to display / print certain columns in my data frame that share
> certain condition (for example, part of the column name). I am using for
> loop, as follow:
>
> # below is the sample data structure
> all.data <- data.frame( NUM_A = 1:5, NAME_A = c("Andy",
"Andrew", "Angus",
> "Alex", "Argo"),
>                        NUM_B = 1:5, NAME_B = c(NA, "Barn",
"Bolton",
> "Bravo", NA),
>                        NUM_C = 1:5, NAME_C = c("Candy", NA,
"Cecil",
> "Crayon", "Corey"),
>                        NUM_D = 1:5, NAME_D = c("David",
"Delta", NA, NA,
> "Dummy") )
>
> col_names <- c("A", "B", "C",
"D")
>
> > all.data
>  NUM_A NAME_A NUM_B NAME_B NUM_C NAME_C NUM_D NAME_D
> 1     1   Andy     1   <NA>     1  Candy     1  David
> 2     2 Andrew     2   Barn     2   <NA>     2  Delta
> 3     3  Angus     3 Bolton     3  Cecil     3   <NA>
> 4     4   Alex     4  Bravo     4 Crayon     4   <NA>
> 5     5   Argo     5   <NA>     5  Corey     5  Dummy
> >
>
> Then for each col_names, I want to display the columns:
>
> for (each_name in col_names) {
>
>        sub.data <- subset( all.data,
>                            !is.na( paste("NAME_", each_name, sep
= '') ),
>                            select = c( paste("NUM_", each_name,
sep = '') ,
> paste("NAME_", each_name, sep = '') )
>                          )
>        print(sub.data)
> }
>
> the "incorrect" result:
>
> NUM_A NAME_A
> 1     1   Andy
> 2     2 Andrew
> 3     3  Angus
> 4     4   Alex
> 5     5   Argo
>  NUM_B NAME_B
> 1     1   <NA>
> 2     2   Barn
> 3     3 Bolton
> 4     4  Bravo
> 5     5   <NA>
>  NUM_C NAME_C
> 1     1  Candy
> 2     2   <NA>
> 3     3  Cecil
> 4     4 Crayon
> 5     5  Corey
>  NUM_D NAME_D
> 1     1  David
> 2     2  Delta
> 3     3   <NA>
> 4     4   <NA>
> 5     5  Dummy
> >
>
> What I want to achieve is that the result should only display the NUM and
> NAME that is not NA. Here, the NA can be NULL, or zero (or other specific
> values).
>
> the "correct" result:
>
> NUM_A NAME_A
> 1     1   Andy
> 2     2 Andrew
> 3     3  Angus
> 4     4   Alex
> 5     5   Argo
>  NUM_B NAME_B
>  2     2   Barn
> 3     3 Bolton
> 4     4  Bravo
>   NUM_C NAME_C
> 1     1  Candy
>  3     3  Cecil
> 4     4 Crayon
> 5     5  Corey
>  NUM_D NAME_D
> 1     1  David
> 2     2  Delta
> 5     5  Dummy
> >
>
> I am guessing that I don't use the correct type on the following
statement
> (within the subset in the loop):
> !is.na( paste("NAME_", each_name, sep = '') )
>
> But then, I might be completely using a wrong approach.
>
> Any idea is definitely appreciated.
>
> Thank you,
>
> Ferry
>
>        [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
	[[alternative HTML version deleted]]

Petr PIKAL

2009-Apr-10 07:10 UTC

head link

[R] Odp: how to automatically select certain columns using for loop in dataframe

Hi

I do not like complicated paste cycles too much so I would prefer

for (i in 1:4) print(na.omit(all.data[  ,last.char(names(all.data)) %in% 
col_names[i] ]))

with last.char function like this

last.char<-function(x) substring(x, first=nchar(x), last=nchar(x))

Regards
Petr


r-help-bounces at r-project.org napsal dne 10.04.2009 00:30:37:
> Hi,
> 
> I am trying to display / print certain columns in my data frame that 
share> certain condition (for example, part of the column name). I am using for
> loop, as follow:
> 
> # below is the sample data structure
> all.data <- data.frame( NUM_A = 1:5, NAME_A = c("Andy",
"Andrew",
"Angus",> "Alex", "Argo"),
>                         NUM_B = 1:5, NAME_B = c(NA, "Barn",
"Bolton",
> "Bravo", NA),
>                         NUM_C = 1:5, NAME_C = c("Candy", NA,
"Cecil",
> "Crayon", "Corey"),
>                         NUM_D = 1:5, NAME_D = c("David",
"Delta", NA,
NA,> "Dummy") )
> 
> col_names <- c("A", "B", "C",
"D")
> 
> > all.data
>   NUM_A NAME_A NUM_B NAME_B NUM_C NAME_C NUM_D NAME_D
> 1     1   Andy     1   <NA>     1  Candy     1  David
> 2     2 Andrew     2   Barn     2   <NA>     2  Delta
> 3     3  Angus     3 Bolton     3  Cecil     3   <NA>
> 4     4   Alex     4  Bravo     4 Crayon     4   <NA>
> 5     5   Argo     5   <NA>     5  Corey     5  Dummy
> >
> 
> Then for each col_names, I want to display the columns:
> 
> for (each_name in col_names) {
> 
>         sub.data <- subset( all.data,
>                             !is.na( paste("NAME_", each_name, sep
= '')
),>                             select = c( paste("NUM_", each_name,
sep =
'') ,> paste("NAME_", each_name, sep = '') )
>                           )
>         print(sub.data)
> }
> 
> the "incorrect" result:
> 
> NUM_A NAME_A
> 1     1   Andy
> 2     2 Andrew
> 3     3  Angus
> 4     4   Alex
> 5     5   Argo
>   NUM_B NAME_B
> 1     1   <NA>
> 2     2   Barn
> 3     3 Bolton
> 4     4  Bravo
> 5     5   <NA>
>   NUM_C NAME_C
> 1     1  Candy
> 2     2   <NA>
> 3     3  Cecil
> 4     4 Crayon
> 5     5  Corey
>   NUM_D NAME_D
> 1     1  David
> 2     2  Delta
> 3     3   <NA>
> 4     4   <NA>
> 5     5  Dummy
> >
> 
> What I want to achieve is that the result should only display the NUM 
and> NAME that is not NA. Here, the NA can be NULL, or zero (or other 
specific> values).
> 
> the "correct" result:
> 
> NUM_A NAME_A
> 1     1   Andy
> 2     2 Andrew
> 3     3  Angus
> 4     4   Alex
> 5     5   Argo
>   NUM_B NAME_B
>  2     2   Barn
> 3     3 Bolton
> 4     4  Bravo
>    NUM_C NAME_C
> 1     1  Candy
>  3     3  Cecil
> 4     4 Crayon
> 5     5  Corey
>   NUM_D NAME_D
> 1     1  David
> 2     2  Delta
> 5     5  Dummy
> >
> 
> I am guessing that I don't use the correct type on the following 
statement> (within the subset in the loop):
> !is.na( paste("NAME_", each_name, sep = '') )
> 
> But then, I might be completely using a wrong approach.
> 
> Any idea is definitely appreciated.
> 
> Thank you,
> 
> Ferry
> 
>    [[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html> and provide commented, minimal, self-contained, reproducible code.

Reasonably Related Threads

Search for more seemingly similar threads

R help - Apr 2009 - how to automatically select certain columns using for loop in dataframe

[R] how to automatically select certain columns using for loop in dataframe

[R] how to automatically select certain columns using for loop in dataframe

[R] Odp: how to automatically select certain columns using for loop in dataframe

Reasonably Related Threads