thr3ads.net - R help - [R] "Copy-pastable" output of 1000 plus variables [Apr 2017]

If this information is useful, please help other people find it:
Share via:

BR_email

2017-Apr-23 20:46 UTC

[R] "Copy-pastable" output of 1000 plus variables

Jeff:
Thanks, Please see my reply to David.
Bruce

Bruce Ratner, Ph.D.
The Significant Statistician?
(516) 791-3544
Statistical Predictive Analtyics -- www.DMSTAT1.com
Machine-Learning Data Mining and Modeling -- www.GenIQ.net
  

Jeff Newmiller wrote:> Coming from an Excel background, copying and pasting seems attractive, but
it does not create a reproducible record of what you did so it becomes quite
tiring and frustrating after some time has passed and you return to your
analysis.
>
> Nitpick: you put the setdiff function in the row selection position, an
error I am sure Hadley did not recommend.
>
> Since R is programmable, there are far more ways to select columns than
just setdiff. Since your description of desired features is vague, you are
unlikely to get the answer you would really like from your email. Some
possibilities to think about:
>
> a) use regular expressions and grep or grepl to select by similar character
patterns. E.g. all columns including the the substring "value" or
"key": grep( "key|value", names( dta ). Possible to specify
very complex selection patterns, but there are whole books on regular
expressions, so you can't expect to learn all about them on this R-specific
mailing list.
>
> b) use a separate csv file with a column listing each column name, and then
one column for each subset you want to define, using TRUE/FALSE values to
include or not include the column name identified. E.g.
>
> # typically easier to manage in an external data file, online for example
only
> colsets <- read.csv( text> "Colname,set1,set2
> key,TRUE,TRUE
> value1,TRUE,FALSE
> value2,TRUE,FALSE
> factor1,FALSE,TRUE
> ",header=TRUE,as.is=TRUE)
> dta[ , colsets$set1 ]
>
> Also your criteria of "clean listing" and
"copy-pasteable" are likely mutually exclusive, depending how you
interpret them. You might be able to use dput to export a set of column names
that can be re-imported accurately, but you might not regard it as
"clean" if you are thinking "readable".

David L Carlson

2017-Apr-23 23:26 UTC

head link

[R] "Copy-pastable" output of 1000 plus variables

This might work for you:

cols <- LETTERS # actually this will be cols <- colnames(df) in your
example
# Create a data frame to select columns
choose <- data.frame(cols, select=0, stringsAsFactors=FALSE)
# Run the editor and replace 0 with 1 in the select column 
# for each variable you wish to include
fix(choose)
# Your list of variables will be the vector mycols
mycols <- choose$cols[choose$select==1]


David L. Carlson
Department of Anthropology
Texas A&M University

-----Original Message-----
From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of BR_email
Sent: Sunday, April 23, 2017 3:47 PM
To: Jeff Newmiller <jdnewmil at dcn.davis.ca.us>; r-help at r-project.org
Subject: Re: [R] "Copy-pastable" output of 1000 plus variables

Jeff:
Thanks, Please see my reply to David.
Bruce

Bruce Ratner, Ph.D.
The Significant Statistician?
(516) 791-3544
Statistical Predictive Analtyics -- www.DMSTAT1.com
Machine-Learning Data Mining and Modeling -- www.GenIQ.net
  

Jeff Newmiller wrote:> Coming from an Excel background, copying and pasting seems attractive, but
it does not create a reproducible record of what you did so it becomes quite
tiring and frustrating after some time has passed and you return to your
analysis.
>
> Nitpick: you put the setdiff function in the row selection position, an
error I am sure Hadley did not recommend.
>
> Since R is programmable, there are far more ways to select columns than
just setdiff. Since your description of desired features is vague, you are
unlikely to get the answer you would really like from your email. Some
possibilities to think about:
>
> a) use regular expressions and grep or grepl to select by similar character
patterns. E.g. all columns including the the substring "value" or
"key": grep( "key|value", names( dta ). Possible to specify
very complex selection patterns, but there are whole books on regular
expressions, so you can't expect to learn all about them on this R-specific
mailing list.
>
> b) use a separate csv file with a column listing each column name, and then
one column for each subset you want to define, using TRUE/FALSE values to
include or not include the column name identified. E.g.
>
> # typically easier to manage in an external data file, online for example
only
> colsets <- read.csv( text> "Colname,set1,set2
> key,TRUE,TRUE
> value1,TRUE,FALSE
> value2,TRUE,FALSE
> factor1,FALSE,TRUE
> ",header=TRUE,as.is=TRUE)
> dta[ , colsets$set1 ]
>
> Also your criteria of "clean listing" and
"copy-pasteable" are likely mutually exclusive, depending how you
interpret them. You might be able to use dput to export a set of column names
that can be re-imported accurately, but you might not regard it as
"clean" if you are thinking "readable".
______________________________________________
R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

David Winsemius

2017-Apr-24 03:13 UTC

head link

[R] "Copy-pastable" output of 1000 plus variables

I don't have a lot of interest in trying to replicate operations in SAS. 

If you don't exhibit the willingness to show code in R then ... best of
luck. But do read the Posting Guide to at least understand the local
expectations.

Good luck;
David

Sent from my iPhone
> On Apr 23, 2017, at 5:26 PM, David L Carlson <dcarlson at tamu.edu>
wrote:
> 
> This might work for you:
> 
> cols <- LETTERS # actually this will be cols <- colnames(df) in your
example
> # Create a data frame to select columns
> choose <- data.frame(cols, select=0, stringsAsFactors=FALSE)
> # Run the editor and replace 0 with 1 in the select column 
> # for each variable you wish to include
> fix(choose)
> # Your list of variables will be the vector mycols
> mycols <- choose$cols[choose$select==1]
> 
> 
> David L. Carlson
> Department of Anthropology
> Texas A&M University
> 
> -----Original Message-----
> From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of BR_email
> Sent: Sunday, April 23, 2017 3:47 PM
> To: Jeff Newmiller <jdnewmil at dcn.davis.ca.us>; r-help at
r-project.org
> Subject: Re: [R] "Copy-pastable" output of 1000 plus variables
> 
> Jeff:
> Thanks, Please see my reply to David.
> Bruce
> 
> Bruce Ratner, Ph.D.
> The Significant Statistician?
> (516) 791-3544
> Statistical Predictive Analtyics -- www.DMSTAT1.com
> Machine-Learning Data Mining and Modeling -- www.GenIQ.net
> 
> 
> Jeff Newmiller wrote:
>> Coming from an Excel background, copying and pasting seems attractive,
but it does not create a reproducible record of what you did so it becomes quite
tiring and frustrating after some time has passed and you return to your
analysis.
>> 
>> Nitpick: you put the setdiff function in the row selection position, an
error I am sure Hadley did not recommend.
>> 
>> Since R is programmable, there are far more ways to select columns than
just setdiff. Since your description of desired features is vague, you are
unlikely to get the answer you would really like from your email. Some
possibilities to think about:
>> 
>> a) use regular expressions and grep or grepl to select by similar
character patterns. E.g. all columns including the the substring
"value" or "key": grep( "key|value", names( dta ).
Possible to specify very complex selection patterns, but there are whole books
on regular expressions, so you can't expect to learn all about them on this
R-specific mailing list.
>> 
>> b) use a separate csv file with a column listing each column name, and
then one column for each subset you want to define, using TRUE/FALSE values to
include or not include the column name identified. E.g.
>> 
>> # typically easier to manage in an external data f

BR_email

2017-Apr-24 10:51 UTC

head link

[R] "Copy-pastable" output of 1000 plus variables

David:
Your code worked beautifully.
This little ditty should be high-profile for those who work big data,
which are virtually never accompanied with a data dictionary.
This code is the first step to grab the data at large to bring it down 
in size.
Excellent.
Thank you.
Bruce

  

David L Carlson wrote:> This might work for you:
>
> cols <- LETTERS # actually this will be cols <- colnames(df) in your
example
> # Create a data frame to select columns
> choose <- data.frame(cols, select=0, stringsAsFactors=FALSE)
> # Run the editor and replace 0 with 1 in the select column
> # for each variable you wish to include
> fix(choose)
> # Your list of variables will be the vector mycols
> mycols <- choose$cols[choose$select==1]
>
>
> David L. Carlson
> Department of Anthropology
> Texas A&M University
>
> -----Original Message-----
> From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of BR_email
> Sent: Sunday, April 23, 2017 3:47 PM
> To: Jeff Newmiller <jdnewmil at dcn.davis.ca.us>; r-help at
r-project.org
> Subject: Re: [R] "Copy-pastable" output of 1000 plus variables
>
> Jeff:
> Thanks, Please see my reply to David.
> Bruce
>
> Bruce Ratner, Ph.D.
> The Significant Statistician?
> (516) 791-3544
> Statistical Predictive Analtyics -- www.DMSTAT1.com
> Machine-Learning Data Mining and Modeling -- www.GenIQ.net
>    
>
> Jeff Newmiller wrote:
>> Coming from an Excel background, copying and pasting seems attractive,
but it does not create a reproducible record of what you did so it becomes quite
tiring and frustrating after some time has passed and you return to your
analysis.
>>
>> Nitpick: you put the setdiff function in the row selection position, an
error I am sure Hadley did not recommend.
>>
>> Since R is programmable, there are far more ways to select columns than
just setdiff. Since your description of desired features is vague, you are
unlikely to get the answer you would really like from your email. Some
possibilities to think about:
>>
>> a) use regular expressions and grep or grepl to select by similar
character patterns. E.g. all columns including the the substring
"value" or "key": grep( "key|value", names( dta ).
Possible to specify very complex selection patterns, but there are whole books
on regular expressions, so you can't expect to learn all about them on this
R-specific mailing list.
>>
>> b) use a separate csv file with a column listing each column name, and
then one column for each subset you want to define, using TRUE/FALSE values to
include or not include the column name identified. E.g.
>>
>> # typically easier to manage in an external data file, online for
example only
>> colsets <- read.csv( text>> "Colname,set1,set2
>> key,TRUE,TRUE
>> value1,TRUE,FALSE
>> value2,TRUE,FALSE
>> factor1,FALSE,TRUE
>> ",header=TRUE,as.is=TRUE)
>> dta[ , colsets$set1 ]
>>
>> Also your criteria of "clean listing" and
"copy-pasteable" are likely mutually exclusive, depending how you
interpret them. You might be able to use dput to export a set of column names
that can be re-imported accurately, but you might not regard it as
"clean" if you are thinking "readable".
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

R help - Apr 2017 - "Copy-pastable" output of 1000 plus variables

[R] "Copy-pastable" output of 1000 plus variables

[R] "Copy-pastable" output of 1000 plus variables

[R] "Copy-pastable" output of 1000 plus variables

[R] "Copy-pastable" output of 1000 plus variables