thr3ads.net - R help - [R] grep [Aug 2024]

If this information is useful, please help other people find it:
Share via:

Steven Yen

2024-Aug-02 01:10 UTC

[R] grep

Good Morning. Below I like statement like

j<-grep(".r\\b",colnames(mydata),value=TRUE); j

with the \\b option which I read long time ago which Ive found useful.

Are there more or these options, other than ? grep? Thanks.

dstat is just my own descriptive routine.

 > x
 ?[1] "age"????????? "sleep"??????? "primary"?????
"middle"
 ?[5] "high"???????? "somewhath"??? "veryh"???????
"somewhatm"
 ?[9] "verym"??????? "somewhatc"??? "veryc"???????
"somewhatl"
[13] "veryl"??????? "village"????? "married"?????
"social"
[17] "agricultural" "communist"??? "minority"????
"religious"
 > colnames(mydata)
 ?[1] "depression"???? "sleep"????????? "female"
"village"
 ?[5] "agricultural"?? "married"???????
"communist" "minority"
 ?[9] "religious"????? "social"???????? "no"
"primary"
[13] "middle"???????? "high"?????????? "veryh"
"somewhath"
[17] "notveryh"?????? "verym"????????? "somewhatm"
"notverym"
[21] "veryc"????????? "somewhatc"????? "notveryc"
"veryl"
[25] "somewhatl"????? "notveryl"?????? "age"
"village.r"
[29] "married.r"????? "social.r"??????
"agricultural.r" "communist.r"
[33] "minority.r"???? "religious.r"??? "male.r"
"education.r"
 > j<-grep(".r\\b",colnames(mydata),value=TRUE); j
[1] "village.r"????? "married.r"????? "social.r"
"agricultural.r"
[5] "communist.r"??? "minority.r"????
"religious.r" "male.r"
[9] "education.r"
 > j<-c(x,j); j
 ?[1] "age"??????????? "sleep"????????? "primary"
"middle"
 ?[5] "high"?????????? "somewhath"????? "veryh"
"somewhatm"
 ?[9] "verym"????????? "somewhatc"????? "veryc"
"somewhatl"
[13] "veryl"????????? "village"??????? "married"
"social"
[17] "agricultural"?? "communist"????? "minority"
"religious"
[21] "village.r"????? "married.r"????? "social.r"
"agricultural.r"
[25] "communist.r"??? "minority.r"????
"religious.r" "male.r"
[29] "education.r"
 > data<-mydata[j]
 > cbind(
+?? dstat(subset(data,male.r==1))[,1:2],
+?? dstat(subset(data,male.r==0))[,1:2]
+ )
Sample statistics (Weighted =? FALSE )

Sample statistics (Weighted =? FALSE )

 ??????????????? Mean Std.dev? Mean Std.dev
age??????????? 6.279?? 0.841 6.055?? 0.813
sleep????????? 6.483?? 1.804 6.087?? 2.045
primary??????? 0.452?? 0.498 0.408?? 0.491
middle???????? 0.287?? 0.453 0.176?? 0.381
high?????????? 0.171?? 0.377 0.082?? 0.275
somewhath????? 0.522?? 0.500 0.447?? 0.497
veryh????????? 0.254?? 0.435 0.250?? 0.433
somewhatm????? 0.419?? 0.493 0.460?? 0.498
verym????????? 0.544?? 0.498 0.411?? 0.492
somewhatc????? 0.376?? 0.484 0.346?? 0.476
veryc????????? 0.593?? 0.491 0.615?? 0.487
somewhatl????? 0.544?? 0.498 0.504?? 0.500
veryl????????? 0.390?? 0.488 0.389?? 0.487
village??????? 0.757?? 0.429 0.752?? 0.432
married??????? 0.936?? 0.245 0.906?? 0.291
social???????? 0.538?? 0.499 0.528?? 0.499
agricultural?? 0.780?? 0.414 0.826?? 0.379
communist????? 0.178?? 0.383 0.038?? 0.190
minority?????? 0.071?? 0.256 0.081?? 0.273
religious????? 0.088?? 0.284 0.102?? 0.302
village.r????? 0.243?? 0.429 0.248?? 0.432
married.r????? 0.064?? 0.245 0.094?? 0.291
social.r?????? 0.462?? 0.499 0.472?? 0.499
agricultural.r 0.220?? 0.414 0.174?? 0.379
communist.r??? 0.822?? 0.383 0.962?? 0.190
minority.r???? 0.929?? 0.256 0.919?? 0.273
religious.r??? 0.912?? 0.284 0.898?? 0.302
male.r???????? 1.000?? 0.000 0.000?? 0.000
education.r??? 0.090?? 0.286 0.334?? 0.472
 >

Iris Simmons

2024-Aug-02 01:40 UTC

head link

[R] grep

You can find more by reading through ?regex as well as Perl documentation
(which you can find online).

On Thu, Aug 1, 2024, 21:11 Steven Yen <styen at ntu.edu.tw> wrote:
> Good Morning. Below I like statement like
>
> j<-grep(".r\\b",colnames(mydata),value=TRUE); j
>
> with the \\b option which I read long time ago which Ive found useful.
>
> Are there more or these options, other than ? grep? Thanks.
>
> dstat is just my own descriptive routine.
>
>  > x
>   [1] "age"          "sleep"        "primary"
"middle"
>   [5] "high"         "somewhath"    "veryh"  
"somewhatm"
>   [9] "verym"        "somewhatc"    "veryc"  
"somewhatl"
> [13] "veryl"        "village"      "married" 
"social"
> [17] "agricultural" "communist"    "minority"
"religious"
>  > colnames(mydata)
>   [1] "depression"     "sleep"         
"female" "village"
>   [5] "agricultural"   "married"       
"communist" "minority"
>   [9] "religious"      "social"         "no"
"primary"
> [13] "middle"         "high"          
"veryh" "somewhath"
> [17] "notveryh"       "verym"         
"somewhatm" "notverym"
> [21] "veryc"          "somewhatc"     
"notveryc" "veryl"
> [25] "somewhatl"      "notveryl"       "age"
"village.r"
> [29] "married.r"      "social.r"      
"agricultural.r" "communist.r"
> [33] "minority.r"     "religious.r"   
"male.r" "education.r"
>  > j<-grep(".r\\b",colnames(mydata),value=TRUE); j
> [1] "village.r"      "married.r"     
"social.r" "agricultural.r"
> [5] "communist.r"    "minority.r"    
"religious.r" "male.r"
> [9] "education.r"
>  > j<-c(x,j); j
>   [1] "age"            "sleep"         
"primary" "middle"
>   [5] "high"           "somewhath"     
"veryh" "somewhatm"
>   [9] "verym"          "somewhatc"     
"veryc" "somewhatl"
> [13] "veryl"          "village"       
"married" "social"
> [17] "agricultural"   "communist"     
"minority" "religious"
> [21] "village.r"      "married.r"     
"social.r" "agricultural.r"
> [25] "communist.r"    "minority.r"    
"religious.r" "male.r"
> [29] "education.r"
>  > data<-mydata[j]
>  > cbind(
> +   dstat(subset(data,male.r==1))[,1:2],
> +   dstat(subset(data,male.r==0))[,1:2]
> + )
> Sample statistics (Weighted =  FALSE )
>
> Sample statistics (Weighted =  FALSE )
>
>                  Mean Std.dev  Mean Std.dev
> age            6.279   0.841 6.055   0.813
> sleep          6.483   1.804 6.087   2.045
> primary        0.452   0.498 0.408   0.491
> middle         0.287   0.453 0.176   0.381
> high           0.171   0.377 0.082   0.275
> somewhath      0.522   0.500 0.447   0.497
> veryh          0.254   0.435 0.250   0.433
> somewhatm      0.419   0.493 0.460   0.498
> verym          0.544   0.498 0.411   0.492
> somewhatc      0.376   0.484 0.346   0.476
> veryc          0.593   0.491 0.615   0.487
> somewhatl      0.544   0.498 0.504   0.500
> veryl          0.390   0.488 0.389   0.487
> village        0.757   0.429 0.752   0.432
> married        0.936   0.245 0.906   0.291
> social         0.538   0.499 0.528   0.499
> agricultural   0.780   0.414 0.826   0.379
> communist      0.178   0.383 0.038   0.190
> minority       0.071   0.256 0.081   0.273
> religious      0.088   0.284 0.102   0.302
> village.r      0.243   0.429 0.248   0.432
> married.r      0.064   0.245 0.094   0.291
> social.r       0.462   0.499 0.472   0.499
> agricultural.r 0.220   0.414 0.174   0.379
> communist.r    0.822   0.383 0.962   0.190
> minority.r     0.929   0.256 0.919   0.273
> religious.r    0.912   0.284 0.898   0.302
> male.r         1.000   0.000 0.000   0.000
> education.r    0.090   0.286 0.334   0.472
>  >
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
	[[alternative HTML version deleted]]

Rui Barradas

2024-Aug-02 04:28 UTC

head link

[R] grep

?s 02:10 de 02/08/2024, Steven Yen escreveu:> Good Morning. Below I like statement like
> 
> j<-grep(".r\\b",colnames(mydata),value=TRUE); j
> 
> with the \\b option which I read long time ago which Ive found useful.
> 
> Are there more or these options, other than ? grep? Thanks.
> 
> dstat is just my own descriptive routine.
> 
>  > x
>  ?[1] "age"????????? "sleep"???????
"primary"????? "middle"
>  ?[5] "high"???????? "somewhath"???
"veryh"??????? "somewhatm"
>  ?[9] "verym"??????? "somewhatc"???
"veryc"??????? "somewhatl"
> [13] "veryl"??????? "village"?????
"married"????? "social"
> [17] "agricultural" "communist"???
"minority"???? "religious"
>  > colnames(mydata)
>  ?[1] "depression"???? "sleep"?????????
"female" "village"
>  ?[5] "agricultural"?? "married"???????
"communist" "minority"
>  ?[9] "religious"????? "social"???????? "no"
"primary"
> [13] "middle"???????? "high"??????????
"veryh" "somewhath"
> [17] "notveryh"?????? "verym"?????????
"somewhatm" "notverym"
> [21] "veryc"????????? "somewhatc"?????
"notveryc" "veryl"
> [25] "somewhatl"????? "notveryl"?????? "age"
"village.r"
> [29] "married.r"????? "social.r"??????
"agricultural.r" "communist.r"
> [33] "minority.r"???? "religious.r"???
"male.r" "education.r"
>  > j<-grep(".r\\b",colnames(mydata),value=TRUE); j
> [1] "village.r"????? "married.r"?????
"social.r" "agricultural.r"
> [5] "communist.r"??? "minority.r"????
"religious.r" "male.r"
> [9] "education.r"
>  > j<-c(x,j); j
>  ?[1] "age"??????????? "sleep"?????????
"primary" "middle"
>  ?[5] "high"?????????? "somewhath"?????
"veryh" "somewhatm"
>  ?[9] "verym"????????? "somewhatc"?????
"veryc" "somewhatl"
> [13] "veryl"????????? "village"???????
"married" "social"
> [17] "agricultural"?? "communist"?????
"minority" "religious"
> [21] "village.r"????? "married.r"?????
"social.r" "agricultural.r"
> [25] "communist.r"??? "minority.r"????
"religious.r" "male.r"
> [29] "education.r"
>  > data<-mydata[j]
>  > cbind(
> +?? dstat(subset(data,male.r==1))[,1:2],
> +?? dstat(subset(data,male.r==0))[,1:2]
> + )
> Sample statistics (Weighted =? FALSE )
> 
> Sample statistics (Weighted =? FALSE )
> 
>  ??????????????? Mean Std.dev? Mean Std.dev
> age??????????? 6.279?? 0.841 6.055?? 0.813
> sleep????????? 6.483?? 1.804 6.087?? 2.045
> primary??????? 0.452?? 0.498 0.408?? 0.491
> middle???????? 0.287?? 0.453 0.176?? 0.381
> high?????????? 0.171?? 0.377 0.082?? 0.275
> somewhath????? 0.522?? 0.500 0.447?? 0.497
> veryh????????? 0.254?? 0.435 0.250?? 0.433
> somewhatm????? 0.419?? 0.493 0.460?? 0.498
> verym????????? 0.544?? 0.498 0.411?? 0.492
> somewhatc????? 0.376?? 0.484 0.346?? 0.476
> veryc????????? 0.593?? 0.491 0.615?? 0.487
> somewhatl????? 0.544?? 0.498 0.504?? 0.500
> veryl????????? 0.390?? 0.488 0.389?? 0.487
> village??????? 0.757?? 0.429 0.752?? 0.432
> married??????? 0.936?? 0.245 0.906?? 0.291
> social???????? 0.538?? 0.499 0.528?? 0.499
> agricultural?? 0.780?? 0.414 0.826?? 0.379
> communist????? 0.178?? 0.383 0.038?? 0.190
> minority?????? 0.071?? 0.256 0.081?? 0.273
> religious????? 0.088?? 0.284 0.102?? 0.302
> village.r????? 0.243?? 0.429 0.248?? 0.432
> married.r????? 0.064?? 0.245 0.094?? 0.291
> social.r?????? 0.462?? 0.499 0.472?? 0.499
> agricultural.r 0.220?? 0.414 0.174?? 0.379
> communist.r??? 0.822?? 0.383 0.962?? 0.190
> minority.r???? 0.929?? 0.256 0.919?? 0.273
> religious.r??? 0.912?? 0.284 0.898?? 0.302
> male.r???????? 1.000?? 0.000 0.000?? 0.000
> education.r??? 0.090?? 0.286 0.334?? 0.472
>  >
> 
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.Hello,

The metacharacters reference is the documentation ?regex.
If you want to know whether there are more metacharacters similar to \b,
there are \< and \>. low are examples of using them instead of \b.

Also, the pattern '.r' does not match a period followed by an
'r', the
period matches any character ('.'). To match a literal period you must 
escape it. The correct regex is '\\.r'.



x <- c("age", "sleep", "primary",
"middle", "high", "somewhath", "veryh",
        "somewhatm", "verym", "somewhatc",
"veryc", "somewhatl", "veryl",
        "village", "married", "social",
"agricultural", "communist",
        "minority", "religious")
colnms <- c("depression", "sleep", "female",
"village", "agricultural",
             "married", "communist", "minority",
"religious", "social",
"no",
             "primary", "middle", "high",
"veryh", "somewhath", "notveryh",
             "verym", "somewhatm", "notverym",
"veryc", "somewhatc",
"notveryc",
             "veryl", "somewhatl", "notveryl",
"age", "village.r",
"married.r",
             "social.r", "agricultural.r",
"communist.r", "minority.r",
"religious.r",
             "male.r", "education.r")

grep("\\.r\\b", colnms, value = TRUE)
#> [1] "village.r"      "married.r"     
"social.r"       "agricultural.r"
#> [5] "communist.r"    "minority.r"    
"religious.r"    "male.r"
#> [9] "education.r"
# the same as above
# \\> matches the empty string at the end of a word,
# \\b matches the empty string at both ends of a word
grep("\\.r\\>", colnms, value = TRUE)
#> [1] "village.r"      "married.r"     
"social.r"       "agricultural.r"
#> [5] "communist.r"    "minority.r"    
"religious.r"    "male.r"
#> [9] "education.r"

# 4 col names have a 'm' and end in '.r' therefore 4 matches
grep("m.*\\.r\\>", colnms, value = TRUE)
#> [1] "married.r"   "communist.r" "minority.r"
"male.r"
# only the strings starting with 'm'
grep("\\bm.*\\.r\\b", colnms, value = TRUE)
#> [1] "married.r"  "minority.r" "male.r"
grep("\\<m.*\\.r\\>", colnms, value = TRUE)
#> [1] "married.r"  "minority.r" "male.r"


Hope this helps,

Rui Barradas


-- 
Este e-mail foi analisado pelo software antiv?rus AVG para verificar a presen?a
de v?rus.
www.avg.com

Possibly Parallel Threads

Search for more possibly parallel threads

R help - Aug 2024 - grep

[R] grep

[R] grep

[R] grep

Possibly Parallel Threads