Hi
I would like to eliminate a large number of lines of the dataframe df1
The lines to delete are given here by the values of Mat (ex : 2,4,7,10).
but I have a large number (300) values of Mat
dput(df1)
structure(list(Mat = c(1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3,
3, 3, 4, 4, 4, 4, 4, 5, 5, 5, 5, 5, 5, 6, 6, 6, 6, 6, 6, 7, 7,
7, 7, 7, 7, 7, 8, 8, 8, 8, 8, 8, 8, 9, 9, 9, 9, 9, 9, 10, 10,
10, 11, 11, 11, 11, 11, 11, 11), Prenom = c("Ginette",
"Ginette",
"Ginette", "Ginette", "Ginette",
"Ginette", "Nicole", "Nicole",
"Nicole", "Nicole", "Jean", "Jean",
"Jean", "Jean", "Jean", "Ginette",
"Ginette", "Ginette", "Ginette",
"Ginette", "H?l?ne", "H?l?ne",
"H?l?ne", "H?l?ne", "H?l?ne", "H?l?ne",
"Guy", "Guy", "Guy",
"Guy", "Guy", "Guy", "Claude",
"Claude", "Claude", "Claude",
"Claude", "Claude", "Claude", "R?gine",
"R?gine", "R?gine", "R?gine",
"R?gine", "R?gine", "R?gine", "Germain",
"Germain", "Germain",
"Germain", "Germain", "Germain",
"B?atrice", "B?atrice", "B?atrice",
"Josette", "Josette", "Josette",
"Josette", "Josette", "Josette",
"Josette"), Sexe = c("F?minin", "F?minin",
"F?minin", "F?minin",
"F?minin", "F?minin", "F?minin",
"F?minin", "F?minin", "F?minin",
"Masculin", "Masculin", "Masculin",
"Masculin", "Masculin", "F?minin",
"F?minin", "F?minin", "F?minin",
"F?minin", "F?minin", "F?minin",
"F?minin", "F?minin", "F?minin",
"F?minin", "Masculin", "Masculin",
"Masculin", "Masculin", "Masculin",
"Masculin", "Masculin", "Masculin",
"Masculin", "Masculin", "Masculin",
"Masculin", "Masculin", "F?minin",
"F?minin", "F?minin", "F?minin",
"F?minin", "F?minin", "F?minin",
"Masculin", "Masculin", "Masculin",
"Masculin", "Masculin", "Masculin",
"F?minin", "F?minin", "F?minin",
"F?minin", "F?minin", "F?minin",
"F?minin", "F?minin", "F?minin",
"F?minin")), .Names = c("Mat",
"Prenom", "Sexe"), row.names = c(NA, 62L), class =
"data.frame")
I would like to obtain the data frame df2
dput(df2)
structure(list(Mat = c(1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3,
3, 3, 4, 4, 4, 4, 4, 5, 5, 5, 5, 5, 5, 6, 6, 6, 6, 6, 6, 7, 7,
7, 7, 7, 7, 7, 8, 8, 8, 8, 8, 8, 8, 9, 9, 9, 9, 9, 9, 10, 10,
10, 11, 11, 11, 11, 11, 11, 11), Prenom = c("Ginette",
"Ginette",
"Ginette", "Ginette", "Ginette",
"Ginette", "Nicole", "Nicole",
"Nicole", "Nicole", "Jean", "Jean",
"Jean", "Jean", "Jean", "Ginette",
"Ginette", "Ginette", "Ginette",
"Ginette", "H?l?ne", "H?l?ne",
"H?l?ne", "H?l?ne", "H?l?ne", "H?l?ne",
"Guy", "Guy", "Guy",
"Guy", "Guy", "Guy", "Claude",
"Claude", "Claude", "Claude",
"Claude", "Claude", "Claude", "R?gine",
"R?gine", "R?gine", "R?gine",
"R?gine", "R?gine", "R?gine", "Germain",
"Germain", "Germain",
"Germain", "Germain", "Germain",
"B?atrice", "B?atrice", "B?atrice",
"Josette", "Josette", "Josette",
"Josette", "Josette", "Josette",
"Josette"), Sexe = c("F?minin", "F?minin",
"F?minin", "F?minin",
"F?minin", "F?minin", "F?minin",
"F?minin", "F?minin", "F?minin",
"Masculin", "Masculin", "Masculin",
"Masculin", "Masculin", "F?minin",
"F?minin", "F?minin", "F?minin",
"F?minin", "F?minin", "F?minin",
"F?minin", "F?minin", "F?minin",
"F?minin", "Masculin", "Masculin",
"Masculin", "Masculin", "Masculin",
"Masculin", "Masculin", "Masculin",
"Masculin", "Masculin", "Masculin",
"Masculin", "Masculin", "F?minin",
"F?minin", "F?minin", "F?minin",
"F?minin", "F?minin", "F?minin",
"Masculin", "Masculin", "Masculin",
"Masculin", "Masculin", "Masculin",
"F?minin", "F?minin", "F?minin",
"F?minin", "F?minin", "F?minin",
"F?minin", "F?minin", "F?minin",
"F?minin")), .Names = c("Mat",
"Prenom", "Sexe"), row.names = c(NA, 62L), class =
"data.frame")
It is possible to obtain by
df2 <- df1[df1$Mat != 2 | df1$Mat !=4 | [df1$Mat !=7 | [df1$Mat !=10,]
But how to delete these lines when the 300 values of Mat are in the
vector MatDelete
Any ideas ?
--
Michel ARNAUD
Charg? de mission aupr?s du DRH
DGDRD-Drh - TA 174/04
Av Agropolis 34398 Montpellier cedex 5
tel : 04.67.61.75.38
fax : 04.67.61.57.87
port: 06.47.43.55.31
Hi,
Try:
vec1<- c(2,4,7,10)
df1New<-df1[!df1$Mat %in% vec1,]
?dim(df1New)
#[1] 43? 3
A.K.
----- Original Message -----
From: Arnaud Michel <michel.arnaud at cirad.fr>
To: R help <r-help at r-project.org>
Cc:
Sent: Tuesday, September 10, 2013 11:03 AM
Subject: [R] to delete lines by means of a vector
Hi
I would like to eliminate a large number of? lines of the dataframe df1
The lines to delete are given here by the values of Mat (ex : 2,4,7,10).
but I have a large number (300) values of Mat
dput(df1)
structure(list(Mat = c(1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3,
3, 3, 4, 4, 4, 4, 4, 5, 5, 5, 5, 5, 5, 6, 6, 6, 6, 6, 6, 7, 7,
7, 7, 7, 7, 7, 8, 8, 8, 8, 8, 8, 8, 9, 9, 9, 9, 9, 9, 10, 10,
10, 11, 11, 11, 11, 11, 11, 11), Prenom = c("Ginette",
"Ginette",
"Ginette", "Ginette", "Ginette",
"Ginette", "Nicole", "Nicole",
"Nicole", "Nicole", "Jean", "Jean",
"Jean", "Jean", "Jean", "Ginette",
"Ginette", "Ginette", "Ginette",
"Ginette", "H?l?ne", "H?l?ne",
"H?l?ne", "H?l?ne", "H?l?ne", "H?l?ne",
"Guy", "Guy", "Guy",
"Guy", "Guy", "Guy", "Claude",
"Claude", "Claude", "Claude",
"Claude", "Claude", "Claude", "R?gine",
"R?gine", "R?gine", "R?gine",
"R?gine", "R?gine", "R?gine", "Germain",
"Germain", "Germain",
"Germain", "Germain", "Germain",
"B?atrice", "B?atrice", "B?atrice",
"Josette", "Josette", "Josette",
"Josette", "Josette", "Josette",
"Josette"), Sexe = c("F?minin", "F?minin",
"F?minin", "F?minin",
"F?minin", "F?minin", "F?minin",
"F?minin", "F?minin", "F?minin",
"Masculin", "Masculin", "Masculin",
"Masculin", "Masculin", "F?minin",
"F?minin", "F?minin", "F?minin",
"F?minin", "F?minin", "F?minin",
"F?minin", "F?minin", "F?minin",
"F?minin", "Masculin", "Masculin",
"Masculin", "Masculin", "Masculin",
"Masculin", "Masculin", "Masculin",
"Masculin", "Masculin", "Masculin",
"Masculin", "Masculin", "F?minin",
"F?minin", "F?minin", "F?minin",
"F?minin", "F?minin", "F?minin",
"Masculin", "Masculin", "Masculin",
"Masculin", "Masculin", "Masculin",
"F?minin", "F?minin", "F?minin",
"F?minin", "F?minin", "F?minin",
"F?minin", "F?minin", "F?minin",
"F?minin")), .Names = c("Mat",
"Prenom", "Sexe"), row.names = c(NA, 62L), class =
"data.frame")
I would like to obtain the data frame df2
dput(df2)
structure(list(Mat = c(1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3,
3, 3, 4, 4, 4, 4, 4, 5, 5, 5, 5, 5, 5, 6, 6, 6, 6, 6, 6, 7, 7,
7, 7, 7, 7, 7, 8, 8, 8, 8, 8, 8, 8, 9, 9, 9, 9, 9, 9, 10, 10,
10, 11, 11, 11, 11, 11, 11, 11), Prenom = c("Ginette",
"Ginette",
"Ginette", "Ginette", "Ginette",
"Ginette", "Nicole", "Nicole",
"Nicole", "Nicole", "Jean", "Jean",
"Jean", "Jean", "Jean", "Ginette",
"Ginette", "Ginette", "Ginette",
"Ginette", "H?l?ne", "H?l?ne",
"H?l?ne", "H?l?ne", "H?l?ne", "H?l?ne",
"Guy", "Guy", "Guy",
"Guy", "Guy", "Guy", "Claude",
"Claude", "Claude", "Claude",
"Claude", "Claude", "Claude", "R?gine",
"R?gine", "R?gine", "R?gine",
"R?gine", "R?gine", "R?gine", "Germain",
"Germain", "Germain",
"Germain", "Germain", "Germain",
"B?atrice", "B?atrice", "B?atrice",
"Josette", "Josette", "Josette",
"Josette", "Josette", "Josette",
"Josette"), Sexe = c("F?minin", "F?minin",
"F?minin", "F?minin",
"F?minin", "F?minin", "F?minin",
"F?minin", "F?minin", "F?minin",
"Masculin", "Masculin", "Masculin",
"Masculin", "Masculin", "F?minin",
"F?minin", "F?minin", "F?minin",
"F?minin", "F?minin", "F?minin",
"F?minin", "F?minin", "F?minin",
"F?minin", "Masculin", "Masculin",
"Masculin", "Masculin", "Masculin",
"Masculin", "Masculin", "Masculin",
"Masculin", "Masculin", "Masculin",
"Masculin", "Masculin", "F?minin",
"F?minin", "F?minin", "F?minin",
"F?minin", "F?minin", "F?minin",
"Masculin", "Masculin", "Masculin",
"Masculin", "Masculin", "Masculin",
"F?minin", "F?minin", "F?minin",
"F?minin", "F?minin", "F?minin",
"F?minin", "F?minin", "F?minin",
"F?minin")), .Names = c("Mat",
"Prenom", "Sexe"), row.names = c(NA, 62L), class =
"data.frame")
It is possible to obtain by
df2 <- df1[df1$Mat != 2 | df1$Mat !=4 | [df1$Mat !=7 | [df1$Mat !=10,]
But how to delete these lines when the 300 values of Mat are in the
vector MatDelete
Any ideas ?
--
Michel ARNAUD
Charg? de mission aupr?s du DRH
DGDRD-Drh - TA 174/04
Av Agropolis 34398 Montpellier cedex 5
tel : 04.67.61.75.38
fax : 04.67.61.57.87
port: 06.47.43.55.31
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
Hello, It seems you've made a mistake and posted df1 twice, identical(df1, df2) # TRUE As for your question, try negating ?%in% MatDelete <- c(2, 4, 7, 10) df3 <- df1[!df1$Mat %in% MatDelete, ] Hope this helps, Rui Barradas Em 10-09-2013 16:03, Arnaud Michel escreveu:> Hi > I would like to eliminate a large number of lines of the dataframe df1 > The lines to delete are given here by the values of Mat (ex : 2,4,7,10). > but I have a large number (300) values of Mat > > dput(df1) > structure(list(Mat = c(1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, > 3, 3, 4, 4, 4, 4, 4, 5, 5, 5, 5, 5, 5, 6, 6, 6, 6, 6, 6, 7, 7, > 7, 7, 7, 7, 7, 8, 8, 8, 8, 8, 8, 8, 9, 9, 9, 9, 9, 9, 10, 10, > 10, 11, 11, 11, 11, 11, 11, 11), Prenom = c("Ginette", "Ginette", > "Ginette", "Ginette", "Ginette", "Ginette", "Nicole", "Nicole", > "Nicole", "Nicole", "Jean", "Jean", "Jean", "Jean", "Jean", "Ginette", > "Ginette", "Ginette", "Ginette", "Ginette", "H?l?ne", "H?l?ne", > "H?l?ne", "H?l?ne", "H?l?ne", "H?l?ne", "Guy", "Guy", "Guy", > "Guy", "Guy", "Guy", "Claude", "Claude", "Claude", "Claude", > "Claude", "Claude", "Claude", "R?gine", "R?gine", "R?gine", "R?gine", > "R?gine", "R?gine", "R?gine", "Germain", "Germain", "Germain", > "Germain", "Germain", "Germain", "B?atrice", "B?atrice", "B?atrice", > "Josette", "Josette", "Josette", "Josette", "Josette", "Josette", > "Josette"), Sexe = c("F?minin", "F?minin", "F?minin", "F?minin", > "F?minin", "F?minin", "F?minin", "F?minin", "F?minin", "F?minin", > "Masculin", "Masculin", "Masculin", "Masculin", "Masculin", "F?minin", > "F?minin", "F?minin", "F?minin", "F?minin", "F?minin", "F?minin", > "F?minin", "F?minin", "F?minin", "F?minin", "Masculin", "Masculin", > "Masculin", "Masculin", "Masculin", "Masculin", "Masculin", "Masculin", > "Masculin", "Masculin", "Masculin", "Masculin", "Masculin", "F?minin", > "F?minin", "F?minin", "F?minin", "F?minin", "F?minin", "F?minin", > "Masculin", "Masculin", "Masculin", "Masculin", "Masculin", "Masculin", > "F?minin", "F?minin", "F?minin", "F?minin", "F?minin", "F?minin", > "F?minin", "F?minin", "F?minin", "F?minin")), .Names = c("Mat", > "Prenom", "Sexe"), row.names = c(NA, 62L), class = "data.frame") > > I would like to obtain the data frame df2 > dput(df2) > structure(list(Mat = c(1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, > 3, 3, 4, 4, 4, 4, 4, 5, 5, 5, 5, 5, 5, 6, 6, 6, 6, 6, 6, 7, 7, > 7, 7, 7, 7, 7, 8, 8, 8, 8, 8, 8, 8, 9, 9, 9, 9, 9, 9, 10, 10, > 10, 11, 11, 11, 11, 11, 11, 11), Prenom = c("Ginette", "Ginette", > "Ginette", "Ginette", "Ginette", "Ginette", "Nicole", "Nicole", > "Nicole", "Nicole", "Jean", "Jean", "Jean", "Jean", "Jean", "Ginette", > "Ginette", "Ginette", "Ginette", "Ginette", "H?l?ne", "H?l?ne", > "H?l?ne", "H?l?ne", "H?l?ne", "H?l?ne", "Guy", "Guy", "Guy", > "Guy", "Guy", "Guy", "Claude", "Claude", "Claude", "Claude", > "Claude", "Claude", "Claude", "R?gine", "R?gine", "R?gine", "R?gine", > "R?gine", "R?gine", "R?gine", "Germain", "Germain", "Germain", > "Germain", "Germain", "Germain", "B?atrice", "B?atrice", "B?atrice", > "Josette", "Josette", "Josette", "Josette", "Josette", "Josette", > "Josette"), Sexe = c("F?minin", "F?minin", "F?minin", "F?minin", > "F?minin", "F?minin", "F?minin", "F?minin", "F?minin", "F?minin", > "Masculin", "Masculin", "Masculin", "Masculin", "Masculin", "F?minin", > "F?minin", "F?minin", "F?minin", "F?minin", "F?minin", "F?minin", > "F?minin", "F?minin", "F?minin", "F?minin", "Masculin", "Masculin", > "Masculin", "Masculin", "Masculin", "Masculin", "Masculin", "Masculin", > "Masculin", "Masculin", "Masculin", "Masculin", "Masculin", "F?minin", > "F?minin", "F?minin", "F?minin", "F?minin", "F?minin", "F?minin", > "Masculin", "Masculin", "Masculin", "Masculin", "Masculin", "Masculin", > "F?minin", "F?minin", "F?minin", "F?minin", "F?minin", "F?minin", > "F?minin", "F?minin", "F?minin", "F?minin")), .Names = c("Mat", > "Prenom", "Sexe"), row.names = c(NA, 62L), class = "data.frame") > > It is possible to obtain by > df2 <- df1[df1$Mat != 2 | df1$Mat !=4 | [df1$Mat !=7 | [df1$Mat !=10,] > But how to delete these lines when the 300 values of Mat are in the > vector MatDelete > > Any ideas ? >
?match ## or ?"%in%" ## if vec is the vector of values you want to omit, something like df2 <- df1[! (df1$Mat %in% vec),] ## the parentheses may be unnecessary Cheers, Bert On Tue, Sep 10, 2013 at 8:03 AM, Arnaud Michel <michel.arnaud@cirad.fr>wrote:> Hi > I would like to eliminate a large number of lines of the dataframe df1 > The lines to delete are given here by the values of Mat (ex : 2,4,7,10). > but I have a large number (300) values of Mat > > dput(df1) > structure(list(Mat = c(1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, > 3, 3, 4, 4, 4, 4, 4, 5, 5, 5, 5, 5, 5, 6, 6, 6, 6, 6, 6, 7, 7, > 7, 7, 7, 7, 7, 8, 8, 8, 8, 8, 8, 8, 9, 9, 9, 9, 9, 9, 10, 10, > 10, 11, 11, 11, 11, 11, 11, 11), Prenom = c("Ginette", "Ginette", > "Ginette", "Ginette", "Ginette", "Ginette", "Nicole", "Nicole", > "Nicole", "Nicole", "Jean", "Jean", "Jean", "Jean", "Jean", "Ginette", > "Ginette", "Ginette", "Ginette", "Ginette", "Hélène", "Hélène", > "Hélène", "Hélène", "Hélène", "Hélène", "Guy", "Guy", "Guy", > "Guy", "Guy", "Guy", "Claude", "Claude", "Claude", "Claude", > "Claude", "Claude", "Claude", "Régine", "Régine", "Régine", "Régine", > "Régine", "Régine", "Régine", "Germain", "Germain", "Germain", > "Germain", "Germain", "Germain", "Béatrice", "Béatrice", "Béatrice", > "Josette", "Josette", "Josette", "Josette", "Josette", "Josette", > "Josette"), Sexe = c("Féminin", "Féminin", "Féminin", "Féminin", > "Féminin", "Féminin", "Féminin", "Féminin", "Féminin", "Féminin", > "Masculin", "Masculin", "Masculin", "Masculin", "Masculin", "Féminin", > "Féminin", "Féminin", "Féminin", "Féminin", "Féminin", "Féminin", > "Féminin", "Féminin", "Féminin", "Féminin", "Masculin", "Masculin", > "Masculin", "Masculin", "Masculin", "Masculin", "Masculin", "Masculin", > "Masculin", "Masculin", "Masculin", "Masculin", "Masculin", "Féminin", > "Féminin", "Féminin", "Féminin", "Féminin", "Féminin", "Féminin", > "Masculin", "Masculin", "Masculin", "Masculin", "Masculin", "Masculin", > "Féminin", "Féminin", "Féminin", "Féminin", "Féminin", "Féminin", > "Féminin", "Féminin", "Féminin", "Féminin")), .Names = c("Mat", > "Prenom", "Sexe"), row.names = c(NA, 62L), class = "data.frame") > > I would like to obtain the data frame df2 > dput(df2) > structure(list(Mat = c(1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, > 3, 3, 4, 4, 4, 4, 4, 5, 5, 5, 5, 5, 5, 6, 6, 6, 6, 6, 6, 7, 7, > 7, 7, 7, 7, 7, 8, 8, 8, 8, 8, 8, 8, 9, 9, 9, 9, 9, 9, 10, 10, > 10, 11, 11, 11, 11, 11, 11, 11), Prenom = c("Ginette", "Ginette", > "Ginette", "Ginette", "Ginette", "Ginette", "Nicole", "Nicole", > "Nicole", "Nicole", "Jean", "Jean", "Jean", "Jean", "Jean", "Ginette", > "Ginette", "Ginette", "Ginette", "Ginette", "Hélène", "Hélène", > "Hélène", "Hélène", "Hélène", "Hélène", "Guy", "Guy", "Guy", > "Guy", "Guy", "Guy", "Claude", "Claude", "Claude", "Claude", > "Claude", "Claude", "Claude", "Régine", "Régine", "Régine", "Régine", > "Régine", "Régine", "Régine", "Germain", "Germain", "Germain", > "Germain", "Germain", "Germain", "Béatrice", "Béatrice", "Béatrice", > "Josette", "Josette", "Josette", "Josette", "Josette", "Josette", > "Josette"), Sexe = c("Féminin", "Féminin", "Féminin", "Féminin", > "Féminin", "Féminin", "Féminin", "Féminin", "Féminin", "Féminin", > "Masculin", "Masculin", "Masculin", "Masculin", "Masculin", "Féminin", > "Féminin", "Féminin", "Féminin", "Féminin", "Féminin", "Féminin", > "Féminin", "Féminin", "Féminin", "Féminin", "Masculin", "Masculin", > "Masculin", "Masculin", "Masculin", "Masculin", "Masculin", "Masculin", > "Masculin", "Masculin", "Masculin", "Masculin", "Masculin", "Féminin", > "Féminin", "Féminin", "Féminin", "Féminin", "Féminin", "Féminin", > "Masculin", "Masculin", "Masculin", "Masculin", "Masculin", "Masculin", > "Féminin", "Féminin", "Féminin", "Féminin", "Féminin", "Féminin", > "Féminin", "Féminin", "Féminin", "Féminin")), .Names = c("Mat", > "Prenom", "Sexe"), row.names = c(NA, 62L), class = "data.frame") > > It is possible to obtain by > df2 <- df1[df1$Mat != 2 | df1$Mat !=4 | [df1$Mat !=7 | [df1$Mat !=10,] > But how to delete these lines when the 300 values of Mat are in the vector > MatDelete > > Any ideas ? > > -- > Michel ARNAUD > Chargé de mission auprès du DRH > DGDRD-Drh - TA 174/04 > Av Agropolis 34398 Montpellier cedex 5 > tel : 04.67.61.75.38 > fax : 04.67.61.57.87 > port: 06.47.43.55.31 > > ______________________________**________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/**listinfo/r-help<https://stat.ethz.ch/mailman/listinfo/r-help> > PLEASE do read the posting guide http://www.R-project.org/** > posting-guide.html <http://www.R-project.org/posting-guide.html> > and provide commented, minimal, self-contained, reproducible code. >-- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm [[alternative HTML version deleted]]