Hi, I am new to R and learning the basics.so i came to know that data frame can store any data type as oppose to matrix which stores only numeric.My question is 1.)How to replace a particular pattern with new pattern in data frame.I tried something like x = as.data.frame(gsub(".*LINK.*","NA",file))........but the output is really weird it converts every thing in zeros and ones. what i actually want is DELETE ALL THE RANDOM COLUMS IN DF BASED ON ROW VALUE. file = read.csv("x.csv",header=T,sep=",")#read the file file[file == "LINK"] = NA #replaced row pattern with NA file[,colSums(is.na(file))==0] #deleting all colums which has na's This approach pretty much does the thing but i want to get familiar on how to use regular expressions over data frame. Any suggestions on which function/package/regex to use when dealing with DataFrame Sorry if this is a basic question. Thank you [[alternative HTML version deleted]]
Inline beow. On Sunday, February 21, 2016, kalyan chakravarty < kalyanchakravarty456 at gmail.com> wrote:> Hi, > I am new to R and learning the basics.so i came to know that data frame can > store any data type as oppose to matrix which stores only numeric.My > question isFalse. You need to spend more time with a tutorial or two, as your initial "understanding" is already incorrect. The rstudio.com website has some nice suggestions for web tutorials, though you can certainly find many just by searching. Your questions below are essentially nonsense, because you do not understand the data frame concept. Cheers, Bert 1.)How to replace a particular pattern with new pattern in data frame.I> tried something like > x = as.data.frame(gsub(".*LINK.*","NA",file))........but the output is > really weird it converts every thing in zeros and ones. > > what i actually want is DELETE ALL THE RANDOM COLUMS IN DF BASED ON ROW > VALUE. > > file = read.csv("x.csv",header=T,sep=",")#read the file > file[file == "LINK"] = NA #replaced row pattern with NA > file[,colSums(is.na(file))==0] #deleting all colums which has na's > > This approach pretty much does the thing but i want to get familiar on how > to use regular expressions over data frame. > Any suggestions on which function/package/regex to use when dealing with > DataFrame > > Sorry if this is a basic question. > Thank you > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org <javascript:;> mailing list -- To UNSUBSCRIBE and > more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) [[alternative HTML version deleted]]
Hi kalyan, It is a bit difficult to work out what you want to do. However, there are some things I can suggest. The gsub function is useful for changing strings not assigning new values. If you want to delete a column of a data frame if there are any NA values, you first want to check for NA values (let's call your data frame x.df to avoid confusion): x.df<-data.frame(a=sample(1:10,10), b=c(sample(1:10,9),NA),c=sample(1:10,10)) any(is.na(x.df[,1])) will return TRUE if at least one element of the first column of x.df is NA. Next you want to know how many columns there are in x.df: ncols<-dim(x.df)[2] Now you can step through the columns _backwards_ (so you don't change the order of the columns you are testing) to delete any containing NA values: for(column in ncols:1) if(any(is.na(x.df[,column]))) x.df[[column]]<-NULL This leaves me with the first and third columns of x.df. Jim On Mon, Feb 22, 2016 at 4:43 AM, kalyan chakravarty < kalyanchakravarty456 at gmail.com> wrote:> Hi, > I am new to R and learning the basics.so i came to know that data frame can > store any data type as oppose to matrix which stores only numeric.My > question is > 1.)How to replace a particular pattern with new pattern in data frame.I > tried something like > x = as.data.frame(gsub(".*LINK.*","NA",file))........but the output is > really weird it converts every thing in zeros and ones. > > what i actually want is DELETE ALL THE RANDOM COLUMS IN DF BASED ON ROW > VALUE. > > file = read.csv("x.csv",header=T,sep=",")#read the file > file[file == "LINK"] = NA #replaced row pattern with NA > file[,colSums(is.na(file))==0] #deleting all colums which has na's > > This approach pretty much does the thing but i want to get familiar on how > to use regular expressions over data frame. > Any suggestions on which function/package/regex to use when dealing with > DataFrame > > Sorry if this is a basic question. > Thank you > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]