samarvir singh
2015-Feb-12 04:57 UTC
[R] How to subset data, by sorting names alphabetically.
hello, I am cleaning some large data with 4 million observation and 7 variable. Of the 7 variables , 1 is name/string I want to subset data, which have same name Example- Name var1 var2 var3 var4 var5 var6 aa - - - - - - ab bd ac ad af ba bd aa av i want to sort the data something like this aa aa all aa in a same subset and all ab in same subset every column with same name in a subset thanks in advance. I am new to R community. appreciate your help - Samarvir [[alternative HTML version deleted]]
Hi Samarvir, Assuming that you want to generate a separate data frame for each value of "Name", # name of initial data frame is ssdf for(nameval in unique(ssdf$Name)) assign(nameval,ssdf[ssdf$Name==nameval,]) This will produce as many data frames as there are unique values of ssdf$Name, each named by the values it contains. Jim On Thu, Feb 12, 2015 at 3:57 PM, samarvir singh <samarvir1996 at gmail.com> wrote:> hello, > > I am cleaning some large data with 4 million observation and 7 variable. > Of the 7 variables , 1 is name/string > > I want to subset data, which have same name > > Example- > > Name var1 var2 var3 var4 var5 var6 > aa - - - - - - > ab > bd > ac > ad > af > ba > bd > aa > av > > i want to sort the data something like this > > aa > aa > all aa in a same subset > > and all ab in same subset > > every column with same name in a subset > > > > thanks in advance. > I am new to R community. > appreciate your help > - Samarvir > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
The split function does essentially this, but puts the results into a list rather than using the dangerous and messy assign function. The overall syntax is simpler as well. On Thu, Feb 12, 2015 at 3:14 AM, Jim Lemon <drjimlemon at gmail.com> wrote:> Hi Samarvir, > Assuming that you want to generate a separate data frame for each > value of "Name", > > # name of initial data frame is ssdf > for(nameval in unique(ssdf$Name)) assign(nameval,ssdf[ssdf$Name==nameval,]) > > This will produce as many data frames as there are unique values of > ssdf$Name, each named by the values it contains. > > Jim > > > On Thu, Feb 12, 2015 at 3:57 PM, samarvir singh <samarvir1996 at gmail.com> > wrote: > > hello, > > > > I am cleaning some large data with 4 million observation and 7 variable. > > Of the 7 variables , 1 is name/string > > > > I want to subset data, which have same name > > > > Example- > > > > Name var1 var2 var3 var4 var5 var6 > > aa - - - - - - > > ab > > bd > > ac > > ad > > af > > ba > > bd > > aa > > av > > > > i want to sort the data something like this > > > > aa > > aa > > all aa in a same subset > > > > and all ab in same subset > > > > every column with same name in a subset > > > > > > > > thanks in advance. > > I am new to R community. > > appreciate your help > > - Samarvir > > > > [[alternative HTML version deleted]] > > > > ______________________________________________ > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Gregory (Greg) L. Snow Ph.D. 538280 at gmail.com [[alternative HTML version deleted]]