I'm trying to sort a DATAFRAME by a column "ID" that contains alphanumeric data. Specifically,"ID" contains integers all preceeded by the character "g" as in: g1, g6, g3, g19, g100, g2, g39 I am using the following code: DATAFRAME=DATAFRAME[order(DATAFRAME1$ID),] and was hoping it would sort the dataframe by ID in the following manner g1, g2, g3, g6, g19, g39, g100 but it doesn't sort at all. Could anyone point out my mistake? Thank you. Mark
This was just discussed recently. Try: library(gtools) ?mixedorder On 2/24/06, mtb954 mtb954 <mtb954 at gmail.com> wrote:> I'm trying to sort a DATAFRAME by a column "ID" that contains > alphanumeric data. Specifically,"ID" contains integers all preceeded > by the character "g" as in: > > g1, g6, g3, g19, g100, g2, g39 > > I am using the following code: > > DATAFRAME=DATAFRAME[order(DATAFRAME1$ID),] > > and was hoping it would sort the dataframe by ID in the following manner > > g1, g2, g3, g6, g19, g39, g100 > > but it doesn't sort at all. Could anyone point out my mistake? > > Thank you. > > Mark > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html >
On Fri, 2006-02-24 at 12:54 -0600, mtb954 mtb954 wrote:> I'm trying to sort a DATAFRAME by a column "ID" that contains > alphanumeric data. Specifically,"ID" contains integers all preceeded > by the character "g" as in: > > g1, g6, g3, g19, g100, g2, g39 > > I am using the following code: > > DATAFRAME=DATAFRAME[order(DATAFRAME1$ID),] > > and was hoping it would sort the dataframe by ID in the following manner > > g1, g2, g3, g6, g19, g39, g100 > > but it doesn't sort at all. Could anyone point out my mistake? > > Thank you. > > MarkThe values are being sorted by character based ordering, which may be impacted upon by your locale. Thus, on my system, you end up with something like the following:> ID[order(ID)][1] "g1" "g100" "g19" "g2" "g3" "g39" "g6" What you can do, based upon the presumption that the prefix of 'g' is present as you describe above, is:> ID[order(as.numeric((gsub("g", "", ID))))][1] "g1" "g2" "g3" "g6" "g19" "g39" "g100" What this does is to use gsub() to strip the 'g' and then order by numeric value. HTH, Marc Schwartz
Does this help?
ID <- paste("g", sample(1:100, 100, replace=FALSE),
sep="")
ID
[1] "g88" "g5" "g79" "g67"
"g43" "g21" "g66"
[8] "g9" "g38" "g86" "g12"
"g85" "g74" "g34"
[15] "g52" "g95" "g6" "g22"
"g70" "g87" "g7"
[22] "g83" "g63" "g42" "g26"
"g65" "g16" "g97"
[29] "g76" "g2" "g90" "g23"
"g15" "g82" "g75"
[36] "g58" "g17" "g20" "g96"
"g91" "g31" "g33"
[43] "g48" "g32" "g93" "g54"
"g49" "g36" "g81"
[50] "g57" "g27" "g14" "g62"
"g10" "g80" "g71"
[57] "g28" "g37" "g89" "g8"
"g94" "g68" "g56"
[64] "g92" "g41" "g11" "g4"
"g99" "g55" "g60"
[71] "g18" "g69" "g19" "g64"
"g39" "g1" "g53"
[78] "g44" "g24" "g100" "g35"
"g3" "g40" "g47"
[85] "g51" "g46" "g61" "g45"
"g50" "g25" "g13"
[92] "g73" "g77" "g30" "g84"
"g78" "g29" "g59"
[99] "g72" "g98"
ID[order(as.numeric(substr(ID, start=2, stop=nchar(ID))))]
[1] "g1" "g2" "g3" "g4"
"g5" "g6" "g7"
[8] "g8" "g9" "g10" "g11"
"g12" "g13" "g14"
[15] "g15" "g16" "g17" "g18"
"g19" "g20" "g21"
[22] "g22" "g23" "g24" "g25"
"g26" "g27" "g28"
[29] "g29" "g30" "g31" "g32"
"g33" "g34" "g35"
[36] "g36" "g37" "g38" "g39"
"g40" "g41" "g42"
[43] "g43" "g44" "g45" "g46"
"g47" "g48" "g49"
[50] "g50" "g51" "g52" "g53"
"g54" "g55" "g56"
[57] "g57" "g58" "g59" "g60"
"g61" "g62" "g63"
[64] "g64" "g65" "g66" "g67"
"g68" "g69" "g70"
[71] "g71" "g72" "g73" "g74"
"g75" "g76" "g77"
[78] "g78" "g79" "g80" "g81"
"g82" "g83" "g84"
[85] "g85" "g86" "g87" "g88"
"g89" "g90" "g91"
[92] "g92" "g93" "g94" "g95"
"g96" "g97" "g98"
[99] "g99" "g100"
The idea is to drop the leading "g", convert to numeric, and then
order.
mtb954 mtb954 wrote:> I'm trying to sort a DATAFRAME by a column "ID" that contains
> alphanumeric data. Specifically,"ID" contains integers all
preceeded
> by the character "g" as in:
>
> g1, g6, g3, g19, g100, g2, g39
>
> I am using the following code:
>
> DATAFRAME=DATAFRAME[order(DATAFRAME1$ID),]
>
> and was hoping it would sort the dataframe by ID in the following manner
>
> g1, g2, g3, g6, g19, g39, g100
>
> but it doesn't sort at all. Could anyone point out my mistake?
>
> Thank you.
>
> Mark
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html
>
--
Chuck Cleland, Ph.D.
NDRI, Inc.
71 West 23rd Street, 8th floor
New York, NY 10010
tel: (212) 845-4495 (Tu, Th)
tel: (732) 452-1424 (M, W, F)
fax: (917) 438-0894