thr3ads.net - R help - [R] Using indexing to manipulate data [Mar 2010]

If this information is useful, please help other people find it:
Share via:

duncandonutz

2010-Mar-18 05:05 UTC

[R] Using indexing to manipulate data

I know one of R's advantages is it's ability to index, eliminating the
need
for control loops to select relevant data, so I thought this problem would
be easy.  I can't crack it.  I have looked through past postings, but
nothing seems to match this problem

I have a data set with one column of actors and one column of acts.  I need
a list that will give me a pair of actors in each row, provided they both
participated in the act.

Example:

The Data looks like this:
Jim         A
Bob        A
Bob        C 
Larry      D
Alice      C
Tom       F
Tom       D
Tom       A  
Alice      B
Nancy    B

I would like this:
Jim      Bob
Jim      Tom
Bob     Alice
Larry   Tom
Alice    Nancy

The order doesn't matter (Jim-Bob vs. Bob-Jim), but each pairing should be
counted only once.
Thanks!

-- 
View this message in context:
http://n4.nabble.com/Using-indexing-to-manipulate-data-tp1597547p1597547.html
Sent from the R help mailing list archive at Nabble.com.

Dimitris Rizopoulos

2010-Mar-18 08:22 UTC

head link

[R] Using indexing to manipulate data

One approach is the following:

Dat <- read.table(textConnection(
"Jim         A
Bob        A
Bob        C
Larry      D
Alice      C
Tom       F
Tom       D
Tom       A
Alice      B
Nancy    B"))
closeAllConnections()
names(Dat) <- c("name", "act")


out <- tapply(as.character(Dat$name), Dat$act, function (x) {
     if (length(x) < 2) c(x, "") else t(combn(x, 2))
})
unique(do.call(rbind, out))


I hope it helps.

Best,
Dimitris


On 3/18/2010 6:05 AM, duncandonutz wrote:>
> I know one of R's advantages is it's ability to index, eliminating
the need
> for control loops to select relevant data, so I thought this problem would
> be easy.  I can't crack it.  I have looked through past postings, but
> nothing seems to match this problem
>
> I have a data set with one column of actors and one column of acts.  I need
> a list that will give me a pair of actors in each row, provided they both
> participated in the act.
>
> Example:
>
> The Data looks like this:
> Jim         A
> Bob        A
> Bob        C
> Larry      D
> Alice      C
> Tom       F
> Tom       D
> Tom       A
> Alice      B
> Nancy    B
>
> I would like this:
> Jim      Bob
> Jim      Tom
> Bob     Alice
> Larry   Tom
> Alice    Nancy
>
> The order doesn't matter (Jim-Bob vs. Bob-Jim), but each pairing should
be
> counted only once.
> Thanks!
>
-- 
Dimitris Rizopoulos
Assistant Professor
Department of Biostatistics
Erasmus University Medical Center

Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands
Tel: +31/(0)10/7043478
Fax: +31/(0)10/7043014

Jim Lemon

2010-Mar-18 08:32 UTC

head link

[R] Using indexing to manipulate data

On 03/18/2010 04:05 PM, duncandonutz wrote:>
> I know one of R's advantages is it's ability to index, eliminating
the need
> for control loops to select relevant data, so I thought this problem would
> be easy.  I can't crack it.  I have looked through past postings, but
> nothing seems to match this problem
>
> I have a data set with one column of actors and one column of acts.  I need
> a list that will give me a pair of actors in each row, provided they both
> participated in the act.
>
> Example:
>
> The Data looks like this:
> Jim         A
> Bob        A
> Bob        C
> Larry      D
> Alice      C
> Tom       F
> Tom       D
> Tom       A
> Alice      B
> Nancy    B
>
> I would like this:
> Jim      Bob
> Jim      Tom
> Bob     Alice
> Larry   Tom
> Alice    Nancy
>
> The order doesn't matter (Jim-Bob vs. Bob-Jim), but each pairing should
be
> counted only once.
Hi duncandonutz,
Try this:

actnames<-read.table("junkfunc/names.dat",stringsAsFactors=FALSE)
actorpairs<-NULL
for(act in unique(actnames$V2)) {
  actors<-actnames$V1[actnames$V2 == act]
  nactors<-length(actors)
  if(nactors > 1) {
   indices<-combn(nactors,2)
   for(i in 1:dim(indices)[2])
    actorpairs<-
     rbind(actorpairs,c(actors[indices[1,i]],actors[indices[2,i]]))
  }
}
actorpairs

Jim

Gabor Grothendieck

2010-Mar-18 08:41 UTC

head link

[R] Using indexing to manipulate data

Here are two solutions.  The first uses merge and the second uses
sqldf.  They both do a self join picking off the unique pairs.  The
sqldf solution also sorts the result:

# input
DF <- structure(list(Actor = c("Jim", "Bob",
"Bob", "Larry", "Alice",
"Tom", "Tom", "Tom", "Alice",
"Nancy"), Act = c("A", "A", "C",
"D", "C", "F", "D", "A",
"B", "B")), .Names = c("Actor", "Act"
), class = "data.frame", row.names = c(NA, -10L))

subset(unique(merge(DF, DF, by = 2)), Actor.x < Actor.y)

library(sqldf) # see http://sqldf.googlecode.com
sqldf("select A.Actor, A.Act, B.Act
	from DF A join DF B
	where A.Act = B.Act and A.Actor < B.Actor
	order by A.Act, A.Actor")



On Thu, Mar 18, 2010 at 1:05 AM, duncandonutz <dwadswor at unm.edu>
wrote:>
> I know one of R's advantages is it's ability to index, eliminating
the need
> for control loops to select relevant data, so I thought this problem would
> be easy. ?I can't crack it. ?I have looked through past postings, but
> nothing seems to match this problem
>
> I have a data set with one column of actors and one column of acts. ?I need
> a list that will give me a pair of actors in each row, provided they both
> participated in the act.
>
> Example:
>
> The Data looks like this:
> Jim ? ? ? ? A
> Bob ? ? ? ?A
> Bob ? ? ? ?C
> Larry ? ? ?D
> Alice ? ? ?C
> Tom ? ? ? F
> Tom ? ? ? D
> Tom ? ? ? A
> Alice ? ? ?B
> Nancy ? ?B
>
> I would like this:
> Jim ? ? ?Bob
> Jim ? ? ?Tom
> Bob ? ? Alice
> Larry ? Tom
> Alice ? ?Nancy
>
> The order doesn't matter (Jim-Bob vs. Bob-Jim), but each pairing should
be
> counted only once.
> Thanks!
>
> --
> View this message in context:
http://n4.nabble.com/Using-indexing-to-manipulate-data-tp1597547p1597547.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

Maybe Matching Threads

Search for more apparently analagous threads

R help - Mar 2010 - Using indexing to manipulate data

[R] Using indexing to manipulate data

[R] Using indexing to manipulate data

[R] Using indexing to manipulate data

[R] Using indexing to manipulate data

Maybe Matching Threads