Matt
2009-Feb-28 14:05 UTC
[R] arules: rules are built on item ordering in the dataframe, rather than
Hi, I'm trying out the package arules and I'm having a bit of trouble getting my data to work properly. I have a set of transactions with the purchased products but each product could appear in a different column in the data frame. This causes the rules to be built based on the ordering, which is not significant. Here is an example: # # Code: my.df <- data.frame( transaction=as.factor(1:4), item1=c("a", "b", "c", "d"), item2=c("e", "a", "f", "b"), item3=c("h", "i", "b", "a")) # Create transactions library(arules) my.trans <- as(my.df[,2:4], "transactions") # Create Rules rules <- apriori(my.trans, parameter=list(support=.01, confidence=0.6)) inspect(rules) ## End code I'd like the confidence to be high for a -> b or b -> a (they appear together in each transaction) regardless of *where* they appear. This example gives the expected results: ## Working example: my.df2 <- data.frame( transaction=as.factor(1:4), a = rep("a", 4), b = rep("b", 4), c = c(NA, "c", NA, NA), d = c(NA, NA, "d", "d")) my.trans2 <- as(my.df2[,2:5], "transactions") rules2 <- apriori(my.trans2, parameter=list(support=.01, confidence=0.6)) inspect(rules2) ## End code I can't figure out how to coerce my data frame into this format (or if this is the best way to accomplish my objective). I appreciate your help. Thanks, Matt
Possibly Parallel Threads
- Transform values from one column into column names of new dataframe
- How to read-in a transaction-table with single items per line via RODBC?
- Error Message During ANOVA
- data.frame manipulation: Unbinding strings in a row
- rank analysis - reinventing the wheel?