I have the following data set, representing the the estimated number of some event (est), when the actual number was 3, 4, ..., 15. The numbers in the cells are the observed *frequencies* of each combination of (actual, estimated), so each column (a3 -- a15) gives a single discrete frequency distribution. Thus, when the actual number was 6, the estimated values were 5,6,7 with frequencies 7, 120, 20. The NAs could be taken as 0s. > jevons <- read.csv("C:/Documents/milestone/papers/Jevons/jevons.csv") > jevons est a3 a4 a5 a6 a7 a8 a9 a10 a11 a12 a13 a14 a15 1 3 23 NA NA NA NA NA NA NA NA NA NA NA NA 2 4 NA 65 NA NA NA NA NA NA NA NA NA NA NA 3 5 NA NA 102 7 NA NA NA NA NA NA NA NA NA 4 6 NA NA 4 120 18 NA NA NA NA NA NA NA NA 5 7 NA NA 1 20 113 30 2 NA NA NA NA NA NA 6 8 NA NA NA NA 25 76 24 6 1 NA NA NA NA 7 9 NA NA NA NA NA 28 76 37 11 1 NA NA NA 8 10 NA NA NA NA NA 1 18 46 19 4 NA NA NA 9 11 NA NA NA NA NA NA 2 16 26 17 7 2 NA 10 12 NA NA NA NA NA NA NA 2 12 19 11 3 2 11 13 NA NA NA NA NA NA NA NA NA 3 6 3 1 12 14 NA NA NA NA NA NA NA NA NA 1 1 4 6 13 15 NA NA NA NA NA NA NA NA NA NA 1 2 2 I'd like to make a plots of (x=actual, y=estimated) and (x=actual, y=estimated-actual), showing these frequency distributions, but I'm not sure how to make a plot that shows these distributions clearly, since the values are frequencies. To start off, I converted the table to a data frame, excluding the NAs: jevons.df <- matrix(0, 0, 3) colnames(jevons.df) <- c("actual", "estimated", "frequency") for(i in 1:nrow(jevons)) { estimated <-i+2 for (j in 2:14) { actual <- j+1 freq <- jevons[i,j] if (! is.na(freq)) jevons.df <- rbind(jevons.df, c(actual, estimated, freq)) } } giving > jevons.df actual estimated frequency [1,] 3 3 23 [2,] 4 4 65 [3,] 5 5 102 [4,] 6 5 7 [5,] 5 6 4 [6,] 6 6 120 [7,] 7 6 18 [8,] 5 7 1 ... -Michael -- Michael Friendly Email: friendly AT yorku DOT ca Professor, Psychology Dept. York University Voice: 416 736-5115 x66249 Fax: 416 736-5814 4700 Keele Street http://www.math.yorku.ca/SCS/friendly.html Toronto, ONT M3J 1P3 CANADA
Try this (possibly after scaling the rows or columns to 1): library(gplots) with(as.data.frame.table(as.matrix(jevons[-1])), balloonplot(Var1, Var2, Freq)) On Thu, Oct 8, 2009 at 9:24 AM, Michael Friendly <friendly at yorku.ca> wrote:> I have the following data set, representing the the estimated number of some > event (est), when the actual number > was 3, 4, ..., 15. ?The numbers in the cells are the observed *frequencies* > of each combination of (actual, estimated), > so each column (a3 -- a15) gives a single discrete frequency distribution. > ?Thus, when the actual number was 6, > the estimated values were 5,6,7 with frequencies 7, 120, 20. ?The NAs could > be taken as 0s. > > >> jevons <- read.csv("C:/Documents/milestone/papers/Jevons/jevons.csv") >> jevons > ?est a3 a4 ?a5 ?a6 ?a7 a8 a9 a10 a11 a12 a13 a14 a15 > 1 ? ?3 23 NA ?NA ?NA ?NA NA NA ?NA ?NA ?NA ?NA ?NA ?NA > 2 ? ?4 NA 65 ?NA ?NA ?NA NA NA ?NA ?NA ?NA ?NA ?NA ?NA > 3 ? ?5 NA NA 102 ? 7 ?NA NA NA ?NA ?NA ?NA ?NA ?NA ?NA > 4 ? ?6 NA NA ? 4 120 ?18 NA NA ?NA ?NA ?NA ?NA ?NA ?NA > 5 ? ?7 NA NA ? 1 ?20 113 30 ?2 ?NA ?NA ?NA ?NA ?NA ?NA > 6 ? ?8 NA NA ?NA ?NA ?25 76 24 ? 6 ? 1 ?NA ?NA ?NA ?NA > 7 ? ?9 NA NA ?NA ?NA ?NA 28 76 ?37 ?11 ? 1 ?NA ?NA ?NA > 8 ? 10 NA NA ?NA ?NA ?NA ?1 18 ?46 ?19 ? 4 ?NA ?NA ?NA > 9 ? 11 NA NA ?NA ?NA ?NA NA ?2 ?16 ?26 ?17 ? 7 ? 2 ?NA > 10 ?12 NA NA ?NA ?NA ?NA NA NA ? 2 ?12 ?19 ?11 ? 3 ? 2 > 11 ?13 NA NA ?NA ?NA ?NA NA NA ?NA ?NA ? 3 ? 6 ? 3 ? 1 > 12 ?14 NA NA ?NA ?NA ?NA NA NA ?NA ?NA ? 1 ? 1 ? 4 ? 6 > 13 ?15 NA NA ?NA ?NA ?NA NA NA ?NA ?NA ?NA ? 1 ? 2 ? 2 > > I'd like to make a plots of (x=actual, y=estimated) and (x=actual, > y=estimated-actual), showing these frequency distributions, > but I'm not sure how to make a plot that shows these distributions clearly, > since the values are frequencies. > To start off, I converted the table to a data frame, excluding the NAs: > > jevons.df <- matrix(0, 0, 3) > colnames(jevons.df) <- c("actual", "estimated", "frequency") > for(i in 1:nrow(jevons)) { > ?estimated <-i+2 > ?for (j in 2:14) { > ? actual <- j+1 > ? freq <- jevons[i,j] > ? if (! is.na(freq)) jevons.df <- rbind(jevons.df, c(actual, estimated, > freq)) > ? } > } > > giving > >> jevons.df > ? ? actual estimated frequency > [1,] ? ? ?3 ? ? ? ? 3 ? ? ? ?23 > [2,] ? ? ?4 ? ? ? ? 4 ? ? ? ?65 > [3,] ? ? ?5 ? ? ? ? 5 ? ? ? 102 > [4,] ? ? ?6 ? ? ? ? 5 ? ? ? ? 7 > [5,] ? ? ?5 ? ? ? ? 6 ? ? ? ? 4 > [6,] ? ? ?6 ? ? ? ? 6 ? ? ? 120 > [7,] ? ? ?7 ? ? ? ? 6 ? ? ? ?18 > [8,] ? ? ?5 ? ? ? ? 7 ? ? ? ? 1 > ?... > > -Michael > > > > -- > Michael Friendly ? ? Email: friendly AT yorku DOT ca Professor, Psychology > Dept. > York University ? ? ?Voice: 416 736-5115 x66249 Fax: 416 736-5814 > 4700 Keele Street ? ?http://www.math.yorku.ca/SCS/friendly.html > Toronto, ONT ?M3J 1P3 CANADA > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
On 10/09/2009 12:24 AM, Michael Friendly wrote:> est a3 a4 a5 a6 a7 a8 a9 a10 a11 a12 a13 a14 a15 > 1 3 23 NA NA NA NA NA NA NA NA NA NA NA NA > 2 4 NA 65 NA NA NA NA NA NA NA NA NA NA NA > 3 5 NA NA 102 7 NA NA NA NA NA NA NA NA NA > 4 6 NA NA 4 120 18 NA NA NA NA NA NA NA NA > 5 7 NA NA 1 20 113 30 2 NA NA NA NA NA NA > 6 8 NA NA NA NA 25 76 24 6 1 NA NA NA NA > 7 9 NA NA NA NA NA 28 76 37 11 1 NA NA NA > 8 10 NA NA NA NA NA 1 18 46 19 4 NA NA NA > 9 11 NA NA NA NA NA NA 2 16 26 17 7 2 NA > 10 12 NA NA NA NA NA NA NA 2 12 19 11 3 2 > 11 13 NA NA NA NA NA NA NA NA NA 3 6 3 1 > 12 14 NA NA NA NA NA NA NA NA NA 1 1 4 6 > 13 15 NA NA NA NA NA NA NA NA NA NA 1 2 2Hi Michael, You can get a visual representation of this by feeding your matrix (here called mfdf) to color2D.matplot (plotrix) like this, although it may not be what you want. color2D.matplot(mfdf,extremes=c("lightgray","black"), show.values=TRUE,xlab="Actual count", ylab="Estimate",main="Actual vs estimate", axes=FALSE) axis(1,at=0.5:12.5,labels=3:15) axis(2,at=0.5:12.5,labels=15:3) Jim