thr3ads.net - similar to: "Why does aggregate fail?"

Displaying 20 results from an estimated 60000 matches similar to: "Why does aggregate fail?"

2010 Feb 02

Writing out csv files

In my code, I calculate the maximum values with 2 factors using maxr=with(arrdf, tapply(rate,list(weekday,quarter), max, na.rm=T)) and I want to write out the file so that Excel can read it. I used write.table(maxr, fname, sep=",", col.names=TRUE, row.names=TRUE, quote=TRUE, na="0") which works, and yields something like

pdf files in loops

2010 Apr 01

pdf files in loops

I need to make a bunch of PDF files of histograms. I tried gatelist = unique(mdf$ArrivalGate) for( gate in gatelist) { outfile = paste("../", airport, "/", airport, "taxiHistogram", gate, ".pdf", sep="") pdf(file = outfile, width = 10, height=8, par(lwd=1)) title=paste("Taxi time for Arrival Gate", gate, "by

Using the output of strsplit

2010 Jan 18

Using the output of strsplit

I successfully combined my data frames, and am now on my next hurdle. I had combined the data and quarter, and used tapply to count the entries for each unique date/quarter pair. ar= tapply(ewrgnd$gw, list(ewrgnd$dq), sum) #for each date/quarter combination sums the gw (which are all 1) dq=row.names(ar) spl=strsplit(dq) But I need to split them back into the separate date and quarter. So I used

Comparing dates in dataframes

2010 Jan 16

Comparing dates in dataframes

I have two data frames. One (arr) has all arrivals to an airport for a year, and the other (gw) has the dates and quarter hour of the day when the weather is good. arr has a Date and quarter hour column. >names(arr) [1] "Date" "weekday" "hour" "month" "minute" [6] "quarter" "ICAO"

bwplot puts the bars in the wrong place

2010 Apr 16

bwplot puts the bars in the wrong place

Dear R-Help, With the attached data set, I am still getting incorrect bwplots > xyplot(gdf$tt~gdf$OnHour |gdf$Runway, data=gdf) # Is correct > bwplot(gdf$tt~gdf$OnHour |gdf$Runway, data=gdf, horizontal=FALSE) # Puts the boxes on the wrong x-axis values # look especially at 0 and 3. How do I fix this? What is happening? Thanks, Jim Rome

Can you improve on this code?

2006 Apr 24

Can you improve on this code?

# File app/models/timesheet.rb, line 27 27: def totals 28: totals = Hash.new 29: totals["Monday"] = totals["Tuesday"] = totals["Wednesday"] = totals["Thursday"] = totals["Friday"] = totals["Saturday"] = totals["Sunday"] = totals["Totals"]=0 #initialise all to zero 31: 32: for item in

How to select a row from one dataframe that is "close" to a row in another dataframe

2010 Mar 20

How to select a row from one dataframe that is "close" to a row in another dataframe

I have two data frames of flight data, but they have very different numbers of rows. They come from different sources, so the data are not identical. > names(oooi) [1] "FltOrigDt" "MkdCrrCd" [3] "MkdFltNbr" "DprtTrpnStnCd" [5] "ArrTrpnStnCd" "ActualOutLocalTimestamp"

Importing fixed-width data

2011 May 25

Importing fixed-width data

I have a data set where the lines look like: 2011-05-13 00:00:00 EONAAL330 dfa13002516PSCNONA 2011-05-13 00:00:01 EONAAL223 laa13044510AS.NONM Some lines are missing the field before and after the NON: 2011-05-13 00:00:05 EONBHS229 mia13001621NON I read them into R using df = read.fwf(file, widths=c(19,-4,7,3,8,2,1,3,1),

Not sure how to use aggregate, colSums, by

2011 Aug 14

Not sure how to use aggregate, colSums, by

I have a data frame called test shown below that i would like to summarize in a particular way : I want to show the column sums (columns y ,f) grouped by country (column e1). However, I'm looking for the data to be split according to column e2. In other words, two tables of sum by country. One table for "con" and one table for "std" shown in column e2. Finally at the

What am I doing wrong in my loops?

2009 Dec 30

What am I doing wrong in my loops?

Dear kind list people: I have the following code: >hours [1] "0" "1" "2" "4" "5" "6" "7" "8" "9" "10" "11" "12" "13" "14" "15" [16] "16" "17" "18" "19" "20" "21" "22"

High frequency time-series

2003 Oct 22

High frequency time-series

Having to collect hourly electricity loads and quarter-of-an-hour electricity production data for some years I think that the tidiest way of doing it is to resort to ts but I don't know how to define such a frequency starting from a set date. Leafing through r-help mail archives I've found this *ALMOST* satisfactory message: ==========================================================

How to separate a data set by its factors

2009 Dec 24

How to separate a data set by its factors

I have a large data set of airport data and wish to analyze it by hour and day of the week. hour and day of the week are factors. I can do something such as: histogram(~(Arrival.Val) | DAY*Hour, type="count", breaks=60) which displays the data the way I want it in principle, but the plots are too small to read. I added layout=c(7,6,4) to the argument list, but then I only get the first

Double x grid in ggplot2

2011 Jun 10

Double x grid in ggplot2

I am trying to overlay raw data with a boxplot as follows: pp = qplot(factor(time, levels=0:60, ordered=TRUE), error, data=dfsub, size=I(1), main =" title", ylab="Error (min)", xlab="Time before ON (min)", alpha=I(1/10), ylim=c(-30,40), geom="jitter") + facet_wrap(~ runway, ncol=2) +

How to speed up interpolation

2011 Jul 17

How to speed up interpolation

df is a very large data frame with arrival estimates for many flights (DF$flightfact) at random times (df$PredTime). The error of the estimate is df$dt. My problem is that I want to know the prediction error at each minute before landing. This code works, but is very slow, and dominates everything. I tried using split(), but that rapidly ate up my 12 GB of memory. So, is there a better R way of

How to suppress factor labels

2011 Jun 08

How to suppress factor labels

I am using ggplot2 to make a boxplot that overlays a scatterplot: pp = qplot(time, error, data=times, size=I(1), geom="jitter", main=title, ylab="Error (min)", xlab="Time before ON (min)", alpha=I(1/10), color=times$runway, ylim=c(-30,40)) pp2 = pp + with(times, facet_wrap(~ runway, ncol=2)) print(pp2 + geom_boxplot(alpha=.5,

Beginner question: select cases

2006 Sep 25

Beginner question: select cases

Hello all, I hope i chose the right list as my question is a beginner-question. I have a data set with 3 colums "London", "Rome" and "Vienna" - the location is presented through a 1 like this: London Rome Vienna q1 0 0 1 4 0 1 0 2 1 0 0 3 .... .... .... I just want to calculate the means of a variable q1. I tried following script: # calculate the mean

Rome TW on Ubuntu 10.10 (maverick)

2011 Feb 24

Rome TW on Ubuntu 10.10 (maverick)

Hello, I was asked to provide info about my attempt to run Rome TW on my Linux system. Not sure what info exactly is sought, so please ask for additional information. The system I tried to use is a Dell Latitude D830 with 3 GB ram and the latest Bios version A15 I've installed Ubuntu 10.10 32bit. I did install Compiz Fusion from standard Ubuntu repo's. Additionaly, I installed Wine 1.2

zoo: hourly values (local time) not unique

2008 Sep 22

zoo: hourly values (local time) not unique

Hi! I've got a time series as a zoo object which contains hourly values. My problem is that these values occur in every "real" hour with regard to daylight savings time. I.e. the last sunday in march, i'll have 23values whereas the last sunday in october contains 25 values instead of 24. Thus if I try to aggregate the data using for example tapply (e.g. to get a monthly mean),

trying ti use a function in aggregate

2012 Oct 25

trying ti use a function in aggregate

Hi -I am using R v 2.13.0. I am trying to use the aggregate function to calculate the percent at length for each Trip_id and CommonName. Here is a small subset of the data. Trip_id Vessel CommonName Length Count 1 230 Sunlight Shad,American 19 1 2 230 Sunlight Shad,American 20 1 3 230 Sunlight Shad,American 21

Why do histogram bars vary their width?

2009 Dec 26

Why do histogram bars vary their width?

histogram(~(Arrival4) | as.factor(Hour), type="count", breaks=16,ylab="Arrival Count", xlab="Arrival Rate/4",main="Friday EWR A22R D22L Configiration", layout=c(6,4), par.strip.text=list(cex=0.7)) Why do I get plots with different bar widths? See attached. Thanks, Jim Rome -------------- next part -------------- A non-text attachment was scrubbed...

similar to: Why does aggregate fail?