Displaying 20 results from an estimated 2000 matches similar to: "Discretizing data rows into regular intervals"
2009 May 05
3
Oracle-JRuby error
I am trying to migrate from RoR/MYSql to JRoR/Oracle. I am using Active
Record JDBC to talk to the database. The Migration process to create and
populate the database tables has been painful. My latest issue is the
method new_date is undefined in the JDBC adapter. I have the following
gems installed:
*** LOCAL GEMS ***
actionmailer (2.2.2)
actionpack (2.2.2)
activerecord (2.2.2)
2012 Jul 19
2
finding the values to minimize sum of functions
Hi fellow R users,
I am desperately hoping there is an easy way to do this in R.
Say I have three functions:
f(x) = x^2
f(y) = 2y^2
f(z) = 3z^2
constrained such that x+y+z=c (let c=1 for simplicity).
I want to find the values of x,y,z that will minimize f(x) + f(y) + f(z).
I know I can use the optim function when there is only one function, but
don't know how to set it up when there are
2011 Mar 05
3
subsetting data by specified observation number
Hi members,
I'd like to thank you guys ahead of time for the help. I'm kind of stuck.
I have a data frame with ID and position numbers:
1> head(failed.3)
id position
1 10000997 2
4 1000RW_M 2
15 1006RW_G 2
24 1012RW_M 3
28 10160917 2
30 1016RW_M 13
I'd like to use this to subset out a large dataset and keep only the
observation
2012 Jul 24
9
Regular Expression
Hi--
I have three columns in an input file:
MONTH QUARTER YEAR
2012-07 2012-3 2012
2001-07 2001-3 2001
2002-01 2002-1 2002
I want to make output like so:
MONTH QUARTER YEAR
07 3 2012
07 3 2001
01 1 2002
I was having some trouble getting the regular expression to work. I think
it should
2002 May 07
2
Discretization of numeric attributes
Dear R-helpers:
I am interested in discretization methods for numerical attributes, as they
are reported in the 'machine learning' community. For example, the work of
Fayyad & Irani (IJCAI-93), Kononenko, entropy-based approaches, MDL
principle, the C4.5 approach, etc. I am especially interested in those
methods that take a factor as goal target into account for discretizing
2008 Jan 23
3
How to do more advanced cross tabulation in R?
Hi,
I am trying to reproduce some functionalities of Excel pivot table in R,
sadly, I couldn't figure out how to do it. I am wondering if this is even
possible in R. Does anyone know?
Here is an example:
year=rep(2003,16)
quarter=rep(1:4,each=4)
sales=1:16
company=rep(c("a","b","c","d"),4)
df=data.frame(year,quarter,sales,company) #this is the
2006 Feb 17
3
(Newbie) Functions on vectors
Folks,
I want to make the following function more efficient, by vectorizing it:
getCriterionDecisionDate <- function (quarter , year)
{
if (length(quarter) != length(year)) stop ("Quarter and year vectors
of unequal length!");
ret <- character(0);
for (i in 1:length(quarter)) {
currQuarter <- quarter[i];
currYear <- year[i];
if ((currQuarter < 1) |
2010 Jan 16
3
Comparing dates in dataframes
I have two data frames. One (arr) has all arrivals to an airport for a
year, and the other (gw) has the dates and quarter hour of the day when
the weather is good. arr has a Date and quarter hour column.
>names(arr)
[1] "Date" "weekday" "hour" "month" "minute"
[6] "quarter" "ICAO"
2010 Feb 14
4
Newbie woes with *apply
Dataframe cust has Date-type column open.date. I wish to set up another
column, with (first day of) the quarter of open.date.
To be comprehensive (of course, improvement suggestions are welcome),
month = function(date)
{
return(as.numeric(format(date,"%m")))
}
first.day.of.month = function(date)
{
return(date + 1 - as.numeric(format(date,"%d")))
}
first.day.of.quarter =
2010 Jan 18
3
Using the output of strsplit
I successfully combined my data frames, and am now on my next hurdle.
I had combined the data and quarter, and used tapply to count the
entries for each unique date/quarter pair.
ar= tapply(ewrgnd$gw, list(ewrgnd$dq), sum) #for each date/quarter
combination sums the gw (which are all 1)
dq=row.names(ar)
spl=strsplit(dq)
But I need to split them back into the separate date and quarter. So I
used
2009 Aug 13
3
Finding minimum of time subset
Dear List,
I have a data frame of data taken every few seconds. I would like to subset the data to retain only the data taken on the quarter hour, and as close to the quarter hour as possible. So far I have figured out how to subset the data to the quarter hour, but not how to keep only the minimum time for each quarter hour.
For example:
2009 Feb 19
2
table with 3 variables
I have the initial matrice:
> *data.frame(Subject=rep(100:101, each=4), Quarter=rep(paste("Q",1:4,
sep=""),2), Boolean = rep(c("Y","N"),4))*
Subject Quarter Boolean
1 100 Q1 Y
2 100 Q2 N
3 100 Q3 Y
4 100 Q4 N
5 101 Q1 Y
6 101 Q2 N
7 101 Q3 Y
8 101
2004 Jul 26
1
group definition for a bootstrap
Hi,
This is probably really simple, but I am clearly not R-minded, I have read
the help files, and reread them, and I still can't work out what to do...
I have a data frame (d) with 3 columns (age (0-5), quarter (1-4) and x).
I want to estimate the precision of my mean x by age and quarter, so I want
to carry out a bootstrap for each group.
I am trying to do this within a loop, so I don't
2011 Jul 15
2
Convert continuous variable into discrete variable
Dear all,
I have a continuous variable that can take on values between 0 and 100, for
example: x<-runif(100,0,100)
I also have a second variable that defines a series of thresholds, for
example: y<-c(3, 4.5, 6, 8)
I would like to convert my continuous variable into a discrete one using the
threshold variables:
If x is between 0 and 3 the discrete variable should be 1
If x is between 3
2012 Mar 12
2
mapply & assign to generate functions
Hi,
I have a problem that I'm finding a bit tricky. I'm trying to use
mapply and assign to generate curried functions. For example, if I
have the function divide
divide <- function(x, y) {
x / y
}
And I want the end result to be functionally equivalent to:
half <- function(x) divide(x, 2)
third <- function(x) divide(x, 3)
quarter <- function(x) divide(x, 4)
But I want
2004 Jun 25
1
trouble using boot package
Hello,
I am trying to carry out a bootstrap analysis (using the boot package) on a
table and cannot work out how to get the results I need!
I have a table ("d2") with 4 columns: "ID_code", "Age", "Quarter" and
"StomWt". Age (0-5) and Quarter (1-4) are my strata
Therefore I wish to estimate the confidence intervals for the mean StomWt
for each Age
2009 Feb 19
2
table with 3 varialbes
I have the initial matrice:
> *data.frame(Subject=rep(100:101, each=4), Quarter=rep(paste("Q",1:4,
sep=""),2), Boolean = rep(c("Y","N"),4))*
Subject Quarter Boolean
1 100 Q1 Y
2 100 Q2 N
3 100 Q3 Y
4 100 Q4 N
5 101 Q1 Y
6 101 Q2 N
7 101 Q3 Y
8 101
2010 Feb 02
2
Writing out csv files
In my code, I calculate the maximum values with 2 factors using
maxr=with(arrdf, tapply(rate,list(weekday,quarter), max, na.rm=T))
and I want to write out the file so that Excel can read it.
I used
write.table(maxr, fname, sep=",", col.names=TRUE, row.names=TRUE,
quote=TRUE, na="0")
which works, and yields something like
2006 Jul 25
3
Overplotting: plot() invocation looks ugly ... suggestions?
Hi WizaRds,
I'd like to overplot UK fuel consumption per quarter over the course of five years.
Sounds simple enough?
Unless I'm missing something, the following seems very involved for what I'm trying to do. Any suggestions on simplifications?
The way I did it is awkward mainly because of the first call to plot ... but isn't this necessary, especially to set limits for the
2008 Jun 05
7
Improving data processing efficiency
Hi everyone!
I have a question about data processing efficiency.
My data are as follows: I have a data set on quarterly institutional
ownership of equities; some of them have had recent IPOs, some have not
(I have a binary flag set). The total dataset size is 700k+ rows.
My goal is this: For every quarter since issue for each IPO, I need to
find a "matched" firm in the same