Displaying 20 results from an estimated 10000 matches similar to: "Splitting and saving separate dataframes"
2005 Aug 24
2
Remove NAs from Barplot
Dear List:
I'm creating a series of barplots using Sweave that must assume a
standard format. This is student achievement data and the x-axis must
include all grades 3 to 8. In some cases, the data for a grade (or more
than one grade) are missing in the vector math.bar, but are never
missing for the vector apmxpmeet. The following sample code illustrates
the issue.
Using the code below to
2004 Aug 06
1
Comparing rows in a dataframe
Hello
I have a longitudinal dataframe organized in the long format and would like to make comparison between successive rows if certain conditions apply. Specifically, I have four variables of interest: grade, score, year, and schid, associated with each school with 3 measurements per school per grade, therefore the rows are temporally ordered and each school occupies multiple rows. For example,
2004 Nov 17
1
"Impossible to run" error message when using Sweave
Dear List:
I have a large dataset of multiple schools. My goal is to produce a
separate tex file for each school that plots some of the student
achievement scores. Essentially, the aim is to develop a custom report
for each school. To accomplish this, I have code for a loop that gets
sourced into R and then Sweaves the multiple files to create the
individual school reports.
Here is the code for
2004 Aug 03
2
attach data from tapply to dataframe
I am working with a longitudinal data set in the long format. This data
set has three observations per grade level per year. Here are the first
10 rows of the data frame:
>tenn.dat[1:10,]
year schid type grade gain se new cohort
6 2001 100005 5 4 33.1 3.5 4 3
7 2002 100005 5 4 33.9 3.9 4 2
8 2003 100005 5 4 32.3 4.2 4 1
10 2001 100005
2004 Nov 28
1
paste command
In a previous post, I mentioned a loop being used to generate graphs. I have some sample code partially put together but have found one offending line of code that I cannot figure out what to do with.
I have one data frame called grade4. If I do something like
hist(grade4$math)
I get the appropriate chart.
Within the loop, however, I am doing this for multiple files and grades, so I use
2004 Aug 01
3
Creating dummy codes
Is there an efficient way to create a series of dummy codes from a single variable? For example, I have a variable, “grade” = {2, …, 12}. I want to create k-1 dummy codes for grade such that grade 2 is the base (i.e, grade 2 =0).
I am hoping that the new variables can be labeled as grade.3, grade.4 etc. I'll then use
grade <- paste("grade.", 3:12, sep="") in
2004 May 21
2
Help with Plotting Function
Dear List:
I cannot seem to find a way to plot my data correctly. I have a small data frame with 6 total variables (x_1 ... x_6).
I am trying to plot x_1 against x_2 and x_3.
I have tried
plot(x_2, x_1) #obviously works fine
plot(x_3, x_1, add=TRUE) # Does not work. I keep getting error messages.
I would also like to add ablines to this plot.
I have experimented with a number of other
2010 Jun 28
2
Lattice and Beamer
Two things I think are some of the best developments in statistics and production are the lattice package and the beamer class for presentation in Latex. One thing I have not become very good at is properly sizing my visuals to look good in a presentation.
For instance, I have the following code that creates a nice plot (sorry, cannot provide reproducible data).
2004 Nov 28
1
Modifications to an abline
Dear List:
I am working to generate graphs for individual students that will be created through a series of loops in Sweave. Before doing so, I am still trying to design the graph. The code for creating the barplot is below with some sample datapoints just made up for now.
Ultimately, this chart will take data from an lme object using longitudinal student data. So, the dots represent the
2004 Aug 06
1
reshape (was: Comparing rows in a dataframe)
Hi all:
I solved the previous stated problem in something of a brute force way
(but it works). I seem to now be running into one little hiccup using
reshape. Here is a quick snip of the data in long format:
grade stability year schid
6 Grade 4 3 2001 100005
7 Grade 4 3 2002 100005
8 Grade 4 2 2003 100005
10 Grade 5 2 2001 100005
11 Grade 5
2005 May 09
0
Sweep statistics
Dear List:
I am wondering if there is a more efficient way to compute the
following. For the example I am using the star data frame in the mlmRev
package. This has 80 schools and includes grades K, 1, 2, and 3. First I
compute the grade level mean in each school using tapply as:
tapply(star$math, list(star$sch,star$gr), mean, na.rm=T)
This results in a table of means by school for each grade.
2005 Dec 01
1
Simulate Correlated data from complex sample
Dear List:
I have created some code to simulate data from a complex sample where
5000 students are nested in 50 schools. My code returns a dataframe with
a variable representing student achievement at a single time point. My
actual code for creating this is below.
What I would like to do is generate a second column of data that is
correlated with the first at .8 and has the same means within
2007 Sep 19
2
By() with method = spearman
I have a data set where I want the correlations between 2 variables
conditional on a students grade level.
This code works just fine.
by(tmp[,c('mtsc07', 'DCBASmathscoreSPRING')], tmp$Grade, cor,
use='complete', method='pearson')
However, this generates an error
by(tmp[,c('mtsc07', 'DCBASmathscoreSPRING')], tmp$Grade, cor,
use='complete',
2004 Feb 18
2
Area between CDFs
Dear List:
I am trying to find the area between two ECDFs. I am examining the gap in performance between two groups, males and females on a student achievement test in math, which is a continuous metric.
I start by creating a subset of the dataframe
male<-subset(datafile, female="Male")
female<-subset(datafile, female="Female")
I then plot the two CDFs via
2005 Aug 29
1
ylim for graphic
Dear list:
I have some data for which I am generating a series of barplots for
percentages. One issue that I am dealing with is that I am trying to get
the legend to print in a fixed location for each chart generated by the
data. Because these charts are being created in a loop, with different
data, my code searches the data to identify the maximum value in the
data and then print the data values
2004 Nov 29
1
Labeling charts within a loop
Hi All:
This may turn out to be very simply, but I can't seem to add the name of
the school to a chart. The loop I created is below that subsets a
dataframe and creates a chart for each school based on certain
variables. As it stands now, they title includes the school's ID number.
Instead, I want to replace this with the school's actual name, which is
stored in a variable called
2009 Oct 21
1
formula and model.frame
Suppose I have the following function
myFun <- function(formula, data){
f <- formula(formula)
dat <- model.frame(f, data)
dat
}
Applying it with this sample data yields a new dataframe:
qqq <- data.frame(grade = c(3, NA, 3,4,5,5,4,3), score = rnorm(8), idVar = c(1:8))
dat <- myFun(score ~ grade, qqq)
However, what I would like is for the resulting dataframe (dat) to include
2018 Mar 13
2
Possible Improvement to sapply
FYI, in R devel (to become 3.5.0), there's isFALSE() which will cut
some corners compared to identical():
> microbenchmark::microbenchmark(identical(FALSE, FALSE), isFALSE(FALSE))
Unit: nanoseconds
expr min lq mean median uq max neval
identical(FALSE, FALSE) 984 1138 1694.13 1218.0 1337.5 13584 100
isFALSE(FALSE) 713 761 1133.53 809.5 871.5
2018 Mar 13
0
Possible Improvement to sapply
Quite possibly, and I?ll look into that. Aside from the work I was doing, however, I wonder if there is a way such that sapply could avoid the overhead of having to call the identical function to determine the conditional path.
From: William Dunlap [mailto:wdunlap at tibco.com]
Sent: Tuesday, March 13, 2018 12:14 PM
To: Doran, Harold <HDoran at air.org>
Cc: Martin Morgan <martin.morgan
2018 Mar 13
1
Possible Improvement to sapply
You?re right, it sure does. My suggestion causes it to fail when simplify = ?array?
From: William Dunlap [mailto:wdunlap at tibco.com]
Sent: Tuesday, March 13, 2018 12:11 PM
To: Doran, Harold <HDoran at air.org>
Cc: r-help at r-project.org
Subject: Re: [R] Possible Improvement to sapply
Wouldn't that change how simplify='array' is handled?
> str(sapply(1:3,