Displaying 20 results from an estimated 8000 matches similar to: "basic question re lm()"
2003 Aug 07
2
Question about 'NA'
Hi all,
Ive got a database with 10 columns (different
variables) for 100 subjects, each column with
different # of NA's. I'd like to know if it is
possible to use a function to exclude the NA's using
only a specific column, lets say:
Data2 <- omit.exclude(Data1$column1) ??, then
Data3 <- omit.exclude(Data1$column2) and so on
I tried the code above but with no results
Thanks
2008 Feb 26
1
Split data.frames depeding values of a column
Hello to all
is there a function wich splits a data.frame (column1,column2,column3,....)
into
data1 <-(column1,column3....) #column2 = 1
data2 <-(column1,column3....) #column2 = 2
data3 <-(column1,column3....) #column2 = 3
...
Regards Knut
2006 Aug 15
3
question re: "summarry.lm" and NA values
Is there a way to get the following code to include
NA values where the coefficients are ?NA??
((summary(reg))$coefficients)
explanation:
Using a loop, I am running regressions on several
?subsets? of ?data1?.
?reg <- ( lm(lm(data1[,1] ~., data1[,2:l])) )?
My regression has 10 independent variables, and I
therefore expect 11 coefficients.
After each regression, I wish to save the
2006 Jan 04
3
matrix math
I am using R 2.1.1 in an windows XP environment.
I have 2 dataframes, temp1 and temp2.
Each dataframe has 20 variables (“cocolumns") and 525 observations (“rows”). All variables are numeric.
I want to create a new dataframe that also has 20 columns and 525 rows. The values in this dataframe should be the sum of the 2 other dataframe.
(i.e. temp1$column
2006 Mar 14
1
using a value in a column to "lookup" data in a certian column of a dataset?
I have a dataset with 20 columns and ~600,000 rows.
Column 1 has a number from 2-19. This number tells
me, for each row, which column has the ?applicable?
data. (i.e. the data that I wish to use for each
individual row)
I want to create a vector that contains the data from
the value in column 1.
e.g.
If column 1, row 1, has a value of ?6?, I want to
obtain the value in column 6, row1.
If
2006 Aug 15
3
merge 2 data frame based on more than 2 variables
Dear Lister,
I understand merge() can be used to join 2 data frames based on 1 variable.
But how about merge based on more than 2 variables?
Thank you so much!
--
WenSui Liu
(http://spaces.msn.com/statcompute/blog)
Senior Decision Support Analyst
Health Policy and Clinical Effectiveness
Cincinnati Children Hospital Medical Center
[[alternative HTML version deleted]]
2010 Apr 19
1
Grouping rows of data by day
Hi all,
I have a set of data in hourly time steps with each row identified as
time data column1 data column2
1 9999 9999
1.042 9999 9999
1.083 9999 9999
1.125 9999 9999
1.167 9999
2011 Dec 15
2
lm and R-squared (newbie)
Hello,
I've two data.frames (data1 and data4), dec="." and sep=";".
http://r.789695.n4.nabble.com/file/n4199964/data1.txt data1.txt
http://r.789695.n4.nabble.com/file/n4199964/data4.txt data4.txt
When I do
plot(data1$nx,data1$ny, col="red")
points(data4$nx,data4$ny, col="blue")
, results seem very similar (at least to me) but the R-squared of
2006 Nov 28
3
comments in scan
I had a question about scan in R. For better code readability, I
would like to have lines in the block of data to be scanned that are
commented - not just lines that have a comment at the end. For example
#age, weight, height
33,128,65
34,56,155
instead of having to do something like
33,128,65 #age, weight, height
34,56,155
Is this at all possible?
2008 Jun 07
2
Using lm with a matrix?
I'm trying to do a linear regression between the columns of matrices. In
example below I want to regress column 1 of matrix xdat with column1 of ydat
and do a separate regression between the column 2s of each matrix. But the
output I get seems to give correct slopes but incorrect intercepts and
another set of slopes with value NA. How do I do this correctly? I'm after
the slope and
2024 Dec 11
1
Cores hang when calling mcapply
Hello Thomas,
Consider that the primary bottleneck may be tied to memory usage and the complexity of pivoting extremely large datasets into wide formats with tens of thousands of unique values per column. Extremely large expansions of columns inherently stress both memory and CPU, and splitting into 110k separate data frames before pivoting and combining them again is likely causing resource
2024 Dec 12
1
Cores hang when calling mcapply
Hi Gregg.
Just wanted to follow up on the solution you proposed.
I had to make some adjustments to get exactly what I wanted, but it works, and takes about 15 minutes on our server configuration:
temp <-
??????open_dataset(
????????????sources = input_files,
????????????format = 'csv',
????????????unify_schema = TRUE,
????????????col_types = schema(
????????????"ID_Key"
2024 Dec 12
1
Cores hang when calling mcapply
Hi Thomas,
Glad to hear the suggestion helped, and that switching to a `data.table` approach reduced the processing time and memory overhead?15 minutes for one of the smaller datasets is certainly better! Sounds like the adjustments you devised, especially keeping the multicore approach for `make_clean_names()` and ensuring that `ID_Key` values remain intact, were the missing components you
2004 Nov 23
5
number of pairwise present data in matrix with missings
is there a smart way of determining the number of pairwise present data
in a data matrix with missings (maybe as a by-product of some
statistical function?)
so far, i used several loops like:
for (column1 in 1:99) {
for (column2 in 2:100) {
for (row in 1:500) {
if (!is.na(matrix[row,column1]) & !is.na(matrix[row,column2])) {
pairs[col1,col2] <- pairs[col1,col2]+1
2009 May 14
2
Function to read a string as the variables as opposed to taking the string name as the variable
I am writing a custom function that uses an R-function from the
reshape package: cast. However, my question could be applicable to
any R function.
Normally one writes the arguments directly into a function, e.g.:
result=cast(table1, column1 + column2 + column3 ~ column4,
mean) (1)
I need to be able to write this statement as follows:
result=cast(table1, string_with_columns ~
2024 Dec 11
1
Cores hang when calling mcapply
About to try this implementation.
As a follow-up, this is the exact error:
Lost warning messages
Error: no more error handlers available (recursive errors?); invoking 'abort' restart
Execution halted
Error: cons memory exhausted (limit reached?)
Error: cons memory exhausted (limit reached?)
Error: cons memory exhausted (limit reached?)
Error: cons memory exhausted (limit reached?)
2010 Feb 27
1
New Variable from Several Existing Variables
I am new to R, but have been using SAS for years. In this transition period,
I am finding myself pulling my hair out to do some of the simplest things.
An example of this is that I need to generate a new variable based on the
outcome of several existing variables in a data row. In other words, if the
variable in all three existing columns are "Yes", then then the new variable
should
2009 Jul 07
2
How to separate the string?
Hi everyone,
Hi want to separate the string(column1) for example
column1 column2 column3 column4 column5 column6
bear b e a r
cat c a t
tiger t i g e r
I know how to do this in excel where using MID function.
Now I want to solve it using R. The list of strings is in
2006 Aug 25
2
R in Nature
Hi all,
We've just had a paper accepted for publication in Nature. We used R for
95% of our analyses (one of my co-authors sneaked in some GenStat when I
wasn't looking.). The preprint is available from the Nature web site, in
the open peer-review trial section. I searched Nature for previous
references to "R Development Core Team", and I received no hits. So I
tentatively
2009 Jun 15
2
Help with syntax error
Hi,
I have written boxplot commands of this form before, but I don''t quite understand why the function call is reporting a syntax error in this instance. All parameters passed to the function are strings.
Thanks in advance.
Payam
> simplevar <- function(wframe,column1,column2) {
+ tframe <- get(wframe)
+ x1 <- which(names(wframe)==column1)
+ x2 <-