similar to: Complicated analysis for huge databases

Displaying 20 results from an estimated 2000 matches similar to: "Complicated analysis for huge databases"

2017 Nov 17
0
Complicated analysis for huge databases
Combine columns 1 and 2 into a column with a single ID like "33.55", "44.66" and use split() on these IDs to break up your dataset. Iterate over the list of data frames split() returns. B. > On Nov 17, 2017, at 12:59 PM, Allaisone 1 <allaisone1 at hotmail.com> wrote: > > > Hi all .., > > > I have a large dataset of around 600,000 rows and 600
2017 Nov 18
2
Complicated analysis for huge databases
Thanks Boris , this was very helpful but I'm struggling with the last part. 1) I combined the first 2 columns :- library(tidyr) SingleMealsCode <-unite(MyData, MealsCombinations, c(MealA, MealB), remove=FALSE) SingleMealsCode <- SingleMealsCode[,-2] 2) I separated this dataframe into different dataframes based on "MealsCombination" column so R will recognize each meal
2017 Nov 18
2
Complicated analysis for huge databases
Although the loop seems to be formulated correctly I wonder why it gives me these errors : -object 'i' not found - unexpected '}' in "}" the desired output is expected to be very large as for each dataframe in the list of dataframes I expect to see maf value for each of the 600 columns! and this is only for for one dataframe in the list .. I have around 150-200
2017 Nov 18
3
Complicated analysis for huge databases
The loop : AllMAFs <- list() for (i in length(SeparatedGroupsofmealsCombs) { AllMAFs[[i]] <- apply( SeparatedGroupsofmealsCombs[[i]], 2, function(x)maf( tabulate( x+1) )) } gives these errors (I tried this many times and I'm sure I copied it entirely) :- Error in apply(SeparatedGroupsofmealsCombs[[i]], 2, function(x) maf(tabulate(x + : object 'i' not found > }
2017 Nov 18
0
Complicated analysis for huge databases
> On Nov 18, 2017, at 1:52 AM, Allaisone 1 <allaisone1 at hotmail.com> wrote: > > Although the loop seems to be formulated correctly I wonder why > it gives me these errors : > > -object 'i' not found > - unexpected '}' in "}" You probably did not copy the entire code offered. But we cannot know since you did not "show your code",
2017 Nov 18
0
Complicated analysis for huge databases
Something like the following? AllMAFs <- list() for (i in length(SeparatedGroupsofmealsCombs) { AllMAFs[[i]] <- apply(SeparatedGroupsofmealsCombs[[i]], 2, function(x)maf(tabulate(x+1))) } (untested, of course) Also the solution is a bit generic since I don't know what the output of maf() looks like in your case, and I don't understand why you use tabulate because I would have
2017 Nov 18
0
Complicated analysis for huge databases
On 18/11/2017 4:40 PM, Allaisone 1 wrote: > > The loop : > > > AllMAFs <- list() > > for (i in length(SeparatedGroupsofmealsCombs) { > AllMAFs[[i]] <- apply( SeparatedGroupsofmealsCombs[[i]], 2, function(x)maf( tabulate( x+1) )) > } > > > gives these errors (I tried this many times and I'm sure I copied it entirely) :- > > Error in
2017 Nov 19
1
Complicated analysis for huge databases
Thanks but a new error appeared with the loop : Error in x + 1 : non-numeric argument to binary operator I think this can be solved by converting columns (I,II,II,..600) into "numeric" instead of the current "int" type as shown below in the structure of "33_55" dataframe . $ 33_55:'data.frame': 256 obs. of 600 variables: ..$ MealsCombinations
2017 Nov 18
0
Complicated analysis for huge databases
The correct code is: for (i in 1:length(SeparatedGroupsofmealsCombs)) { ... I had mentioned that this is untested, but the error is so obvious ... B. > On Nov 18, 2017, at 4:40 PM, Allaisone 1 <allaisone1 at hotmail.com> wrote: > > > The loop : > > AllMAFs <- list() > > for (i in length(SeparatedGroupsofmealsCombs) { > AllMAFs[[i]] <-
2017 Nov 10
0
Calculating frequencies of multiple values in 200 colomns
Thank you for your effort Bert.., I knew what is the problem now, the values (1,2,3) were only an example. The values I have are 0 , 1, 2 . Tabulate () function seem to ignore calculating the frequency of 0 values and this is my exact problem as the frequency of 0 values should also be calculated for the maf to be calculated correctly. ________________________________ From: Bert Gunter
2017 Nov 09
2
Calculating frequencies of multiple values in 200 colomns
Always reply to the list. I am not a free, private consultant! "For example, if I have the values : 1 , 2 , 3 in each column, applying Tabulate () would calculate the frequency of 1 and 2 without 3" Huh?? > x <- sample(1:3,10,TRUE) > x [1] 1 3 1 1 1 3 2 3 2 1 > tabulate(x) [1] 5 2 3 Cheers, Bert Bert Gunter "The trouble with having an open mind is that people
2017 Nov 10
2
Calculating frequencies of multiple values in 200 colomns
|> x <- sample(0:2, 10, replace = TRUE) |> x [1] 1 0 2 1 0 2 2 0 2 1 |> tabulate(x) [1] 3 4 |> table(x) x 0 1 2 3 3 4 B. > On Nov 10, 2017, at 4:32 AM, Allaisone 1 <allaisone1 at hotmail.com> wrote: > > > > Thank you for your effort Bert.., > > > I knew what is the problem now, the values (1,2,3) were only an example. The values I have are
2017 Nov 10
0
Calculating frequencies of multiple values in 200 colomns
Hi, To clarify the default behavior that Boris is referencing below, note the definition of the 'bin' argument to the tabulate() function: bin: a numeric vector ***(of positive integers)***, or a factor. Long vectors are supported. I added the asterisks for emphasis. This is also noted in the examples used for the function in ?tabulate at the bottom of the help page. The second
2017 Nov 09
3
Calculating frequencies of multiple values in 200 colomns
Hi All I have a dataset of 200 columns and 1000 rows , there are 3 repeated values under each column (7,8,10). I wanted to calculate the frequency of each value under each column and then apply the function maf () given that the frequency of each value is known. I can do the analysis step by step like this :- > Values A B C ... 200 1 7 10 7 2
2012 Feb 10
3
problem subsetting data frame with variable instead of constant
Hello, I've encountered a very weird issue with the method subset(), or maybe this is something I don't know about said method that when you're subsetting based on the columns of a data frame you can only use constants (0.1, 2.3, 2.2) instead of variables? Here's a look at my data frame called 'ea.cad.pwr': *>ea.ca.pwr[1:5,] MAF OR POWER 1 0.02 0.01 0.9999 2 0.02
2010 Nov 29
2
FW: how to use by() ?
Thank you for the suggestion, Bill. The result is not quite what I would like. Here's sample code for you or anyone else who may be interested: Al1 = c('A','C','C','C') Al2 = c('G','G','G','T') Freq1 = c(0.0078,0.0567,0.9434,0.9908) MAF = c(0.0078,0.0567,0.0566,0.0092) m1 = data.frame(Al1=Al1,
2005 Jul 15
1
2D contour predictions
Hi All I have been fitting regression models and would now like to produce some contour & image plots from the predictors. Is there an easy way to do this? My current (newbie) experience with R would suggest there is but that it's not always easy to find it! f3 <- lm( fc ~ poly( speed, 2 ) + poly( torque, 2 ) + poly( sonl, 2 ) + poly( p_rail, 2 ) + poly( pil_sep, 2 ) + poly( maf, 2
2003 Sep 17
2
CART analysis
Greetings, Does anyone know of an R code for classification and regression tree analysis (CART)? Thank you Ron Ron Thornton BVSc, PhD, MACVSc (pathology, epidemiology) Programme Co-ordinator, Active Surveillance Animal Biosecurity MAF Biosecurity Authority P O Box 2526 Wellington, New Zealand phone: 64-4-4744156 027 223 7582 fax: 64-4-474-4133 e-mail: ron.thornton at maf.govt.nz
2007 Jan 21
2
efficient code. how to reduce running time?
Hi, I am new to R. and even though I've made my code to run and do what it needs to . It is taking forever and I can't use it like this. I was wondering if you could help me find ways to fix the code to run faster. Here are my codes.. the data set is a bunch of 0s and 1s in a data.frame. What I am doing is this. I pick a column and make up a new column Y with values associated with that
2010 Nov 29
3
how to use by() ?
Hello, All! How might one accomplish this using the by() function? m1 is a data frame. # populate column "m1$major_allele" for ( i in 1:length(m1$major_allele)) { if ( m1$Freq1[i] == m1$MAF[i]){ m1$major_allele[i] = m1$Al1[i] } else{ m1$major_allele[i] = m1$Al2[i] } } Jim [[alternative HTML version deleted]]