thr3ads.net - similar to: "Complicated analysis for huge databases"

Displaying 20 results from an estimated 2000 matches similar to: "Complicated analysis for huge databases"

2017 Nov 17

Complicated analysis for huge databases

Combine columns 1 and 2 into a column with a single ID like "33.55", "44.66" and use split() on these IDs to break up your dataset. Iterate over the list of data frames split() returns. B. > On Nov 17, 2017, at 12:59 PM, Allaisone 1 <allaisone1 at hotmail.com> wrote: > > > Hi all .., > > > I have a large dataset of around 600,000 rows and 600

Complicated analysis for huge databases

2017 Nov 18

Complicated analysis for huge databases

Thanks Boris , this was very helpful but I'm struggling with the last part. 1) I combined the first 2 columns :- library(tidyr) SingleMealsCode <-unite(MyData, MealsCombinations, c(MealA, MealB), remove=FALSE) SingleMealsCode <- SingleMealsCode[,-2] 2) I separated this dataframe into different dataframes based on "MealsCombination" column so R will recognize each meal

Complicated analysis for huge databases

2017 Nov 18

Complicated analysis for huge databases

Although the loop seems to be formulated correctly I wonder why it gives me these errors : -object 'i' not found - unexpected '}' in "}" the desired output is expected to be very large as for each dataframe in the list of dataframes I expect to see maf value for each of the 600 columns! and this is only for for one dataframe in the list .. I have around 150-200

Complicated analysis for huge databases

2017 Nov 18

Complicated analysis for huge databases

The loop : AllMAFs <- list() for (i in length(SeparatedGroupsofmealsCombs) { AllMAFs[[i]] <- apply( SeparatedGroupsofmealsCombs[[i]], 2, function(x)maf( tabulate( x+1) )) } gives these errors (I tried this many times and I'm sure I copied it entirely) :- Error in apply(SeparatedGroupsofmealsCombs[[i]], 2, function(x) maf(tabulate(x + : object 'i' not found > }

Complicated analysis for huge databases

2017 Nov 18

Complicated analysis for huge databases

> On Nov 18, 2017, at 1:52 AM, Allaisone 1 <allaisone1 at hotmail.com> wrote: > > Although the loop seems to be formulated correctly I wonder why > it gives me these errors : > > -object 'i' not found > - unexpected '}' in "}" You probably did not copy the entire code offered. But we cannot know since you did not "show your code",

Complicated analysis for huge databases

2017 Nov 18

Complicated analysis for huge databases

Something like the following? AllMAFs <- list() for (i in length(SeparatedGroupsofmealsCombs) { AllMAFs[[i]] <- apply(SeparatedGroupsofmealsCombs[[i]], 2, function(x)maf(tabulate(x+1))) } (untested, of course) Also the solution is a bit generic since I don't know what the output of maf() looks like in your case, and I don't understand why you use tabulate because I would have

Complicated analysis for huge databases

2017 Nov 18

Complicated analysis for huge databases

On 18/11/2017 4:40 PM, Allaisone 1 wrote: > > The loop : > > > AllMAFs <- list() > > for (i in length(SeparatedGroupsofmealsCombs) { > AllMAFs[[i]] <- apply( SeparatedGroupsofmealsCombs[[i]], 2, function(x)maf( tabulate( x+1) )) > } > > > gives these errors (I tried this many times and I'm sure I copied it entirely) :- > > Error in

Complicated analysis for huge databases

2017 Nov 19

Complicated analysis for huge databases

Thanks but a new error appeared with the loop : Error in x + 1 : non-numeric argument to binary operator I think this can be solved by converting columns (I,II,II,..600) into "numeric" instead of the current "int" type as shown below in the structure of "33_55" dataframe . $ 33_55:'data.frame': 256 obs. of 600 variables: ..$ MealsCombinations

Complicated analysis for huge databases

2017 Nov 18

Complicated analysis for huge databases

The correct code is: for (i in 1:length(SeparatedGroupsofmealsCombs)) { ... I had mentioned that this is untested, but the error is so obvious ... B. > On Nov 18, 2017, at 4:40 PM, Allaisone 1 <allaisone1 at hotmail.com> wrote: > > > The loop : > > AllMAFs <- list() > > for (i in length(SeparatedGroupsofmealsCombs) { > AllMAFs[[i]] <-

Calculating frequencies of multiple values in 200 colomns

2017 Nov 10

Calculating frequencies of multiple values in 200 colomns

Thank you for your effort Bert.., I knew what is the problem now, the values (1,2,3) were only an example. The values I have are 0 , 1, 2 . Tabulate () function seem to ignore calculating the frequency of 0 values and this is my exact problem as the frequency of 0 values should also be calculated for the maf to be calculated correctly. ________________________________ From: Bert Gunter

Calculating frequencies of multiple values in 200 colomns

2017 Nov 09

Calculating frequencies of multiple values in 200 colomns

Always reply to the list. I am not a free, private consultant! "For example, if I have the values : 1 , 2 , 3 in each column, applying Tabulate () would calculate the frequency of 1 and 2 without 3" Huh?? > x <- sample(1:3,10,TRUE) > x [1] 1 3 1 1 1 3 2 3 2 1 > tabulate(x) [1] 5 2 3 Cheers, Bert Bert Gunter "The trouble with having an open mind is that people

Calculating frequencies of multiple values in 200 colomns

2017 Nov 10

Calculating frequencies of multiple values in 200 colomns

|> x <- sample(0:2, 10, replace = TRUE) |> x [1] 1 0 2 1 0 2 2 0 2 1 |> tabulate(x) [1] 3 4 |> table(x) x 0 1 2 3 3 4 B. > On Nov 10, 2017, at 4:32 AM, Allaisone 1 <allaisone1 at hotmail.com> wrote: > > > > Thank you for your effort Bert.., > > > I knew what is the problem now, the values (1,2,3) were only an example. The values I have are

Calculating frequencies of multiple values in 200 colomns

2017 Nov 10

Calculating frequencies of multiple values in 200 colomns

Hi, To clarify the default behavior that Boris is referencing below, note the definition of the 'bin' argument to the tabulate() function: bin: a numeric vector ***(of positive integers)***, or a factor. Long vectors are supported. I added the asterisks for emphasis. This is also noted in the examples used for the function in ?tabulate at the bottom of the help page. The second

Calculating frequencies of multiple values in 200 colomns

2017 Nov 09

Calculating frequencies of multiple values in 200 colomns

Hi All I have a dataset of 200 columns and 1000 rows , there are 3 repeated values under each column (7,8,10). I wanted to calculate the frequency of each value under each column and then apply the function maf () given that the frequency of each value is known. I can do the analysis step by step like this :- > Values A B C ... 200 1 7 10 7 2

problem subsetting data frame with variable instead of constant

2012 Feb 10

problem subsetting data frame with variable instead of constant

Hello, I've encountered a very weird issue with the method subset(), or maybe this is something I don't know about said method that when you're subsetting based on the columns of a data frame you can only use constants (0.1, 2.3, 2.2) instead of variables? Here's a look at my data frame called 'ea.cad.pwr': *>ea.ca.pwr[1:5,] MAF OR POWER 1 0.02 0.01 0.9999 2 0.02

FW: how to use by() ?

2010 Nov 29

FW: how to use by() ?

Thank you for the suggestion, Bill. The result is not quite what I would like. Here's sample code for you or anyone else who may be interested: Al1 = c('A','C','C','C') Al2 = c('G','G','G','T') Freq1 = c(0.0078,0.0567,0.9434,0.9908) MAF = c(0.0078,0.0567,0.0566,0.0092) m1 = data.frame(Al1=Al1,

2D contour predictions

2005 Jul 15

2D contour predictions

Hi All I have been fitting regression models and would now like to produce some contour & image plots from the predictors. Is there an easy way to do this? My current (newbie) experience with R would suggest there is but that it's not always easy to find it! f3 <- lm( fc ~ poly( speed, 2 ) + poly( torque, 2 ) + poly( sonl, 2 ) + poly( p_rail, 2 ) + poly( pil_sep, 2 ) + poly( maf, 2

CART analysis

2003 Sep 17

CART analysis

Greetings, Does anyone know of an R code for classification and regression tree analysis (CART)? Thank you Ron Ron Thornton BVSc, PhD, MACVSc (pathology, epidemiology) Programme Co-ordinator, Active Surveillance Animal Biosecurity MAF Biosecurity Authority P O Box 2526 Wellington, New Zealand phone: 64-4-4744156 027 223 7582 fax: 64-4-474-4133 e-mail: ron.thornton at maf.govt.nz

efficient code. how to reduce running time?

2007 Jan 21

efficient code. how to reduce running time?

Hi, I am new to R. and even though I've made my code to run and do what it needs to . It is taking forever and I can't use it like this. I was wondering if you could help me find ways to fix the code to run faster. Here are my codes.. the data set is a bunch of 0s and 1s in a data.frame. What I am doing is this. I pick a column and make up a new column Y with values associated with that

how to use by() ?

2010 Nov 29

how to use by() ?

Hello, All! How might one accomplish this using the by() function? m1 is a data frame. # populate column "m1$major_allele" for ( i in 1:length(m1$major_allele)) { if ( m1$Freq1[i] == m1$MAF[i]){ m1$major_allele[i] = m1$Al1[i] } else{ m1$major_allele[i] = m1$Al2[i] } } Jim [[alternative HTML version deleted]]

similar to: Complicated analysis for huge databases