So, I am having a tricky reference file to extract information from. The format of the file is x 1 + 4 * 3 + 5 + 6 + 11 * 0.5 So, the elements that are not being multiplied (1, 5 and 6) and the elements before the multiplication sign (4 and 11) means actually the reference for the row in a matrix where I need to extract the element from. The numbers after the multiplication sign are regular numbers Ex:> x<-matrix(20:35) > x[,1] [1,] 20 [2,] 21 [3,] 22 [4,] 23 [5,] 24 [6,] 25 [7,] 26 [8,] 27 [9,] 28 [10,] 29 [11,] 30 [12,] 31 [13,] 32 [14,] 33 [15,] 34 [16,] 35 I would like to read the rows 1,4,5,6 and 11 and sum then. However the numbers in the elements row 4 and 11 are multiplied by 3 and 0.5 So it would be 20 + 23 * 3 + 24 + 25 + 30 * 0.5. And I have this format in different files so I can't do all by hand. Can anybody help me with a script that can differentiate this? Thanks -- View this message in context: http://r.789695.n4.nabble.com/Differenciate-numbers-from-reference-for-rows-tp3019853p3019853.html Sent from the R help mailing list archive at Nabble.com.
Hi:
x <- matrix(20:35, ncol = 1)
u <- c(1, 4, 5, 6, 11) # 'x values'
m <- c(1, 3, 1, 1, 0.5)
# Function to compute the inner product of the multipliers with the
extracted
# elements of x determined by u
f <- function(mat, inputs, mults) crossprod(mat[inputs], mults)
f(x, u, mults = c(1, 3, 1, 1, 0.5))
[,1]
[1,] 153
20 + 23 * 3 + 24 + 25 + 30 * 0.5
[1] 153
The function is flexible enough to allow you to play with the input matrix
(although a vector would also work), the 'observation vector' inputs and
the
set of multipliers. Here's one way (not necessarily the most efficient):
uv <- matrix(sample(1:15, 25, replace = TRUE), ncol = 5)
uv # like an X matrix, where each row provides the input values of the
vars
[,1] [,2] [,3] [,4] [,5]
[1,] 12 8 11 10 15
[2,] 15 11 14 14 8
[3,] 4 8 4 10 12
[4,] 10 5 2 1 7
[5,] 11 4 9 1 11
# Apply the function f to each row of uv:
apply(uv, 1, function(y) f(x, y, mults = c(1, 3, 1, 1, 0.5)))
[1] 188.0 203.5 171.5 155.0 162.0
The direct matrix version:
crossprod(t(matrix(x[uv], ncol = 5)), c(1, 3, 1, 1, 0.5))
[,1]
[1,] 188.0
[2,] 203.5
[3,] 171.5
[4,] 155.0
[5,] 162.0
Notice that the apply() call returns a vector whereas crossprod() returns a
matrix.
x[uv] selects the x values associated with the indices in uv and returns a
vector in column-major order. The crossprod() call transposes the reshaped
x[uv] and then 'matrix' multiplies it by the vector c(1, 3, 1, 1, 0.5).
HTH,
Dennis
On Fri, Oct 29, 2010 at 3:54 PM, M.Ribeiro <mresendeufv@yahoo.com.br>
wrote:
>
> So, I am having a tricky reference file to extract information from.
>
> The format of the file is
>
> x 1 + 4 * 3 + 5 + 6 + 11 * 0.5
>
> So, the elements that are not being multiplied (1, 5 and 6) and the
> elements
> before the multiplication sign (4 and 11) means actually the reference for
> the row in a matrix where I need to extract the element from.
>
> The numbers after the multiplication sign are regular numbers
> Ex:
>
> > x<-matrix(20:35)
> > x
> [,1]
> [1,] 20
> [2,] 21
> [3,] 22
> [4,] 23
> [5,] 24
> [6,] 25
> [7,] 26
> [8,] 27
> [9,] 28
> [10,] 29
> [11,] 30
> [12,] 31
> [13,] 32
> [14,] 33
> [15,] 34
> [16,] 35
>
> I would like to read the rows 1,4,5,6 and 11 and sum then. However the
> numbers in the elements row 4 and 11 are multiplied by 3 and 0.5
>
> So it would be
> 20 + 23 * 3 + 24 + 25 + 30 * 0.5.
>
> And I have this format in different files so I can't do all by hand.
> Can anybody help me with a script that can differentiate this?
> Thanks
> --
> View this message in context:
>
http://r.789695.n4.nabble.com/Differenciate-numbers-from-reference-for-rows-tp3019853p3019853.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
[[alternative HTML version deleted]]
Gabor Grothendieck
2010-Oct-30 12:42 UTC
[R] Differenciate numbers from reference for rows
On Fri, Oct 29, 2010 at 6:54 PM, M.Ribeiro <mresendeufv at yahoo.com.br> wrote:> > So, I am having a tricky reference file to extract information from. > > The format of the file is > > x ? 1 + 4 * 3 + 5 + 6 + 11 * 0.5 > > So, the elements that are not being multiplied (1, 5 and 6) and the elements > before the multiplication sign (4 and 11) means actually the reference for > the row in a matrix where I need to extract the element from. > > The numbers after the multiplication sign are regular numbers > Ex: > >> x<-matrix(20:35) >> x > ? ? ?[,1] > ?[1,] ? 20 > ?[2,] ? 21 > ?[3,] ? 22 > ?[4,] ? 23 > ?[5,] ? 24 > ?[6,] ? 25 > ?[7,] ? 26 > ?[8,] ? 27 > ?[9,] ? 28 > [10,] ? 29 > [11,] ? 30 > [12,] ? 31 > [13,] ? 32 > [14,] ? 33 > [15,] ? 34 > [16,] ? 35 > > I would like to read the rows 1,4,5,6 and 11 and sum then. However the > numbers in the elements row 4 and 11 are multiplied by 3 and 0.5 > > So it would be > 20 + 23 * 3 + 24 + 25 + 30 * 0.5. > > And I have this format in different files so I can't do all by hand. > Can anybody help me with a script that can differentiate this?I assume that every number except for the second number in the pattern number * number is to be replaced by that row number in x. Try this. We define a regular expression which matches the first number ([0-9]+) of each potential pair and optionally (?) spaces ( *) a star (\\*), more spaces ( *) and digits [0-9.]+ passing the first and second backreferences (matches to the parenthesized portions of the regular expression) to f and inserting the output of f where the matches had been. library(gsubfn) f <- function(a, b) paste(x[as.numeric(a)], b) s2 <- gsubfn("([0-9]+)( *\\* *[0-9.]+)?", f, s) If the objective is to then perform the calculation that that represents then try this: sapply(s2, function(x) eval(parse(text = x))) For example,> s <- c("1 + 4 * 3 + 5 + 6 + 11 * 0.5", "1 + 4 * 3 + 5 + 6 + 11 * 0.5") > x <- matrix(20:35) > f <- function(a, b) paste(x[as.numeric(a)], b) > s2 <- gsubfn("([0-9]+)( *\\* *[0-9.]+)?", f, s) > s2[1] "20 + 23 * 3 + 24 + 25 + 30 * 0.5" "20 + 23 * 3 + 24 + 25 + 30 * 0.5"> sapply(s2, function(x) eval(parse(text = x)))20 + 23 * 3 + 24 + 25 + 30 * 0.5 20 + 23 * 3 + 24 + 25 + 30 * 0.5 153 153 For more see the gsubfn home page at http://gsubfn.googlecode.com -- Statistics & Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com