Lisa Solomon
2007-Jul-20  18:03 UTC
[R] Free Online: Data Mining Intro for Beginners, Vendor-Neutral
Intro to Data Mining for Absolute Beginners (no charge) This one-hour webinar is a perfect place to start if you are new to data mining and have little-to-no background in statistics or machine learning. -Dates: July 24, August 9, September 7 -Registration: http://salford.webex.com -Future Webinars: Multiple timezones and topics are planned. Let us know if you would like to be notified as we schedule new webinar dates and topics. -Abstract: In the one hour "Intro to Data Mining" webinar, we will discuss: **Data basics: what kind of data is required for data mining and predictive analytics; In what format must the data be; what steps are necessary to prepare data appropriately **What kinds of questions can we answer with data mining **How data mining models work: the inputs, the outputs, and the nature of the predictive mechanism **Evaluation criteria: how predictive models can be assessed and their value measured **Specific background knowledge to prepare you to begin a data mining project. Please circulate to colleagues who might benefit and do not hesitate to contact me if you have any questions. Sincerely, Lisa Solomon lisas at salford-systems.com Salford Systems, 4740 Murphy Canyon Rd. Ste 200, San Diego, Calif. 92123
In trying to get a better understanding of vectorization I wrote the
following code:
My objective is to take two sets of time series and calculate the
correlations for each combination of time series.
mat1 <- matrix(sample(1:500, 25), ncol = 5)
mat2 <- matrix(sample(501:1000, 25), ncol = 5)
Scenario 1:
apply(mat1, 1, function(x) cor(mat1, mat2[1,]))
Scenario 2:
apply(mat1, 1, function(x) cor(mat1, mat2))
Using scenario 1, (output below) I can see that correlations are
calculated for just the first row of mat2 against each individual row of
mat1.
Using scenario 2, (output below) I can see that correlations are
calculated for each row of mat2 against each individual row of mat1.  
Q1: The output of scenario2 consists of 25 rows of data.  Are the first
five rows mat1 against mat2[1,], the next five rows mat1 against
mat2[2,], ... last five rows mat1 against mat2[5,]?
Q2: I assign the output of scenario 2 to a new matrix
	matC <- apply(mat1, 1, function(x) cor(mat1, mat2))
    However, I need a way to identify each row in matC as a pairing of
rows from mat1 and mat2.  Is there a parameter I can add to apply to do
this?
Scenario 1:> apply(mat1, 1, function(x) cor(mat1, mat2[1,]))
           [,1]       [,2]       [,3]       [,4]       [,5]
[1,] -0.4626122 -0.4626122 -0.4626122 -0.4626122 -0.4626122
[2,] -0.9031543 -0.9031543 -0.9031543 -0.9031543 -0.9031543
[3,]  0.0735273  0.0735273  0.0735273  0.0735273  0.0735273
[4,]  0.7401259  0.7401259  0.7401259  0.7401259  0.7401259
[5,] -0.4548582 -0.4548582 -0.4548582 -0.4548582 -0.4548582
Scenario 2:> apply(mat1, 1, function(x) cor(mat1, mat2))
             [,1]        [,2]        [,3]        [,4]        [,5]
 [1,]  0.19394126  0.19394126  0.19394126  0.19394126  0.19394126
 [2,]  0.26402400  0.26402400  0.26402400  0.26402400  0.26402400
 [3,]  0.12923842  0.12923842  0.12923842  0.12923842  0.12923842
 [4,] -0.74549676 -0.74549676 -0.74549676 -0.74549676 -0.74549676
 [5,]  0.64074122  0.64074122  0.64074122  0.64074122  0.64074122
 [6,]  0.26931986  0.26931986  0.26931986  0.26931986  0.26931986
 [7,]  0.08527921  0.08527921  0.08527921  0.08527921  0.08527921
 [8,] -0.28034079 -0.28034079 -0.28034079 -0.28034079 -0.28034079
 [9,] -0.15251915 -0.15251915 -0.15251915 -0.15251915 -0.15251915
[10,]  0.19542415  0.19542415  0.19542415  0.19542415  0.19542415
[11,]  0.75107032  0.75107032  0.75107032  0.75107032  0.75107032
[12,]  0.53042767  0.53042767  0.53042767  0.53042767  0.53042767
[13,] -0.51163612 -0.51163612 -0.51163612 -0.51163612 -0.51163612
[14,] -0.44396048 -0.44396048 -0.44396048 -0.44396048 -0.44396048
[15,]  0.57018745  0.57018745  0.57018745  0.57018745  0.57018745
[16,]  0.70480284  0.70480284  0.70480284  0.70480284  0.70480284
[17,] -0.36674283 -0.36674283 -0.36674283 -0.36674283 -0.36674283
[18,] -0.81826607 -0.81826607 -0.81826607 -0.81826607 -0.81826607
[19,]  0.53145184  0.53145184  0.53145184  0.53145184  0.53145184
[20,]  0.24568385  0.24568385  0.24568385  0.24568385  0.24568385
[21,] -0.10610402 -0.10610402 -0.10610402 -0.10610402 -0.10610402
[22,] -0.78650748 -0.78650748 -0.78650748 -0.78650748 -0.78650748
[23,]  0.04269423  0.04269423  0.04269423  0.04269423  0.04269423
[24,]  0.14704698  0.14704698  0.14704698  0.14704698  0.14704698
[25,]  0.28340166  0.28340166  0.28340166  0.28340166  0.28340166
**********************************************************************
Please be aware that, notwithstanding the fact that the pers...{{dropped}}