thr3ads.net - R help - [R] the features of the truth [Mar 2011]

If this information is useful, please help other people find it:
Share via:

Alexy Khrabrov

2011-Mar-02 05:28 UTC

[R] the features of the truth

This is really a statistics problem, so I wonder which R packages can be
employed best to solve and visualize it.

I run a lot of simulations to approach the truth.  The truth is a result of very
complex computations, and is a real number.  The closer it is to 0, the truthier
it is.

Each simulations has a set of features, some of which are not available for all
simulations.  Some of the features are numeric (week), some boolean (utility),
while others are factors.

Each simulation has the final value, the dm column in the data frame.  The names
of the simulations are rownames of the data frame, and feature names are the
column names.  Here's the dataframe:

http://dl.dropbox.com/u/9300701/Data/sf.dm.pos.r

You read it in R with

sf <- read.table(sf.dm.pos.r)

Seeking the truth questions:

-- What kinds of GLM and other models can we run to determine which features are
most contributing to the truth, i.e. making dm closer to 0?

-- What kind of clustering can emphasize the most contributing features?

-- What kind of visualizations can be used to make it clear which features
affect the truth the most, and in which combinations?  What kind of color
visualizations are there to make the truth even clearer?

Cheers,
Alexy

	[[alternative HTML version deleted]]

R help - Mar 2011 - the features of the truth

[R] the features of the truth