Displaying 20 results from an estimated 10000 matches similar to: "Data cleaning & Data preparation, what do R users want?"
2017 Nov 30
2
Data cleaning & Data preparation, what do R users want?
Hi again,
Typo in the last email. Should read "about 40 standard deviations".
Jim
On Thu, Nov 30, 2017 at 10:54 AM, Jim Lemon <drjimlemon at gmail.com> wrote:
> Hi Robert,
> People want different levels of automation in the software they use.
> What concerns many of us is the desire for the function
>
2017 Nov 29
0
Data cleaning & Data preparation, what do R users want?
I don't think my view is of interest to many, so offlist.
I reject this:
" I would consider data analysis work to be three stages: data preparation,
statistical analysis, and producing the report."
For example, there is no such thing as "outliers" -- data to be removed as
part of cleaning/preparation -- without a statistical model to be an
"outlier" **from**,
2017 Nov 29
0
Data cleaning & Data preparation, what do R users want?
Hi Robert,
People want different levels of automation in the software they use.
What concerns many of us is the desire for the function
"figure-out-what-this-data-is-import-it-and-get-rid-of-bad-values".
Such users typically want something that justifies its use by being
written by someone who seems to know what they're doing and lots of
other people use it. One advantage of many R
2017 Nov 29
0
Data cleaning & Data preparation, what do R users want?
Great question. What do I want? I want my co-workers to stop using Excel
spreadsheets for data entry, storage, and sharing! I want them to
understand the value of data discipline. But alas . . . .
I work in a county health department in the US. Between dplyr, stringr,
grep, grepl, and the base R read() functions, I'm doing OK.
I need to learn more about APIs, so I can see if I can make R
2017 Nov 30
0
Data cleaning & Data preparation, what do R users want?
I would agree that getting data into R from various sources is the biggest
pain point. Even if there is an api, the results are not always consistent
and you have to do lots of dimension checking to get it right. Or there
isn't an open api at all and you have to hack it by web scraping or
otherwise- http://enpiar.com/2017/08/11/one-hour-package/
On Thu, Nov 30, 2017 at 1:00 AM, Jim Lemon
2017 Dec 11
0
Data cleaning & Data preparation, what do R users want?
Dominik (and others)
If it is indeed still the biggest paint point, even in 2017, then maybe we
can do something about that, with more efforts at different user interface
design and try-outs with them on specialized datasets.
[ The fact that in some specialties, such as clinical trials, for example,
getting access to public domain datasets (and not having to use a tiny
"toy" dataset,
2017 Nov 21
3
Best way to study internals of R ( mix of C, C++, Fortran, and R itself)?
How difficult is it to get a good feel for the internals of R, if you want
to learn the general code base, but also the CPU intensive stuff ( much of
it in C or Fortran?) and the ways in which the general code and the CPU
intensive stuff is connected together?
R has a very large audience, but my understanding is that only a small
group have a good understanding of the internals (and some of those
2009 Oct 06
3
R on Linux, and R on Windows , any difference in maturity+stability?
Will R have more glitches on one operating system as opposed to
another, or is it pretty much the same?
robert
2007 Jun 08
4
Tools For Preparing Data For Analysis
As noted on the R-project web site itself ( www.r-project.org ->
Manuals -> R Data Import/Export ), it can be cumbersome to prepare
messy and dirty data for analysis with the R tool itself. I've also
seen at least one S programming book (one of the yellow Springer ones)
that says, more briefly, the same thing.
The R Data Import/Export page recommends examples using SAS, Perl,
Python, and
2017 Nov 21
0
Best way to study internals of R ( mix of C, C++, Fortran, and R itself)?
1) What is easy for one person may be very hard for another, so your question is really unanswerable. You do need to know C and Fortran to get through the source code. Get started soon reading the R Internals document if it sounds interesting to you... you are bound to learn something even if you don't stick with it. If you have questions about the internals though, you should read the Posting
2009 Jan 09
1
survey statistics, rate/proportions with standard errors
what does R have to compare with , say , proc surveymeans, estimate survey
means/proportions with standard errors, using Taylor methods?
[[alternative HTML version deleted]]
2016 Oct 14
3
Parallel IR [PIR] --- BoF preparation discussion
Dear community,
In preparation for the BoF on Parallel IR at the US developers meeting
we would like to collect feedback from the whole community. The
concerns, ideas, etc. will be summarized in the BoF and should provide a
good starting point for a discussion.
We know that over the years the topic of a parallel extension for LLVM
was discussed on the mailing list [0, 1, 2], workshops [3, 4] or
2011 Feb 11
4
When is *interactive* data visualization useful to use?
Hello all,
Before getting to my question, I would like to apologize for asking this
question here. My question is not directly an R question, however, I still
find the topic relevant to R community of users - especially due to only *
partial* (current) support for interactive data visualization (see here:
http://cran.r-project.org/web/views/Graphics.html were with iplots we are
waiting for
2010 Dec 07
1
Statistical Analysis with R Beginner's Guide Book
Hi Everyone,
I'm writing to announce my new R beginner's guide book and answer questions
related to it.
The primary focus of Statistical Analysis with R is helping new users become
accustomed to R and empowering them to apply R to suit their own needs. It
is a beginner's guide written for a broad audience and should be well
received by businesspeople, IT professionals, researchers,
2009 Aug 19
2
mild and extreme outliers in boxplot
dear all,
could somebody tell me how I can plot mild outliers as a circle(?) and
extreme outliers as an asterisk(*) in a box-whisker plot?
Thanks very much in advance
--
View this message in context: http://www.nabble.com/mild-and-extreme-outliers-in-boxplot-tp25040545p25040545.html
Sent from the R help mailing list archive at Nabble.com.
2013 Aug 26
10
[LLVMdev] Adding diversity for security (and testing)
Greetings LLVM Devs!
I am a PhD student in the Secure Systems and Software Lab at UC
Irvine. We have been working on adding randomness into code generation
to create a diverse population of binaries. This diversity prevents
code-reuse attacks such as return-oriented-programming (ROP) by
denying the attacker information about the exact code layout. ROP has
been used is several high-profile recent
2012 Oct 30
1
peer-reviewed (or not) publications on R
Dear Friends,
I'm contributing to a paper on a new R package for a clinical (medicine,
ophthalmology) audience, and part of the mission is to encourage people who
might be occasional users of Excel or SPSS, to become more familiar with R.
I'd really appreciate any pointers to more recent papers that describe R,
it's growth (statistics on user base, number of packages, volume of help
2007 Jun 13
3
Awk and Vilno
In clinical trial data preparation and many other data situations, the
statistical programmer needs to merge and re-merge multiple input
files countless times. A syntax for merging files that is clear and
concise is very important for the statistical programmer's
productivity.
Here is how Vilno does it:
inlist dataset1 dataset2 dataset3 ;
joinby variable1 variable2 where ( var3<=var4 ) ;
2009 Jan 08
3
Ashlee Vance's article on R in the New York Times
Ashlee Vance's article on R in the New York Times.
This is typical of the New York Times. Because they get to coast on the
prestige and reputation of their brand , they have a history of just this
sort of journalistic sloppiness. Whether it's the author or the editor at
fault doesn't really matter, they do this screw-up all the time.
Look, if you write an article on the first page of
2007 Jun 09
2
How do you do an e-mail post that is within an ongoing thread?
That may sound like a stupid question, but if it confuses me, I'm sure
it confuses others as well. I've tried to find that information on the
R mail-group info pages, can't seem to find it. Is it something
obvious?
To begin a brand new discussion, you do your post as an e-mail sent to
r-help at stat.math.ethz.ch .
As I am doing right now.
How do I do an additional post that gets