thr3ads.net - similar to: "dealing with a messy dataset"

Displaying 20 results from an estimated 300 matches similar to: "dealing with a messy dataset"

2017 Oct 05

dealing with a messy dataset

It looks like fixed width. I just used the last position of each field to get the size and used the 'readr' package; > input <- "And XVIII 000214.5+450520 0.69 17 9 0.00 -8.7 26.8 6.44 6.78 < 6.65 -44 0.5 MESSIER031 0.6 1.54 + PAndAS-03 000356.4+405319 0.10 17 0.00 -3.6 27.8 4.38 2.8 MESSIER031

dealing with a messy dataset

2017 Oct 05

dealing with a messy dataset

dear Jim, Thanks for your reply and your proposition. I forgot to provide the header of the dataframe, here it is: ================================================================================ Byte-by-byte Description of file: lvg_table2.dat -------------------------------------------------------------------------------- Bytes Format Units Label Explanations

dealing with a messy dataset

2017 Oct 05

dealing with a messy dataset

You should be able to use that header information to create the correct parameters to the read_fwf function to read in the data. Jim Holtman Data Munger Guru What is the problem that you are trying to solve? Tell me what you want to do, not how you want to do it. On Thu, Oct 5, 2017 at 11:02 AM, jean-philippe <jeanphilippe.fontaine at gssi.infn.it> wrote: > dear Jim, > > Thanks

dealing with a messy dataset

2017 Oct 05

dealing with a messy dataset

Is this a fixed width format? If so, read.fwf() in base, or read_fwf() in the readr package will solve the problem. You may need to trim trailing spaces though. B. > On Oct 5, 2017, at 10:12 AM, jean-philippe <jeanphilippe.fontaine at gssi.infn.it> wrote: > > dear R-users, > > > I am facing a quite regular and basic problem when it comes to dealing with datasets,

dealing with a messy dataset

2017 Oct 05

dealing with a messy dataset

dear Jim, Yes I fixed the problem. Thanks again all of you for your contribution! This worked : start <- c(1, 20, 35, 41, 44, 48, 53, 59, 64, 70, 76, 78, 83, 88, + 93, 114, 122, 127) data1<-read_fwf("lvg_table2.txt",skip=70, fwf_widths(diff(start))) Well now I know how to deal with fixed-width files :) Cheers Jean-Philippe On 05/10/2017 18:42, jim

first occurrence of a value?

2011 May 04

first occurrence of a value?

Hello, A simple question perhaps, but how do I, within each row, find the first occurence of the number 1 in the df below? I want to use this position to programmatically create the variable 'year'. I'v come up with a solution, but I find it downright ugly. Is there a simpler way? I was hoping for a useful built-in function that I don;t yet know about. df <-

Peer Review - Linuxfest Presentation Outline

2004 Sep 27

Peer Review - Linuxfest Presentation Outline

Hello all, I've been invited to do a presentation on Asterisk for the Ohio Linuxfest in Columbus this weekend (http://www.ohiolinux.org). Rough estimates are that nearly 500 people will be attending. I've been working on an outline for a couple of weeks and I would like to have some peer review of the information presented. I am going to have to cut down the content to make it fit in

permutation-based FDR

2010 Jul 13

permutation-based FDR

Hola a todos, Tengo un pequeño problemilla... Tengo unas 9000 variables que he contrastado con 1 en concreto con el test de wilcoxon. He calculado el p-valor, y queria corregirlo con el permutation-based FDR. He encontrado una funcion con R comp.fdr()que hace esta corrección, pero te pide que le pongas las variables con las observaciones y te hace el test (según he entendido). Yo solo quiero

Feature request: non-dropping regmatches/strextract

2019 Aug 29

Feature request: non-dropping regmatches/strextract

Thank you! I greatly appreciate your consideration, though of course it is up to you. I think many people switch to stringr/stringi simply because functions in those packages have some consistent design choices, for example, they do not drop empty/missing matches, which facilitates array-based programming. For example, in the cases where one needs to make a new column in a data.frame (data.table,

Buildbot Noise

2015 Oct 07

Buildbot Noise

On 7 October 2015 at 22:44, Eric Christopher <echristo at gmail.com> wrote: > I think this is a poor analogy. You're also ignoring the solution I gave you > in my previous mail for slow bots. I'm not ignoring it, I'm acting upon it. But it takes time. I don't have infinite resources. > If you can't give some basic stability guarantees then the bot > is only

Date handling in R is hard to understand

2013 Nov 08

Date handling in R is hard to understand

Dear All, I usually work with time series data. The data may come in AM/PM date format or on 24 hour time basis. R can not recognize the two differences automatically - at least for me. I have to specifically tell R in which time format the data is. It seems that Pandas knows how to handle date without being told the format. The problem arises when I try to shift time by a certain time. Say

Best way to study internals of R ( mix of C, C++, Fortran, and R itself)?

2017 Nov 21

Best way to study internals of R ( mix of C, C++, Fortran, and R itself)?

How difficult is it to get a good feel for the internals of R, if you want to learn the general code base, but also the CPU intensive stuff ( much of it in C or Fortran?) and the ways in which the general code and the CPU intensive stuff is connected together? R has a very large audience, but my understanding is that only a small group have a good understanding of the internals (and some of those

Feature request: non-dropping regmatches/strextract

2019 Aug 29

Feature request: non-dropping regmatches/strextract

Thank you, I am aware that there are packages that can accomplish this. I mentioned stringr::str_extract as a function that does not drop empty matches. I think that the behavior of regmatches(..., regexpr(...))?in base R should permit an option to prevent dropping of empty matches both for sake of consistency with the rest of the language (missing data does not yield a dropped index in other

[LLVMdev] Debugging on unavailable hardware

2014 Dec 11

[LLVMdev] Debugging on unavailable hardware

Hi Renato, Thank you very much for the directions, I am going to recommit my fix. What are hardware used in buildbots? Are these common boards like PandaBoard or some thing special? What is RAM installed? Thanks, --Serge 2014-12-11 2:36 GMT+06:00 Renato Golin <renato.golin at linaro.org>: > On 10 December 2014 at 19:06, Serge Pavlov <sepavloff at gmail.com> wrote: > > In

Feature request: non-dropping regmatches/strextract

2019 Aug 15

Feature request: non-dropping regmatches/strextract

Using a non-capturing group, "(?:...)" instead of "(...)", simplifies my example a bit > x <- c("Groucho <groucho at marx.com>", "<chico at marx.com>", "Harpo") > strcapture("([[:alpha:]]+)?(?: *<([[:alpha:]. ]+@[[:alpha:]. ]+)>)?", x, proto=data.frame(Name=character(), Address=character(),

Nouveau on GeForce 7950GTX (NV49)

2008 Jun 05

Nouveau on GeForce 7950GTX (NV49)

Hi everyone, I just finished installing the current git of the nouveau driver for my NV49 based 7950GTX. I attempted this because I've been continually frustrated by the constantly degrading 2D performance of X under the nvidia and nv drivers. Webpages that do any kind of compositing in Mozilla will burn CPU all day long. I'm happy to say that I'm writing because I didn't have

Mi script R es muy lento

2015 Jun 01

Mi script R es muy lento

Hola Carlos, bueno la verdad es que mi pregunta era algo general, cuando no has usado data.table no parece muy intuitivo pasar de la forma de programar a la que estás más acostumbrado (bucles, notación matricial...) a esa otra. Aun no tengo un cálculo complejo concreto pero lo tendré que hacer... solo quería saber si se puede, y parece que sí, asi que será cuestión de empaparse un poco de

reticulate + virtual environments

2024 Jul 15

reticulate + virtual environments

Hi, I am using reticulate and a virtual environment (not conda) to run Python scripts from RStudio. However, when I try to use my own (existing) virtual environment, reticulate does not use it. If I run my scripts, the installed modules (e.g., py_install("pandas", "mmstat4.hu.data")) are not found. I believe this happens because reticulate is using r-reticulate instead of

reticulate + virtual environments

2024 Jul 15

reticulate + virtual environments

Have you tried https://rstudio.github.io/reticulate/ ? Generally speaking, complex nonstandard package specific questions such as yours rarely get a reply here -- there are 20,000+ packages (and counting) after all! As reticulate was created by and integrated with RStudio/Posit, I would think their site and help resources might be a better venue. Of course, if you don't use RStudio, you may

Best way to study internals of R ( mix of C, C++, Fortran, and R itself)?

2017 Nov 21

Best way to study internals of R ( mix of C, C++, Fortran, and R itself)?

1) What is easy for one person may be very hard for another, so your question is really unanswerable. You do need to know C and Fortran to get through the source code. Get started soon reading the R Internals document if it sounds interesting to you... you are bound to learn something even if you don't stick with it. If you have questions about the internals though, you should read the Posting

similar to: dealing with a messy dataset