thr3ads.net - similar to: "How do you scale variables which consist of tokens"

Displaying 20 results from an estimated 1200 matches similar to: "How do you scale variables which consist of tokens"

2012 Apr 19

Compare String Similarity

Dear All, I need to estimate the level of similarity of two strings. For example: string1 <- c("depending","audience","research", "school"); string2 <- c("audience","push","drama","button","depending"); The words in string may occur in different order though. What function would you recommend to use

SVM. How to use categorical attributes?

2012 Mar 27

SVM. How to use categorical attributes?

Hi All, Here is the case. I want to build classification model (SVM). Some of variables for this model are categorical attributes which represent words (usually 3-10 words - query for search in google). For example: search_id | query_words |..| result -----------+----------------------------------+--+-------- 1 | how,to,grow,tree |..| 4 2

normalization of multi-value string variable

2012 Mar 26

normalization of multi-value string variable

Hi All, I need to normalize/scale string variable which represents interests of customers (e.g., 'cycling, rollerblading, swimming' etc). Does anybody know how to do this, I want then use it along with other numeric variables for SVM classification. Appreciate for any advice. -Alex [[alternative HTML version deleted]]

samba, ADS and privileges management

2009 Jan 27

samba, ADS and privileges management

Hello list. I once had a samba server acting as a PDC, a mapping between my NT 'Domain admins' and Unix 'admins' groups, and everything worked perfectly. Now I got a new shiny samba server acting as a print server only, member of an AD domain, and I can't have the members of 'Domain admins' group manage printing drivers on the server, whereas the Administrator

Data Analyst and Coordinator

2013 Nov 12

Data Analyst and Coordinator

Dear R-Sig-Jobs members, For its Executive Office in Brussels, The International Diabetes Federation (IDF) is looking to hire a Data Analyst & Coordinator with significant R experience. This person will join the Epidemiology and Public Health unit that sits within the Policy & Programmes department. They will be responsible for the management of IDF?s high-profile Diabetes Atlas. They

Statistician / Data Analyst in Brussels, Belgium

2016 Jan 19

Statistician / Data Analyst in Brussels, Belgium

Dear R-Sig-Jobs members, For its Executive Office in Brussels, the International Diabetes Federation (IDF) is looking to hire a Statistician and Data Analyst to join the Policy & Programmes department. This person will be responsible for the management of the high-profile IDF Diabetes Atlas (www.diabetesatlas.org). They will coordinate the collection, analysis, interpretation and presentation

Is it R or I?

2000 Sep 29

Is it R or I?

Salutations: I have been trying to translate a S-PLUS/ArcInfo (GIS software) application that I wrote on a SGI (IRIX) platform to public domain R and GrassGIS on a Linux platform. I am almost on the verge of abandoning it as I find R to be rather unstable, slow and frustrating. I enclose a section of my code for R experts to examine hoping that they'll point out that all the above three are

Two problems with Samba in AD realm

2008 Nov 12

Two problems with Samba in AD realm

Hello list. I recently moved to an AD environment. I'm still keeping a samba servers to make my cups-managed printers available to windows users, rather than duplicating configuration with a Windows print service. But I'm facing two problems, probably due to the way we manage AD. First, all my host belong to a Unix-managed DNS domain (msr-inria.inria.fr), not to the windows-managed

[Posible SPAM] Re: Stopwords: Topic modelling con LDA

2020 Apr 29

[Posible SPAM] Re: Stopwords: Topic modelling con LDA

Hola, Acabo de calcular tf-idf y me surge una duda. ¿Habría un valor de idf o tf-idf que se considerara como umbral para establecer que una palabra es muy común o no? Los valores de idf en mis datos van entre 0 y 3.78 y los de tf-idf ente 0 y 0.07. Un saludo El Mar, 28 de Abril de 2020, 12:53, Carlos Ortega escribió: > Hola, > Yo de primeras los quitaría para qué otros topics aparecen.

Added code and tests for the tf-idf weighting scheme.

2013 Mar 03

Added code and tests for the tf-idf weighting scheme.

Hello guys.I have sent a pull request for the code and tests of the Tf-Idf weighting scheme. Please do let me know if any changes are required.Meanwhile,Ill begin working on implementing normalizations which require additional statistics and on the DFR schemes. https://github.com/xapian/xapian/pull/6 On Tue, Feb 26, 2013 at 5:30 PM, <xapian-devel-request at lists.xapian.org>wrote: >

How to speed up interpolation

2011 Jul 17

How to speed up interpolation

df is a very large data frame with arrival estimates for many flights (DF$flightfact) at random times (df$PredTime). The error of the estimate is df$dt. My problem is that I want to know the prediction error at each minute before landing. This code works, but is very slow, and dominates everything. I tried using split(), but that rapidly ate up my 12 GB of memory. So, is there a better R way of

Re: Trouble moving OVMF guest to new host

2017 Feb 08

Re: Trouble moving OVMF guest to new host

On Wed, 2017-02-08 at 20:13 +0300, Aleksei wrote: > I'm running libvirt in user session and libvirt creates VARS part of OVMF in ~/.config/libvirt/qemu/nvram/ > Check your xml, there should be lines like this: > <os> > <type arch='x86_64' machine='pc-q35-2.7'>hvm</type> > <loader readonly='yes'

Article score calculations for Boolean and MultiTerm Queries, and customization options

2007 Jul 10

Article score calculations for Boolean and MultiTerm Queries, and customization options

Hi, I have some questions about the way that documents are scored by the Boolean and MultiTerm Queries, and about possible options for custom scoring articles. I am working on a project experimenting with different methods of automatically generating queries and the scoring mechanisms behind Lucene and Ferret have been perplexing us. >From looking at the Lucene explanation at (

Merging of the TfIdf patch

2013 Mar 26

Merging of the TfIdf patch

Hello Guys. I have updated the code,tests,documentation,makefile entries and the registry entry of the* *TfIdf patch as per the feedback.Please do let me know if any additional changes are required before the patch can be merged, -Regards -Aarsh On Sun, Mar 3, 2013 at 2:50 PM, aarsh shah <aarshkshah1992 at gmail.com> wrote: > Hello guys.I have sent a pull request for the code and

Understanding boost ?

2006 Sep 20

Understanding boost ?

Hi, I''m confused about managing field boosting ... I have set the :boost for the :name field in my docs to 10, via :boost => 10 Then I performed a search for ''keith'' over all fields via with *:(keith*), expecting a doc with Keith in the :name field to come out on top. But another doc with Keith mentioned in other fields (:comments, :address) scored higher. I

Implementing tf-idf weighting scheme in Xapian

2013 Feb 19

Implementing tf-idf weighting scheme in Xapian

Hello guys.I just read up about tf-idf schemes and want to implement it in Xapian (with some frequently used normalizations) as it will also give me a good hang of implementing a weighting scheme before I start working on implementing DFR schemes. I read the following as references and I think Ive understood it well and can write the hack :- 1.)

Boost Sorting with Acts_as_ferret?

2007 Feb 03

Boost Sorting with Acts_as_ferret?

Hey guys, Simple question here. I have a single index of recipes, from which I''m looking at the following fields: Name, Ingredient Text, Tags, and Description. The key is, I want to show all the results that come from Name, before I show any of the results from Ingredient Text, Tags, or Description. I tried doing this: acts_as_ferret :fields => { :name =>

[LLVMdev] LLVM Dev Mtg

2009 Jun 18

[LLVMdev] LLVM Dev Mtg

I saw Tanya's note about this year's LLVM developers' meeting. It would be interesting if the time could coincide with IDF 2009 (Sep. 22-24). I'm sure some LLVM folks would be interested in attending IDF as well. Perhaps hold the LLVM conference the day before or after IDF? -Dave

GSoC-2017 Introduction and Project Discussion

2017 Mar 16

GSoC-2017 Introduction and Project Discussion

Hello, I'm Shivang Bansal, a 3rd year Computer Science Engineering undergraduate at Institute of Engineering & Technology in Lucknow, India. This mail is an expression of my interest for Google Summer of Code program of this year. I want to apologize for getting in so late. Actually I would have contacted earlier, but sudden demise of my Grandfather disabled me in doing so. I am

Errore : requires numeric/complex matrix/vector arguments

2013 Oct 20

Errore : requires numeric/complex matrix/vector arguments

Dear R users,I'm a new user of R. I'm trying to do a LM test an there is this type of error: Error in t(mX) %*% mX : requires numeric/complex matrix/vector arguments. To be clear I write down the code in which mY ( 126,1 ) mX (126,1) mZ(126,1) are matrix. LMTEST <- function(mY, mX, mZ)#mY, mX, mZ must be matrices!#returns the LM test statistic and the degree of freedom{iT =

similar to: How do you scale variables which consist of tokens