similar to: How do you scale variables which consist of tokens

Displaying 20 results from an estimated 1200 matches similar to: "How do you scale variables which consist of tokens"

2012 Apr 19
1
Compare String Similarity
Dear All, I need to estimate the level of similarity of two strings. For example: string1 <- c("depending","audience","research", "school"); string2 <- c("audience","push","drama","button","depending"); The words in string may occur in different order though. What function would you recommend to use
2012 Mar 27
2
SVM. How to use categorical attributes?
Hi All, Here is the case. I want to build classification model (SVM). Some of variables for this model are categorical attributes which represent words (usually 3-10 words - query for search in google). For example: search_id | query_words |..| result -----------+----------------------------------+--+-------- 1 | how,to,grow,tree |..| 4 2
2012 Mar 26
1
normalization of multi-value string variable
Hi All, I need to normalize/scale string variable which represents interests of customers (e.g., 'cycling, rollerblading, swimming' etc). Does anybody know how to do this, I want then use it along with other numeric variables for SVM classification. Appreciate for any advice. -Alex [[alternative HTML version deleted]]
2009 Jan 27
0
samba, ADS and privileges management
Hello list. I once had a samba server acting as a PDC, a mapping between my NT 'Domain admins' and Unix 'admins' groups, and everything worked perfectly. Now I got a new shiny samba server acting as a print server only, member of an AD domain, and I can't have the members of 'Domain admins' group manage printing drivers on the server, whereas the Administrator
2013 Nov 12
0
Data Analyst and Coordinator
Dear R-Sig-Jobs members, For its Executive Office in Brussels, The International Diabetes Federation (IDF) is looking to hire a Data Analyst & Coordinator with significant R experience. This person will join the Epidemiology and Public Health unit that sits within the Policy & Programmes department. They will be responsible for the management of IDF?s high-profile Diabetes Atlas. They
2016 Jan 19
0
Statistician / Data Analyst in Brussels, Belgium
Dear R-Sig-Jobs members, For its Executive Office in Brussels, the International Diabetes Federation (IDF) is looking to hire a Statistician and Data Analyst to join the Policy & Programmes department. This person will be responsible for the management of the high-profile IDF Diabetes Atlas (www.diabetesatlas.org). They will coordinate the collection, analysis, interpretation and presentation
2000 Sep 29
0
Is it R or I?
Salutations: I have been trying to translate a S-PLUS/ArcInfo (GIS software) application that I wrote on a SGI (IRIX) platform to public domain R and GrassGIS on a Linux platform. I am almost on the verge of abandoning it as I find R to be rather unstable, slow and frustrating. I enclose a section of my code for R experts to examine hoping that they'll point out that all the above three are
2008 Nov 12
1
Two problems with Samba in AD realm
Hello list. I recently moved to an AD environment. I'm still keeping a samba servers to make my cups-managed printers available to windows users, rather than duplicating configuration with a Windows print service. But I'm facing two problems, probably due to the way we manage AD. First, all my host belong to a Unix-managed DNS domain (msr-inria.inria.fr), not to the windows-managed
2017 Feb 08
1
Re: Trouble moving OVMF guest to new host
On Wed, 2017-02-08 at 20:13 +0300, Aleksei wrote: > I'm running libvirt in user session and libvirt creates VARS part of OVMF in ~/.config/libvirt/qemu/nvram/ > Check your xml, there should be lines like this: > <os> >     <type arch='x86_64' machine='pc-q35-2.7'>hvm</type> >     <loader readonly='yes'
2020 Apr 29
2
[Posible SPAM] Re: Stopwords: Topic modelling con LDA
Hola, Acabo de calcular tf-idf y me surge una duda. ¿Habría un valor de idf o tf-idf que se considerara como umbral para establecer que una palabra es muy común o no? Los valores de idf en mis datos van entre 0 y 3.78 y los de tf-idf ente 0 y 0.07. Un saludo El Mar, 28 de Abril de 2020, 12:53, Carlos Ortega escribió: > Hola, > Yo de primeras los quitaría para qué otros topics aparecen.
2013 Mar 03
0
Added code and tests for the tf-idf weighting scheme.
Hello guys.I have sent a pull request for the code and tests of the Tf-Idf weighting scheme. Please do let me know if any changes are required.Meanwhile,Ill begin working on implementing normalizations which require additional statistics and on the DFR schemes. https://github.com/xapian/xapian/pull/6 On Tue, Feb 26, 2013 at 5:30 PM, <xapian-devel-request at lists.xapian.org>wrote: >
2011 Jul 17
1
How to speed up interpolation
df is a very large data frame with arrival estimates for many flights (DF$flightfact) at random times (df$PredTime). The error of the estimate is df$dt. My problem is that I want to know the prediction error at each minute before landing. This code works, but is very slow, and dominates everything. I tried using split(), but that rapidly ate up my 12 GB of memory. So, is there a better R way of
2007 Jul 10
0
Article score calculations for Boolean and MultiTerm Queries, and customization options
Hi, I have some questions about the way that documents are scored by the Boolean and MultiTerm Queries, and about possible options for custom scoring articles. I am working on a project experimenting with different methods of automatically generating queries and the scoring mechanisms behind Lucene and Ferret have been perplexing us. >From looking at the Lucene explanation at (
2013 Mar 26
1
Merging of the TfIdf patch
Hello Guys. I have updated the code,tests,documentation,makefile entries and the registry entry of the* *TfIdf patch as per the feedback.Please do let me know if any additional changes are required before the patch can be merged, -Regards -Aarsh On Sun, Mar 3, 2013 at 2:50 PM, aarsh shah <aarshkshah1992 at gmail.com> wrote: > Hello guys.I have sent a pull request for the code and
2006 Sep 20
8
Understanding boost ?
Hi, I''m confused about managing field boosting ... I have set the :boost for the :name field in my docs to 10, via :boost => 10 Then I performed a search for ''keith'' over all fields via with *:(keith*), expecting a doc with Keith in the :name field to come out on top. But another doc with Keith mentioned in other fields (:comments, :address) scored higher. I
2013 Feb 19
2
Implementing tf-idf weighting scheme in Xapian
Hello guys.I just read up about tf-idf schemes and want to implement it in Xapian (with some frequently used normalizations) as it will also give me a good hang of implementing a weighting scheme before I start working on implementing DFR schemes. I read the following as references and I think Ive understood it well and can write the hack :- 1.)
2007 Feb 03
2
Boost Sorting with Acts_as_ferret?
Hey guys, Simple question here. I have a single index of recipes, from which I''m looking at the following fields: Name, Ingredient Text, Tags, and Description. The key is, I want to show all the results that come from Name, before I show any of the results from Ingredient Text, Tags, or Description. I tried doing this: acts_as_ferret :fields => { :name =>
2009 Jun 18
0
[LLVMdev] LLVM Dev Mtg
I saw Tanya's note about this year's LLVM developers' meeting. It would be interesting if the time could coincide with IDF 2009 (Sep. 22-24). I'm sure some LLVM folks would be interested in attending IDF as well. Perhaps hold the LLVM conference the day before or after IDF? -Dave
2017 Mar 16
2
GSoC-2017 Introduction and Project Discussion
Hello, I'm Shivang Bansal, a 3rd year Computer Science Engineering undergraduate at Institute of Engineering & Technology in Lucknow, India. This mail is an expression of my interest for Google Summer of Code program of this year. I want to apologize for getting in so late. Actually I would have contacted earlier, but sudden demise of my Grandfather disabled me in doing so. I am
2013 Oct 20
3
Errore : requires numeric/complex matrix/vector arguments
Dear R users,I'm a new user of R. I'm trying to do a LM test an there is this type of error: Error in t(mX) %*% mX : requires numeric/complex matrix/vector arguments. To be clear I write down the code in which mY ( 126,1 ) mX (126,1) mZ(126,1) are matrix. LMTEST <- function(mY, mX, mZ)#mY, mX, mZ must be matrices!#returns the LM test statistic and the degree of freedom{iT =