Displaying 20 results from an estimated 1200 matches similar to: "How do you scale variables which consist of tokens"
2012 Apr 19
1
Compare String Similarity
Dear All,
I need to estimate the level of similarity of two strings. For example:
string1 <- c("depending","audience","research", "school");
string2 <- c("audience","push","drama","button","depending");
The words in string may occur in different order though. What function would you recommend to use
2012 Mar 27
2
SVM. How to use categorical attributes?
Hi All,
Here is the case. I want to build classification model (SVM). Some of variables for this model are categorical attributes which represent words (usually 3-10 words - query for search in google). For example:
search_id | query_words |..| result
-----------+----------------------------------+--+--------
1 | how,to,grow,tree |..| 4
2
2012 Mar 26
1
normalization of multi-value string variable
Hi All,
I need to normalize/scale string variable which represents interests of customers (e.g., 'cycling, rollerblading, swimming' etc).
Does anybody know how to do this, I want then use it along with other numeric variables for SVM classification.
Appreciate for any advice.
-Alex
[[alternative HTML version deleted]]
2009 Jan 27
0
samba, ADS and privileges management
Hello list.
I once had a samba server acting as a PDC, a mapping between my NT
'Domain admins' and Unix 'admins' groups, and everything worked perfectly.
Now I got a new shiny samba server acting as a print server only, member
of an AD domain, and I can't have the members of 'Domain admins' group
manage printing drivers on the server, whereas the Administrator
2013 Nov 12
0
Data Analyst and Coordinator
Dear R-Sig-Jobs members,
For its Executive Office in Brussels, The International Diabetes
Federation (IDF) is looking to hire a Data Analyst & Coordinator with
significant R experience. This person will join the Epidemiology and
Public Health unit that sits within the Policy & Programmes
department. They will be responsible for the management of IDF?s
high-profile Diabetes Atlas. They
2016 Jan 19
0
Statistician / Data Analyst in Brussels, Belgium
Dear R-Sig-Jobs members,
For its Executive Office in Brussels, the International Diabetes Federation
(IDF) is looking to hire a Statistician and Data Analyst to join the Policy
& Programmes department. This person will be responsible for the management
of the high-profile IDF Diabetes Atlas (www.diabetesatlas.org). They will
coordinate the collection, analysis, interpretation and presentation
2000 Sep 29
0
Is it R or I?
Salutations:
I have been trying to translate a S-PLUS/ArcInfo (GIS software) application
that I wrote on a SGI (IRIX) platform to public domain R and GrassGIS on
a Linux platform. I am almost on the verge of abandoning it as I find R to
be
rather unstable, slow and frustrating.
I enclose a section of my code for R experts to examine
hoping that they'll point out that all the above three are
2008 Nov 12
1
Two problems with Samba in AD realm
Hello list.
I recently moved to an AD environment. I'm still keeping a samba servers
to make my cups-managed printers available to windows users, rather than
duplicating configuration with a Windows print service. But I'm facing
two problems, probably due to the way we manage AD.
First, all my host belong to a Unix-managed DNS domain
(msr-inria.inria.fr), not to the windows-managed
2017 Feb 08
1
Re: Trouble moving OVMF guest to new host
On Wed, 2017-02-08 at 20:13 +0300, Aleksei wrote:
> I'm running libvirt in user session and libvirt creates VARS part of OVMF in ~/.config/libvirt/qemu/nvram/
> Check your xml, there should be lines like this:
> <os>
> <type arch='x86_64' machine='pc-q35-2.7'>hvm</type>
> <loader readonly='yes'
2020 Apr 29
2
[Posible SPAM] Re: Stopwords: Topic modelling con LDA
Hola,
Acabo de calcular tf-idf y me surge una duda. ¿Habría un valor de idf o
tf-idf que se considerara como umbral para establecer que una palabra es
muy común o no? Los valores de idf en mis datos van entre 0 y 3.78 y los
de tf-idf ente 0 y 0.07.
Un saludo
El Mar, 28 de Abril de 2020, 12:53, Carlos Ortega escribió:
> Hola,
> Yo de primeras los quitaría para qué otros topics aparecen.
2013 Mar 03
0
Added code and tests for the tf-idf weighting scheme.
Hello guys.I have sent a pull request for the code and tests of the Tf-Idf
weighting scheme.
Please do let me know if any changes are required.Meanwhile,Ill begin
working on implementing normalizations which require additional statistics
and on the DFR schemes.
https://github.com/xapian/xapian/pull/6
On Tue, Feb 26, 2013 at 5:30 PM, <xapian-devel-request at lists.xapian.org>wrote:
>
2011 Jul 17
1
How to speed up interpolation
df is a very large data frame with arrival estimates for many flights
(DF$flightfact) at random times (df$PredTime). The error of the estimate
is df$dt.
My problem is that I want to know the prediction error at each minute
before landing. This code works, but is very slow, and dominates
everything. I tried using split(), but that rapidly ate up my 12 GB of
memory. So, is there a better R way of
2007 Jul 10
0
Article score calculations for Boolean and MultiTerm Queries, and customization options
Hi,
I have some questions about the way that documents are scored by the Boolean
and MultiTerm Queries, and about possible options for custom scoring
articles. I am working on a project experimenting with different methods of
automatically generating queries and the scoring mechanisms behind Lucene
and Ferret have been perplexing us.
>From looking at the Lucene explanation at (
2013 Mar 26
1
Merging of the TfIdf patch
Hello Guys. I have updated the code,tests,documentation,makefile entries
and the registry entry of the* *TfIdf patch as per the feedback.Please do
let me know if any additional changes are required before the patch can be
merged,
-Regards
-Aarsh
On Sun, Mar 3, 2013 at 2:50 PM, aarsh shah <aarshkshah1992 at gmail.com> wrote:
> Hello guys.I have sent a pull request for the code and
2006 Sep 20
8
Understanding boost ?
Hi,
I''m confused about managing field boosting ...
I have set the :boost for the :name field in my docs to 10, via :boost
=> 10
Then I performed a search for ''keith'' over all fields via with
*:(keith*), expecting a doc with Keith in the :name field to come out on
top. But another doc with Keith mentioned in other fields (:comments,
:address) scored higher.
I
2013 Feb 19
2
Implementing tf-idf weighting scheme in Xapian
Hello guys.I just read up about tf-idf schemes and want to implement it in
Xapian (with some frequently used normalizations) as it will also give me a
good hang of implementing a weighting scheme before I start working on
implementing DFR schemes.
I read the following as references and I think Ive understood it well and
can write the hack :-
1.)
2007 Feb 03
2
Boost Sorting with Acts_as_ferret?
Hey guys,
Simple question here.
I have a single index of recipes, from which I''m looking at the
following fields: Name, Ingredient Text, Tags, and Description.
The key is, I want to show all the results that come from Name,
before I show any of the results from Ingredient Text, Tags, or
Description.
I tried doing this:
acts_as_ferret :fields => {
:name =>
2009 Jun 18
0
[LLVMdev] LLVM Dev Mtg
I saw Tanya's note about this year's LLVM developers' meeting.
It would be interesting if the time could coincide with IDF
2009 (Sep. 22-24). I'm sure some LLVM folks would be interested
in attending IDF as well. Perhaps hold the LLVM conference the
day before or after IDF?
-Dave
2017 Mar 16
2
GSoC-2017 Introduction and Project Discussion
Hello,
I'm Shivang Bansal, a 3rd year Computer Science Engineering undergraduate
at Institute of Engineering & Technology in Lucknow, India. This mail is an
expression of my interest for Google Summer of Code program of this year. I
want to apologize for getting in so late. Actually I would have contacted
earlier, but sudden demise of my Grandfather disabled me in doing so.
I am
2013 Oct 20
3
Errore : requires numeric/complex matrix/vector arguments
Dear R users,I'm a new user of R. I'm trying to do a LM test an there is this type of error: Error in t(mX) %*% mX : requires numeric/complex matrix/vector arguments.
To be clear I write down the code in which mY ( 126,1 ) mX (126,1) mZ(126,1) are matrix.
LMTEST <- function(mY, mX, mZ)#mY, mX, mZ must be matrices!#returns the LM test statistic and the degree of freedom{iT =