similar to: Classifying large text corpora using R

Displaying 20 results from an estimated 500 matches similar to: "Classifying large text corpora using R"

2011 Sep 07
1
Fwd: FSelector and RWeka problem
Hi all, Although I sent the mail to Piotr, the author of FSelector, it should be better to ask here to let others know. Yanwei Begin forwarded message: From: Yanwei Song <yanwei.song@gmail.com> Date: September 7, 2011 4:41:58 PM EDT To: p.romanski@stud.elka.pw.edu.pl Subject: FSelector and RWeka problem Dear Piotr, Thanks for developing the FSelector package for us. I'm a new
2011 Nov 17
3
merging corpora and metadata
Greetings! I loose all my metadata after concatenating corpora. This is an example of what happens: > meta(corpus.1) MetaID cid fid selfirst selend fname 1 0 1 11 2169 2518 WCPD-2001-01-29-Pg217.scrb 2 0 1 14 9189 9702 WCPD-2003-01-13-Pg39.scrb 3 0 1 14 2109 2577 WCPD-2003-01-13-Pg39.scrb .... .... 17 0
2015 Jun 02
2
information.gain de la libreria FSelector
Hola, estoy intentando calcular la ganancia de información para una serie de variables (series temporales con distinta longuitud, ej: Presion Arterial, Frecuencia cardíaca,...) en relación con una variable binaria (0:paciente no muere; 1:paciente muere). Para ello voy a usar la función information.gain de la libreria FSelector. Sabeis si es posible calcular la ganancia de información para
2015 Jun 02
2
information.gain de la libreria FSelector
Hola Javier, yo soy licenciada en Físicas pero también tengo algo de información médica (doctorado en Neurociencia) Un saludo El 2 de junio de 2015, 15:35, <javier.ruben.marcuzzi en gmail.com> escribió: > Estimada María Luz Morales > > ¿Qué título universitario tiene usted?, es para pensar en como poder > ayudarla, si desde la parte médica o la de R > > Javier Rubén
2012 Jun 12
0
Fwd: [Corpora-List] ACM SIGIR 2012 Workshop on Open Source Information Retrieval
This might be an interesting option for some of you! Regards, Parth. ---------- Forwarded message ---------- From: Andrew Trotman <andrew at cs.otago.ac.nz> Date: Tue, Jun 12, 2012 at 5:12 AM Subject: [Corpora-List] ACM SIGIR 2012 Workshop on Open Source Information Retrieval To: corpora at uib.no ACM SIGIR 2012 WORKSHOP ON OPEN SOURCE INFORMATION RETRIEVAL**** 16 August 2012, Portland,
2009 Aug 13
0
Efficiently Extracting Meta Data from TM Corpora
I'm using text miner (the "tm" package) to process large numbers of blog and message board postings (about 245,000). Does anyone have any advice for how to efficiently extract the meta data from a corpus of this size? TM does a great job of using MPI for many functions (e.g. tmMap) which greatly speed up the processing. However, the "meta" function that I need does not
2009 Sep 15
2
S3 objects in S4 slots
Hello, I am the maintainer of the stringkernels package and have come across a problem with using S3 objects in my S4 classes. Specifically, I have an S4 class with a slot that takes a text corpus as a list of character vectors. tm (version 0.5) saves corpora as lists with a class attribute of c("VCorpus", "Corpus", "list"). I don't actually need the
2018 Oct 04
2
Indexing Chinese?
My second (and hopefully last) question: is there any more news on indexing Chinese characters and words? Searching online mostly returns results from a decade ago or more, with nothing very conclusive. How close is this to possible? For the time being I'm doing some pre-processing on long strings of Chinese, breaking on punctuation in order to avoid errors. But I have some large corpora of
2012 Sep 20
3
(no subject)
>From my book on corpus linguistics with R: # (10) Imagine you have two vectors a and b such that a<-c("d", "d", "j", "f", "e", "g", "f", "f", "i", "g") b<-c("a", "g", "d", "f", "g", "a", "f", "a",
2011 Sep 02
1
[PATCH 0/7] hivex + hivexml: Add byte runs for nodes and values
This changeset adds byte run reporters for node and value metadata in the hivexml program. This location reporting required several new ABI functions, which required new ABI return types. One benefit to the byte run functions is additional sanity checks, which have revealed new data or parsing errors when run on M57 patents images. An example error: Image: Charlie, 2009-12-11, available at
2011 Jul 31
1
Entropy based feature selection in R
I need to use entropy based feature selection to reduce term space while doing text classification. Are there any R packages available that would help me do this? I can also make do with chi squared based algorithm, if there are packages for that. Thanks in advance. Andy -- View this message in context: http://r.789695.n4.nabble.com/Entropy-based-feature-selection-in-R-tp3708056p3708056.html
2013 Mar 13
1
Feature selection package for text mining
Hi, I am doing a project on authorship attribution, where my term document matrix has around 10450 features. Can you please suggest me a package where I can find the feature selection function to reduce the dimensions. Regards, Venkata Satish Basva [[alternative HTML version deleted]]
2016 Jun 03
2
Custom assembler subset
On Fri, Jun 3, 2016 at 11:53 AM, Ahmed Bougacha <ahmed.bougacha at gmail.com> wrote: > -llvmdev at cs.uiuc.edu, that list isn't in use anymore. > > On Wed, Jun 1, 2016 at 4:48 PM, Kenneth Adam Miller via llvm-dev > <llvm-dev at lists.llvm.org> wrote: > > Hello all, > > > > I would like to restrain the compiler that I build on my local box from >
2005 Oct 06
0
problem with classifying
Hello list I have a problem with classifying traffic from two providers, and about 600 users. I have the following situation: P1-\ | linux | --eth0-| box |-eth1 P2-/ | | P1 and P2 are coming from VLANs. I have 4 type traffic which I want to classify. The traffic is divided as follows: P1 - 100mbit from realm 0x70000 10mbit from realm ! 0x70000
2011 Jun 07
3
Classifying boolean values
Hi to all, I'm new to this forum and new to R. I have to build a tree classifier that has boolean values as response. When I build the tree with: echoknn.tree <- tree(class ~ ., data=echoknn.train) where "class" is a coloumn of my dataset (echoknn.train) of boolean values, the result is a tree where leaf nodes are numbers in the range [0,1]; but this isn't the result that I
2006 Feb 05
1
classifying packets and ports
Hi, I''ve been working for a big corporate company as junior system engineer and getting nicely to understand HTB/iproute2/iptables etc, The ordinary users(about 500 users), can pop / smtp / skype out on the network, but I can''t ssh out, cause they blocked the ports. Thought of being clever, I let my home linux listen on port 443 or 110 for ssh connection, but it wont connect, I
2004 Jun 30
1
classifying packets
hello! i have the following problem: i have a pc, which has one (eth0) NIC. eth0 is connected to two other linux machines acting as routers to the internet. i want to classify packets outgoing from my PC. i want to mark packets that are routed through 1. router with mark 1 and the others packets routed throuhg router 2. with mark 2. i had following ideas: -using destination MAC -using route
2008 Jan 11
2
[LLVMdev] Classifying Operands & Def/Use Chains
Is there any way to discover whether a particular operand of a MachineInst participates in addressing? That is, if the MachineInst references memory, can I tell, given an operand, whether that operand is part of the address calculation for the instruction? Also, is there any reasonable way to get the set of machine instructions to which the output(s) of some machine instruction flows? The
2008 Jan 11
0
[LLVMdev] Classifying Operands & Def/Use Chains
On Jan 11, 2008, at 2:00 PM, David Greene wrote: > Is there any way to discover whether a particular operand of a > MachineInst > participates in addressing? That is, if the MachineInst references > memory, > can I tell, given an operand, whether that operand is part of the > address > calculation for the instruction? Nope, not that I know of. > Also, is there any
2008 Jan 11
1
[LLVMdev] Classifying Operands & Def/Use Chains
On Friday 11 January 2008 16:36, Chris Lattner wrote: > On Jan 11, 2008, at 2:00 PM, David Greene wrote: > > Is there any way to discover whether a particular operand of a > > MachineInst > > participates in addressing? That is, if the MachineInst references > > memory, > > can I tell, given an operand, whether that operand is part of the > > address >