Displaying 20 results from an estimated 600 matches similar to: "Newbie - Scrape Data From PDFs?"
2018 Jan 24
0
Newbie - Scrape Data From PDFs?
Hi Scott,
I have never done this myself but I read something recently on the
r-help distribution that was related.
I just did a quick search and found a few hits that might work for you.
1. https://medium.com/@CharlesBordet/how-to-extract-and-clean-data-from-pdf-files-in-r-da11964e252e
2. http://bxhorn.com/2016/extract-data-tables-from-pdf-files-in-r/
3.
2018 Jan 24
1
Newbie - Scrape Data From PDFs?
I think I would use pdftk to extract the form data. All subsequent
manipulation in R.
HTH
Ulrik
Eric Berger <ericjberger at gmail.com> schrieb am Mi., 24. Jan. 2018, 08:11:
> Hi Scott,
> I have never done this myself but I read something recently on the
> r-help distribution that was related.
> I just did a quick search and found a few hits that might work for you.
>
>
2023 Dec 29
2
Help request: Parsing docx files for key words and appending to a spreadsheet
Hi Andy:
I don?t have an answer but I do have what I hope is some friendly advice. Generally the more information you can provide, the more likely you will get help that is useful. In your case you say that you tried several packages and they didn?t do what you wanted. Providing that code, as well as why they didn?t do what you wanted (be specific) would greatly facilitate things.
Happy
2007 Aug 04
7
Optimization in R
Hi all,
I've been working on improving R's optim() command, which does general purpose
unconstrained optimization. Obviously, this is important for many statistics
computations, such as maximum likelihood, method of moments, etc. I have
focused my efforts of the BFGS method, mainly because it best matches my
current projects.
Here's a quick summary of what I've done:
*
2011 Sep 16
4
Dual Authentication: Local and Active Directory
I was wondering if it was possible to get a Samba server that was
acting as an AD member server to also be able to authenticate local
users, or is stuck just serving AD users?
--
Aaron Clausen
mightymartianca at gmail.com
2007 May 13
1
symbollic differentiation in R
Hi all,
I wrote a symbollic differentiation function in R, which can be downloaded
here:
http://www.econ.upenn.edu/~clausen/computing/Deriv.R
http://www.econ.upenn.edu/~clausen/computing/Simplify.R
It is just a prototype. Of course, R already contains two differentiation
functions: D and deriv. However, these functions have several limitations.
They can probably be fixed, but
2007 May 13
2
relist, an inverse operator to unlist
Hi all,
I wrote a function called relist, which is an inverse to the existing
unlist function:
http://www.econ.upenn.edu/~clausen/computing/relist.R
Some functions need many parameters, which are most easily represented in
complex structures. Unfortunately, many mathematical functions in R,
including optim, nlm, and grad can only operate on functions whose domain is
a vector. R has a
2008 Mar 09
2
[patch] add=TRUE in plot.default()
Hi all,
As long as I've used R, add=TRUE hasn't worked in contexts like this:
f <- function(x) x^2
X <- seq(0, 1, by=1/4)
plot(f, col="blue")
plot(X, f(X), col="red", type="l", add=TRUE)
I attached a fix for version 2.6.2.
Cheers,
Andrew
-------------- next part --------------
diff --git a/src/library/graphics/R/plot.R
2010 Dec 27
3
openssh and keystroke timing attacks (again)
Hi all,
Over the past 10 years, there has been some discussion and several
patches concerning keystroke timing being revealed by the timing of
openssh packet network transmission. The issue is that keystroke
timing is correlated with the plaintext, and openssh users expect
their communications to be kept entirely secret.
Despite some excellent ideas and patches, such as Jason Coit's
2008 Feb 27
7
[Bug 14704] New: randr1.2 broken on apple powerbook 17"
http://bugs.freedesktop.org/show_bug.cgi?id=14704
Summary: randr1.2 broken on apple powerbook 17"
Product: xorg
Version: unspecified
Platform: Other
OS/Version: All
Status: NEW
Severity: normal
Priority: medium
Component: Driver/nouveau
AssignedTo: nouveau at lists.freedesktop.org
2007 Jun 22
6
Organising a Subversion repository.
Hello,
I appreciate that this topic may have come up before but a quick scan of
the archives did not reveal anything. If someone could kindly point me
to an archive entry that would also be great.
I want to use Puppet fileserver in conjunction with a svn repo, this was
I can commit files to svn and have the fileserver propagate them to the
various machines, I believe this is a common way of
2023 Dec 29
2
Help request: Parsing docx files for key words and appending to a spreadsheet
Hello
I am trying to work through a problem, but feel like I've gone down a
rabbit hole. I'd very much appreciate any help.
The task: I have several directories of multiple (some directories, up
to 2,500+) *.docx files (newspaper articles downloaded from Lexis+) that
I want to iterate through to append to a spreadsheet only those articles
that satisfy a condition (i.e., a specific
2011 Apr 20
2
Folder Encryption and Samba
It's been passed to me from on high that one of our file servers needs
to be encrypted. I'm considering either whole-disk encryption or
folder encryption. I like the latter since, well, it's less work.
Is there any particular folder encryption systems out there that the
folks around here can recommend?
--
Aaron Clausen
mightymartianca at gmail.com
2011 Jul 26
2
Incoming External Trust
I'm running a Samba domain (Samba 3.4.7) with OpenLDAP. I also have
an Server 2003 AD domain, and want to set up an external trust so that
AD users can access resources on the Samba domain, but not visa versa
(I believe this is called a one-way incoming external trust). I'm not
finding a lot of information out there that makes sense. Does anybody
have any hints?
--
Aaron Clausen
2002 Dec 02
1
IMQ
Has anybody got imq running on iptables 1.2.7a. The home page for imq only
seems to have a patch for 1.2.6a.
--
Aaron Clausen
_______________________________________________
LARTC mailing list / LARTC@mailman.ds9a.nl
http://mailman.ds9a.nl/mailman/listinfo/lartc HOWTO: http://lartc.org/
2020 Oct 07
1
Adding text to existing PDF's created with R
R 4.0.2
OS X
Colleagues
Does R have the capability of adding text (e.g., page numbers) to an existing PDF (previously created with R) -- other than adding this text, the PDF should be unchanged (except for a new filename).
The intent is as follows:
I have multiple PDFs that I eventually merge into a single PDF, separating each one with a separator page.
The content of the separator pages
2010 Mar 23
1
[patch] add is.set parameter to sample()
Hi all,
sample() has some well-documented undesirable behaviour.
sample(1:6, 1)
sample(2:6, 1)
...
sample(5:6, 1)
do what you expect, but
sample(6:6, 1)
sample(1:6, 1)
do the same thing.
This behaviour is documented:
If 'x' has length 1, is numeric (in the sense of 'is.numeric') and
'x >= 1', sampling _via_ 'sample' takes place from
2006 Dec 07
8
crash on repeated search
I have found another crash in ferret; this one just uses a regular
search. It''s similar to an issue reported by Matt Schnitz a while ago,
but unlike his, mine does not go away if I turn off omit_norms. It does
go away if I turn on the garbage collector more often, but I''m not sure
that''s a stable workaround under the circumstances.
This one isn''t a
2007 Feb 16
8
term vector blues
I have a lot of crashes when I try to use term vectors. Here''s an
example, which crashes pretty consistently. This problem seems to be
somewhat sensitive to platform... people on other OS''s and ruby versions
have reported no error. I have seen this with ferret 0.10.13 and 0.10.14
on debian stable using ruby 1.8.2, but I have observed the same problem
on various other systems as
2008 Sep 15
2
Perhaps slightly OT - Lots of spurious webdav requests.
Hello All,
I am running a CentOS 4.6 file server for a small office network and I
am getting a lot of strange webdav requests from one of the Windows
workstations - I have not configured Webdav on the Windows host
(hereafter "windows-laptop") in question.
Some details - I have configured a Samba share called (say) "share1"
on the CentOS server and the windows-laptop connects