similar to: Newbie - Scrape Data From PDFs?

Displaying 20 results from an estimated 600 matches similar to: "Newbie - Scrape Data From PDFs?"

2018 Jan 24
0
Newbie - Scrape Data From PDFs?
Hi Scott, I have never done this myself but I read something recently on the r-help distribution that was related. I just did a quick search and found a few hits that might work for you. 1. https://medium.com/@CharlesBordet/how-to-extract-and-clean-data-from-pdf-files-in-r-da11964e252e 2. http://bxhorn.com/2016/extract-data-tables-from-pdf-files-in-r/ 3.
2018 Jan 24
1
Newbie - Scrape Data From PDFs?
I think I would use pdftk to extract the form data. All subsequent manipulation in R. HTH Ulrik Eric Berger <ericjberger at gmail.com> schrieb am Mi., 24. Jan. 2018, 08:11: > Hi Scott, > I have never done this myself but I read something recently on the > r-help distribution that was related. > I just did a quick search and found a few hits that might work for you. > >
2023 Dec 29
2
Help request: Parsing docx files for key words and appending to a spreadsheet
Hi Andy: I don?t have an answer but I do have what I hope is some friendly advice. Generally the more information you can provide, the more likely you will get help that is useful. In your case you say that you tried several packages and they didn?t do what you wanted. Providing that code, as well as why they didn?t do what you wanted (be specific) would greatly facilitate things. Happy
2007 Aug 04
7
Optimization in R
Hi all, I've been working on improving R's optim() command, which does general purpose unconstrained optimization. Obviously, this is important for many statistics computations, such as maximum likelihood, method of moments, etc. I have focused my efforts of the BFGS method, mainly because it best matches my current projects. Here's a quick summary of what I've done: *
2011 Sep 16
4
Dual Authentication: Local and Active Directory
I was wondering if it was possible to get a Samba server that was acting as an AD member server to also be able to authenticate local users, or is stuck just serving AD users? -- Aaron Clausen mightymartianca at gmail.com
2007 May 13
1
symbollic differentiation in R
Hi all, I wrote a symbollic differentiation function in R, which can be downloaded here: http://www.econ.upenn.edu/~clausen/computing/Deriv.R http://www.econ.upenn.edu/~clausen/computing/Simplify.R It is just a prototype. Of course, R already contains two differentiation functions: D and deriv. However, these functions have several limitations. They can probably be fixed, but
2007 May 13
2
relist, an inverse operator to unlist
Hi all, I wrote a function called relist, which is an inverse to the existing unlist function: http://www.econ.upenn.edu/~clausen/computing/relist.R Some functions need many parameters, which are most easily represented in complex structures. Unfortunately, many mathematical functions in R, including optim, nlm, and grad can only operate on functions whose domain is a vector. R has a
2008 Mar 09
2
[patch] add=TRUE in plot.default()
Hi all, As long as I've used R, add=TRUE hasn't worked in contexts like this: f <- function(x) x^2 X <- seq(0, 1, by=1/4) plot(f, col="blue") plot(X, f(X), col="red", type="l", add=TRUE) I attached a fix for version 2.6.2. Cheers, Andrew -------------- next part -------------- diff --git a/src/library/graphics/R/plot.R
2010 Dec 27
3
openssh and keystroke timing attacks (again)
Hi all, Over the past 10 years, there has been some discussion and several patches concerning keystroke timing being revealed by the timing of openssh packet network transmission. The issue is that keystroke timing is correlated with the plaintext, and openssh users expect their communications to be kept entirely secret. Despite some excellent ideas and patches, such as Jason Coit's
2008 Feb 27
7
[Bug 14704] New: randr1.2 broken on apple powerbook 17"
http://bugs.freedesktop.org/show_bug.cgi?id=14704 Summary: randr1.2 broken on apple powerbook 17" Product: xorg Version: unspecified Platform: Other OS/Version: All Status: NEW Severity: normal Priority: medium Component: Driver/nouveau AssignedTo: nouveau at lists.freedesktop.org
2007 Jun 22
6
Organising a Subversion repository.
Hello, I appreciate that this topic may have come up before but a quick scan of the archives did not reveal anything. If someone could kindly point me to an archive entry that would also be great. I want to use Puppet fileserver in conjunction with a svn repo, this was I can commit files to svn and have the fileserver propagate them to the various machines, I believe this is a common way of
2023 Dec 29
2
Help request: Parsing docx files for key words and appending to a spreadsheet
Hello I am trying to work through a problem, but feel like I've gone down a rabbit hole. I'd very much appreciate any help. The task: I have several directories of multiple (some directories, up to 2,500+) *.docx files (newspaper articles downloaded from Lexis+) that I want to iterate through to append to a spreadsheet only those articles that satisfy a condition (i.e., a specific
2011 Apr 20
2
Folder Encryption and Samba
It's been passed to me from on high that one of our file servers needs to be encrypted. I'm considering either whole-disk encryption or folder encryption. I like the latter since, well, it's less work. Is there any particular folder encryption systems out there that the folks around here can recommend? -- Aaron Clausen mightymartianca at gmail.com
2011 Jul 26
2
Incoming External Trust
I'm running a Samba domain (Samba 3.4.7) with OpenLDAP. I also have an Server 2003 AD domain, and want to set up an external trust so that AD users can access resources on the Samba domain, but not visa versa (I believe this is called a one-way incoming external trust). I'm not finding a lot of information out there that makes sense. Does anybody have any hints? -- Aaron Clausen
2002 Dec 02
1
IMQ
Has anybody got imq running on iptables 1.2.7a. The home page for imq only seems to have a patch for 1.2.6a. -- Aaron Clausen _______________________________________________ LARTC mailing list / LARTC@mailman.ds9a.nl http://mailman.ds9a.nl/mailman/listinfo/lartc HOWTO: http://lartc.org/
2020 Oct 07
1
Adding text to existing PDF's created with R
R 4.0.2 OS X Colleagues Does R have the capability of adding text (e.g., page numbers) to an existing PDF (previously created with R) -- other than adding this text, the PDF should be unchanged (except for a new filename). The intent is as follows: I have multiple PDFs that I eventually merge into a single PDF, separating each one with a separator page. The content of the separator pages
2010 Mar 23
1
[patch] add is.set parameter to sample()
Hi all, sample() has some well-documented undesirable behaviour. sample(1:6, 1) sample(2:6, 1) ... sample(5:6, 1) do what you expect, but sample(6:6, 1) sample(1:6, 1) do the same thing. This behaviour is documented: If 'x' has length 1, is numeric (in the sense of 'is.numeric') and 'x >= 1', sampling _via_ 'sample' takes place from
2006 Dec 07
8
crash on repeated search
I have found another crash in ferret; this one just uses a regular search. It''s similar to an issue reported by Matt Schnitz a while ago, but unlike his, mine does not go away if I turn off omit_norms. It does go away if I turn on the garbage collector more often, but I''m not sure that''s a stable workaround under the circumstances. This one isn''t a
2007 Feb 16
8
term vector blues
I have a lot of crashes when I try to use term vectors. Here''s an example, which crashes pretty consistently. This problem seems to be somewhat sensitive to platform... people on other OS''s and ruby versions have reported no error. I have seen this with ferret 0.10.13 and 0.10.14 on debian stable using ruby 1.8.2, but I have observed the same problem on various other systems as
2008 Sep 15
2
Perhaps slightly OT - Lots of spurious webdav requests.
Hello All, I am running a CentOS 4.6 file server for a small office network and I am getting a lot of strange webdav requests from one of the Windows workstations - I have not configured Webdav on the Windows host (hereafter "windows-laptop") in question. Some details - I have configured a Samba share called (say) "share1" on the CentOS server and the windows-laptop connects