Displaying 20 results from an estimated 10000 matches similar to: "How to get the length of an UTF-8 string"
2009 Apr 10
3
Determine the Length of the Longest Word in a String
Hi Everyone,
I'm new to programming R and have accomplished my goal, but feel that there
is probably a more efficient way of coding this. I'd appreciate any
guidance that a more advanced programmer can provide.
My goal --
I would like to find the length of the longest word in a string containing
many words separated by spaces.
How I did it --
I was able to find the length of the
2005 Dec 02
1
Compile error on FreeBSD 4.10 gcc 2.95.4
FYI, I tried installing ferret on my freebsd virtual server and got this:
retango# gem install ferret --include-dependencies
Attempting local installation of ''ferret''
Local gem file not found: ferret*.gem
Attempting remote installation of ''ferret''
Updating Gem source index for: http://gems.rubyforge.org
Building native extensions. This could take a while...
2010 Nov 24
2
Create new string of same length as entry in dataframe
I suspect that this is simple, but thanks in advance for any advice...
I have a dataframe, t2:
V1 V2
aaa 3
aaaa 4
aaaaaa 6
a 1
aa 2
V2 is the length of the string in V1 using nchar(as.character(t1$V1))
I'd like to create a third column, that contains a string of the length of
V2, but containing an alternate text, e.g.
V1 V2 V3
2008 Oct 28
2
A question about the API mkchar()
Hi guys,
I've got a question about the API mkchar(). I have met some difficulty
in parsing utf-8 string to mkchar() in R-2.7.0.
I was intending to parse an utf-8 string str_jan (some Japanese
characters such as?, whose utf-8 code is E381B5) to R API SEXP
mkChar(const char *name) , we only need to create the SEXP using the
string that we parsed.
Unfortunately, I found when parsing the
2003 Aug 29
2
length() and nchar()
I would propose to add "
See also:
`nchar' for counting the number of character in
character vectors.
"
to the helpfile of length(),
because it is rather difficult
to find nchar() if one has only
search terms as "length", "len",
"strlen" in mind.
Sincerly
Wolfram Fischer
2009 Jan 24
1
Help with dudi.pca
Dear R-helpers,
I have two data frames, op and em4:
> str(op)
'data.frame': 37 obs. of 5 variables:
$ m : num 0.202 0.336 0.122 0.139 0.14 ...
$ lln : num 0.798 0.643 0.863 0.835 0.823 ...
$ rrn : num 0.789 0.702 0.894 0.895 0.923 ...
$ asym2: num 0.177 0.304 0.108 0.187 0.274 ...
$ asym3: num 0.0755 0.0975 0.0818 0.0651 0.13 ...
> str(rownames(op))
chr
2007 Sep 05
6
length of a string
Dear all,
I would like to know how can I compute the length of a string in a dataframe. Example:
SEQUENCE ID
TGCTCCCATCTCCACGG HR04FS000000645
ACTGAACTCCCATCTCCAAT HR00000595847847
I would like to know how to compute the length of each SEQUENCE.
Best regards,
João Fadista
[[alternative HTML version deleted]]
1999 Aug 03
3
RW 0.64.2 substring() string truncation?
Hi,
(First, apology for my earlier incorrectly addressed "subscribe"
post.)
Can somebody tell me what exactly is going on below. Basically, I am
running into some kind of "string truncation" problem when I try
to get a substring starting past the 8192nd character (see sample
session below). There doesn't appear to be any problem creating the
string, and nchar()
2010 Jan 25
1
sequence of equal-length numbers (for filenames)
Dear R-users,
I'd like to create filenames in a mask "file000.dat" numbered from 1 to e.g.
123. The last problem I'm dealing with is creating the sequence of numbers
with equal length, i.e. 001, 002,.... 023, 024,.... 122, 123.
The closest I got is by a repetition:
Sequence <- c(1:123)
for(i in c(1:length(Sequence))) {
print(
paste(rep("0",
2011 Sep 29
2
String manipulation with regexpr, got to be a better way
Help-Rs,
I'm doing some string manipulation in a file where I converted a string date in mm/dd/yyyy format and returned the date yyyy.
I've used regexpr (hat tip to Gabor G for a very nice earlier post on this function) in steps (I've un-nested the code and provided it and an example of what I did below. My question is: is there a more efficient way to do this. Specifically is
2007 Oct 15
4
Get the last 3 chars of a string
I want to extract the last 3 letters of a string.
So far, I've done this:
> symbol = 'XYZ.VX"
> substr(symbol,nchar(symbol)-2,nchar(symbol))
[1] ".VX"
It works, but the code looks UGLY as hell. Am I missing something? Or
is this the way it's supposed to be?
Thanks,
Sergio
On 10/15/07, pintinho <diego at bpgomes.com> wrote:
>
> Hi everyone,
>
2013 Mar 14
3
Working with string
Hello again,
Let say I have following string:
Vec <- c("sada", "asdsa", "sa")
Now I want to make each element of this vector with equal length.
Basically I want following vector:
c("sada ", "asdsa", "sa ")
Therefore we can get:
> nchar(c("sada ", "asdsa", "sa "))
[1] 5 5 5
Is there any
2006 Oct 10
4
Need help for coding an extension to ferret
Hi,
i''m working on a project using Ferret for indexing it''s datas. I''m very
happy with it but i need to code an extension to implement a .to_json
method to TopDocs class, because ruby''s json implementation is really
really slow...
It''s my second (the first was the tutorial :/ ) ruby C extension, so i''m
not really at ease with ruby C
2011 Nov 18
4
length of empty string
Hi all,
Can somebody explain why length("") returns 1 and not 0? How do I test
if a given string is the empty string?
Thanks,
Steffen.
[[alternative HTML version deleted]]
2011 Oct 24
2
Length of string?
This is very basic but I have not been able to find an answer. Basically I
want to find the length of a string.
length("Text")
returns 1 so I know that is not right.
Thank you.
Kevin
2020 Jun 26
2
Error in substring: invalid multibyte string
Hi all,
I'm getting the following error from substring:
> substr("<I>Jens Oehlschl\xe4gel-Akiyoshi", 1, 100)
Error in substr("<I>Jens Oehlschl\xe4gel-Akiyoshi", 1, 100) :
invalid multibyte string at '<e4>gel-A<6b>iyoshi'
Is that normal / intended? I've tried setting the Encoding/locale to
Latin-1/UTF-8 but that does not help. nchar
2009 Mar 18
2
Profiling question: string formatting extremely slow
Hi all,
I'm using R to find duplicates in a set of 6 files containing Part Number
information. Before applying the intersect method to identify the duplicates
I need to normalize the P/Ns. Converting the P/N to uppercase if
alphanumerical and applying an 18 char long zero padding if numerical.
When I apply the pn_formatting function (see code below) to "Part Number"
column of the
2007 Aug 07
2
Embedded nuls in strings
Hi,
?rawToChar
'rawToChar' converts raw bytes either to a single character string
or a character vector of single bytes. (Note that a single
character string could contain embedded nuls.)
Allowing embedded nuls in a string might be an interesting experiment but it
seems to cause some troubles to most of the string manipulation functions.
A string with an embedded 0:
2005 Oct 25
1
performance of nchar
Hi,
Is nchar function knowingly slow in R? I'm doing some string
formatting that requires multiple call to nchar, and nchar seems to be
very slow.
Experiment 1, pass nchar inside sprintf, and it takes 0.7 seconds
> system.time(for (i in 1:10000)
+ str = sprintf('0005%020d', nchar(op))
+ )[3]
[1] 0.7
Experiment 2, get the length of op separately using nchar, and then pass
2010 Jul 12
1
Comparison of two very large strings
Hi,
I have a function in R that compares two very large strings for about 1
million records.
The strings are very large URLs like:-
http://query.nytimes.com/gst/sitesearch_selector.html?query=US+Visa+Laws&type=nyt&x=25&y=8.
..
or of larger lengths.
The data-frame looks like:-
id url
1