Hello, I just had a problem with someone cusing on my rail app now is there somthing like Red Cloth that I can use to disable ''dirty words'' -- Posted via http://www.ruby-forum.com/.
Mohammad wrote:> Hello, I just had a problem with someone cusing on my rail app now is > there somthing like Red Cloth that I can use to disable ''dirty words'' >naughty_words = [''poo'',''darn'',''sugar'',''heffalumps''] naughty_words.each do |cuss| comment.gsub!(/\b#{cuss}\b/i, ''FLUFFY BUNNIES'') end Needn''t be much more complicated than that... -- Alex
Mohammad wrote:> Hello, I just had a problem with someone cusing on my rail app now is > there somthing like Red Cloth that I can use to disable ''dirty words'' > >Is this for a forum? You could probably use a bayesian classifier to try to flag messages that contain words that aren''t already on your blacklist. There are at least two ruby projects out there. http://bishop.rubyforge.org/ http://rubyforge.org/projects/classifier alex
Alex Young wrote:> Mohammad wrote: >> Hello, I just had a problem with someone cusing on my rail app now is >> there somthing like Red Cloth that I can use to disable ''dirty words'' >> > > naughty_words = [''poo'',''darn'',''sugar'',''heffalumps''] > naughty_words.each do |cuss| > comment.gsub!(/\b#{cuss}\b/i, ''FLUFFY BUNNIES'') > end > > Needn''t be much more complicated than that...Hmm. This is what I wrote def show @pm = Pm.find(params[:id]) if @pm.to_id != @session[:user].id render :text => "Dont try to cheat the system." end @body2 = filter(@pm.body) end def filter(text) @naughty_words = [''fuck'',''ass'',''bastered''] @replace_with = [''f***'',''a**'',''b*****''] @count = 0 @naughty_words.each do |cuss| text.gsub!(/\b#{cuss}\b/i, @replace_with[@count]) @count += 1 end end and its just displaying the ones that are in the @pm.body got any idea why? did the gsub mess up somewhere, (not good with gsub sorry). -- Posted via http://www.ruby-forum.com/.
> -----Original Message----- > From: rails-bounces@lists.rubyonrails.org > [mailto:rails-bounces@lists.rubyonrails.org] On Behalf Of Mohammad > Sent: Thursday, May 11, 2006 3:19 PM > To: rails@lists.rubyonrails.org > Subject: [Rails] Re: Curse words > > > Alex Young wrote: > > Mohammad wrote: > >> Hello, I just had a problem with someone cusing on my rail > app now is > >> there somthing like Red Cloth that I can use to disable > ''dirty words'' > >> > > > > naughty_words = [''poo'',''darn'',''sugar'',''heffalumps''] > > naughty_words.each do |cuss| > > comment.gsub!(/\b#{cuss}\b/i, ''FLUFFY BUNNIES'') > > end > > > > Needn''t be much more complicated than that... > > Hmm. This is what I wrote > def show > @pm = Pm.find(params[:id]) > if @pm.to_id != @session[:user].id > render :text => "Dont try to cheat the system." > end > @body2 = filter(@pm.body) > end > > > def filter(text) > @naughty_words = [''fuck'',''ass'',''bastered''] > @replace_with = [''f***'',''a**'',''b*****''] > @count = 0 > @naughty_words.each do |cuss| > text.gsub!(/\b#{cuss}\b/i, @replace_with[@count]) > @count += 1 > end > end > > and its just displaying the ones that are in the @pm.body got > any idea > why? did the gsub mess up somewhere, (not good with gsub sorry).I''m not sure, but it looks like you phucked up somewhere jackazz. :-P Dan This communication is the property of Qwest and may contain confidential or privileged information. Unauthorized use of this communication is strictly prohibited and may be unlawful. If you have received this communication in error, please immediately notify the sender by reply e-mail and destroy all copies of the communication and any attachments.
FLUFFY BUNNIES i can''t FLUFFY BUNNIES stand FLUFFY BUNNIES censors, the FLUFFY BUNNIES! -- Posted via http://www.ruby-forum.com/.
Mohammad wrote:> Alex Young wrote: >> Mohammad wrote: >>> Hello, I just had a problem with someone cusing on my rail app now is >>> there somthing like Red Cloth that I can use to disable ''dirty words'' >>> >> >> naughty_words = [''poo'',''darn'',''sugar'',''heffalumps''] >> naughty_words.each do |cuss| >> comment.gsub!(/\b#{cuss}\b/i, ''FLUFFY BUNNIES'') >> end >> >> Needn''t be much more complicated than that... > > Hmm. This is what I wrote > def show > @pm = Pm.find(params[:id]) > if @pm.to_id != @session[:user].id > render :text => "Dont try to cheat the system." > end > @body2 = filter(@pm.body) > end > > > def filter(text) > @naughty_words = [''fuck'',''ass'',''bastered''] > @replace_with = [''f***'',''a**'',''b*****''] > @count = 0 > @naughty_words.each do |cuss| > text.gsub!(/\b#{cuss}\b/i, @replace_with[@count]) > @count += 1 > end > end > > and its just displaying the ones that are in the @pm.body got any idea > why? did the gsub mess up somewhere, (not good with gsub sorry).Your not returning the text thats why. def filter(text) @naughty_words = [''fuck'',''ass'',''bastered''] @replace_with = [''f***'',''a**'',''b*****''] @naughty_words.each_with_index do |cuss,count| text.gsub!(/\b#{cuss}\b/i, @replace_with[count]) end text end -- Posted via http://www.ruby-forum.com/.
Mohammad wrote:> Alex Young wrote: >> Mohammad wrote: >>> Hello, I just had a problem with someone cusing on my rail app now is >>> there somthing like Red Cloth that I can use to disable ''dirty words'' >>> >> naughty_words = [''poo'',''darn'',''sugar'',''heffalumps''] >> naughty_words.each do |cuss| >> comment.gsub!(/\b#{cuss}\b/i, ''FLUFFY BUNNIES'') >> end >> >> Needn''t be much more complicated than that... > > Hmm. This is what I wrote > def show > @pm = Pm.find(params[:id]) > if @pm.to_id != @session[:user].id > render :text => "Dont try to cheat the system." > end > @body2 = filter(@pm.body) > end > > > def filter(text) > @naughty_words = [''fuck'',''ass'',''bastered''] > @replace_with = [''f***'',''a**'',''b*****''] > @count = 0 > @naughty_words.each do |cuss| > text.gsub!(/\b#{cuss}\b/i, @replace_with[@count]) > @count += 1 > end > end > > and its just displaying the ones that are in the @pm.body got any idea > why? did the gsub mess up somewhere, (not good with gsub sorry). >You''re using gsub!, which modifies the string in place. However, from the looks of things, you want to return the corrected string for display (I''m assuming that''s what the @body variable is for). The return value of your filter() method will be the return value of the @naughty_words.each() call, though, *not* the corrected string. I''ve been caught out by this before - the return value of an each() call is *the original list*, not anything that you did to it. Also, the way you''re picking the replacement is actually wrong. It will cycle through the replace_words one by one, then fail after 3 replacements. You want something like this: def filter(text) @naughty_words = [''poo'',''darn'',''sugar'',''heffalumps''] @replace_with = [''p**'',''d***'',''s****'',''h*********''] @naughty_words.each do |cuss| text.gsub!( /\b#{cuss}\b/i, @replace_with[@naughty_words.index(cuss)]) end return text end Better would be to use a hash: def filter(text) @naughty_words = {''poo'' => ''p**'', ''darn'' => ''d***'', ''sugar'' => ''s****'', ''heffalumps'' => ''h*********''} @naughty_words.each_pair do |cuss, replacement| text.gsub!( /\b#{cuss}\b/i, replacement ) end end Note that in the second case I''m not returning text. That''s because gsub! acts in place, so in your original show method you don''t need to assign to @body2 - you can just use the @pm.body value directly. Hope that makes things a little clearer... -- Alex
First, gsub with an exclamation mark on the end modifies the string in-place, so the filter function will modify @pm.body. Also, your filter function doesn''t return a string, it returns the @naughty_words array. Before you read any further, please note that some of my suggestions may be slightly ridiculous. Perhaps a good idea would be to just have a naughty_words array rather than both naughty_words and replace_with. Like this: NAUGHTY_WORDS = %w(roses kittens) def filter(text) text_to_filter = text.dup NAUGHTY_WORDS.each do |word| text_to_filter.gsub!(/\b#{word}\b/i, word[0,1] + ("*" * (word.size-1))) end text_to_filter end First, gsub with an exclamation mark on the end modifies the string in-place, so the filter function will modify @pm.body. Also, your filter function doesn''t return a string, it returns the @naughty_words array. Also, perhaps a good idea would be to just have a naughty_words array rather than both naughty_words and replace_with. Like this: NAUGHTY_WORDS = %w(roses kittens) def filter(text) text_to_filter = text.dup NAUGHTY_WORDS.each do |word| text_to_filter.gsub!(/\b#{word}\b/i, word[0,1] + ("*" * (word.size-1))) end text_to_filter end puts filter("Raindrops on roses and whiskers on kittens.") This outputs "Raindrops on r**** and whiskers on k******." Assuming you want to preserve the first letter of naughty words... First, it duplicates the string passed to it, so it doesn''t operate on the actual body of the message you''d like to send. From the way you call the function, that seems to be what you''re expecting to happen. Then, it iterates through the naughty words, nothing new with that. For each naughty word, it calls gsub! on the duplicated string, looking for the word and replacing it with the word''s first letter and asterisks for the rest of the word''s letters. After iterating through all the naughty words, it returns the cleaned text. Oh, and I''ve made the naughty characters a constant. Maybe a good idea, maybe not? I''m not sure. You could also have a naughty_words method that just returns the array of naughty words: def naughty_words %w(roses kittens) end Then later if you want to keep the list of naughty words somewhere else, like in your database or in a file, the naughty_words method could take care of reading the words from that other place and just return an array of words. You might also want to put your filtering in a helper method, and call it from a view, instead of setting the @body2 instance variable. Also maybe just replacing naughty words with four asterisks would be best, then they''re obfuscated more. You could do this: def filter(text) text_to_filter = text.dup naughty_words.each do |word| text_to_filter.gsub!(/\b#{word}\b/i, "****") end text_to_filter end Or in one line: def filter(text) text.gsub(/\b(#{naughty_words.join(''|'')})\b/i, "****") end That one doesn''t use the in-place gsub (it returns a copy of the string) and builds up a regular expression instead of iterating through each word. Replacing the naughty words with random characters would be cute. You could do this... CLEAN_CHARS = "!@$%*" def naughty_words %w(roses kittens) end def clean_word_for(word) Array.new(word.size).fill{ CLEAN_CHARS.slice(rand (CLEAN_CHARS.size),1) }.join end def filter(text) text_to_filter = text.dup naughty_words.each do |word| text_to_filter.gsub!(/\b#{word}\b/i, clean_word_for(word)) end text_to_filter end puts filter("Raindrops on roses and whiskers on kittens. Roses and roses.") Which outputs something like "Raindrops on *!@%% and whiskers on %!%$ $!@. *!@%% and *!@%%." And you could shorten the filter method as before: def filter(text) text.gsub(/\b(#{naughty_words.join(''|'')})\b/i) { clean_word_for ($1) } end This uses the block form of gsub. See the documentation here: http:// ruby-doc.org/core/classes/String.html#M001889 -- Michael Daines http://www.mdaines.com
Alex Young wrote:> Mohammad wrote: > >> Hello, I just had a problem with someone cusing on my rail app now is >> there somthing like Red Cloth that I can use to disable ''dirty words'' >> > > naughty_words = [''poo'',''darn'',''sugar'',''heffalumps''] > naughty_words.each do |cuss| > comment.gsub!(/\b#{cuss}\b/i, ''FLUFFY BUNNIES'') > end > > Needn''t be much more complicated than that... >Wont that look through comment once for each word. That seems expensive. Maybe an alternation would be more efficient for large blocks of text: text = ''This darn thing smells like poo.'' words = { ''poo'' => ''#@%'', ''darn'' => ''$@%!'', } # create alternation match = words.keys.join(''|'') # look through text *once* making substitutions text.gsub!(/\b(#{match})\b/) { |match| words[match] } Randy.
Randy W. Sims wrote:> Wont that look through comment once for each word. That seems expensive.Measure it... You might be surprised. What seems like it should be expensive often isn''t, and vice versa...> Maybe an alternation would be more efficient for large blocks of text: > > text = ''This darn thing smells like poo.'' > > words = { > ''poo'' => ''#@%'', > ''darn'' => ''$@%!'', > } > > # create alternation > match = words.keys.join(''|'') > > # look through text *once* making substitutions > text.gsub!(/\b(#{match})\b/) { |match| words[match] } >For large blocks of text, that''ll turn out to be very expensive. Two things in regular expressions are slow: keeping back-references, and backtracking. The alternation you''ve got there implies both. The relative speed you''ll end up with will depend on the number of words to be substituted, their density in the text, their similarity (I think - not entirely certain how much Ruby''s regex engine optimises alternations), and the statistics of the rest of the text. For an extremely naive test (attached), I''ve measured an each() loop being 3 times faster than an alternation on longish (16KB-160KB) text blocks, but that''s with a very high match rate. I''m sure Zed will pick holes in my methods (and they''re *huge* :-) but the principle stands... -- Alex -------------- next part -------------- require ''benchmark'' @text = ''This darn thing smells like poo.'' @backup = @text.dup @subs = {''poo''=>''p**'', ''darn''=>''d***''} @list = @subs.keys @precompile_list = @list.collect{|l| /\b(#{l})\b/} @precompile_alt = /\b(#{@list.join(''|'')})\b/ def each_func(text) @list.each do |l| text.gsub!(/\b#{l}\b/, @subs[l]) end end def alternate_func(text) matcher = @list.join(''|'') text.gsub!(/\b(#{matcher})\b/){|match| @subs[match]} end def precompile_each(text) @precompile_list.each do |m| text.gsub!(m){|match| @subs[match]} end end def precompile_alternate(text) text.gsub!(@precompile_alt){|match| @subs[match]} end def replace(text, backup) text = backup.dup end def do_loop(text, count, comment) puts comment Benchmark.bm(20) do |b| backup = text.dup b.report(''each:''){ for i in 1..count; each_func(text); replace(text, backup); end } b.report(''alternation:''){ for i in 1..count; alternate_func(text); replace(text,backup); end } b.report(''prec_each:''){ for i in 1..count; precompile_each(text); replace(text,backup); end } b.report(''prec_alt:''){ for i in 1..count; precompile_alternate(text); replace(text,backup); end } end end do_loop(@text, 5000, ''Basic short loop.'') do_loop(@text*50, 5000, ''Loop over 50.'') do_loop(@text*500, 5000, ''Loop over 500.'') do_loop(@text*50000, 5000, ''Loop over 5000.'') <<-EX Example run (PPC 1.2GHz Mac Mini): Basic short loop. user system total real each: 0.230000 0.030000 0.260000 ( 0.411503) alternation: 0.180000 0.000000 0.180000 ( 0.308466) prec_each: 0.050000 0.010000 0.060000 ( 0.164306) prec_alt: 0.040000 0.000000 0.040000 ( 0.058427) Loop over 50. user system total real each: 0.290000 0.020000 0.310000 ( 0.501477) alternation: 0.510000 0.000000 0.510000 ( 0.745209) prec_each: 0.120000 0.010000 0.130000 ( 0.290804) prec_alt: 0.390000 0.000000 0.390000 ( 0.553196) Loop over 500. user system total real each: 1.000000 0.420000 1.420000 ( 2.050063) alternation: 3.340000 0.510000 3.850000 ( 5.093815) prec_each: 0.830000 0.380000 1.210000 ( 1.642269) prec_alt: 3.080000 0.460000 3.540000 ( 4.789877) Loop over 5000. user system total real each: 104.550000 26.550000 131.100000 (178.008627) alternation: 324.600000 28.270000 352.870000 (464.232145) prec_each: 103.310000 26.270000 129.580000 (179.688677) prec_alt: 322.470000 28.150000 350.620000 (457.476762) EX
I''d like to be a bit more helpful on this topic but honestly I couldn''t give a FLUFFY BUNNIES. -- Giles Bowkett http://www.gilesgoatboy.org On 5/12/06, Alex Young <alex@blackkettle.org> wrote:> Randy W. Sims wrote: > > Wont that look through comment once for each word. That seems expensive. > Measure it... You might be surprised. What seems like it should be > expensive often isn''t, and vice versa... > > > Maybe an alternation would be more efficient for large blocks of text: > > > > text = ''This darn thing smells like poo.'' > > > > words = { > > ''poo'' => ''#@%'', > > ''darn'' => ''$@%!'', > > } > > > > # create alternation > > match = words.keys.join(''|'') > > > > # look through text *once* making substitutions > > text.gsub!(/\b(#{match})\b/) { |match| words[match] } > > > For large blocks of text, that''ll turn out to be very expensive. Two > things in regular expressions are slow: keeping back-references, and > backtracking. The alternation you''ve got there implies both. > > The relative speed you''ll end up with will depend on the number of words > to be substituted, their density in the text, their similarity (I think > - not entirely certain how much Ruby''s regex engine optimises > alternations), and the statistics of the rest of the text. For an > extremely naive test (attached), I''ve measured an each() loop being 3 > times faster than an alternation on longish (16KB-160KB) text blocks, > but that''s with a very high match rate. > > I''m sure Zed will pick holes in my methods (and they''re *huge* :-) but > the principle stands... > > -- > Alex > > > require ''benchmark'' > > @text = ''This darn thing smells like poo.'' > @backup = @text.dup > @subs = {''poo''=>''p**'', ''darn''=>''d***''} > @list = @subs.keys > @precompile_list = @list.collect{|l| /\b(#{l})\b/} > @precompile_alt = /\b(#{@list.join(''|'')})\b/ > > def each_func(text) > @list.each do |l| > text.gsub!(/\b#{l}\b/, @subs[l]) > end > end > def alternate_func(text) > matcher = @list.join(''|'') > text.gsub!(/\b(#{matcher})\b/){|match| @subs[match]} > end > def precompile_each(text) > @precompile_list.each do |m| > text.gsub!(m){|match| @subs[match]} > end > end > def precompile_alternate(text) > text.gsub!(@precompile_alt){|match| @subs[match]} > end > > def replace(text, backup) > text = backup.dup > end > > def do_loop(text, count, comment) > puts comment > Benchmark.bm(20) do |b| > backup = text.dup > b.report(''each:''){ for i in 1..count; each_func(text); replace(text, backup); end } > b.report(''alternation:''){ for i in 1..count; alternate_func(text); replace(text,backup); end } > b.report(''prec_each:''){ for i in 1..count; precompile_each(text); replace(text,backup); end } > b.report(''prec_alt:''){ for i in 1..count; precompile_alternate(text); replace(text,backup); end } > end > end > > do_loop(@text, 5000, ''Basic short loop.'') > do_loop(@text*50, 5000, ''Loop over 50.'') > do_loop(@text*500, 5000, ''Loop over 500.'') > do_loop(@text*50000, 5000, ''Loop over 5000.'') > > <<-EX > Example run (PPC 1.2GHz Mac Mini): > > Basic short loop. > user system total real > each: 0.230000 0.030000 0.260000 ( 0.411503) > alternation: 0.180000 0.000000 0.180000 ( 0.308466) > prec_each: 0.050000 0.010000 0.060000 ( 0.164306) > prec_alt: 0.040000 0.000000 0.040000 ( 0.058427) > Loop over 50. > user system total real > each: 0.290000 0.020000 0.310000 ( 0.501477) > alternation: 0.510000 0.000000 0.510000 ( 0.745209) > prec_each: 0.120000 0.010000 0.130000 ( 0.290804) > prec_alt: 0.390000 0.000000 0.390000 ( 0.553196) > Loop over 500. > user system total real > each: 1.000000 0.420000 1.420000 ( 2.050063) > alternation: 3.340000 0.510000 3.850000 ( 5.093815) > prec_each: 0.830000 0.380000 1.210000 ( 1.642269) > prec_alt: 3.080000 0.460000 3.540000 ( 4.789877) > Loop over 5000. > user system total real > each: 104.550000 26.550000 131.100000 (178.008627) > alternation: 324.600000 28.270000 352.870000 (464.232145) > prec_each: 103.310000 26.270000 129.580000 (179.688677) > prec_alt: 322.470000 28.150000 350.620000 (457.476762) > EX > > _______________________________________________ > Rails mailing list > Rails@lists.rubyonrails.org > http://lists.rubyonrails.org/mailman/listinfo/rails > > >
Possibly Parallel Threads
- How to escape a forward slash with gsub, also does interpolation work with gsub?
- Blasphemous? any support for a REPO of current edition BIND, et al (e.g., BZ561299)?
- Trick to compile older packages
- Question about EXT3 error messages in /var/log/messages
- QueryParser doesn''t use StandardAnalyzer correctly?