Recently I came across a piece of code using the replace command. Looking at the code I got the feeling that the author was not very comfortable with Regex replacements. Here is the snippet which I have modified slightly to explain my point.
<!DOCTYPE html> <html> <head> <script src="jquery-1.4.2.js"></script> <script language = "JavaScript" type = "text/javascript"> $(document).ready(function (){ var text = []; text.push('Suddenly, the wolf appeared beside her. "What are you doing out here, little girl?"\n'); text.push('the wolf asked in a voice as friendly as he could muster.\n'); text.push('I\'m on my way to see my Grandma who lives through the forest, near the brook," Little Red Riding Hood replied.'); var postdata = []; var re = new RegExp("\n"); var re1 = new RegExp("[.]", "g"); var re2 = new RegExp("[,]", "g"); var re3 = new RegExp("[?]", "g"); var re4 = new RegExp("[!]", "g"); var re5 = new RegExp("[:]", "g"); var re6 = new RegExp("[;]", "g"); var re7 = new RegExp("[']", "g"); var re8 = new RegExp('["]', "g"); $.each(text, function(){ var content = this.replace(re, ' '); content = content.replace(re1, '').replace(re2, '').replace(re3, '').replace(re4, '') .replace(re5, '').replace(re6, '').replace(re7, '').replace(re8, ''); postdata.push(content); }); //print it out for(var i in postdata){ console.log('OLD: ' + postdata[i]); } }); </script> </head> <body> <!-- some stuff --> </body> </html>
As is obvious all the above code does is replaces the character ‘\n’ with a space and characters dot, comma, question mark, exclamation, colon, semi-colon, single and double quotes with an empty string.
If you were to run this the output would be:
Console [7]=
OLD: Suddenly the wolf appeared beside her What are you doing out here little girl
Console [8]=
OLD: the wolf asked in a voice as friendly as he could muster
Console [9]=
OLD: Im on my way to see my Grandma who lives through the forest near the brook Little Red Riding Hood replied
Fortunately JavaScript’s Regex engine is quite robust and flexible and can pretty much take on any other Regex engines. So the above code can be written as follows:
<!DOCTYPE html> <html> <head> <script src="jquery-1.4.2.js"></script> <script language = "JavaScript" type = "text/javascript"> $(document).ready(function (){ var text = []; text.push('Suddenly, the wolf appeared beside her. "What are you doing out here, little girl?"\n'); text.push('the wolf asked in a voice as friendly as he could muster.\n'); text.push('I\'m on my way to see my Grandma who lives through the forest, near the brook," Little Red Riding Hood replied.'); var postdata = []; var re = /['\n']/g; var re1 = /[\.\,\?\!\:\;\'\"\']/g; for(var i in text){ var content = text[i].replace(re, ' ').replace(re1, ''); postdata.push(content); }; for(var i in postdata){ console.log('NEW: ' + postdata[i]); } }); </script> </head> <body> <!-- some stuff --> </body> </html>
The output is:
Console [10]=
NEW: Suddenly the wolf appeared beside her What are you doing out here little girl
Console [11]=
NEW: the wolf asked in a voice as friendly as he could muster
Console [12]=
NEW: I m on my way to see my Grandma who lives through the forest near the brook Little Red Riding Hood replied
As you can see it is much more elegant and many times faster. Also notice that instead of using the jQuery $.each iterator I have use the JavaScript for…in. This is much faster than $.each. I recommend that you use this whenever possible.