Go Back   CodingForums.com > :: Client side development > JavaScript programming

Before you post, read our: Rules & Posting Guidelines

Reply
 
Thread Tools Rate Thread
Enjoy an ad free experience by logging in. Not a member yet? Register.
Old 11-27-2012, 05:38 PM   PM User | #1
johnsmith153
New Coder

 
Join Date: Mar 2012
Posts: 81
Thanks: 7
Thanked 0 Times in 0 Posts
johnsmith153 is infamous around these parts
My regular expression is very slow in JS

This:

/[^\(]+\.(gif|jpg|jpeg|png|svg|bmp)[^;\s]*\)/gi

is very slow in JS.

I have tested this regular expression in the same code:

/a/gi

and it runs ultra quick, so it must be the regex, not the JS code.

Any ideas to speed it up?
johnsmith153 is offline   Reply With Quote
Old 11-27-2012, 05:44 PM   PM User | #2
Philip M
Supreme Master coder!

 
Philip M's Avatar
 
Join Date: Jun 2002
Location: London, England
Posts: 17,037
Thanks: 197
Thanked 2,411 Times in 2,389 Posts
Philip M has a spectacular aura aboutPhilip M has a spectacular aura aboutPhilip M has a spectacular aura about
What is that code supposed to do?

*\ what is the \?

It is your responsibility to die() if necessary….. - PHP Manual
__________________

All the code given in this post has been tested and is intended to address the question asked.
Unless stated otherwise it is not just a demonstration.
Philip M is offline   Reply With Quote
Old 11-27-2012, 06:20 PM   PM User | #3
WolfShade
Regular Coder

 
Join Date: Apr 2012
Location: St. Louis, MO, USA
Posts: 950
Thanks: 7
Thanked 98 Times in 98 Posts
WolfShade is an unknown quantity at this point
It _looks_ like it allows anything except forward slash and left parenthesis, then a period, and one of six image formats, followed by anything that isn't a semi-colon or whitespace.
__________________
^_^

If anyone knows of a website that can offer ColdFusion help that isn't controlled by neurotic, pedantic jerks* (stackoverflow.com), please PM me with a link.
*
The neurotic, pedantic jerks are not the owners; just the people who are in control of the "popularity contest".
WolfShade is offline   Reply With Quote
Old 11-27-2012, 06:50 PM   PM User | #4
johnsmith153
New Coder

 
Join Date: Mar 2012
Posts: 81
Thanks: 7
Thanked 0 Times in 0 Posts
johnsmith153 is infamous around these parts
Hi,

Thanks for replying.

My Regex is very poor and I put this together the best I can using various tutorials etc.

If anything looks stupid then it probably is.

It basically searches through a CSS file and returns an array of image URLs. That's all it needs to do. It doesn't need to to replace anything. It works, but takes ages to complete the processing.
johnsmith153 is offline   Reply With Quote
Old 11-27-2012, 06:56 PM   PM User | #5
Philip M
Supreme Master coder!

 
Philip M's Avatar
 
Join Date: Jun 2002
Location: London, England
Posts: 17,037
Thanks: 197
Thanked 2,411 Times in 2,389 Posts
Philip M has a spectacular aura aboutPhilip M has a spectacular aura aboutPhilip M has a spectacular aura about
Please show all your (relevant) code.
__________________

All the code given in this post has been tested and is intended to address the question asked.
Unless stated otherwise it is not just a demonstration.
Philip M is offline   Reply With Quote
Old 11-27-2012, 06:59 PM   PM User | #6
johnsmith153
New Coder

 
Join Date: Mar 2012
Posts: 81
Thanks: 7
Thanked 0 Times in 0 Posts
johnsmith153 is infamous around these parts
This is the part causing the problem:

var imgUrls = cssFile.match(/[^\(]+\.(gif|jpg|jpeg|png|svg|bmp)[^;\s]*\)/gi);

If I change the above line to something like:

var imgUrls = cssFile.match(/a/gi);

...then there is no problem.
johnsmith153 is offline   Reply With Quote
Old 11-27-2012, 07:18 PM   PM User | #7
Philip M
Supreme Master coder!

 
Philip M's Avatar
 
Join Date: Jun 2002
Location: London, England
Posts: 17,037
Thanks: 197
Thanked 2,411 Times in 2,389 Posts
Philip M has a spectacular aura aboutPhilip M has a spectacular aura aboutPhilip M has a spectacular aura about
What is it exactly that you are trying to match?

Does this help you?

Code:
<script type = "text/javascript">
var cssFile = "abc.gif,something else, xyz.jpg, whatever, pqr.bmp";
var s = cssFile.split(",")
var imgUrls = [];

for (var i =0; i<s.length; i++) {
imgUrls[i] = s[i].match(/[a-z]+\.+(gif|jpg|jpeg|png|svg|bmp)/gi);
}
alert (imgUrls.join(" "));

</script>
__________________

All the code given in this post has been tested and is intended to address the question asked.
Unless stated otherwise it is not just a demonstration.

Last edited by Philip M; 11-27-2012 at 07:27 PM..
Philip M is offline   Reply With Quote
Old 11-27-2012, 08:00 PM   PM User | #8
johnsmith153
New Coder

 
Join Date: Mar 2012
Posts: 81
Thanks: 7
Thanked 0 Times in 0 Posts
johnsmith153 is infamous around these parts
Thanks Phillip,

It solves the processing speed problem, but doesn't quite get all the images.

It should sort through an entire CSS file and return all the images in that file (gif|jpg|jpeg|png|svg|bmp).

The problems are:

(1) it doesn't account for underscores So:
'image_name.png' is picked up as 'name.png'

(2) Numbers are ignored, so image_50.png is completely ignored

(3) It only gets the filename. So, '../images/filename.png' only returns 'filename.png'. Of course if the path is the same for all images in the stylesheet then that's fine, but it might not always be the case.

If doing it this way helps the speed problem then I can ensure that all images use the same path, but if it makes no difference to the speed then I'd like to get the full path.

Thanks again.
johnsmith153 is offline   Reply With Quote
Old 11-27-2012, 09:17 PM   PM User | #9
Philip M
Supreme Master coder!

 
Philip M's Avatar
 
Join Date: Jun 2002
Location: London, England
Posts: 17,037
Thanks: 197
Thanked 2,411 Times in 2,389 Posts
Philip M has a spectacular aura aboutPhilip M has a spectacular aura aboutPhilip M has a spectacular aura about
Code:
<script type = "text/javascript">
var cssFile = "abc.gif,something else, /images/xyz.jpg, whatever, pqr_50.bmp";
var s = cssFile.split(",")
var imgUrls = [];
for (var i =0; i<s.length; i++) {
imgUrls[i] = s[i].match(/\W?\w.+(gif|jpg|jpeg|png|svg|bmp)/gi);
}
alert (imgUrls.join(" "));
</script>
__________________

All the code given in this post has been tested and is intended to address the question asked.
Unless stated otherwise it is not just a demonstration.

Last edited by Philip M; 11-27-2012 at 09:22 PM..
Philip M is offline   Reply With Quote
Old 11-27-2012, 09:32 PM   PM User | #10
johnsmith153
New Coder

 
Join Date: Mar 2012
Posts: 81
Thanks: 7
Thanked 0 Times in 0 Posts
johnsmith153 is infamous around these parts
Thanks again and I really appreciate this, but when tested in a real CSS file it picks up loads of text before the filename as well.

I'm guessing there's a way to limit the start of the string (file) returned.

So as an example it returns:

.class-name{background-image:url('../images/file.png

...rather than just:

../images/file.png
johnsmith153 is offline   Reply With Quote
Old 11-28-2012, 03:39 AM   PM User | #11
rnd me
Senior Coder

 
rnd me's Avatar
 
Join Date: Jun 2007
Location: Urbana
Posts: 3,457
Thanks: 9
Thanked 466 Times in 450 Posts
rnd me is a jewel in the roughrnd me is a jewel in the roughrnd me is a jewel in the rough
the fewer unknowns, the faster it will run.

in css, all your urls should appear sandwiched in either ',",or ().

try only looking between those delimiters.

also, many times it's faster to run three different specific regexps rather than using a range than only hits 2 or 3 different chars in real-life but potentially matches many more.




EDIT:
this performs over 50X faster than a match() based on the RX in post #1:
Code:
css.toLowerCase()
 .split(/(\.gif|jpg|jpeg|png|svg|bmp)/g)
 .join("\n")
 .match(/[\/.\-\w:]+\n(gif|jpg|jpeg|png|svg|bmp)/g)
 .join("`")
 .split("\n")
 .join("")
 .split("`")
 .filter(Boolean)
 .filter(function(a){return  this[a]?0:(this[a]=1)},{})
 .sort()

performance improvements compared to code in post#1:
  1. using toLowerCase() to avoid the regexp-slowing "i" flag (2.3X faster)
  2. using string split()s instead of regexp split()s (10-40X faster)
  3. using end-of-line matching instead of floating matching (2-3X faster)
  4. using opt-in chars instead of exclusion list terms ( 1.5-3X faster)

this also sorts the list and filters out invalid and duplicate items, feel free to remove the last there lines if you don't need that.

again. i think it can be made even faster by looking at ["'()], but 50X is a good head-start.
__________________
my site (updated 5/13)
STATS (2013/5) HTML5:90.2% MOB:14% IE7:0.5% IE8:8.6% IE9:9.8% IE10:10%

Last edited by rnd me; 11-28-2012 at 04:26 AM..
rnd me is online now   Reply With Quote
Old 11-28-2012, 07:30 AM   PM User | #12
Philip M
Supreme Master coder!

 
Philip M's Avatar
 
Join Date: Jun 2002
Location: London, England
Posts: 17,037
Thanks: 197
Thanked 2,411 Times in 2,389 Posts
Philip M has a spectacular aura aboutPhilip M has a spectacular aura aboutPhilip M has a spectacular aura about
Quote:
Originally Posted by johnsmith153 View Post
Thanks again and I really appreciate this, but when tested in a real CSS file it picks up loads of text before the filename as well.

I'm guessing there's a way to limit the start of the string (file) returned.

So as an example it returns:

.class-name{background-image:url('../images/file.png

...rather than just:

../images/file.png
If you specified exactly what it is you wanted then it might be possible to help you better. But if every time you introduce a new spec then it becomes frustrating for both of us.


Code:
<script type = "text/javascript">
var cssFile = "abc.gif, something else, more nonsense, unwanted stuff/mydirectory/images/xyz.jpg, whatever, ../pqr_50.bmp";
var s = cssFile.split(" ");  // split at spaces
var imgUrls = [];
for (var i =0; i<s.length; i++) {
imgUrls[i] = s[i].match(/(\W\w+\W\w+)?(\W*?)\w+.(gif|jpg|jpeg|png|svg|bmp)/gi);
}

alert (imgUrls.join("  "));  // abc.gif /mydirectory/images/xyz.jpg ../pqr_50.bmp

var newArr = []; 
for (var i=0; i<imgUrls.length; i++) {   // if desired remove empty array elements
if (imgUrls[i]) { 
newArr.push(imgUrls[i]); 
} 
}  

alert (newArr.join("  "));  // abc.gif /mydirectory/images/xyz.jpg ../pqr_50.bmp

</script>
__________________

All the code given in this post has been tested and is intended to address the question asked.
Unless stated otherwise it is not just a demonstration.

Last edited by Philip M; 11-28-2012 at 11:38 AM..
Philip M is offline   Reply With Quote
Old 11-28-2012, 01:58 PM   PM User | #13
rnd me
Senior Coder

 
rnd me's Avatar
 
Join Date: Jun 2007
Location: Urbana
Posts: 3,457
Thanks: 9
Thanked 466 Times in 450 Posts
rnd me is a jewel in the roughrnd me is a jewel in the roughrnd me is a jewel in the rough
Quote:
Originally Posted by Philip M View Post
Code:
var newArr = []; 
for (var i=0; i<imgUrls.length; i++) {   // if desired remove empty array elements
if (imgUrls[i]) { 
newArr.push(imgUrls[i]); 
} 
}
is shorter and faster as:
Code:
var newArr = imgUrls.filter(Boolean);
or maybe you knew that but you like typing...

aside, despite it's complexity, your whole posted code in #12 is still 3X slower than the one-liner i posted...
__________________
my site (updated 5/13)
STATS (2013/5) HTML5:90.2% MOB:14% IE7:0.5% IE8:8.6% IE9:9.8% IE10:10%

Last edited by rnd me; 11-28-2012 at 02:02 PM..
rnd me is online now   Reply With Quote
Users who have thanked rnd me for this post:
johnsmith153 (11-28-2012)
Old 11-28-2012, 03:04 PM   PM User | #14
Philip M
Supreme Master coder!

 
Philip M's Avatar
 
Join Date: Jun 2002
Location: London, England
Posts: 17,037
Thanks: 197
Thanked 2,411 Times in 2,389 Posts
Philip M has a spectacular aura aboutPhilip M has a spectacular aura aboutPhilip M has a spectacular aura about
Quote:
Originally Posted by rnd me View Post
is shorter and faster as:
Code:
var newArr = imgUrls.filter(Boolean);
or maybe you knew that but you like typing...

aside, despite it's complexity, your whole posted code in #12 is still 3X slower than the one-liner i posted...
Yes, I am sure you are right. Yours could well be a few milliseconds faster. Who cares? I don't think my code is complex.

Also, mine has the merit that it works. I tested your one-liner with my string and got

stuff/mydirectory/images/xyz.jpg,../pqr_50.bmp

var newArr = imgUrls.filter(Boolean); is a feature of HTML5 so only works in the most modern browsers. Oddly enough some benighted people are still running IE7 or IE8. I feel that there is no point is coding which does not work in all reasonably expected browsers (say IE7+). Especially when the older code does just exactly the same thing. Like typing? Not really, but I am not bone idle either.

However, johnsmith153 (who for some reason is said to be infamous around these parts) seems to be happy, so I shall regard this thread as closed.
__________________

All the code given in this post has been tested and is intended to address the question asked.
Unless stated otherwise it is not just a demonstration.

Last edited by Philip M; 11-28-2012 at 03:53 PM.. Reason: Correction
Philip M is offline   Reply With Quote
Old 11-28-2012, 05:46 PM   PM User | #15
rnd me
Senior Coder

 
rnd me's Avatar
 
Join Date: Jun 2007
Location: Urbana
Posts: 3,457
Thanks: 9
Thanked 466 Times in 450 Posts
rnd me is a jewel in the roughrnd me is a jewel in the roughrnd me is a jewel in the rough
Quote:
Originally Posted by Philip M View Post
mine has the merit that it works. I tested your one-liner with my string and got

stuff/mydirectory/images/xyz.jpg,../pqr_50.bmp


var newArr = imgUrls.filter(Boolean); is a feature of HTML5 so only works in the most modern browsers. Oddly enough some benighted people are still running IE7 or IE8. I feel that there is no point is coding which does not work in all reasonably expected browsers (say IE7+). Especially when the older code does just exactly the same thing. Like typing? Not really, but I am not bone idle either.

um, the regexp is for CSS, not self-made-up testcases of url-ish junk. mine finds all the ones that the orig code does in the heap of css i tested:

http://cdn.atlas.illinois.edu/css/?f...les/screen.css






the old what about 3-versions-behind IE users? i expected it.
this is an ECMA5 thing, not a HTML5 thing. why would HTML spec a core JS method anyway; that doesn't make sense... it's really from prototype.js, but better than that library's version because of the 2nd this argument.
i really think that average coders can shim such basic backwards-compatibility if need be. maybe it is asking too much, i'll consider it. Then again, someone around here always seems to stick up for the 5 year old browser crowd...


either way, no harm, a simple drop in can support all the way back to IE4 if need be:
Code:
Array.prototype.filter = [].filter || function(d, e) {
	for (var f = this.length, b = [], a = 0; a < f; a++) if (a in this) {
		var c = this[a];
		d.call(e, c, a, this) && b.push(c)
	}
	return b
};
even with the compat dropin (which nobody types), the combined code is still shorter than doing it the loopy way.

i really can't live without map/filter, so i don't understand how anyone doing any data processing would want to skip over that incredibly useful set of functionality. that's just me. It's the best thing to happen to JS since new Date().getFullYear()...


if you want to safely live in the luxury of functional coding, include fallback compat drop-ins with your standard library.

here are all 9 methods ( map, filter, every, some, lastIndexOf, indexOf, reduce, reduceRight, forEach ) in under 1000bytes:


Code:
(function (){var $="concat",B="length",C="call",e=Array.prototype,t,n,r={map:function(e,t){var n=this[$](),r=n[B],i=0,s=[];for(;i<r;i++)i in n&&(s[i]=e[C](t,n[i],i,n));return s},filter:function(e,t){var n=this[$](),r=n[B],i=0,s=[],o=0;for(;i<r;i++)i in n&&e[C](t,n[i],i,n)&&(s[o++]=n[i]);return s},every:function(e,t){var n=this[$](),r=n[B],i=0;return r&&n.filter(e,t)[B]==r},some:function(e,t){var n=this[$](),r=n[B],i=1;for(;r--;)if(r in n&&e[C](n,n[r],r,n)&&!--i)return!0;return!1},lastIndexOf:function(e,t){var n=this[$](),r=n[B],i=t||-1;for(;r>i;r--)if(r in n&&n[r]===e)return l;return-1},indexOf:function(e,t){var n=this[$](),r=n[B],i=t||0;for(;i<r;i++)if(i in n&&n[i]===e)return i;return-1},reduce:function(e,t){var n=this[$](),r=n[B],i=0,s=t||n[i++];for(;i<r;i++)s=e[C](null,s,n[i],i,n);return s},reduceRight:function(e,t){var n=this[$](),r=n[B],i=r-1,s=t||n[i--];for(;i>-1;i--)s=e[C](null,s,n[i],i,n);return s},forEach:function(e,t){this[$]().map(e,t)}};for(t in r)n=e[t],e[t]=n||r[t]})();
unlike last time; look ma, no eval()!
__________________
my site (updated 5/13)
STATS (2013/5) HTML5:90.2% MOB:14% IE7:0.5% IE8:8.6% IE9:9.8% IE10:10%

Last edited by rnd me; 11-28-2012 at 05:52 PM..
rnd me is online now   Reply With Quote
Reply

Bookmarks

Jump To Top of Thread


Thread Tools
Rate This Thread
Rate This Thread:

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +1. The time now is 09:42 PM.


Advertisement
Log in to turn off these ads.