Wednesday, January 24, 2018

regex - Regular expression with?




I am going through learncodethehardway's regular expression courses.



One matching criteria is [A-Za-z]+?



The author explains that by adding the ?, this matching criteria will be "non-greedy".



However, I think the following two are equivalent. Shouldn't it always generate the same output if I use it for parsing strings?




[A-Za-z]+?



[A-Za-z]+



Am I correct?


Answer



Actually, no they are not the same. If you include the "?", it will match as FEW characters as possible. Here's an example:






var string = 'abcdefg';

alert(string.match(/[A-Za-z]+/));







var string = 'abcdefg';


alert(string.match(/[A-Za-z]+?/));





Now, the "?" can still match multiple characters, but only if it has too, like this:






var string = 'abcdefg>';

alert(string.match(/[A-Za-z]+>/));





Now it gets a little confusing. Check out this example that does NOT include the "?" (the dot character matches everything but a space or new line character):






var string = 'sldkfjsldkj>';

alert(string.match(/<.+>/));





You can see that it matches everything. However, with the "?", it will match only up to the first ">".






var string = 'sldkfjsldkj>';

alert(string.match(/<.+?>/));





Now it's time for a practical example of why we would need the "?" symbol. Suppose I want to match everything between HTML tags:






var string = 'I\'m strong<\/strong> I\'m not strong  I am again.<\/strong>';

alert(string.match(/[\s\S]+<\/strong>/));






As you can see, that matched EVERYTHING between the first and last tags. To match ONLY one, use the life-saving "?" symbol again:





var string = 'I\'m strong<\/strong> I\'m not strong  I am again.<\/strong>';

alert(string.match(/[\s\S]+?<\/strong>/));





No comments:

Post a Comment

plot explanation - Why did Peaches&#39; mom hang on the tree? - Movies &amp; TV

In the middle of the movie Ice Age: Continental Drift Peaches' mom asked Peaches to go to sleep. Then, she hung on the tree. This parti...