Problem
I want to use a regular expression to match a piece of a string and then get the parenthesized substring:
What am I doing incorrectly?
I realized that the regular expression code above was correct: the actual string I was testing against was as follows:
"date format_%A"
Reporting that “percent A” is undefined appears unusual, but it’s unrelated to this topic, so I’ve created a new one, Why is a matched substring returning “undefined” in JavaScript?
Because the text I was logging (“percent A”) had a specific value, console.log was attempting to retrieve the value of the next parameter.
Asked by nickf
Solution #1
You can find capturing groups in places like this:
You can iterate over multiple matches if there are any:
As you can see, iterating over several matches was not an easy task. As a result, the String.prototype.matchAll method was proposed. The ECMAScript 2020 specification is intended to include this new approach. It provides us with a clean API and solves a variety of issues. It’s now available on major browsers and JS engines, including Chrome 73+, Node 12+, and Firefox 67+.
The method returns an iterator, which can be used in the following way:
We can call it lazy because it returns an iterator; this is important when dealing with vast numbers of capturing groups or strings. However, you can easily convert the result to an Array using the spread syntax or the Array.from method if necessary:
function getFirstGroup(regexp, str) {
const array = [...str.matchAll(regexp)];
return array.map(m => m[1]);
}
// or:
function getFirstGroup(regexp, str) {
return Array.from(str.matchAll(regexp), m => m[1]);
}
In the interim, you can utilize the official shim package until this idea receives more widespread acceptance.
The method’s core workings are also straightforward. The following is an equivalent implementation using a generator function:
function* matchAll(str, regexp) {
const flags = regexp.global ? regexp.flags : regexp.flags + "g";
const re = new RegExp(regexp, flags);
let match;
while (match = re.exec(str)) {
yield match;
}
}
To avoid side effects caused by the mutation of the lastIndex property when running through several matches, a copy of the original regexp is produced.
To avoid an infinite loop, we must also guarantee that the regexp has the global flag.
I’m particularly pleased to notice that this StackOverflow query got mentioned in the proposal’s talks.
Answered by Christian C. Salvadó
Solution #2
For each match, you may use the following method to get the nth capturing group:
Answered by Mathias Bynens
Solution #3
The b is not the same as the a. (On —format foo/, it works, but not on format a b.) However, I wanted to present an alternative to your expression, which is perfectly acceptable. Of course, the most crucial aspect is the match call.
Answered by PhiLho
Solution #4
Last but not least, I discovered one piece of code (JS ES6) that worked perfectly for me:
This will return:
['fiestasdefindeaño', 'PadreHijo', 'buenosmomentos', 'france', 'paris']
Answered by Sebastien H.
Solution #5
I was seeking for an answer here after not obtaining what I wanted from the multi-match parentheses examples above:
var matches = mystring.match(/(?:neededToMatchButNotWantedInResult)(matchWanted)/igm);
After looking at the slightly confusing function calls above with while and.push(), it occurred to me that the problem might be solved very cleanly with mystring. Instead, use replace() (the replacement isn’t the goal, and it isn’t even done; the CLEAN, built-in recursive function call option for the second parameter is!):
var yourstring = 'something format_abc something format_def something format_ghi';
var matches = [];
yourstring.replace(/format_([^\s]+)/igm, function(m, p1){ matches.push(p1); } );
After this, I’m not sure I’ll ever use.match() for anything else.
Answered by Alexz
Post is based on https://stackoverflow.com/questions/432493/how-do-you-access-the-matched-groups-in-a-javascript-regular-expression