Regex replace text but exclude when text is between specific tag

I have the following string:

Lorem ipsum Test dolor sit amet, consetetur sadipscing elitr, sed diam nonumy <a href="http://Test.com/url">Test</a> eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd sed Test dolores et ea rebum. Stet clita kasd gubergren, no sea <a href="http://url.com">Test xyz</a> takimata sanctus est Lorem ipsum dolor sit amet.

Now I would replace the string 'Test' outside of tags an not between tags (e.g. replaced with '1234').

Lorem ipsum 1234 dolor sit amet, consetetur sadipscing elitr, sed diam nonumy <a href="http://Test.com/url">Test</a> eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd sed 1234 dolores et ea rebum. Stet clita kasd gubergren, no sea <a href="http://url.com">Test xyz</a> takimata sanctus est Lorem ipsum dolor sit amet.

I started with this regex: (?!<a[^>]*>)(Test)([^<])(?!</a>)

But two problems are not solved:

The text 'Test' gets also replaced inside Tags (e.g. )

Does the text between the tag not exactly match the searched text, it will be also replaced(e.g. <a href="http://url">Test xyz</a>)

I hope someone has a solution to solve this problem.

asked Sep 19 '12 at 10:44

Weri

43114

add a comment |

I have the following string:

Lorem ipsum Test dolor sit amet, consetetur sadipscing elitr, sed diam nonumy <a href="http://Test.com/url">Test</a> eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd sed Test dolores et ea rebum. Stet clita kasd gubergren, no sea <a href="http://url.com">Test xyz</a> takimata sanctus est Lorem ipsum dolor sit amet.

Now I would replace the string 'Test' outside of tags an not between tags (e.g. replaced with '1234').

Lorem ipsum 1234 dolor sit amet, consetetur sadipscing elitr, sed diam nonumy <a href="http://Test.com/url">Test</a> eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd sed 1234 dolores et ea rebum. Stet clita kasd gubergren, no sea <a href="http://url.com">Test xyz</a> takimata sanctus est Lorem ipsum dolor sit amet.

I started with this regex: (?!<a[^>]*>)(Test)([^<])(?!</a>)

But two problems are not solved:

The text 'Test' gets also replaced inside Tags (e.g. )

Does the text between the tag not exactly match the searched text, it will be also replaced(e.g. <a href="http://url">Test xyz</a>)

I hope someone has a solution to solve this problem.

asked Sep 19 '12 at 10:44

Weri

43114

add a comment |

I have the following string:

Lorem ipsum Test dolor sit amet, consetetur sadipscing elitr, sed diam nonumy <a href="http://Test.com/url">Test</a> eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd sed Test dolores et ea rebum. Stet clita kasd gubergren, no sea <a href="http://url.com">Test xyz</a> takimata sanctus est Lorem ipsum dolor sit amet.

Now I would replace the string 'Test' outside of tags an not between tags (e.g. replaced with '1234').

Lorem ipsum 1234 dolor sit amet, consetetur sadipscing elitr, sed diam nonumy <a href="http://Test.com/url">Test</a> eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd sed 1234 dolores et ea rebum. Stet clita kasd gubergren, no sea <a href="http://url.com">Test xyz</a> takimata sanctus est Lorem ipsum dolor sit amet.

I started with this regex: (?!<a[^>]*>)(Test)([^<])(?!</a>)

But two problems are not solved:

The text 'Test' gets also replaced inside Tags (e.g. )

Does the text between the tag not exactly match the searched text, it will be also replaced(e.g. <a href="http://url">Test xyz</a>)

I hope someone has a solution to solve this problem.

asked Sep 19 '12 at 10:44

Weri

43114

I have the following string:

Lorem ipsum Test dolor sit amet, consetetur sadipscing elitr, sed diam nonumy <a href="http://Test.com/url">Test</a> eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd sed Test dolores et ea rebum. Stet clita kasd gubergren, no sea <a href="http://url.com">Test xyz</a> takimata sanctus est Lorem ipsum dolor sit amet.

Now I would replace the string 'Test' outside of tags an not between tags (e.g. replaced with '1234').

Lorem ipsum 1234 dolor sit amet, consetetur sadipscing elitr, sed diam nonumy <a href="http://Test.com/url">Test</a> eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd sed 1234 dolores et ea rebum. Stet clita kasd gubergren, no sea <a href="http://url.com">Test xyz</a> takimata sanctus est Lorem ipsum dolor sit amet.

I started with this regex: (?!<a[^>]*>)(Test)([^<])(?!</a>)

But two problems are not solved:

The text 'Test' gets also replaced inside Tags (e.g. )

Does the text between the tag not exactly match the searched text, it will be also replaced(e.g. <a href="http://url">Test xyz</a>)

I hope someone has a solution to solve this problem.

regex

asked Sep 19 '12 at 10:44

Weri

43114

asked Sep 19 '12 at 10:44

Weri

43114

asked Sep 19 '12 at 10:44

Weri

43114

asked Sep 19 '12 at 10:44

Weri

43114

asked Sep 19 '12 at 10:44

Weri

43114

add a comment |

4 Answers
4

active

oldest

votes

(?!<a[^>]*?>)(Test)(?![^<]*?</a>)

same as zb226, but optimized with a lazy match

Also, using regexes on raw HTML is not recommended.

answered Sep 19 '12 at 11:48

protist

75249

I also added the b flag to match a word boundary: (?!<a[^>]*?>)(bTestb)(?![^<]*?</a>)

– Weri
Sep 19 '12 at 12:34

That should give the regex optimizer more to work with. It also should not adversely affect your matches, as long as _Test_, _Test, or Test_ are not in your document (and assuming you would not care to match them if they were).

– protist
Sep 19 '12 at 13:10

The lookaheaed before Test and the lazy match are meaningless. See my answer.

– Adam
Oct 25 '17 at 16:38

add a comment |

Answer

Use

(Test)(?!(.(?!<a))*</a>)

Explanation

Let me remind you of the meaning of some symbols:

1) ?! is a negative lookahead, for example r(?!d) selects all r that are not directly followed by an d:

enter image description here

2) Therefore never start a negative lookahead without a character. Just (?!d) is meaningless:

enter image description here

3) The ? can be used as a lazy match. For example .+E would select from

123EEE

the whole string 123EEE. However, .+?E selects as few "any charater" (.+) as needed. It would only select 123E.

Answer:

Protist answer is that you should use (?!<a[^>]*?>)(Test)(?![^<]*?</a>). Let me explain how to make this shorter first.

As mentioned in 2), it is meaningless to put a lookahead before the match. So the following is equivalent to protist answer:

(Test)(?![^<]*?</a>)

also since < is not allowed, the lazy match ? is superfluous, so its also equivalent to

(Test)(?![^<]*</a>)

This selects all Test that are not followed by an </a> without the symbol < in between. This is why Test which appears before or after any <a ...> .. </a> will be replaced.

However, note that

Lorem Test dolor <a href="http://Test.com/url">Test <strong>dolor</strong></a> eirmod

would be changed to

Lorem 1234 dolor <a href="http://1234.com/url">1234 <strong>dolor</strong></a> eirmod

In order to catch that you could change your regex to

(Test)(?!(.(?!<a))*</a>)

which does the following:

Select every word Test that is not followed by a string ***</a> where each character in *** is not followed by <a.

Note that the dot . is important (see 2)).

Note that a lazy match like (Test)(?!(.(?!<a))*?</a>) is not relevant because nested links are illegal in HTML4 and HTML5 (smth like <a href="#">..<a href="#">...</a>..</a>).

protist said

Also, using regexes on raw HTML is not recommended.

I agree with that. A problem is that it would cause problems if a tag is not closed or opened. For example all mentioned solutions here would change

Lorem Test dolor Test <strong>dolor</strong></a> eirmod

Lorem Test dolor Test <strong>dolor</strong></a> eirmod 1234 dolores sea 1234 takimata

edited Nov 14 '18 at 9:12

answered Oct 25 '17 at 16:38

Adam

3,30842870

Great answer, worked perfect for me

– Justin E. Samuels
Oct 16 '18 at 22:38

add a comment |

This should do the trick:

(<a[^>]*>)(Test)(?![^<]*</a>)

Try it yourself on regexr.

edited Apr 17 '18 at 23:03

answered Sep 19 '12 at 11:24

zb226

5,71132850

1

It is meaningless to put a lookahead before the match

– Adam
Apr 17 '18 at 22:43

1

@Adam That's of course correct, thanks for the heads up :)

– zb226
Apr 17 '18 at 23:04

add a comment |

Resurrecting this ancient question because it had a simple solution that wasn't mentioned.

With all the disclaimers about using regex to parse html, here is a simple way to do it.

Method for Perl / PCRE

<a[^>]*>[^<]*</a(*SKIP)(*F)|Test

demo

General Solution

<a[^>]*>[^<]*</a|(Test)

In this version, the text to be replaced is captured in Group 1 and the replacement is performed by a simple callback or lambda.

demo

Reference

How to match pattern except in situations s1, s2, s3

For code implementation see the code samples in How to match a pattern unless...

edited May 23 '17 at 12:25

Community♦

answered May 15 '14 at 0:06

zx81

32.9k85585

The most important part for me was to know $replaced = preg_replace_callback( $regex, function($m) { if(empty($m[1])) return $m[0]; else return "Superman";}, $subject);. So I need to return m[0] if m[1] is empty. Really nice to know. Thank you!

– mgutt
Apr 4 '15 at 14:03

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f12493128%2fregex-replace-text-but-exclude-when-text-is-between-specific-tag%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

4 Answers
4

active

oldest

votes

4 Answers
4

active

oldest

votes

(?!<a[^>]*?>)(Test)(?![^<]*?</a>)

same as zb226, but optimized with a lazy match

Also, using regexes on raw HTML is not recommended.

answered Sep 19 '12 at 11:48

protist

75249

I also added the b flag to match a word boundary: (?!<a[^>]*?>)(bTestb)(?![^<]*?</a>)

– Weri
Sep 19 '12 at 12:34

That should give the regex optimizer more to work with. It also should not adversely affect your matches, as long as _Test_, _Test, or Test_ are not in your document (and assuming you would not care to match them if they were).

– protist
Sep 19 '12 at 13:10

The lookaheaed before Test and the lazy match are meaningless. See my answer.

– Adam
Oct 25 '17 at 16:38

add a comment |

(?!<a[^>]*?>)(Test)(?![^<]*?</a>)

same as zb226, but optimized with a lazy match

Also, using regexes on raw HTML is not recommended.

answered Sep 19 '12 at 11:48

protist

75249

I also added the b flag to match a word boundary: (?!<a[^>]*?>)(bTestb)(?![^<]*?</a>)

– Weri
Sep 19 '12 at 12:34

That should give the regex optimizer more to work with. It also should not adversely affect your matches, as long as _Test_, _Test, or Test_ are not in your document (and assuming you would not care to match them if they were).

– protist
Sep 19 '12 at 13:10

The lookaheaed before Test and the lazy match are meaningless. See my answer.

– Adam
Oct 25 '17 at 16:38

add a comment |

(?!<a[^>]*?>)(Test)(?![^<]*?</a>)

same as zb226, but optimized with a lazy match

Also, using regexes on raw HTML is not recommended.

answered Sep 19 '12 at 11:48

protist

75249

(?!<a[^>]*?>)(Test)(?![^<]*?</a>)

same as zb226, but optimized with a lazy match

Also, using regexes on raw HTML is not recommended.

answered Sep 19 '12 at 11:48

protist

75249

answered Sep 19 '12 at 11:48

protist

75249

answered Sep 19 '12 at 11:48

protist

75249

answered Sep 19 '12 at 11:48

protist

75249

I also added the b flag to match a word boundary: (?!<a[^>]*?>)(bTestb)(?![^<]*?</a>)

– Weri
Sep 19 '12 at 12:34

That should give the regex optimizer more to work with. It also should not adversely affect your matches, as long as _Test_, _Test, or Test_ are not in your document (and assuming you would not care to match them if they were).

– protist
Sep 19 '12 at 13:10

The lookaheaed before Test and the lazy match are meaningless. See my answer.

– Adam
Oct 25 '17 at 16:38

add a comment |

I also added the b flag to match a word boundary: (?!<a[^>]*?>)(bTestb)(?![^<]*?</a>)

– Weri
Sep 19 '12 at 12:34

That should give the regex optimizer more to work with. It also should not adversely affect your matches, as long as _Test_, _Test, or Test_ are not in your document (and assuming you would not care to match them if they were).

– protist
Sep 19 '12 at 13:10

The lookaheaed before Test and the lazy match are meaningless. See my answer.

– Adam
Oct 25 '17 at 16:38

I also added the b flag to match a word boundary: (?!<a[^>]*?>)(bTestb)(?![^<]*?</a>)

– Weri
Sep 19 '12 at 12:34

That should give the regex optimizer more to work with. It also should not adversely affect your matches, as long as _Test_, _Test, or Test_ are not in your document (and assuming you would not care to match them if they were).

– protist
Sep 19 '12 at 13:10

The lookaheaed before Test and the lazy match are meaningless. See my answer.

– Adam
Oct 25 '17 at 16:38

add a comment |

Answer

Use

(Test)(?!(.(?!<a))*</a>)

Explanation

Let me remind you of the meaning of some symbols:

1) ?! is a negative lookahead, for example r(?!d) selects all r that are not directly followed by an d:

enter image description here

2) Therefore never start a negative lookahead without a character. Just (?!d) is meaningless:

enter image description here

3) The ? can be used as a lazy match. For example .+E would select from

123EEE

the whole string 123EEE. However, .+?E selects as few "any charater" (.+) as needed. It would only select 123E.

Answer:

Protist answer is that you should use (?!<a[^>]*?>)(Test)(?![^<]*?</a>). Let me explain how to make this shorter first.

As mentioned in 2), it is meaningless to put a lookahead before the match. So the following is equivalent to protist answer:

(Test)(?![^<]*?</a>)

also since < is not allowed, the lazy match ? is superfluous, so its also equivalent to

(Test)(?![^<]*</a>)

This selects all Test that are not followed by an </a> without the symbol < in between. This is why Test which appears before or after any <a ...> .. </a> will be replaced.

However, note that

Lorem Test dolor <a href="http://Test.com/url">Test <strong>dolor</strong></a> eirmod

would be changed to

Lorem 1234 dolor <a href="http://1234.com/url">1234 <strong>dolor</strong></a> eirmod

In order to catch that you could change your regex to

(Test)(?!(.(?!<a))*</a>)

which does the following:

Select every word Test that is not followed by a string ***</a> where each character in *** is not followed by <a.

Note that the dot . is important (see 2)).

Note that a lazy match like (Test)(?!(.(?!<a))*?</a>) is not relevant because nested links are illegal in HTML4 and HTML5 (smth like <a href="#">..<a href="#">...</a>..</a>).

protist said

Also, using regexes on raw HTML is not recommended.

I agree with that. A problem is that it would cause problems if a tag is not closed or opened. For example all mentioned solutions here would change

Lorem Test dolor Test <strong>dolor</strong></a> eirmod

Lorem Test dolor Test <strong>dolor</strong></a> eirmod 1234 dolores sea 1234 takimata

edited Nov 14 '18 at 9:12

answered Oct 25 '17 at 16:38

Adam

3,30842870

Great answer, worked perfect for me

– Justin E. Samuels
Oct 16 '18 at 22:38

add a comment |

Answer

Use

(Test)(?!(.(?!<a))*</a>)

Explanation

Let me remind you of the meaning of some symbols:

1) ?! is a negative lookahead, for example r(?!d) selects all r that are not directly followed by an d:

enter image description here

2) Therefore never start a negative lookahead without a character. Just (?!d) is meaningless:

enter image description here

3) The ? can be used as a lazy match. For example .+E would select from

123EEE

the whole string 123EEE. However, .+?E selects as few "any charater" (.+) as needed. It would only select 123E.

Answer:

Protist answer is that you should use (?!<a[^>]*?>)(Test)(?![^<]*?</a>). Let me explain how to make this shorter first.

As mentioned in 2), it is meaningless to put a lookahead before the match. So the following is equivalent to protist answer:

(Test)(?![^<]*?</a>)

also since < is not allowed, the lazy match ? is superfluous, so its also equivalent to

(Test)(?![^<]*</a>)

This selects all Test that are not followed by an </a> without the symbol < in between. This is why Test which appears before or after any <a ...> .. </a> will be replaced.

However, note that

Lorem Test dolor <a href="http://Test.com/url">Test <strong>dolor</strong></a> eirmod

would be changed to

Lorem 1234 dolor <a href="http://1234.com/url">1234 <strong>dolor</strong></a> eirmod

In order to catch that you could change your regex to

(Test)(?!(.(?!<a))*</a>)

which does the following:

Select every word Test that is not followed by a string ***</a> where each character in *** is not followed by <a.

Note that the dot . is important (see 2)).

Note that a lazy match like (Test)(?!(.(?!<a))*?</a>) is not relevant because nested links are illegal in HTML4 and HTML5 (smth like <a href="#">..<a href="#">...</a>..</a>).

protist said

Also, using regexes on raw HTML is not recommended.

I agree with that. A problem is that it would cause problems if a tag is not closed or opened. For example all mentioned solutions here would change

Lorem Test dolor Test <strong>dolor</strong></a> eirmod

Lorem Test dolor Test <strong>dolor</strong></a> eirmod 1234 dolores sea 1234 takimata

edited Nov 14 '18 at 9:12

answered Oct 25 '17 at 16:38

Adam

3,30842870

Great answer, worked perfect for me

– Justin E. Samuels
Oct 16 '18 at 22:38

add a comment |

Answer

Use

(Test)(?!(.(?!<a))*</a>)

Explanation

Let me remind you of the meaning of some symbols:

1) ?! is a negative lookahead, for example r(?!d) selects all r that are not directly followed by an d:

enter image description here

2) Therefore never start a negative lookahead without a character. Just (?!d) is meaningless:

enter image description here

3) The ? can be used as a lazy match. For example .+E would select from

123EEE

the whole string 123EEE. However, .+?E selects as few "any charater" (.+) as needed. It would only select 123E.

Answer:

Protist answer is that you should use (?!<a[^>]*?>)(Test)(?![^<]*?</a>). Let me explain how to make this shorter first.

As mentioned in 2), it is meaningless to put a lookahead before the match. So the following is equivalent to protist answer:

(Test)(?![^<]*?</a>)

also since < is not allowed, the lazy match ? is superfluous, so its also equivalent to

(Test)(?![^<]*</a>)

This selects all Test that are not followed by an </a> without the symbol < in between. This is why Test which appears before or after any <a ...> .. </a> will be replaced.

However, note that

Lorem Test dolor <a href="http://Test.com/url">Test <strong>dolor</strong></a> eirmod

would be changed to

Lorem 1234 dolor <a href="http://1234.com/url">1234 <strong>dolor</strong></a> eirmod

In order to catch that you could change your regex to

(Test)(?!(.(?!<a))*</a>)

which does the following:

Select every word Test that is not followed by a string ***</a> where each character in *** is not followed by <a.

Note that the dot . is important (see 2)).

Note that a lazy match like (Test)(?!(.(?!<a))*?</a>) is not relevant because nested links are illegal in HTML4 and HTML5 (smth like <a href="#">..<a href="#">...</a>..</a>).

protist said

Also, using regexes on raw HTML is not recommended.

I agree with that. A problem is that it would cause problems if a tag is not closed or opened. For example all mentioned solutions here would change

Lorem Test dolor Test <strong>dolor</strong></a> eirmod

Lorem Test dolor Test <strong>dolor</strong></a> eirmod 1234 dolores sea 1234 takimata

edited Nov 14 '18 at 9:12

answered Oct 25 '17 at 16:38

Adam

3,30842870

Answer

Use

(Test)(?!(.(?!<a))*</a>)

Explanation

Let me remind you of the meaning of some symbols:

1) ?! is a negative lookahead, for example r(?!d) selects all r that are not directly followed by an d:

enter image description here

2) Therefore never start a negative lookahead without a character. Just (?!d) is meaningless:

enter image description here

3) The ? can be used as a lazy match. For example .+E would select from

123EEE

the whole string 123EEE. However, .+?E selects as few "any charater" (.+) as needed. It would only select 123E.

Answer:

Protist answer is that you should use (?!<a[^>]*?>)(Test)(?![^<]*?</a>). Let me explain how to make this shorter first.

As mentioned in 2), it is meaningless to put a lookahead before the match. So the following is equivalent to protist answer:

(Test)(?![^<]*?</a>)

also since < is not allowed, the lazy match ? is superfluous, so its also equivalent to

(Test)(?![^<]*</a>)

This selects all Test that are not followed by an </a> without the symbol < in between. This is why Test which appears before or after any <a ...> .. </a> will be replaced.

However, note that

Lorem Test dolor <a href="http://Test.com/url">Test <strong>dolor</strong></a> eirmod

would be changed to

Lorem 1234 dolor <a href="http://1234.com/url">1234 <strong>dolor</strong></a> eirmod

In order to catch that you could change your regex to

(Test)(?!(.(?!<a))*</a>)

which does the following:

Select every word Test that is not followed by a string ***</a> where each character in *** is not followed by <a.

Note that the dot . is important (see 2)).

Note that a lazy match like (Test)(?!(.(?!<a))*?</a>) is not relevant because nested links are illegal in HTML4 and HTML5 (smth like <a href="#">..<a href="#">...</a>..</a>).

protist said

Also, using regexes on raw HTML is not recommended.

I agree with that. A problem is that it would cause problems if a tag is not closed or opened. For example all mentioned solutions here would change

Lorem Test dolor Test <strong>dolor</strong></a> eirmod

Lorem Test dolor Test <strong>dolor</strong></a> eirmod 1234 dolores sea 1234 takimata

edited Nov 14 '18 at 9:12

answered Oct 25 '17 at 16:38

Adam

3,30842870

edited Nov 14 '18 at 9:12

answered Oct 25 '17 at 16:38

Adam

3,30842870

answered Oct 25 '17 at 16:38

Adam

3,30842870

answered Oct 25 '17 at 16:38

Adam

3,30842870

Great answer, worked perfect for me

– Justin E. Samuels
Oct 16 '18 at 22:38

add a comment |

Great answer, worked perfect for me

– Justin E. Samuels
Oct 16 '18 at 22:38

Great answer, worked perfect for me

– Justin E. Samuels
Oct 16 '18 at 22:38

add a comment |

This should do the trick:

(<a[^>]*>)(Test)(?![^<]*</a>)

Try it yourself on regexr.

edited Apr 17 '18 at 23:03

answered Sep 19 '12 at 11:24

zb226

5,71132850

1

It is meaningless to put a lookahead before the match

– Adam
Apr 17 '18 at 22:43

1

@Adam That's of course correct, thanks for the heads up :)

– zb226
Apr 17 '18 at 23:04

add a comment |

This should do the trick:

(<a[^>]*>)(Test)(?![^<]*</a>)

Try it yourself on regexr.

edited Apr 17 '18 at 23:03

answered Sep 19 '12 at 11:24

zb226

5,71132850

1

It is meaningless to put a lookahead before the match

– Adam
Apr 17 '18 at 22:43

1

@Adam That's of course correct, thanks for the heads up :)

– zb226
Apr 17 '18 at 23:04

add a comment |

This should do the trick:

(<a[^>]*>)(Test)(?![^<]*</a>)

Try it yourself on regexr.

edited Apr 17 '18 at 23:03

answered Sep 19 '12 at 11:24

zb226

5,71132850

This should do the trick:

(<a[^>]*>)(Test)(?![^<]*</a>)

Try it yourself on regexr.

edited Apr 17 '18 at 23:03

answered Sep 19 '12 at 11:24

zb226

5,71132850

edited Apr 17 '18 at 23:03

answered Sep 19 '12 at 11:24

zb226

5,71132850

answered Sep 19 '12 at 11:24

zb226

5,71132850

answered Sep 19 '12 at 11:24

zb226

5,71132850

1

It is meaningless to put a lookahead before the match

– Adam
Apr 17 '18 at 22:43

1

@Adam That's of course correct, thanks for the heads up :)

– zb226
Apr 17 '18 at 23:04

add a comment |

1

It is meaningless to put a lookahead before the match

– Adam
Apr 17 '18 at 22:43

1

@Adam That's of course correct, thanks for the heads up :)

– zb226
Apr 17 '18 at 23:04

It is meaningless to put a lookahead before the match

– Adam
Apr 17 '18 at 22:43

@Adam That's of course correct, thanks for the heads up :)

– zb226
Apr 17 '18 at 23:04

add a comment |

Resurrecting this ancient question because it had a simple solution that wasn't mentioned.

With all the disclaimers about using regex to parse html, here is a simple way to do it.

Method for Perl / PCRE

<a[^>]*>[^<]*</a(*SKIP)(*F)|Test

demo

General Solution

<a[^>]*>[^<]*</a|(Test)

In this version, the text to be replaced is captured in Group 1 and the replacement is performed by a simple callback or lambda.

demo

Reference

How to match pattern except in situations s1, s2, s3

For code implementation see the code samples in How to match a pattern unless...

edited May 23 '17 at 12:25

Community♦

answered May 15 '14 at 0:06

zx81

32.9k85585

The most important part for me was to know $replaced = preg_replace_callback( $regex, function($m) { if(empty($m[1])) return $m[0]; else return "Superman";}, $subject);. So I need to return m[0] if m[1] is empty. Really nice to know. Thank you!

– mgutt
Apr 4 '15 at 14:03

add a comment |

Resurrecting this ancient question because it had a simple solution that wasn't mentioned.

With all the disclaimers about using regex to parse html, here is a simple way to do it.

Method for Perl / PCRE

<a[^>]*>[^<]*</a(*SKIP)(*F)|Test

demo

General Solution

<a[^>]*>[^<]*</a|(Test)

In this version, the text to be replaced is captured in Group 1 and the replacement is performed by a simple callback or lambda.

demo

Reference

How to match pattern except in situations s1, s2, s3

For code implementation see the code samples in How to match a pattern unless...

edited May 23 '17 at 12:25

Community♦

answered May 15 '14 at 0:06

zx81

32.9k85585

The most important part for me was to know $replaced = preg_replace_callback( $regex, function($m) { if(empty($m[1])) return $m[0]; else return "Superman";}, $subject);. So I need to return m[0] if m[1] is empty. Really nice to know. Thank you!

– mgutt
Apr 4 '15 at 14:03

add a comment |

Resurrecting this ancient question because it had a simple solution that wasn't mentioned.

With all the disclaimers about using regex to parse html, here is a simple way to do it.

Method for Perl / PCRE

<a[^>]*>[^<]*</a(*SKIP)(*F)|Test

demo

General Solution

<a[^>]*>[^<]*</a|(Test)

In this version, the text to be replaced is captured in Group 1 and the replacement is performed by a simple callback or lambda.

demo

Reference

How to match pattern except in situations s1, s2, s3

For code implementation see the code samples in How to match a pattern unless...

edited May 23 '17 at 12:25

Community♦

answered May 15 '14 at 0:06

zx81

32.9k85585

Resurrecting this ancient question because it had a simple solution that wasn't mentioned.

With all the disclaimers about using regex to parse html, here is a simple way to do it.

Method for Perl / PCRE

<a[^>]*>[^<]*</a(*SKIP)(*F)|Test

demo

General Solution

<a[^>]*>[^<]*</a|(Test)

In this version, the text to be replaced is captured in Group 1 and the replacement is performed by a simple callback or lambda.

demo

Reference

How to match pattern except in situations s1, s2, s3

For code implementation see the code samples in How to match a pattern unless...

edited May 23 '17 at 12:25

Community♦

answered May 15 '14 at 0:06

zx81

32.9k85585

edited May 23 '17 at 12:25

Community♦

edited May 23 '17 at 12:25

Community♦

edited May 23 '17 at 12:25

Community♦

answered May 15 '14 at 0:06

zx81

32.9k85585

answered May 15 '14 at 0:06

zx81

32.9k85585

answered May 15 '14 at 0:06

zx81

32.9k85585

The most important part for me was to know $replaced = preg_replace_callback( $regex, function($m) { if(empty($m[1])) return $m[0]; else return "Superman";}, $subject);. So I need to return m[0] if m[1] is empty. Really nice to know. Thank you!

– mgutt
Apr 4 '15 at 14:03

add a comment |

The most important part for me was to know $replaced = preg_replace_callback( $regex, function($m) { if(empty($m[1])) return $m[0]; else return "Superman";}, $subject);. So I need to return m[0] if m[1] is empty. Really nice to know. Thank you!

– mgutt
Apr 4 '15 at 14:03

The most important part for me was to know

$replaced = preg_replace_callback( 	$regex, 	function($m) { if(empty($m[1])) return $m[0]; 					else return "Superman";}, 	$subject);

. So I need to return m[0] if m[1] is empty. Really nice to know. Thank you!

– mgutt
Apr 4 '15 at 14:03

The most important part for me was to know

$replaced = preg_replace_callback( 	$regex, 	function($m) { if(empty($m[1])) return $m[0]; 					else return "Superman";}, 	$subject);

. So I need to return m[0] if m[1] is empty. Really nice to know. Thank you!

– mgutt
Apr 4 '15 at 14:03

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Tukukkk