JS RegEx for matching a complete URL [duplicate]
This question already has an answer here:
Extracting for URL from string using regex
3 answers
I'm trying to match a URL in a string of text and I'm using this regex to search for a URL :
/b(https?://.*?.[a-z]{2,4}b)/g
The problem is, it only ever matches the protocol and domain, and nothing else that follows.
Example :
let regEx = /b(https?://.*?.[a-z]{2,4}b)/g;
let str = 'some text https://website.com/sH6Sd2x some more text';
console.log(str.match(regEx));
Returns :
https://website.com
How would I alter the regex so it will return the full URL?
https://website.com/sH6Sd2x
Working Demo :
let regEx = /b(https?://.*?.[a-z]{2,4}b)/g;
let str = 'some text https://website.com/sH6Sd2x some more text';
console.log(str.match(regEx));
javascript regex match
marked as duplicate by Wiktor Stribiżew
StackExchange.ready(function() {
if (StackExchange.options.isMobile) return;
$('.dupe-hammer-message-hover:not(.hover-bound)').each(function() {
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');
$hover.hover(
function() {
$hover.showInfoMessage('', {
messageElement: $msg.clone().show(),
transient: false,
position: { my: 'bottom left', at: 'top center', offsetTop: -7 },
dismissable: false,
relativeToBody: true
});
},
function() {
StackExchange.helpers.removeMessages();
}
);
});
});
Nov 25 '18 at 21:09
This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.
add a comment |
This question already has an answer here:
Extracting for URL from string using regex
3 answers
I'm trying to match a URL in a string of text and I'm using this regex to search for a URL :
/b(https?://.*?.[a-z]{2,4}b)/g
The problem is, it only ever matches the protocol and domain, and nothing else that follows.
Example :
let regEx = /b(https?://.*?.[a-z]{2,4}b)/g;
let str = 'some text https://website.com/sH6Sd2x some more text';
console.log(str.match(regEx));
Returns :
https://website.com
How would I alter the regex so it will return the full URL?
https://website.com/sH6Sd2x
Working Demo :
let regEx = /b(https?://.*?.[a-z]{2,4}b)/g;
let str = 'some text https://website.com/sH6Sd2x some more text';
console.log(str.match(regEx));
javascript regex match
marked as duplicate by Wiktor Stribiżew
StackExchange.ready(function() {
if (StackExchange.options.isMobile) return;
$('.dupe-hammer-message-hover:not(.hover-bound)').each(function() {
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');
$hover.hover(
function() {
$hover.showInfoMessage('', {
messageElement: $msg.clone().show(),
transient: false,
position: { my: 'bottom left', at: 'top center', offsetTop: -7 },
dismissable: false,
relativeToBody: true
});
},
function() {
StackExchange.helpers.removeMessages();
}
);
});
});
Nov 25 '18 at 21:09
This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.
Your regexp ends with.{a-z]{2,4}b
, so that will only match the top-level domain part of the URL.
– Barmar
Nov 25 '18 at 21:05
@Barmar, yes thanks, I'm aware of that. My question was how to alter the regex to include the rest?
– spice
Nov 25 '18 at 21:07
1
A usual URL extraction pattern assumes there are no whitespaces after protocol. Try just/bhttps?://S+b/g
, see demo
– Wiktor Stribiżew
Nov 25 '18 at 21:07
@WiktorStribiżew yep that's it, thank you very much :)
– spice
Nov 25 '18 at 21:08
add a comment |
This question already has an answer here:
Extracting for URL from string using regex
3 answers
I'm trying to match a URL in a string of text and I'm using this regex to search for a URL :
/b(https?://.*?.[a-z]{2,4}b)/g
The problem is, it only ever matches the protocol and domain, and nothing else that follows.
Example :
let regEx = /b(https?://.*?.[a-z]{2,4}b)/g;
let str = 'some text https://website.com/sH6Sd2x some more text';
console.log(str.match(regEx));
Returns :
https://website.com
How would I alter the regex so it will return the full URL?
https://website.com/sH6Sd2x
Working Demo :
let regEx = /b(https?://.*?.[a-z]{2,4}b)/g;
let str = 'some text https://website.com/sH6Sd2x some more text';
console.log(str.match(regEx));
javascript regex match
This question already has an answer here:
Extracting for URL from string using regex
3 answers
I'm trying to match a URL in a string of text and I'm using this regex to search for a URL :
/b(https?://.*?.[a-z]{2,4}b)/g
The problem is, it only ever matches the protocol and domain, and nothing else that follows.
Example :
let regEx = /b(https?://.*?.[a-z]{2,4}b)/g;
let str = 'some text https://website.com/sH6Sd2x some more text';
console.log(str.match(regEx));
Returns :
https://website.com
How would I alter the regex so it will return the full URL?
https://website.com/sH6Sd2x
Working Demo :
let regEx = /b(https?://.*?.[a-z]{2,4}b)/g;
let str = 'some text https://website.com/sH6Sd2x some more text';
console.log(str.match(regEx));
This question already has an answer here:
Extracting for URL from string using regex
3 answers
let regEx = /b(https?://.*?.[a-z]{2,4}b)/g;
let str = 'some text https://website.com/sH6Sd2x some more text';
console.log(str.match(regEx));
let regEx = /b(https?://.*?.[a-z]{2,4}b)/g;
let str = 'some text https://website.com/sH6Sd2x some more text';
console.log(str.match(regEx));
javascript regex match
javascript regex match
edited Nov 25 '18 at 21:50
spice
asked Nov 25 '18 at 21:01
spicespice
450210
450210
marked as duplicate by Wiktor Stribiżew
StackExchange.ready(function() {
if (StackExchange.options.isMobile) return;
$('.dupe-hammer-message-hover:not(.hover-bound)').each(function() {
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');
$hover.hover(
function() {
$hover.showInfoMessage('', {
messageElement: $msg.clone().show(),
transient: false,
position: { my: 'bottom left', at: 'top center', offsetTop: -7 },
dismissable: false,
relativeToBody: true
});
},
function() {
StackExchange.helpers.removeMessages();
}
);
});
});
Nov 25 '18 at 21:09
This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.
marked as duplicate by Wiktor Stribiżew
StackExchange.ready(function() {
if (StackExchange.options.isMobile) return;
$('.dupe-hammer-message-hover:not(.hover-bound)').each(function() {
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');
$hover.hover(
function() {
$hover.showInfoMessage('', {
messageElement: $msg.clone().show(),
transient: false,
position: { my: 'bottom left', at: 'top center', offsetTop: -7 },
dismissable: false,
relativeToBody: true
});
},
function() {
StackExchange.helpers.removeMessages();
}
);
});
});
Nov 25 '18 at 21:09
This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.
Your regexp ends with.{a-z]{2,4}b
, so that will only match the top-level domain part of the URL.
– Barmar
Nov 25 '18 at 21:05
@Barmar, yes thanks, I'm aware of that. My question was how to alter the regex to include the rest?
– spice
Nov 25 '18 at 21:07
1
A usual URL extraction pattern assumes there are no whitespaces after protocol. Try just/bhttps?://S+b/g
, see demo
– Wiktor Stribiżew
Nov 25 '18 at 21:07
@WiktorStribiżew yep that's it, thank you very much :)
– spice
Nov 25 '18 at 21:08
add a comment |
Your regexp ends with.{a-z]{2,4}b
, so that will only match the top-level domain part of the URL.
– Barmar
Nov 25 '18 at 21:05
@Barmar, yes thanks, I'm aware of that. My question was how to alter the regex to include the rest?
– spice
Nov 25 '18 at 21:07
1
A usual URL extraction pattern assumes there are no whitespaces after protocol. Try just/bhttps?://S+b/g
, see demo
– Wiktor Stribiżew
Nov 25 '18 at 21:07
@WiktorStribiżew yep that's it, thank you very much :)
– spice
Nov 25 '18 at 21:08
Your regexp ends with
.{a-z]{2,4}b
, so that will only match the top-level domain part of the URL.– Barmar
Nov 25 '18 at 21:05
Your regexp ends with
.{a-z]{2,4}b
, so that will only match the top-level domain part of the URL.– Barmar
Nov 25 '18 at 21:05
@Barmar, yes thanks, I'm aware of that. My question was how to alter the regex to include the rest?
– spice
Nov 25 '18 at 21:07
@Barmar, yes thanks, I'm aware of that. My question was how to alter the regex to include the rest?
– spice
Nov 25 '18 at 21:07
1
1
A usual URL extraction pattern assumes there are no whitespaces after protocol. Try just
/bhttps?://S+b/g
, see demo– Wiktor Stribiżew
Nov 25 '18 at 21:07
A usual URL extraction pattern assumes there are no whitespaces after protocol. Try just
/bhttps?://S+b/g
, see demo– Wiktor Stribiżew
Nov 25 '18 at 21:07
@WiktorStribiżew yep that's it, thank you very much :)
– spice
Nov 25 '18 at 21:08
@WiktorStribiżew yep that's it, thank you very much :)
– spice
Nov 25 '18 at 21:08
add a comment |
2 Answers
2
active
oldest
votes
The reason it stops there is that your expression ends with .[a-z]{2,4}
which I guess is intended to match the top level domain (.com
, .net
, uk
etc). After that it stops matching.
The solution: add /[^s]*
to the expression. This matches a further slash and zero or more non-whitespace characters.
Note that S
(with capital S) is equivalent to [^s]
(with lowercase s), so use what you like best.
Demo:
let regEx = /b(https?://.*?.[a-z]{2,4}/[^s]*b)/g;
let str = 'some text https://website.com/sH6Sd2x some more text';
console.log(str.match(regEx));
You might even shorten it further if you realize that URLs never contain whitespace, and matching the domain explicitly is not needed, or worse it may even cause trouble (e.g. .museum
is also a valid TLD, but you exclude it).
Enhanced version (shorter regex and more accurate):
let regEx = /b(https?://S*b)/g;
let str = 'some text https://website.com/sH6Sd2x some more text';
console.log(str.match(regEx));
Yep this is exatly what I was looking for. Thank you so much @Peter!
– spice
Nov 25 '18 at 21:10
add a comment |
Since the regexp ends with .[a-z]{2,4}b
, it only matches up to the top-level domain part of the hostname in the URL. You need to match the rest of the URL after that. This matches any non-whitespace characters after that:
let regEx = /bhttps?://.*?.[a-z]{2,4}bS*/g;
See Detect URLs in text with JavaScript for more complete solutions to matching URLs.
add a comment |
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
The reason it stops there is that your expression ends with .[a-z]{2,4}
which I guess is intended to match the top level domain (.com
, .net
, uk
etc). After that it stops matching.
The solution: add /[^s]*
to the expression. This matches a further slash and zero or more non-whitespace characters.
Note that S
(with capital S) is equivalent to [^s]
(with lowercase s), so use what you like best.
Demo:
let regEx = /b(https?://.*?.[a-z]{2,4}/[^s]*b)/g;
let str = 'some text https://website.com/sH6Sd2x some more text';
console.log(str.match(regEx));
You might even shorten it further if you realize that URLs never contain whitespace, and matching the domain explicitly is not needed, or worse it may even cause trouble (e.g. .museum
is also a valid TLD, but you exclude it).
Enhanced version (shorter regex and more accurate):
let regEx = /b(https?://S*b)/g;
let str = 'some text https://website.com/sH6Sd2x some more text';
console.log(str.match(regEx));
Yep this is exatly what I was looking for. Thank you so much @Peter!
– spice
Nov 25 '18 at 21:10
add a comment |
The reason it stops there is that your expression ends with .[a-z]{2,4}
which I guess is intended to match the top level domain (.com
, .net
, uk
etc). After that it stops matching.
The solution: add /[^s]*
to the expression. This matches a further slash and zero or more non-whitespace characters.
Note that S
(with capital S) is equivalent to [^s]
(with lowercase s), so use what you like best.
Demo:
let regEx = /b(https?://.*?.[a-z]{2,4}/[^s]*b)/g;
let str = 'some text https://website.com/sH6Sd2x some more text';
console.log(str.match(regEx));
You might even shorten it further if you realize that URLs never contain whitespace, and matching the domain explicitly is not needed, or worse it may even cause trouble (e.g. .museum
is also a valid TLD, but you exclude it).
Enhanced version (shorter regex and more accurate):
let regEx = /b(https?://S*b)/g;
let str = 'some text https://website.com/sH6Sd2x some more text';
console.log(str.match(regEx));
Yep this is exatly what I was looking for. Thank you so much @Peter!
– spice
Nov 25 '18 at 21:10
add a comment |
The reason it stops there is that your expression ends with .[a-z]{2,4}
which I guess is intended to match the top level domain (.com
, .net
, uk
etc). After that it stops matching.
The solution: add /[^s]*
to the expression. This matches a further slash and zero or more non-whitespace characters.
Note that S
(with capital S) is equivalent to [^s]
(with lowercase s), so use what you like best.
Demo:
let regEx = /b(https?://.*?.[a-z]{2,4}/[^s]*b)/g;
let str = 'some text https://website.com/sH6Sd2x some more text';
console.log(str.match(regEx));
You might even shorten it further if you realize that URLs never contain whitespace, and matching the domain explicitly is not needed, or worse it may even cause trouble (e.g. .museum
is also a valid TLD, but you exclude it).
Enhanced version (shorter regex and more accurate):
let regEx = /b(https?://S*b)/g;
let str = 'some text https://website.com/sH6Sd2x some more text';
console.log(str.match(regEx));
The reason it stops there is that your expression ends with .[a-z]{2,4}
which I guess is intended to match the top level domain (.com
, .net
, uk
etc). After that it stops matching.
The solution: add /[^s]*
to the expression. This matches a further slash and zero or more non-whitespace characters.
Note that S
(with capital S) is equivalent to [^s]
(with lowercase s), so use what you like best.
Demo:
let regEx = /b(https?://.*?.[a-z]{2,4}/[^s]*b)/g;
let str = 'some text https://website.com/sH6Sd2x some more text';
console.log(str.match(regEx));
You might even shorten it further if you realize that URLs never contain whitespace, and matching the domain explicitly is not needed, or worse it may even cause trouble (e.g. .museum
is also a valid TLD, but you exclude it).
Enhanced version (shorter regex and more accurate):
let regEx = /b(https?://S*b)/g;
let str = 'some text https://website.com/sH6Sd2x some more text';
console.log(str.match(regEx));
let regEx = /b(https?://.*?.[a-z]{2,4}/[^s]*b)/g;
let str = 'some text https://website.com/sH6Sd2x some more text';
console.log(str.match(regEx));
let regEx = /b(https?://.*?.[a-z]{2,4}/[^s]*b)/g;
let str = 'some text https://website.com/sH6Sd2x some more text';
console.log(str.match(regEx));
let regEx = /b(https?://S*b)/g;
let str = 'some text https://website.com/sH6Sd2x some more text';
console.log(str.match(regEx));
let regEx = /b(https?://S*b)/g;
let str = 'some text https://website.com/sH6Sd2x some more text';
console.log(str.match(regEx));
edited Nov 25 '18 at 21:12
answered Nov 25 '18 at 21:08
Peter BPeter B
13.3k52045
13.3k52045
Yep this is exatly what I was looking for. Thank you so much @Peter!
– spice
Nov 25 '18 at 21:10
add a comment |
Yep this is exatly what I was looking for. Thank you so much @Peter!
– spice
Nov 25 '18 at 21:10
Yep this is exatly what I was looking for. Thank you so much @Peter!
– spice
Nov 25 '18 at 21:10
Yep this is exatly what I was looking for. Thank you so much @Peter!
– spice
Nov 25 '18 at 21:10
add a comment |
Since the regexp ends with .[a-z]{2,4}b
, it only matches up to the top-level domain part of the hostname in the URL. You need to match the rest of the URL after that. This matches any non-whitespace characters after that:
let regEx = /bhttps?://.*?.[a-z]{2,4}bS*/g;
See Detect URLs in text with JavaScript for more complete solutions to matching URLs.
add a comment |
Since the regexp ends with .[a-z]{2,4}b
, it only matches up to the top-level domain part of the hostname in the URL. You need to match the rest of the URL after that. This matches any non-whitespace characters after that:
let regEx = /bhttps?://.*?.[a-z]{2,4}bS*/g;
See Detect URLs in text with JavaScript for more complete solutions to matching URLs.
add a comment |
Since the regexp ends with .[a-z]{2,4}b
, it only matches up to the top-level domain part of the hostname in the URL. You need to match the rest of the URL after that. This matches any non-whitespace characters after that:
let regEx = /bhttps?://.*?.[a-z]{2,4}bS*/g;
See Detect URLs in text with JavaScript for more complete solutions to matching URLs.
Since the regexp ends with .[a-z]{2,4}b
, it only matches up to the top-level domain part of the hostname in the URL. You need to match the rest of the URL after that. This matches any non-whitespace characters after that:
let regEx = /bhttps?://.*?.[a-z]{2,4}bS*/g;
See Detect URLs in text with JavaScript for more complete solutions to matching URLs.
answered Nov 25 '18 at 21:08
BarmarBarmar
429k36253353
429k36253353
add a comment |
add a comment |
Your regexp ends with
.{a-z]{2,4}b
, so that will only match the top-level domain part of the URL.– Barmar
Nov 25 '18 at 21:05
@Barmar, yes thanks, I'm aware of that. My question was how to alter the regex to include the rest?
– spice
Nov 25 '18 at 21:07
1
A usual URL extraction pattern assumes there are no whitespaces after protocol. Try just
/bhttps?://S+b/g
, see demo– Wiktor Stribiżew
Nov 25 '18 at 21:07
@WiktorStribiżew yep that's it, thank you very much :)
– spice
Nov 25 '18 at 21:08