Finding a Unicode URL

Off-topic posts of interest to the "Everything" community.
Post Reply
Debugger
Posts: 204
Joined: Thu Jan 26, 2017 11:56 am

Finding a Unicode URL

Post by Debugger » Sat Oct 20, 2018 8:18 am

Please improve this regex to detect the unicode link:

http(news|http|ftp|https):\/\/[\w\-_]+(\.[\w]+)+([\w\-\.,@?^=%&:/~\+#]*[\w\-\@?^=%&/~\+#])?
Last edited by Debugger on Sat Oct 20, 2018 9:35 am, edited 1 time in total.

void
Site Admin
Posts: 4193
Joined: Fri Oct 16, 2009 11:31 pm

Re: Finding a Unicode URL

Post by void » Sat Oct 20, 2018 9:11 am

Please try searching for:

(news|http|ftp|https):\/\/.*[^\x00-\x7f]

(news|http|ftp|https) = match news, http, ftp or https
: = match a literal :
\/ = match a literal /
.* = match any character, any number of times
[^\x00-\x7f] = match a non-ASCII character.

Debugger
Posts: 204
Joined: Thu Jan 26, 2017 11:56 am

Re: Finding a Unicode URL

Post by Debugger » Sat Oct 20, 2018 9:39 am

Thanks, it works, but I want to search in both cases(in all matches), Unicode and without Unicode.

Post Reply