Posted in Uncategorized

[Java] Regular Express for URLs detection in a sentence.


Regular Express is very great. It helps us a lot in cases of filtering string which we expect. In this section, I want to introduce once case using Regex, that is detecting URL links in a sentence.

To detect valid URLs which start with “www.”, “http” and “https” in a sentence, you can use below Regular Express:

public List<String> extractUrls(String text)
List<String> containedUrls = new ArrayList<String>();

// String urlRegex = “((https?|ftp|gopher|telnet|file):((//)|(\\\\))+[\\w\\d:#@%/;$()~_?\\+-=\\\\\\.&]*)”;
String urlRegex =
+ “(”
+ “((https?|ftp|gopher|telnet|file):((//)|(\\\\)))” // URLs starting with http://, https://, or ftp://
+ “|”
+ “(^|[^\\/])(www\\.)” // URLs starting with “www.” (without // before it, or it’d re-link the ones done above).
+ “)”
+ “+[\\w\\d:#@%/;$()~_?\\+-=\\\\\\.&]*”
+ “)”;
Pattern pattern = Pattern.compile(urlRegex, Pattern.CASE_INSENSITIVE);

Matcher urlMatcher = pattern.matcher(text);
while (urlMatcher.find())

return containedUrls;

Above Regular Express only detect which word contain URL(s) such as: “<<>>>aaa”, etc. So, to remove unexpected other words, you can modify code to do that.

Please refer my_demo to remove unexpected words after filtering using Regular Express


Hope it help 🙂


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s