*IF* all the fields contain *ONLY* URLs of the form
Code:
http(s)://(www.)xxxxx.yyy/anything
Then I think we can do this. But what about a url such as
http://download.microsoft.com? That is, a subdomain?
Are you worried about them? I think it would be tough to create a simple expression to look for those, as well.
Ignoring subdomains, this SEEMS to work:
Code:
SELECT
SUBSTR(
REPLACE(REPLACE(REPLACE(CONCAT(url,'/'),'http://',''),'https://',''),'www.',''),
1,
LOCATE('/',REPLACE(REPLACE(REPLACE(CONCAT(url,'/'),'http://',''),'https://',''),'www.',''))-1
) AS domain,
COUNT(*) AS howmany
FROM urls
GROUP BY domain
ORDER BY howmany DESC
But *ONLY* if your URL field contains urls alone, no other text.