I'm a developer for a digital marketing company. I have been tasked with updating and cataloging our portfolio of around 400 sites. Firstly I'd like to get an idea of what sites are at what version. I decided Python would be the tool to use and took to learning.
It's still early days but I'm coming up against an annoying problem, maybe you guys could help?
with open('domains.csv','r') as csvfile:
urls = [row for row in csv.reader(csvfile)]
L = ['Wordpress 0.7','Wordpress 1.2','Wordpress 1.5','Wordpress 2.0','Wordpress 2.1','Wordpress 2.3','Wordpress 2.5','Wordpress 2.6','Wordpress 2.7','Wordpress 2.8','Wordpress 2.9','Wordpress 2.9','Wordpress 3.0','Wordpress 3.1','Wordpress 3.2','Wordpress 3.3','Wordpress 3.4','Wordpress 3.5.1']
for url in urls:
sock = urllib2.urlopen(url)
htmlSource = sock.read()
for ver in L:
if htmlSource.find(ver)== 1:
print url + " | Wordpress | " + ver
So this executes without a problem. However, setting the 'if htmlSource.find(ver) == 1:' line to either 1 or -1 produces odd results.
It will either just return url+ "|WordPress |" + ver for all iterations of the loop, for all urls, or it returns nothing.
Basically, I want to check if the string Wordpress X.X is found (x.x being the version number). The source code for all those pages contains a meta tag with the value "WordPress X.X" however I don't seem to be detecting it