Ad-hoc Web Testing using Python

"But Chris, I thought you were a PHP guy?" Yes, in my day-to-day work I use a lot of PHP. However, I am branching out as required by my desire to learn new things. One of those new things is Python, both on the web and the server side. I thought I would share with you a tool I whipped together to do some testing of webservices at work built using an application front-end that we have released under the GPL. Of course, you need to be using our non-free sports data feed fetching software, but that's not the point.

So, as of March 1 we are turning off an old server that uses an older, Perl-using-CGI web service and redirecting those customers using it to one built using PHP (Code Igniter specifically. Again, this decision was made before I started at that job.) As the lead engineer on the project (okay, the only one) I had to make sure that all the old queries would work on the new server. When my boss told me to "manually check them" I scoffed and said "manual checking is for suckers, I'm going to write a script to do it.".

The methodology is as follows:

  1. Make sure that calls to the old web service return properly formed XML *or* HTML with a certain string in it when we are doing XSLT transforms
  2. Do the same thing with the new web service
  3. Tell me when something isn't right

I could do more work in this script, to do things like look for certain tags in the XML document to be populated but we can skip that for now. The new web service has been tested quite well and we are getting back results as expected via some other tests I've written using PHPUnit.

With help from online Python resources like this Python cheat-sheet and some judicious online searching, I came up with this script. As far as I can tell, the only outside dependency was me using Beautiful Soup to read in the HTML output. Maybe I could've used Beautiful Soup for both the XML and HTML, but this still works just fine.


import urllib2
from xml.dom import minidom
import fileinput
from BeautifulSoup import BeautifulSoup

host = '';
host2 = '';
users = ["foo", "bar", "baz", "alpha", "omega"] 
passwords = ['xxxxx', 'xxxxxx', 'xxxxxx', 'xxxxx', "xxxxx"]
key = 0

for username in users:
    print "Doing check for " + username
    password = passwords[key]
    password_mgr = urllib2.HTTPPasswordMgrWithDefaultRealm()
    password_mgr.add_password(None, host, username, password)
    handler = urllib2.HTTPBasicAuthHandler(password_mgr)
    opener = urllib2.build_opener(handler)
    password_mgr.add_password(None, host2, username, password)
    handler = urllib2.HTTPBasicAuthHandler(password_mgr)
    opener2 = urllib2.build_opener(handler)

    for line in fileinput.input(username + "_url.txt"):
            data = urllib2.urlopen(line)

                xmldoc1 = minidom.parse(data)
                html = urllib2.urlopen(line).read()
                soup = BeautifulSoup(html)

                if not soup.find(text="XML Team Solutions"):
                    print "Bad XML+HTML:" + line

            print "Could not load " + line

    key = key + 1

URL's that I wished to test were stored in text files that this script then read in. Using a little grep + awk magic I was able to extract URL's from our web server access log in order to test. I'd appreciate any tips from Pythonistas out there on how to make this code conform more to the "Python way" if there is such a thing.