I occasionally check-up on the web developer jobs on craigslist.org, and this is the second time this month I’ve noticed this particular scam. When I realize what these asshats are trying to pull, it enrages me. I figured I’d write about here to warn others. In case craigslist pulls the ad here it is:
We need a person who is an expert with PHP and CGI/PERL. Pay will be $80,000 to $125,000 per year plus full medical and more. You will also get a laptop. Our recruiting for this position is a little bit unique. Below you will find a project to complete. Your performace on this project will determine if you become a permanent employee with us.
1. There Are Awards For The First 10 To Complete The Project
1: $1000
2-5: $500
5-7: $100
7-10: $50
10+ no award but still have the opportunity for employment.
YOU MUST COMPLETE THE BELOW PROJECT BEFORE SUNDAY JANUARY 21 2007.
What we need is a form with the following fields:
First Name
Last Name
Email Address
Username
Country (drop down list if possible)
Referred By
Do you agree to the terms and conditions?
Details:
-All Fields EXCEPT the Referred by field should be required.
-THE FORM MUST BE LIMITED TO 1 ENTRY PER PERSON. EITHER VIA IP ADDRESS, COOKIES, OR ANY OTHER WAY YOU CAN THINK OF.
- The form must end at a success page.
-When the form is submitted the information from the form needs to be sent to: promotions@freerollsource.net (use sendmail, smtp, mailto etc.)
-You must host this form on a free host of you choice (Must be a completely new account since we will need the account info to review your work).
Here is a list of some free hosts you can use that support cgi/perl and php (with MySQL Database). Go here for a list of some free hosts: http://www.free-webhosts.com/webhosting-01.php
We actually have a CGI script that does all this. It is acceptable to use it if you install it correctly. Email me to get the CGI script.
Once you have completed the project, email me with the URL to the form, as well as the information for the free host you used. I will request some additional information about your background as well. A resume would be nice thing to have ready.
GOOD LUCK! Feel free to email me (daniel@traudts.com) if you have any trouble.
Now I’ve read a lot job postings in my days and this sounds a tad bit fishy–even for craigslist. Secondly, the guy’s email address leads to a half-ass family website. Hmm… something isn’t kosher. Thirty seconds after googling his email address, I discover this forum posting, dating back only a month:
Hello,
I know my way around php and html; but there is one thing I cannot figure out to do.
Right now, I have a standard form that is send to my email address once completed.
What I need to be able to do is limit sign-up to one per IP address. I was hoping someone knew how to do this. I am willing to pay for information that helps me complete the task. Thank you.
Feel free to email me: daniel@traudts.com (NO SPAM!)
I think it’s pretty self-explanatory what this guy is trying to pull with his pie-in-the-sky, bogus job listing. Unfortunately, this is the second time this month that I’ve seen this particular scam; and this is in the smaller market of metro Detroit/Ann Arbor. The prior listing was a bit more convincing and probably lured in more developers than one that talked about a job interview with cash prizes attached. It’s something to be aware of out there…
January 17th, 2007
Today I was the first time I’ve used my new found Python skills in a professional capacity. As part of a larger project, the client wanted me to check each HTML document for image tags missing their “alt” attributes. Knowing how tedious and impercise it would be to do this manually, writing a script was the only logical choice. And it would be handy to be able to run this both locally on their Windows-based workstations before the files were uploaded and on the Linux-based Apache web server. Since I had already worked with the HTMLParser library, I figured this would be a perfect opportunity for Python.
Here is the entire source. Notice that I don’t have to contend with any regular expression torture, nor do I have to implement my own function to recursively search the directory structure. Python can do both these things automatically.
import urllib
import htmllib
import formatter
import os
from os.path import join
import sys
filename = ''
class ImgParser(htmllib.HTMLParser):
global filename
def __init__(self, formatter):
htmllib.HTMLParser.__init__(self, formatter)
self.imgs = []
def start_img(self, attrs):
tag = dict(attrs)
if not 'alt' in tag:
self.imgs.append( (filename, tag['src']) )
def get_imgs(self):
return self.imgs
format = formatter.NullFormatter()
htmlparser = ImgParser(format)
search_root = sys.argv[1]
for root, dirs, files in os.walk(search_root):
for name in files:
if name[-5:] == '.html':
filename = join(root, name)
f = open(filename, 'r')
htmlparser.feed(f.read())
f.close()
htmlparser.close()
imgs = htmlparser.get_imgs()
current_file = ''
print "\n\nThe following files contain images with missing alt tags:\n"
for filename, src in imgs:
if current_file != filename:
current_file = filename
print filename
print " " + src
I’m normally pretty careful to comment code that other people are likely to read, however this is so simple, I didn’t bother. The “os.walk” method automatically handles the recursive directory walking and if the filename ends in “.html”, then the file is opened and run through an instance of the “ImgParser” class. ImgParser is defined as a new class that inherits the HTMLParser class from the “htmllib” module (library). The “start_img” function overrides the handler that is called when the parser encounters the start of an “img” tag. If an img tag is encountered without an alt attribute, then the filename and img’s src attibute is added to the list. Once all of the files are processed, the script loops through the list and outputs the incomplete img tags and the files in which they are contained.
It’s neither sexy or full of features, but it only took me (a Python noob) 10 minutes to write. The path that is searched is read in as the second command line argument, so the script has to be executed as python scan_img_tags.py /var/html or python scan_img_tags.py c:/myhtmlstuff. The script did exactly what I needed it to do, so I’m not likely to polish it up. But if I were, I would probably add line number reporting, a little error trapping, and the ability to specify additional file extensions from the command line.
Hopefully someone else can find some use for this snippet.
December 27th, 2006
Last week, I wrote about my discontent with PHP and why I chose to give Python a shot at becoming my new tool of choice. Since then, Python and I have spent a lot of time together getting acquainted. I’m finally becoming accustomed to the lack of semicolons and curly braces and have accepted Python’s whitespace neurosis. It hasn’t completely dethroned PHP yet because I am still evaluating which web framework I want to invest the time into learning first (Django, TurboGears, Pylons) and then picking my templating system (Kid, Myghty, Cheetah, Genshi, Clearsilver, etc.) But the more I use Python, the bigger fan I become. After only one week, I’m fluent enough to be writing useful stuff and I’m having more fun doing it. This is how I got started…
Snakes on my brain
First, I highly recommend picking up two books: Learning Python (ISBN: 0596002815) and Beginning Python (ISBN: 159059519X). Learning Python reads like a dry tome that would be right at home in an introductory CS class. It is very heavy on the theory (you don’t actually get to start doing anything until about page 60) and, being published almost five years ago, is somewhat outdated. However, it explains crucial details about how Python works under the hood. For example, unlike any language I’ve ever used, variables are simply references to objects. This might sound insignificant, but consider the following code and it’s output in a Python interactive prompt:
>>> languages = ["Pascal", "Python", "Perl", "PHP"]
>>> thepeas = languages
>>> print thepeas
['Pascal', 'Python', 'Perl', 'PHP']
>>> print languages.pop()
PHP
>>> print thepeas
['Pascal', 'Python', 'Perl']
>>> print thepeas.pop()
Perl
>>> print languages
['Pascal', 'Python']
As you can see, both languages and thepeas are two references to the same object, changing one affects the other. This would have been a very frustrating concept to initially encounter if I hadn’t read Learning Python first.
On the other hand, Beginning Python was published in only 2005 and seems to be better aimed at the self-instruction crowd. It moves along quite a bit faster and includes a greater number of examples. Plus, I find the author’s style of writing more approachable. However, pace and praticality comes at the cost of glazing over some rather important theory. I’ve been reading both books concurrently and find that this approach allows me to soak up more information and helps keep my attention. If you have to choose one book though, get Beginning Python.
Tale of two committees
Finallly getting into the mix; writing code, reading the online docs and my books, and browsing tutorials, I am immediately struck with a key difference in philosophy: PHP was created soley with web development in mind, while Python is a Swiss Army Knife. When using PHP, I just fire up my text editor, add a “<?php” and get cranking. Testing normally consists of FTPing or SCPing the .php file to my server and hitting “Reload” in my browser. Form data is easily accessable with $_POST[’stuff’] and MySQL extensions are most likely installed too. PHP is preconfigured on almost every Apache web server and everything Just Works™–which is why I (and everyone else) learned PHP in the first place.
Python, however, doesn’t know what you want to do with it out of the box. It’s a very modular tool that feels right at home either crunching scientific data or processing HTML forms. Fitting with Python’s “batteries included” mantra, you can do almost all of these things without having to download any additional libraries (and if you can’t find a built-in library that does what you want, just head over to the Cheese Shop). However, because it would be hugely wasteful to automatically load every library into memory, Python does require this be done manually with the import statement: “import re” will get you regular expressions, “import pdfgen” will allow you to generate PDFs, “import this” will give you geek poetry (yes, really), etc. Using Python for web development will most likely entail installing and configuring mod_python on Apache, or using the CGI handler. Database interaction modules also have to be manually imported. Doing all of this isn’t difficult, or even time-comsuming, but it does require adapting to the Python way of thinking: “Batteries included… Some assembly required.”
Flexibility, thy name is Python!
If you have the patience to relearn a few things, Python can be very rewarding. I best learn by building stuff that I need, so I decided that my first project would be an offline web-browser / site cacher. It’s not the simplest thing to start with, but it would help me evaluate Python in a real-world application. The program would connect to a site, download the HTML document and its associated images and stylesheets, convert absolute URLs to relative URLs, and save everything to disk with the same directory structure. Having never used Python before, from start to finish, this project took me two days to complete. “Bah!,” I can hear the readers now, “I could have done the same thing in PHP in a quarter of the time and I wouldn’t have had to learn anything new!” While that may be true, four hours after finishing the initial version, I had also converted the crawler into a multithreaded application that improved its performance by a factor of ten. Try that with PHP.
December 23rd, 2006