Prayer through SMS
21 Oct
I'm rarely up and about at 9am on a Saturday. And rarer still hopping through channels on TV. So I was rather shocked to see the "Prayer Through SMS" service being offered. It seems the televangelist "industry" is growing in South Africa. Which usually really doesn't bother me, since the TV is easy to turn off.
It's quite amusing how the "House of Prayer" segment was all about praying "right". "Pray right", it was said - "Pray prayers that God wants to hear", and that we "need to know how to pray correctly". That you shouldn't pray for your business to do well, but rather to establish the kingdom of God in yourself, and that the rest will follow.
But before and after that, there's a wonderful in-show advert that tries to convince people to SMS in their prayers, saying that God said that "If two or more..." believers agree and pray for something, then God will make it so (paraphrased). Isn't it useful that they offer a service such that if you SMS them your prayer, they will "agree" with it, and then "pray for the next 30 days". For only R2!
I'm just waiting for the "And if you call now, you will also get..." worth R100 value!
My current employers, Jam Warehouse, are looking for more developers to work on KnowledgeTree, the open source document management system that Brad, Bryn (Mr. No Web Page), and I have built over the past year or so (and with me alone for another year).
There are two positions - a more senior developer who would take over much of what I've been doing previously, and a less senior developer who would be the lackey and make coffee and tea for the rest of the team. Both are preferably based in Cape Town, but for the right person, there may be options there.
The more senior developer needs to have about 4 years of experience in a modern programming language in a commercial environment - having worked on teams, preferably having some leadership experience, and so forth. The less senior develoer needs to have about 2 years in a commercial environment. (Note - commercial, not necessarily proprietary).
Both would benefit from having used PHP most of that time, from having a great understanding and passion for open source, a technical degree, having speaking/presentation experience, experience in other languages (Python, C#, Java) and so forth.
You can read the more formal job spec on Daniel's web log.
Send me an email (it's on the right-hand-side of my web page) if you think you're suitable.
A common reason to love Python is that it fits in your brain. That you don't need to consult documentation every time you've spent time away from it and want to do basic tasks like reading a file, or doing some text manipulation. And that's certainly one of my reasons. But every time I have to deal with unicode strings, my poor little brain needs help, and I can never find documentation that walks me through the basics simply.
I run all comments through the Genshi HTMLSanitizer, which removes all but a white-listed set of tags and classes from the HTML. The Genshi HTML() function takes HTML text and creates a stream of events (much like, say, SAX), and the HMTLSanitizer filter takes the stream of events and outputs a new stream - using a generator filter to avoid doing work that isn't necessary and using lower memory.
Great, but the output of the stream's render method is UTF-8 encoded text. In a standard string - not a unicode one like all the unicode strings I got from TurboGears, and which I tested with before adding in the filter.
No problem, until I want to put together a message to send via mail to myself notifying me of the comment. Using standard string formatting, the non-unicode string is massaged to unicode from its standard encoding without any problems, so long as everything is within iso-8859-1/latin1. Which, given how Microsoft and Apple love to use smart quotes and so forth, doesn't take long to break.
Ok, no problem then - I'll just convert it to unicode. I remembered there was this unicode function/class (never can be quite sure, can one?). Which gives me the usual:
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 4: ordinal not in range(128)
ascii, eh? Oh, right, it is encoded in utf-8, so unicode(comment, 'utf-8') it is then. But then, I should have noticed that word "encoded" there. Because comment.decode('utf-8') would have been easier. And would have saved me time later trying to figure out which way is decoding and which encoding...
So, no problem, right?
No, the akismet Python API just passes things on to urllib2 to do some quoting, and that doesn't do anything in particular with unicode strings. I just never noticed, because everything else was coerced to ascii fine. So, now I need to take my unicode strings and convert them to utf-8 encoded standard strings before passing them to akismet.
Phew, that all worked. The message is properly assembled as a unicode string to be emailed to me. Until, say, I get to actually sending the email. TurboMail just uses the Python email module, which also doesn't do anything in particular with unicode strings. Internally, it creates a StringIO and writes to it, which coerces things back to ascii, which fails quickly.
Joy. So, convert the put-together message to a utf-8 encoded standard string. Grr.
For a brief second I think about how things work in PHP. No separate unicode strings, no converting encodings - everything is a byte stream, and in my application, a utf-8 encoded byte stream. Which never gets converted between encodings or changed from the browser to the database.
And then I remember the countless problems I faced to get it that far, and how much of the reason it's like that is because there's no decent character set support in PHP, and certain things are just maddening.
So, all I seem to need to remember is that if I have a utf-8 encoded standard string, I just need to run utf8encodedstring.decode('utf-8') to get a unicode string that I can use to do string manipulation properly. And when I need to pass a utf-8 encoded standard string to something, I can just use unicodestring.encode('utf-8'). That I suspect I can handle remembering, but I imagine I'll always have to look up how the stream readers and writers work.
Now if only I could convince TinyMCE not to try show itself on browsers like Safari that it doesn't support properly, and I imagine I may actually be able to accept something approaching most comments on my web log.
This past weekend I've been laid low by a pulled muscle - one of those lateral abdominal muscles that stabilises your entire upper body. Which means: moving is pain. (And which means that I only have three episodes left of Babylon 5 season 2, which arrived on Friday. The rest of the seasons are on the way.)
The comedic angle on this is that it was a table tennis injury. I wish it were something more impressive - mountain climbing is quite a popular option at work, for example. But, though it may lack mad dashes and sheer physical endurance, it makes up for it in terms of reaction time. And quick, strange movements of your various limbs does leave one open to such things.
Before this set-back, however, I have been slowly starting an exercise routine. Or, less impressively, walking the dogs around Newlands Forest with Jeremy. And I've made my diet even more specialised. No longer just pisco-vegetarian, but low-GI pisco-vegetarian now. (The caterers at functions I go to hate me, I'm certain of it.)
The difference is quite impressive, especially since it coincided with me taking things at work a bit closer to sustainable levels after almost a year of full-speed work. What's even weirder is that I'm starting to get a bit addicted, and am looking for more things to do. From wanting to chat to Fooz about getting a bike to looking for a squash partner (of suitably unfit persuasion).
Since working on private projects stole away the bits of my weekend not spent curled up in bed trying not to move (or on the couch, trying not to laugh at Ivanova), I decided that today I'd catch up on my research on the state of the job market for open source developers and systems administrators in South Africa. In short, it too is showing signs of health over hte last two months - I haven't seen the sort of openings for the not-quite-junior developers and sysadmins around in the last few years of watching, and at quite decent salaries. Python hasn't experienced as much of a growth, but most of the positions are quite a bit more senior than the average. A lot more positions going in Cape Town than in Johannesburg of late, or maybe it's my selective memory.
One can get quited used to life with setuptools. While developing and deploying gibe, the install_requires setting in setup.py has come to be my friend, ensuring that everything I need is installed in the environment I'm working in. But when investigating anti-spam options after the twenty or so spam messages overnight, I suddenly realised that there is a scary world without eggs.
Two options showed up in the Cheese Shop - spambayes and akismet Python API.
I'd used spambayes before, adding it to my vellum install to reduce spam. So, I just popped it into install_requires and reran python setup.py develop to get it installed. But there was no egg package available for setuptools. It felt a lot like culture shock. Anyway, I didn't give up immediately, and found out I could store the spam information in the RDBMS. But then I decided to see what else there was.
I'd heard of akismet before - I saw it on the KnowledgeTree People blogs that use WordPress, but I went with moderation on those instead. But since akismet keeps itself automatically up to date and I just felt like trying something different, I figured it was worth a shot. Again, no egg file. Thankfully, it is just a single file, though, and that means I can just bundle it, and things will work on a from-scratch Python environment (since I've been using virtual-python extensively lately).
Anyway, akismet has so far prevented 10 or so spam messages in the past few hours. I know, because I've also integrated TurboMail, which now notifies me on all comments, whether they pass the spam test or not. Besides a typo that prevents message delivery (who cares about that aspect of mail?), deployment with setuptools was flawless - just a simple easy_install of my updated gibe package.
Of course, now I need to figure out how I can send patches to the akismet and spambayes people to get them egged-up.
Introducing gibe
07 Oct
On and off over the last two weeks I've been developing gibe, to replace vellum as my web log software. Gibe is written in Python (of course), and uses the TurboGears web-based development mega-framework. Well, with an alternate set of tools - Routes for dispatching, Genshi for templating, and SQLAlchemy for database connecitivty and ORM, to facilitate my learning of these tools.
How has it been going?
Well, Genshi really helps to ensure valid HTML everywhere. Vellum's templating system, unfortunately, was one of those build-it-with-strings and occasional embedded Python code. Genshi's XML-based templating is spot on for almost all uses - separating a list with some character is not one of those, although I found a nice solution for that. It does silently swallow certain types of errors, which is quite confusing, and also quite surprising in a Python module. But the HTML sanitiser is really great, and I can see myself writing a few filters for it, and maybe writing some code to make applying filters to particular streams in a larger template easier (to make a comma-separated list relatively trivial).
I've become a total Routes convert, especially as I have been contemplating the plugin architecture I want to add. Currently, I have a couple of routes added to provide backwards-comaptibility for Vellum URLs, and these could trivially be done with passing the routes mapper to plugins to add their own paths. Which means that adding new admin pages, new user pages, or entire content management systems wouldn't require any changes to the core code.
SQLAlchemy is taking a while to get used to. I like the declarative ActiveMapper style, but it too silently swallowed some errors that cause relationships between tables/objects to be lost. But, I'm warming to it.
TurboGears, despite all these replacements, continues to function and be useful - the automatic application of templates, the automatic validation of forms, and automatic error handling is a potent combination. That it doesn't tie you into a particular templating engine or modeling system is comforting, but the opinionated defaults are welcome too. And the TurboGears widget system continues to impress me.
Still much for me to do - automatic excerpt generation, theme support, plugin architecture, anti-spam support, and so forth. And tagging, so that I don't have to edit the database to show entries to those subscribed to particular topic feeds. But it's probably the most enjoyable programming I've done in years - simple specification, tools of my own choosing, and no deadlines makes a great change...