the key is, unicode in Python is an object, unicode(str, “utf-8″) makes that object from an utf-9 str, and str.encode(”utf-8″) encode a string to the utf-8 encoding.
To write unicode-aware python code, I’ll need to:
- when getting data, use unicode(str, “the_encoding”) to get an unicode object
- use unicode object inside my program, like all internal strings should be u”some_thing”
- when output, convert the unicode object to whatever fits, that is, use str.encode(”the_encoding”)
Good reference:
- http://groups.inetbot.com/showgrp/cn_pbbs_pcomp_plang_ppython_s574.html
- http://www.amk.ca/python/howto/unicode
- http://www.onlamp.com/pub/a/python/excerpt/pythonckbk_chap1/
Post a Comment
You could use <code type="name"> to get your code colorized