compare strings disregarding whitespace without RE

There is an interesting topic in the python.list about comparing two strings deiregarding whitespace without re. As the discussion went on, there cames two different requirements:

normalize whitespace That is, “a\n b c” == “a b \n c” but “ab c” <> “a bc”
totally ignore withespace Both “a\n b […]

playsh: MUD like programming environment for pythoner

There is an article on wired news about a new game, playsh. It is like a MUD but you can do python programming in it. Looks a good way to learn Python programming.
I didn’t play too much with MUD in the old 1990s days but this looks interesting. However there are two requirements for the […]

unicode(str, "utf-8") and str.encode("utf-8")

the key is, unicode in Python is an object, unicode(str, “utf-8″) makes that object from an utf-9 str, and str.encode(”utf-8″) encode a string to the utf-8 encoding.
To write unicode-aware python code, I’ll need to:

when getting data, use unicode(str, “the_encoding”) to get an unicode object
use unicode object inside my program, like all internal strings should be […]

reading utf-8 file in Python

import codecs
fp = codecs.open(fileName, "r", "utf-8")
fp.read()
* http://evanjones.ca/python-utf8.html
* http://www.jorendorff.com/articles/unicode/python.html

Share This

bug tracking system selection

Well I think it’s the time for setting up a bug tracking system. There are 4 candidates: bugzilla, gnats, roundup, and trac.
Gnats is the most promising one before I started: It has been used for a long time and stable, it’s written in C, use a plain file system based database, and it is said […]

feedparser.text content type

I need to change this line from
true_encoding = http_encoding or ‘us-ascii’
to
true_encoding = http_encoding or xml_encoding or ‘us-ascii’
for those buggy sites that don’t obey the standard. And set content type to text/* but don’t offer a charset, set their encoding in the xml file.

Share This

feedparser.whitespace

According to the XML spec http://www.w3.org/TR/REC-xml/#NT-EncodingDecl whitespace is allowed around the quotes of encoding Here is a simple patch:
— /usr/ports/textproc/py-feedparser/work/feedparser/feedparser.py.old Sat Jul  2 16:17:11 2005
+++ /usr/ports/textproc/py-feedparser/work/feedparser/feedparser.py     Sat Jul  2 16:18:25 2005
@@ -2101,7 +2101,7 @@
else:
# ASCII-compatible
pass
-        xml_encoding_match = re.compile(’^&#60;\?.*encoding=[\’"](.*?)[\’"].*\?&gt;’).match(xml_data)
+        xml_encoding_match = re.compile(’^&#60;\?.*encoding\s=\s[\’"](.*?)[\’"].*\?&gt;’).match(xml_data)
except:
xml_encoding_match = None
if xml_encoding_match:
I’ve send this patch to the […]

feedparser.encoding

Looks Feedparser was written with Python 2.3. With python 2.4, the CJKcodecs is included in the official release. So the line
import cjkcodecs.aliases
should be changed to
import encodings.aliases

Share This

quixote session

http://darcs.idyll.org/~t/projects/quixote2-tutorial/advanced/sessions.html
http://quixote.ca/qx/StoringSessionsInDatabase
http://ksenia.nl/code/quixote_sql_sessions_06.tgz

Share This

Challenge 5

Well this is not an easy one.
The unpickling is as easy as it should:
import pickle
f = open(”banner.p”, “r”)
x = pickle.load(f)
But after that I was lost. x is a list of lists, with 23 items, each item are constructed with one or more tuples, each tuple has two elements: one ” ” or “#”, and one […]

Close
E-mail It