Help me with Milter Problems
This directory
is a place for community help with problems I am stuck on.
MIME Parsing Problem from Oct 3, 2002
UPDATE: My solution to this problem has been released in
milter-0.5.1 and includes
unit tests for the situation.
I need some suggestions on handling malformed MIME with Python.
MIME and HTML parsing are important for
Python Milters. The Python milter has captured some messages
that give it indigestion, and I have provided everything you need
here to demonstrate the problem with a single module and Python-2.1.3.
The files you'll need to help me out are:
- buggymail A captured malformed email message
- mime.py The extended MIME parsing module
- Milter.py Dummy one line module so you won't need
Python Milter to test this problem
- testmime.py A script to feed email messages
from files listed on the command line to mime.py
On properly formed messages, the mime.py module here works perfectly.
The buggymail file can't be fully parsed by pine either. However,
pine doesn't crash. (Python milter as a whole doesn't crash either
since it traps exceptions and simply punts the message.) Pine also doesn't
try to parse the malformed RFC822 attachment. If you turn off
scan_rfc822 in the testmime.py driver, mime.py doesn't crash either.
UPDATE: the problem is that the rfc822 attachment has a transfer encoding
of quoted-printable.
I have added MimeMessage.get_payload_decoded(), returning
a decoded body when needed (in milter-0.5.1, not in this directory).
However, MimeMessage.dump() probably fails to dump updates to attachments
within an rfc822 attachment.
With the mime.py module here, buggymail, and python-2.1.3, I get the
following traceback when scan_rfc822=1:
$ python testmime.py buggymail
Traceback (most recent call last):
File "testmime.py", line 22, in ?
rc = mime.check_attachments(msg,_chk_attach)
File "mime.py", line 383, in check_attachments
rc = check_attachments(i,check)
File "mime.py", line 386, in check_attachments
return check(msg)
File "testmime.py", line 17, in _chk_attach
_chk_attach)
File "mime.py", line 382, in check_attachments
for i in msg.get_payload():
AttributeError: mimepart instance has no attribute '__getitem__'
My attempts to modify mime.py to muddle through have only resulted
in different exceptions.
My questions:
- Is there a small example exhibiting the same stack trace that
can be included in unit testing? UPDATE: Yes, include
an encoded rfc822 attachment.
- How can mime.py be more robust against this kind of problem?
UPDATE: milter-0.5.1 decodes rfc822 attachments as needed.
- Pine doesn't try to further parse the rfc822 attachment. Python
milter should try to in case it has a virus that MSIE would unpack and
execute.
- I am also open to suggestions of other 3rd party MIME/HTML parsing
modules to use instead.