我正在收集来自IMAP服务器的电子邮件,其代码位于下方,但电子邮件正文的内容通常非常难看,有时甚至无法理解.许多电子邮件包含丹麦语和瑞典语等特殊字符,例如æ,ä,ö,ø和å,但我不认为这是问题所在.如何最好地编码和清理?
imap = Net::IMAP.new(address, port, enable_ssl?)
imap.login(user_name, password)
imap.examine(flag)
search_query = "#{last_uid}:*"
imap.uid_search(search_query).each do |uid|
if uid.to_i > last_uid.to_i
header = imap.uid_fetch(uid, "BODY[HEADER.FIELDS (FROM TO DATE SUBJECT)]")[0].attr["BODY[HEADER.FIELDS (FROM TO DATE SUBJECT)]"]
from = Mail.read_from_string(header).from.first
to = Mail.read_from_string(header).to.first rescue nil
subject = Mail.read_from_string(header).subject
date = Mail.read_from_string(header).date
body = imap.uid_fetch(uid, "BODY[TEXT]")[0].attr["BODY[TEXT]"].gsub(/\r\n?/, "\n").force_encoding('UTF-8')
end
end
imap.logout()
imap.disconnect()
样本正文内容:
1:
LS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS08YnI+DQpPcmRy
ZWRhdG86IDI4LTAzLTIwMTMgMTQ6NDc6MTg8YnI+DQpPcmRyZW51bW1lcjogMTA5MDM1PGJy
Pg0KVHJhbnNha3Rpb25zSUQ6IDE2NzgyMQ0KPGJyPjxicj4NCkZha3R1cmVyaW5nc2FkcmVz
c2U6PGJyPg0KLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0t
LS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLTxicj48YnI+DQpOaWtsYXMgSnV1bCBOaWVs
c2VuPGJyIC8+QS5QLiBNw7hsbGVyIEtvbGxlZ2lldCAxMDU8YnIgLz41NzAwIFN2ZW5kYm9y
ZzxiciAvPkRlbm1hcms8YnIgLz5UTEY6OiAyMDYzMDczNzxiciAvPjxhIGhyZWY9Im1haWx0
bzpuaWtzQGxpdmUuZGsiPm5pa3NAbGl2ZS5kazwvYT48YnIgLz4NCjxicj48YnI+DQpMZXZl
cmluZ3NhZHJlc3NlOjxicj4NCi0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0t
LS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS08YnI+PGJyPg0KTmlrbGFz
IEp1dWwgTmllbHNlbjxiciAvPkEuUC4gTcO4bGxlciBLb2xsZWdpZXQgMTA1PGJyIC8+NTcw
MCBTdmVuZGJvcmc8YnIgLz5EZW5tYXJrPGJyIC8+VExGOjogMjA2MzA3Mzc8YnIgLz48YSBo
cmVmPSJtYWlsdG86bmlrc0BsaXZlLmRrIj5uaWtzQGxpdmUuZGs8L2E+PGJyIC8+DQo8YnI+
PGJyPg0KT3JkcmVkYXRhOjxicj4NCi0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0t
LS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS0tLS08YnI+DQoNCiAgMSww
MCBzdGsuIFN0YXIgV2FycyBCYXR0bGVmcm9udCBJSSBYYm94ICg0MTAzMikgw6EgREtLIDI2
Myw5OSAtIElhbHQ6IERLSyAzMjksOTkNCjxicj4NCjxicj4NCkJldGFsaW5nOiAyOiBEYW5z
a2Uga3JlZGl0a29ydCBbdHJhbnNha3Rpb25zZ2VieXIgMSwyNSVdIChES0sgNCwxMykNCjxi
cj4NCkZvcnNlbmRlbHNlOiAgKERLSyAwLDAwKQ0KPGJyPjxicj4NClNhbWxldCBwcmlzIDog
REtLIDMzNCwxMg0KPGJyPg0KSGVyYWYgbW9tczogREtLIDY2LDgzDQo=
2(缩短):
------=_NextPart_000_0482_01CE2B9E.A689A9F0
Content-Type: multipart/related;
boundary="----=_NextPart_001_0483_01CE2B9E.A689A9F0"
------=_NextPart_001_0483_01CE2B9E.A689A9F0
Content-Type: multipart/alternative;
boundary="----=_NextPart_002_0484_01CE2B9E.A689A9F0"
------=_NextPart_002_0484_01CE2B9E.A689A9F0
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
=20
=09
=09
=09
=09
=20
=09
=09
=09
=09
=09
=09
=09
=09
=09
Daily Restock Information.
=09
=09
Item
Format
1+=20
5+ =20
Box Price=20
Qty
Barcode
=09
3(缩写):
--Boundary-=_SHccxHuUYYhTGDGLfcIEBDUToEun
Content-Type: text/plain; charset="ISO-8859-1"
--Boundary-=_SHccxHuUYYhTGDGLfcIEBDUToEun
Content-Type: application/octet-stream
Content-Disposition: attachment; filename="SYSTEMSTOCK.XLSX"
Content-Transfer-Encoding: base64
UEsDBBQABgAIAAAAIQC5OlcVkgEAAIwGAAATAN0BW0NvbnRlbnRfVHlwZXNdLnhtbCCi2QEooAAC
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
AAAAAAAAAAAAAAAAAMRVyWrDMBC9F/oPRtcSK0mhlBInhy7HNpD0AxRrEovYktBMtr/v2FloimtI
HejF+7xl9EYejLZFHq0hoHE2Eb24KyKwqdPGLhLxOX3rPIoISVmtcmchETtAMRre3gymOw8YcbXF
RGRE/klKTDMoFMbOg+U3cxcKRXwbFtKrdKkWIPvd7oNMnSWw1KESQwwHLzBXq5yi1y0/3iuZGSui
5/13JVUilPe5SRWxULm2+gdJx83nJgXt0lXB0DH6AEpjBkBFHvtgmDFMgIiNoZDDwQebDkZDNFaB
等等..
最佳答案 试图解决这个问题,挖了好几个小时,所以把我的答案添加到我找到的几个线程中……
https://stackoverflow.com/a/26604049/2386548
希望有人帮助…