libb.smart_base64
- smart_base64(encoded_words)[source]
Decode base64 encoded words with intelligent charset handling.
Splits out encoded words per RFC 2047, Section 2 and handles common encoding issues like multiline subjects and charset mismatches.
- Parameters:
encoded_words (str) – Base64 encoded string or plain text.
- Returns:
Decoded string (or original if not encoded).
- Return type:
Note
Basic Usage:
>>> smart_base64('=?utf-8?B?U1RaOiBGNFExNSBwcmV2aWV3IOKAkyBUaGUgc3RhcnQgb2YgdGh' ... 'lIGNhc2ggcmV0dXJuIHN0b3J5PyBQYXRoIHRvICQyMDAgc3RvY2sgcHJpY2U/?=') 'STZ: F4Q15 preview – The start of the cash return story? Path to $200 stock price?'
Multiline Subjects (common email bug - base64 encoded per line):
>>> smart_base64('=?UTF-8?B?JDEwTU0rIENJVCBHUk9VUCBUUkFERVMgLSBDSVQgNScyMiAxMDLi' ... 'hZ0tMTAz4oWbICBNSw==?=\r\n\t=?UTF-8?B?VA==?=') "$10MM+ CIT GROUP TRADES - CIT 5'22 102.625-103.125 MK T"
Charset Mismatch (UTF-8 header with Latin-1 content):
>>> smart_base64('=?UTF-8?B?TVMgZW5lcmd5OiByaWcgMTdzIDkxwr4vOTLihZsgMThzIDkzwr4v' ... 'OTTihZsgMjBzIDgywg==?=\r\n\t=?UTF-8?B?vS84Mw==?=') 'MS energy: rig 17s 91.75/92.125 18s 93.75/94.125 20s 82.5/83'
Unicode Characters:
>>> smart_base64('=?UTF-8?B?VGhpcyBpcyBhIGhvcnNleTog8J+Qjg==?=') 'This is a horsey: \U0001f40e' >>> smart_base64('=?UTF-8?B?U0xBQiAxIOKFnDogIDEwOSAtIMK9IHYgNzYuMjU=?=') 'SLAB 1.375: 109 - 0.5 v 76.25'
Plain Text Passthrough:
>>> smart_base64('This is plain text') 'This is plain text'