Long story short, I was seeing exceptions such as this: -
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xb2 in position 0: invalid start byte
Traceback (most recent call last):
File "decode.py", line 8, in
decoded_string = base64.b64decode(string).decode('UTF-8')UnicodeDecodeError: 'utf-8' codec can't decode byte 0xb2 in position 0: invalid start byte
and: -
Traceback (most recent call last):
File "decode.py", line 8, in
decoded_string = base64.b64decode(string).decode('UTF-8')
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xb0 in position 25: invalid start byte
with a piece of inherited Python code.
I wasn't 100% clear how the decoder was being used, apart from this: -----
decoded_string = base64.b64decode(string).decode('UTF-8')
where string contained, I assume, Base64 encoded string.
The actual string was a public key, generated using openssl : -
ssh-keygen -t rsa -b 4096 -f ~/.ssh/dummy -N ""
I wrote a test harness to simulate the problem and test potential solutions, and was posting the public key to the string variable: -
cat ~/.ssh/dummy.pub | awk '{print $2}'
AAAAB3NzaC1yc2EAAAADAQABAAACAQDXmbB7kUK4G0Fqm+5SSDztAMR5mV+0irWGLFuZN7Pbj30Kyi67TZ3J1cEhC3PsDyFW4hkvMRpdOoSlUfL2yVb1IxvbidcPF0ihtHgnMD2pn3W8xwFpbutpPWUgPd679Yq1C/bzFx2lIDWBpy5bSj/TpTWRsdFy7Z1Esja2ST8RfUByAl5zsg6fuyFFySzY8bVgH/Oc+eS82tICS1ZqdXJy6atsJQ2OnP7zTrw4Txz+vwpmQeddWSjL1wUs77ea0FJjU2MMFHm6+uW+cAr2woYlA4Lac6d+Mq9t5Ibt77J8BijkjJ+U79JhNSky0A2rSeThdWuD7uW/Kju43m6fb5ss/ATKbra/M3hUPg0F0YwtiDmPratCkE11uJnFfyYaPpt58LrgvYZzosliQe96AeCWru6IzEkGoGErSfl/PwielDWzDWXuNxY00gQ0Rtx3I76g6gV01gbxKcBusLTFh51GC0PvVEikhk5cI+drbT1uMDjLHi6Tr2MO+uRdu2BpwVQIZgSUke3OpnjQ2rDTIcaKy6e5lfJ7Hpw0kIw0Bi9j9YDMod90TRQXdElWFKeKQ+ZlaH9Ytr2FeDk+9H69kf52rXtn8q9Uy/NtlIdKsYa2pGdv7N1IFumGX+GbYplewTta/05OaJXI3iia1CV09oFryag+5MYQmJRCijSlUBIFjQ==
Can you see where I was going wrong ?
Yes, the public key is NOT Base64 encoded ....
The solution was to encode the public key: -
cat ~/.ssh/dummy.pub | awk '{print $2}' | base64
QUFBQUIzTnphQzF5YzJFQUFBQURBUUFCQUFBQ0FRRFhtYkI3a1VLNEcwRnFtKzVTU0R6dEFNUjVtViswaXJXR0xGdVpON1BiajMwS3lpNjdUWjNKMWNFaEMzUHNEeUZXNGhrdk1ScGRPb1NsVWZMMnlWYjFJeHZiaWRjUEYwaWh0SGduTUQycG4zVzh4d0ZwYnV0cFBXVWdQZDY3OVlxMUMvYnpGeDJsSURXQnB5NWJTai9UcFRXUnNkRnk3WjFFc2phMlNUOFJmVUJ5QWw1enNnNmZ1eUZGeVN6WThiVmdIL09jK2VTODJ0SUNTMVpxZFhKeTZhdHNKUTJPblA3elRydzRUeHordndwbVFlZGRXU2pMMXdVczc3ZWEwRkpqVTJNTUZIbTYrdVcrY0FyMndvWWxBNExhYzZkK01xOXQ1SWJ0NzdKOEJpamtqSitVNzlKaE5Ta3kwQTJyU2VUaGRXdUQ3dVcvS2p1NDNtNmZiNXNzL0FUS2JyYS9NM2hVUGcwRjBZd3RpRG1QcmF0Q2tFMTF1Sm5GZnlZYVBwdDU4THJndllaem9zbGlRZTk2QWVDV3J1Nkl6RWtHb0dFclNmbC9Qd2llbERXekRXWHVOeFkwMGdRMFJ0eDNJNzZnNmdWMDFnYnhLY0J1c0xURmg1MUdDMFB2VkVpa2hrNWNJK2RyYlQxdU1EakxIaTZUcjJNTyt1UmR1MkJwd1ZRSVpnU1VrZTNPcG5qUTJyRFRJY2FLeTZlNWxmSjdIcHcwa0l3MEJpOWo5WURNb2Q5MFRSUVhkRWxXRktlS1ErWmxhSDlZdHIyRmVEays5SDY5a2Y1MnJYdG44cTlVeS9OdGxJZEtzWWEycEdkdjdOMUlGdW1HWCtHYllwbGV3VHRhLzA1T2FKWEkzaWlhMUNWMDlvRnJ5YWcrNU1ZUW1KUkNpalNsVUJJRmpRPT0K
at which point my code started working: -
python3 decode.py
which returns the original unencoded public key: -
AAAAB3NzaC1yc2EAAAADAQABAAACAQDXmbB7kUK4G0Fqm+5SSDztAMR5mV+0irWGLFuZN7Pbj30Kyi67TZ3J1cEhC3PsDyFW4hkvMRpdOoSlUfL2yVb1IxvbidcPF0ihtHgnMD2pn3W8xwFpbutpPWUgPd679Yq1C/bzFx2lIDWBpy5bSj/TpTWRsdFy7Z1Esja2ST8RfUByAl5zsg6fuyFFySzY8bVgH/Oc+eS82tICS1ZqdXJy6atsJQ2OnP7zTrw4Txz+vwpmQeddWSjL1wUs77ea0FJjU2MMFHm6+uW+cAr2woYlA4Lac6d+Mq9t5Ibt77J8BijkjJ+U79JhNSky0A2rSeThdWuD7uW/Kju43m6fb5ss/ATKbra/M3hUPg0F0YwtiDmPratCkE11uJnFfyYaPpt58LrgvYZzosliQe96AeCWru6IzEkGoGErSfl/PwielDWzDWXuNxY00gQ0Rtx3I76g6gV01gbxKcBusLTFh51GC0PvVEikhk5cI+drbT1uMDjLHi6Tr2MO+uRdu2BpwVQIZgSUke3OpnjQ2rDTIcaKy6e5lfJ7Hpw0kIw0Bi9j9YDMod90TRQXdElWFKeKQ+ZlaH9Ytr2FeDk+9H69kf52rXtn8q9Uy/NtlIdKsYa2pGdv7N1IFumGX+GbYplewTta/05OaJXI3iia1CV09oFryag+5MYQmJRCijSlUBIFjQ==