Another Python Unicode Fiasco on JSON
It’s a re-post of my comment here: 1
Basically, I think there is a bug in the json.dump()
function in Python 2 (only) - It can’t dump a Python (dictionary / list) data containing non-ASCII characters, even you open the file with the encoding = 'utf-8'
parameter. (i.e. no matter what you do). However, json.dump()
works fine on Python 3 and json.dumps()
works on both Python 2 and 3.
The following code breaks in Python 2 with exception TypeError: must be unicode, not str
(Python 2.7.6, Debian):
import json
data = {u'\u0430\u0431\u0432\u0433\u0434': 1} #{u'абвгд': 1}
with open('data.txt', 'w') as outfile:
json.dump(data, outfile)
It, however, works fine in Python 3.
The following stroke-out statements are wrong, so just ignore them.
import io, json
with io.open('data.txt', 'w', encoding='utf-8') as f:
f.write(json.dumps(data))
import io, json
with io.open('data.txt', 'w', encoding='utf-8') as f:
json.dump(data, f)