python – 如何在Google App Engine数据存储区中存储非ASCII字符

我已经尝试了不少于5种不同的“解决方案”,我无法让它工作,请帮助.

这是错误

  'ascii' codec can't decode byte 0xc3 in position 1: ordinal not in range(128)
  Traceback (most recent call last):
  File "/base/python_runtime/python_lib/versions/1/google/appengine/ext/webapp/__init__.py", line 636, in __call__
    handler.post(*groups)
  File "/base/data/home/apps/elmovieplace/1.350096827241428223/script/pftv.py", line 114, in post
    movie.put()
  File "/base/python_runtime/python_lib/versions/1/google/appengine/ext/db/__init__.py", line 984, in put
    return datastore.Put(self._entity, config=config)
  File "/base/python_runtime/python_lib/versions/1/google/appengine/api/datastore.py", line 455, in Put
    return _GetConnection().async_put(config, entities, extra_hook).get_result()
  File "/base/python_runtime/python_lib/versions/1/google/appengine/datastore/datastore_rpc.py", line 1219, in async_put
    for pbs in pbsgen:
  File "/base/python_runtime/python_lib/versions/1/google/appengine/datastore/datastore_rpc.py", line 1070, in __generate_pb_lists
    pb = value_to_pb(value)
  File "/base/python_runtime/python_lib/versions/1/google/appengine/api/datastore.py", line 239, in entity_to_pb
    return entity._ToPb()
  File "/base/python_runtime/python_lib/versions/1/google/appengine/api/datastore.py", line 841, in _ToPb
    properties = datastore_types.ToPropertyPb(name, values)
  File "/base/python_runtime/python_lib/versions/1/google/appengine/api/datastore_types.py", line 1672, in ToPropertyPb
    pbvalue = pack_prop(name, v, pb.mutable_value())
  File "/base/python_runtime/python_lib/versions/1/google/appengine/api/datastore_types.py", line 1485, in PackString
    pbvalue.set_stringvalue(unicode(value).encode('utf-8'))
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 1: ordinal not in range(128)

这是代码中给我带来问题的部分.

if imdbValues[5] == 'N/A':
    movie.diector = ''
else:
    movie.director = imdbValues[5]

...

movie.put()

在这种情况下,imdbValues [5]等于ClaudioFäh

最佳答案 这行代码引发了异常:

pbvalue.set_stringvalue(unicode(value).encode('utf-8'))

将值传递给movie.director时,该值首先在unicode中转换为:

unicode(value)

然后用encode编码(‘utf-8’).

unicode()函数使用ASCII作为默认解码编码;这意味着只有通过这些价值才能安全:

>一个unicode字符串
>一个8位字符串

您的代码可能正在传递带有某些编码的字节字符串,而unicode(value)无法以ASCII格式解码.

建议:
如果你正在处理字节字符串,你必须知道他们的编码或你的程序将遭受这种编码/解码问题.

如何解决:
发现您正在处理的字节字符串中使用的编码(utf-8?)并将它们转换为unicode字符串.
例如,如果imdbValues是由包含utf-8编码字节字符串的一些花哨的Imdb python库返回的列表,则应使用以下命令转换它们:

 movie.director = imdbValues[5].decode('utf-8')
点赞