Skip to content

Error when inputting UTF8 CJK characters #88

Description

Original Launchpad bug 264587: https://bugs.launchpad.net/ipython/+bug/264587
Reported by: ngu-kho (ngU khO).

I'm using a UTF8 locale on my system and it's been working well for years. But when I try to input CJK characters into the IPython console in gnome-terminal via scim(an input method platform), some of the characters may go wrong in the console.

An example is the CJK character '选'('\u9009'), it should be encoded as '\xe9\x80\x89' in UTF8. However, trying to type this character into IPython is always a failure. The character will be display as several spaces followed by two question marks surrounded by a diamond(the � character):

In [1]: s = raw_input()
�� (I was actually inputting '选', however this character could be displayed correctly)

And the read string is not the one I input(the first byte in original string becomes several spaces)

In [3]: s
Out[3]: ' \x80\x89'

Moreover, making an assignment to such characters may cause IPython to exit:

In [1]: s = ' ��'
WARNING:


You or a %run:ed script called sys.stdin.close() or sys.stdout.close()!
Exiting IPython!

Such things do not happen in the original python console(/usr/bin/python). And it should not be a problem of scim since the same thing happens when I paste the character from the clipboard instead of typing.

Attached is the screenshot of problem.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions