Unicode issues#252
Conversation
|
I'm not able to see a problem with the default encoding in the test suite. Certainly, there isn't any code in IPython or nose that calls sys.setdefaultencoding(). Do you have any nose plugins installed? It's possible that a poorly behaved plugin is doing that. The other changes seem fine. |
|
I shouldn't have any nose plugins installed - I only installed nose to test IPython. I didn't try to track down where it got changed, but when I tried to add the tests to the main test suite, I couldn't get them to fail, and when I tried sticking a |
|
Some other packages (e.g. Pylons) will also provide nose plugins for their own use. But if you just have IPython and nose installed in the virtualenv, then I'm not sure what the issue would be. |
|
Sorting out the test suite revealed a unicode related test failure in inputsplitter, which has also been resolved. |
|
Interestingly, somewhere along the line, the annoying bug where unicode literals turn into UTF-8 has been resolved. |
|
Rebased on trunk - I needed to check it against newer code. |
|
Looks good to me. Recommend merge. |
|
Trying to resolve some test failures on Windows. |
|
N.B. I suggest that this waits until my sqlite history branch is merged. It will need a couple of changes following the new history system. |
|
OK for waiting post sqlite history merger. I just wanted to thank you for tackling these unicode issues! One thing to note: we may need to make history loading more robust against encoding issues. Just in testing this I input some unicode, then went back to running on master, but since the unicode had gone into my history file on disk, now I can't start ipython because it crashes on initialization: amirbar[~]> ip
Error in sys.excepthook:
Traceback (most recent call last):
File "/home/fperez/usr/lib/python2.6/site-packages/IPython/core/crashhandler.py", line 137, in __call__
traceback = TBhandler.text(etype,evalue,etb,context=31)
File "/home/fperez/usr/lib/python2.6/site-packages/IPython/core/ultratb.py", line 397, in text
tb_offset, context)
File "/home/fperez/usr/lib/python2.6/site-packages/IPython/core/ultratb.py", line 958, in structured_traceback
ipinst = ipapi.get()
File "/home/fperez/usr/lib/python2.6/site-packages/IPython/core/ipapi.py", line 29, in get
return InteractiveShell.instance()
File "/home/fperez/usr/lib/python2.6/site-packages/IPython/core/interactiveshell.py", line 315, in instance
inst = cls(*args, **kwargs)
File "/home/fperez/usr/lib/python2.6/site-packages/IPython/core/interactiveshell.py", line 279, in __init__
self.init_readline()
File "/home/fperez/usr/lib/python2.6/site-packages/IPython/core/interactiveshell.py", line 1565, in init_readline
self.reload_history()
File "/home/fperez/usr/lib/python2.6/site-packages/IPython/core/interactiveshell.py", line 1259, in reload_history
self.history_manager.reload_history()
File "/home/fperez/usr/lib/python2.6/site-packages/IPython/core/history.py", line 174, in reload_history
self.populate_readline_history()
File "/home/fperez/usr/lib/python2.6/site-packages/IPython/core/history.py", line 146, in populate_readline_history
self.shell.readline.add_history(line)
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe1' in position 6: ordinal not in range(128)
Original exception was:
Traceback (most recent call last):
File "/home/fperez/usr/bin/ipython", line 10, in
launch_new_instance()
File "/home/fperez/usr/lib/python2.6/site-packages/IPython/frontend/terminal/ipapp.py", line 660, in launch_new_instance
app.start()
File "/home/fperez/usr/lib/python2.6/site-packages/IPython/core/application.py", line 210, in start
self.initialize()
File "/home/fperez/usr/lib/python2.6/site-packages/IPython/core/application.py", line 202, in initialize
self.construct()
File "/home/fperez/usr/lib/python2.6/site-packages/IPython/frontend/terminal/ipapp.py", line 478, in construct
self.shell = TerminalInteractiveShell.instance(config=self.master_config)
File "/home/fperez/usr/lib/python2.6/site-packages/IPython/core/interactiveshell.py", line 315, in instance
inst = cls(*args, **kwargs)
File "/home/fperez/usr/lib/python2.6/site-packages/IPython/frontend/terminal/interactiveshell.py", line 87, in __init__
user_global_ns=user_global_ns, custom_exceptions=custom_exceptions
File "/home/fperez/usr/lib/python2.6/site-packages/IPython/core/interactiveshell.py", line 279, in __init__
self.init_readline()
File "/home/fperez/usr/lib/python2.6/site-packages/IPython/core/interactiveshell.py", line 1565, in init_readline
self.reload_history()
File "/home/fperez/usr/lib/python2.6/site-packages/IPython/core/interactiveshell.py", line 1259, in reload_history
self.history_manager.reload_history()
File "/home/fperez/usr/lib/python2.6/site-packages/IPython/core/history.py", line 174, in reload_history
self.populate_readline_history()
File "/home/fperez/usr/lib/python2.6/site-packages/IPython/core/history.py", line 146, in populate_readline_history
self.shell.readline.add_history(line)
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe1' in position 6: ordinal not in range(128)
Now, in this case I know perfectly what's going on (the product of going back and forth between branches with different unicode behavior) and I also know how to get out of it. But a normal user may get seriously frustrated by a failure of this kind. So in the history branch, we probably want to make sure that in the face of catastrophic errors with history loading, we simply clean things up, start a new history store (db in the new code) and get up and running (informing the user of what happened) rather than blocking out startup. This is just a comment to keep in mind as we rework the history stuff... As far as this branch is concerned, all I can say is thank you for tackling this unicode mess. In my mind, this was one of the major, if not the key, blocker for getting 0.11 out the door. Awesome work, as always. |
|
You're welcome. And yes, some error handling code for loading history into readline is not a bad idea, although hopefully once we've got all our ducks in order, it won't actually be needed. |
|
Merging this will also close #88 |
There was a problem hiding this comment.
Detect encoding outside loop
|
Done with review during sage days 29 sprint, ready for merge after minor fixes discussed in review. Will be merged after rebase on top of the sqlite branch (once that's merged), since this depends on that. Thanks again for the great work! |
…the best way to do it - sys.stdin.encoding is None in the test suite.
…scii, and Unicode failures can be resolved.
…f the fixes in previous commits are undone.
…haracters in the name. + Unit test. Closes ipythongh-200
…d of compiler, bringing it closer to the Python 3 implementation.
This branch fixes two particularly nasty unicode bugs:
UnicodeEncodeError. This was due to code in compilerop attempting to do MD5 hashing without first encoding.readline.add_history, which expects text to be encoded.I've started creating a test suite for unicode related issues. Unfortunately, this has to be run separately from the main test suite, as something in the test suite's initialisation causes it to change the default encoding from ascii to utf-8, in which situation neither of the problems above occur. This is, for now, in tools/unicode_tests.py.