Forum Archive

What is the output when run on Windows?

ccc

I do not have a Windows box. Can someone who has Python on Windows please run the following code and tell me what the output is?

import os, sys, unicodedata

print(sys.platform, sys.version)
print(os.getenv('PYTHONIOENCODING', None))
print(sys.getdefaultencoding())
print(sys.stdin.encoding, sys.stdout.encoding, sys.stderr.encoding)

try:
    print(unicodedata.lookup('WHITE CHESS KING'))
except KeyError as e:
    print('%s: %s' % (e.__class__.__name__, e))

try:
    print( sys.getwindowsversion())
except AttributeError as e:
    print('%s: %s' % (e.__class__.__name__, e))
dgelessus

Python 2.7.8:

Python 2.7.8 (default, Jun 30 2014, 16:03:49) [MSC v.1500 32 bit (Intel)] on win32
Type "copyright", "credits" or "license()" for more information.
>>> exec("""
import os, sys, unicodedata

print(sys.platform, sys.version)
print(os.getenv('PYTHONIOENCODING', None))
print(sys.getdefaultencoding())
print(sys.stdin.encoding, sys.stdout.encoding, sys.stderr.encoding)

try:
    print(unicodedata.lookup('WHITE CHESS KING'))
except KeyError as e:
    print('%s: %s' % (e.__class__.__name__, e))

try:
    print( sys.getwindowsversion())
except AttributeError as e:
    print('%s: %s' % (e.__class__.__name__, e))
""")
('win32', '2.7.8 (default, Jun 30 2014, 16:03:49) [MSC v.1500 32 bit (Intel)]')
None
ascii
('cp1252', 'cp1252', 'cp1252')
♔
sys.getwindowsversion(major=6, minor=1, build=7601, platform=2, service_pack='Service Pack 1')

Python 3.4.1:

Python 3.4.1 (v3.4.1:c0e311e010fc, May 18 2014, 10:38:22) [MSC v.1600 32 bit (Intel)] on win32
Type "copyright", "credits" or "license()" for more information.
>>> exec("""
import os, sys, unicodedata

print(sys.platform, sys.version)
print(os.getenv('PYTHONIOENCODING', None))
print(sys.getdefaultencoding())
print(sys.stdin.encoding, sys.stdout.encoding, sys.stderr.encoding)

try:
    print(unicodedata.lookup('WHITE CHESS KING'))
except KeyError as e:
    print('%s: %s' % (e.__class__.__name__, e))

try:
    print( sys.getwindowsversion())
except AttributeError as e:
    print('%s: %s' % (e.__class__.__name__, e))
""")
win32 3.4.1 (v3.4.1:c0e311e010fc, May 18 2014, 10:38:22) [MSC v.1600 32 bit (Intel)]
None
utf-8
cp1252 cp1252 cp1252
♔
sys.getwindowsversion(major=6, minor=1, build=7601, platform=2, service_pack='Service Pack 1')

(Yes, I am too lazy to save the code as a file and run it properly. Long live interactive Python!)

FYI, the encodings of the std streams aren't guaranteed to always be the same. I ran the code in IDLE, which pretends to use codepage 1252 (wannabe Latin-1) but actually has Unicode support. If I run the code in the normal python interpreter in the cmd shell (which uses Codepage 850 without any Unicode support by default), it fails to print the Unicode character and produces this traceback:

Traceback (most recent call last):
  File "<stdin>", line 18, in <module>
  File "<string>", line 10, in <module>
  File "C:\Python\3.4\lib\encodings\cp850.py", line 19, in encode
    return codecs.charmap_encode(input,self.errors,encoding_map)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\u2654' in position 0: character maps to <undefined>
ccc

Cool. Thanks much. In the normal python interpreter in the cmd shell, can you please add sys.setdefaultencoding('cp1252') at the top of the file and tell me if that changes the results?

dgelessus
Python 3.4.1 (v3.4.1:c0e311e010fc, May 18 2014, 10:38:22) [MSC v.1600 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys
>>> sys.setdefaultencoding("cp1252")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'module' object has no attribute 'setdefaultencoding'

This happens with both Python 2 and 3. If I run chcp 1252 before starting python, I still get the same UnicodeEncodeError as above, because Windows-1252 doesn't have the U+2654 WHITE CHESS KING character.

There is however the codepage 65001, which is Windows' name for UTF-8. That seems to work, but only with Python 3. On Python 2, I get a LookupError: unknown encoding: cp65001 every time I hit Enter in the interactive prompt.

ccc

Thanks much! Just one more for you to try on Windows...

import sys, unicodedata

print('=' * 23)
print(sys.platform, sys.version)
print(sys.getdefaultencoding())
# sys.setdefaultencoding('cp1252')
# sys.setdefaultencoding('utf-8')
# print(sys.getdefaultencoding())

def chess_char(piece_name):
    color, ptype = piece_name.split()
    try:
        return str(unicodedata.lookup('%s chess %s' % (color, ptype)))
    except UnicodeEncodeError:
        char = 'n' if ptype == 'knight' else ptype[0]
        return char.upper() if color == 'white' else char

colors = 'white', 'black'
ptypes = 'king queen rook bishop knight pawn'.split()
piece_names = ('%s %s' % (color, ptype) for color in colors for ptype in ptypes)
pieces_dict = {piece_name : chess_char(piece_name) for piece_name in piece_names}
print(' '.join(pieces_dict.values()))

My hope is that In Idle, it prints Unicode chess characters and in cmd, it prints ascii characters without throwing any uncaught exceptions.

Thanks again for your help on this.

dgelessus

The UnicodeDecodeError happens when trying to print the Unicode characters, not when looking them up. Python on Windows is fully capable of handling Unicode strings, it's just that the std streams use codepage 850 by default and can't display most Unicode characters.

In any case, here's the output when run in IDLE:

=======================
win32 3.4.1 (v3.4.1:c0e311e010fc, May 18 2014, 10:38:22) [MSC v.1600 32 bit (Intel)]
utf-8
♚ ♝ ♔ ♕ ♛ ♞ ♖ ♗ ♟ ♘ ♙ ♜

And in the command prompt with codepage 850:

=======================
win32 3.4.1 (v3.4.1:c0e311e010fc, May 18 2014, 10:38:22) [MSC v.1600 32 bit (Intel)]
utf-8
Traceback (most recent call last):
  File "<stdin>", line 23, in <module>
  File "<string>", line 22, in <module>
  File "C:\Python\3.4\lib\encodings\cp850.py", line 19, in encode
    return codecs.charmap_encode(input,self.errors,encoding_map)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\u2657' in position 0: character maps to <undefined>
ccc
  • In Idle, this should print msg surrounded by chess characters and then True.
  • In cmd, this should print msg with no chess characters and then False.
# coding: utf-8
import sys

def can_print_unicode(msg='Hello World'):
    try:
        print(str(u'♜ ♞ ♝ {} ♗ ♘ ♖'.format(msg)))
        return True
    except UnicodeEncodeError:
        print(msg)
        return False

if __name__ == '__main__':
    print(can_print_unicode())