itch.iohttp://itch.iohttps://itch.io/t/635303/test-texttest texthttps://itch.io/t/635303/test-textTue, 17 Dec 2019 11:41:53 GMTTue, 17 Dec 2019 11:41:53 GMTTue, 02 Mar 2021 02:29:59 GMTAnother person posted a question asking for a text with the supported characters and was told to see the refered unicode blocks topic. Since it can be usefull to have such a text file for testing and comparing, I would just like to point out that one can easily procedurally generate a text file containing all unicode codepoints (or at least the all BMP codepoints). The following python 3 code writes all codepoints from 32 to 1114112 into a utf8 encoded text file. Just replace r'C:\a.txt' with a location where the user running the script write permissions.


with open(r'C:\a.txt', 'wb') as f:
  for i in range(32, 0x110000, 1): # 0x110000 (1114112) is the maximum range of the chr function. more than that it throws "ValueError chr() arg not in range(0x110000)"
    char = chr(i) # convert integer unicode codepoint to unicode character string
    try:
      f.write(char.encode('utf-8'))
    except:
      pass # ignore UnicodeEncodeError 'utf-8' codec can't encode character '\ud800' in position 0: surrogates not allowed
    if i % 128 == 0:
      f.write(b'\n') # write a line break every 128 characters to allow text editors to parse the file more easily
    if i % 1024 == 0:
      f.write(b'\n') # write a line break every 1024 characters to allow humans to visually parse the text more easily

]]>