Phil Hassey - game dev blog
Phil Hassey as Wolverine
"What kind of
arrogant jerk
has a website like this?"

tinypy 64k – bootstrapped!

So.. hey, it’s done. Basically. tinypy is a 64k implementation of a subset of python. It can bootstrap itself into a single executable that can compile python files to bytecode and run them on a VM. Thanks to everyone who gave feedback thus far on this project. Double thanks to allefant who listened to me blab about it endlessly on irc for the last month 🙂

I found all the stuff people told me about for parsing was a huge help. This article http://javascript.crockford.com/tdop/tdop.html was what I ended basing it on. It’s almost like magic, but it makes for a really simple easy to follow parser. The VM is based on stuff I read about the lua VM.

So what’s next? I need to let it sit around for a week and then I’ll do a “release” I guess. I’ve gotta pick a license or something for it (probably MIT? I’m open for suggestions.) I’m also mulling over possible names. Maybe “tinypy” .. or “wedge” .. or “cupcake” .. or “garter”. Hmmn.

Anyway – I’m sure I’ll be tweaking it a bit over time, but I’m pretty happy to have it to this point now. I probably won’t do much with it until I try making a game with it. Right now it depends on libgc for garbage collection. If someone clever out there can implement a garbage collector for it that works in like 2-4k, that’d be better. My brain is pretty spent.

For the brave: svn://www.imitationpickles.org/tinypy/trunk or tinypy.zip. The following is only tested under linux, but I bet it would work in any bash environment. Maybe.

$ python boot.py

Will run the 3-phase testing + bootstrapping process. It will first use python to generate the .tpc files for the compiler. Second phase uses the VM to generate those same files. Third phase uses the bootstrapped tinypy executable to “re-bootstrap” tinypy to get the final version. The -nopos option strips out debug info from the .tpc files.

$ ./tinypy julia.py

Run the julia demo without dependence on *anything* but the tinypy executable.

$ ./tinypy your_own_code.py

Will do something! Probably print out a pretty traceback about how you tried to use a python feature / module that tinypy doesn’t support 🙂 “batteries not included”

47 Responses to “tinypy 64k – bootstrapped!”

  1. Anonymous Says:

    petite – pytite

  2. Ori Folger Says:

    Pequeño is Spanish for small, and it’s pronounced peh-KEHN-yo, so maybe Pyqueno or Pyquenio.

  3. Paolo Bonzini Says:

    A simple two-space copying collector should be easy to implement. For a 1meg heap, allocate two memory spaces of that size. One is from-space, the other is to-space. Allocate new objects into from-space.

    When it’s full, copy the roots into to-space contiguously, store a marker (e.g. all-zeros, or all-ones) in the first word of the roots, store where you moved them in the second word. Then walk to-space from the beginning to the end looking at each word. The content of the word is in from-space. If the pointed word is the marker, rewrite the word with the word just after (i.e. the “where you moved them” part above. Otherwise, copy the object to to-space and mark it the same way you did with the roots. When you have scanned all of to-space, discard from-space and swap the two spaces.

    See “Cheney’s algorithm” on Wikipedia.

  4. Paolo Bonzini Says:

    Looking at the code, the marker could be a TP_COPIED node. It shouldn’t be more than 200 lines of code.

  5. Nymius Says:

    You are following the steps of pyvm. That’s good because pyvm doesn’t seem to have any releases since 2006. I’m wondering what are your plants wrt python 3000?

    I’ve just downloaded tinypy.zip and am now going to try it…

  6. tonetheman Says:

    worm – that is my suggestion. good work!!!!

  7. hylje Says:

    greetings!

    could you provide us a list of (tested-)working build dependencies? a default (with build-essential) ubuntu feisty brings me a bunch of cast errors after running `python boot.py`:
    warning: initialization makes pointer from integer without a cast
    ending up with a AssertionError.

    this looks intriguing at the least, though!

  8. Poromenos Says:

    It doesn’t work for me, you apparently have hex values in python which tinypy can’t handle. When I change them to dec it compiles, but it then tries to open /dev/fb0, which doesn’t exist in my Ubuntu installation.

    Impressive job otherwise. I am definintely going to study it.

  9. Poromenos Says:

    The framebuffer thing was (obviously, in hindsight) because I was running it on a headless server. The hex thing still stands AFAIK, though.

  10. Nelson Castillo Says:

    Installed libgc-dev in Debian. And it worked just fine 🙂 This was a good idea.

  11. Antonio Ognio Says:

    I’ve just tried to compile it on Ubuntu without help. I installed libgc-dev and have enough of a toolchain installed to compile GTK+ stuff and daemons like Apache and others, so please provide a list of dependencies for the build process and more people will happily take a look.

  12. philhassey Says:

    Deps are python (for bootstrapping), libgc, libsdl, gcc. I’ve only tried it on my one linux box. I guess I should put up a bug tracker or something sometime, but for now if you want, please post the error message in the comments.

    hylje – can you post the error here?

    Poromenos – what hex values? Can you post the error here?

    Paolo – thanks for the tips, I’ll read up on that some-more.

    Nymius – well, the plans are to speed up the dict implementation, make bug fixes, and maybe add my own GC. Other than that, it’s done. With a 64k limit, I really can’t add anything.

  13. John M Says:

    with my Python (2.5.1c1) the tests complained. This is a hack-patch that fixes it, but I still get a mysterious exception later:

    $ svn diff dump2vm.py
    Index: dump2vm.py
    ===================================================================
    — dump2vm.py (revision 370)
    +++ dump2vm.py (working copy)
    @@ -59,7 +59,10 @@
    def do_number(t,r=None):
    r = get_tmp(r)
    code(“NUMBER”,r,0,0)
    – write(fpack(float(t[‘value’])))
    + val = t[‘value’]
    + if type(val) is str and val.startswith(‘0x’):
    + val = int(val[2:], 16)
    + write(fpack(float(val)))
    return r

    def get_tag():
    @@ -555,13 +558,15 @@
    ‘string’:do_string,’get’:do_get, ‘call’:do_call, ‘reg’:do_reg,
    }

  14. Michael Foord Says:

    worm, pytite and garter are all good names…

    My preference for license is usually BSD (revised or new). It allows people to use/modify/redistribute/relicense, so long as they keep the original license in place with their distribution. (MIT removes this restriction I think and is slightly ‘freer’.)

  15. philhassey Says:

    John – ah, thanks. I’m using python 2.4 and the float() builtin accepts hex strings. Thanks for the patch, I’ll work something like that in for my next release and test it against python 2.5.

  16. philhassey Says:

    John – I’ve patched svn so that it works with python 2.5 now. Please tell me if the hex issue persists for you. Thanks for tracking that down!

  17. rahul Says:

    Nice work. For license I suggest the python license. It will keep things simple and everyone who works with python will know what to expect from the license.

  18. philhassey Says:

    I’ve updated the .zip to include the python 2.5 patch.

  19. Ram Says:

    Mu suggestiong:
    pygmie or pygmy
    interesting project:)
    All teh best

  20. Tim Lesher Says:

    Very nice work. AT a previous job, we ended up choosing Lua over Python for an embedded device strictly because of size issues. This would have made things interesting.

    Quick note, though: tests.py are failing for me, on an x86_64 machine running Python 2.5.1 on Ubuntu 7.10:

    ./tinypy tests.py

    File “?”, line 0, in ?

    File “tests.py”, line 378, in ?
    ,”OK”)
    File “tests.py”, line 169, in t_render
    if exact: assert(res == ex)

    Exception:
    assert failed

  21. manuelg Says:

    Awesome!

  22. philhassey Says:

    Tim – the test sent its output to tmp.txt – can you paste that to me? That will show where the error came from.

  23. Manthrax Says:

    Very cool man. As a bytecode language snob, this kind of project really impresses me. A 64k runtime can make this suitable for a LOAD of cool embedded apps. Hell you could port it to BREW and write celphone apps in python. I would also like to add my vote for “cupcake” as a name.

  24. Michael Foord Says:

    @rahul

    The Python license is specifically *not* suitable for use with other projects. Largely because Python has a rich and varied history and the license has gained all sorts of cruft as a result.

    Both the BSD and MIT licenses are simple, widely recognised and widely understood.

  25. Nicolás Sanguinetti Says:

    Earthworm should be a nice name.
    Or “Jim” 😉

  26. Tim Lesher Says:

    tmp.txt:

    File “tmp1.tpc”, line 5, in ?
    C(“OK”).print()
    File “tmp1.tpc”, line 4, in C_print
    def print(self): print(self.data)

    Exception:
    tp_get: KeyError: data

  27. philhassey Says:

    Tim – this appears to have something to do with the -O3 option I’m passing to gcc in the final bootstrapping phase. Not quite sure what the deal is, but I’ll poke around a bit and see what I can figure out. (Anyone got any tips on debugging stuff like that? It works fine when I don’t use any -O options.)

  28. Rene Dudfield Says:

    -03 isn’t safe… and unfortunately some silly people use it for python compilation. Which stuffs up python extensions, and causes weird python bugs.

    Best to use -02 or lower. I think you can debug with -02 these days too?

    pygame has some distutil hacks to change gcc options if you want to look there.

  29. philhassey Says:

    Okay – svn has been updated to fix the odd-ball issue. I’ve got this _vm_raise() function that uses a longjmp and the optimizer, not seeing a return after that call was doing something screwy. Anyway, I adjusted my vm_raise macro to include a return right after the call to _vm_raise(), so now the optimizer knows not to do .. whatever it was doing. I also switched back to -O2 and enabled -Wall, both of which helped.

  30. Jeffrey A. Edlund Says:

    Pretty cool!

    I suggest the Boost License http://boost.org/more/license_info.html

    It’s like MIT and BSD, but I find that it’s easier to apply to new projects.

    I particularly like the short form of the license:

    // Copyright Joe Coder 2004 – 2006.
    // Distributed under the Boost Software License, Version 1.0.
    // (See accompanying file LICENSE_1_0.txt or copy at
    // http://www.boost.org/LICENSE_1_0.txt)

  31. Curtis Monroe Says:

    That SVN update didn’t fix the problem for me. (Linux 64bit too)

    gcc -Wall -O2 tinypy.c `sdl-config –cflags –libs` -lm -lgc -o tinypy
    In file included from tinypy.c:2:
    vm.c: In function ‘vm_signal’:
    vm.c:329: warning: implicit declaration of function ‘strsignal’
    ./tinypy tests.py

    File “?”, line 0, in ?

    File “tests.py”, line 385, in ?
    ,”OK”)
    File “tests.py”, line 178, in t_render
    if exact: assert(res == ex)

    Exception:
    assert failed

    [curtis@XXXXX trunk]$ cat tmp.txt
    Exception:
    tp_get: KeyError: data

  32. philhassey Says:

    Curtis – if you edit boot.py:82 and remove the -O2 does it work?

  33. Curtis Monroe Says:

    Removing -O2 succeeds.
    Replacing -O2 with -O1 fails.
    Replacing -O2 with -O0 succeeds.

  34. philhassey Says:

    Hmn. I’m mystified. If you can track down the issue I’d sure appreciate it. For what it’s worth, when the problem was happening to me tinypy would emit different bytecode depending on if it were compiled with -O2 or not. But it isn’t happening here anymore.

  35. Curtis Monroe Says:

    expanding the -O1 option (specifying the optimizations individually) succeeds

    gcc -Wall -fdefer-pop -fguess-branch-probability -fcprop-registers -floop-optimize -fif-conversion -fif-conversion2 -ftree-ccp -ftree-dce -ftree-dominator-opts -ftree-dse -ftree-ter -ftree-lrs -ftree-sra -ftree-copyrename -ftree-fre -ftree-ch -fmerge-constants tinypy.c `sdl-config –cflags –libs` -lm -lgc -o tinypy

    Maybe its a compiler bug?
    gcc version 4.0.0 20050519 (Red Hat 4.0.0-8)

  36. philhassey Says:

    Maybe .. I’ve got:
    gcc (GCC) 4.1.1 20060724 (prerelease) (4.1.1-3mdk)
    and I had the same issue for a while before I did some cleanup on the code.

  37. philhassey Says:

    Curtis – can you give it another svn update and try again? I did a bit more tweaking on stuff which looked suspicious. Thanks!

  38. Sp3w Says:

    […] Linkage 2007.02.012008-02-01 09:20:45 by mike in links (no comments) permalink Tinypy 64k […]

  39. Curtis Monroe Says:

    latest svn trunk:

    gcc -Wall -O2 tinypy.c `sdl-config –cflags –libs` -lm -lgc -o tinypy
    In file included from tp.c:6,
    from vm.c:1,
    from tinypy.c:2:
    builtins.c: In function ‘tp_round’:
    builtins.c:180: warning: implicit declaration of function ‘roundf’
    builtins.c:180: warning: incompatible implicit declaration of built-in function ‘roundf’
    In file included from tinypy.c:2:
    vm.c: In function ‘vm_signal’:
    vm.c:326: warning: implicit declaration of function ‘strsignal’
    ./tinypy tests.py

    File “?”, line 0, in ?

    File “tests.py”, line 385, in ?
    ,”OK”)
    File “tests.py”, line 178, in t_render
    if exact: assert(res == ex)

    Exception:
    assert failed

    $ cat tmp.txt

    File “tmp1.tpc”, line 5, in ?
    C(“OK”).print()
    File “tmp1.tpc”, line 4, in C_print
    def print(self): print(self.data)

    Exception:
    tp_get: KeyError: data

  40. MIki Says:

    Maybe you can use tinyscheme’s garbage collector (mark & sweep IIRC).

    See http://tinyscheme.sourceforge.net/home.html

  41. Rene Dudfield Says:

    Does hello world work?
    print ‘hello world’

    Might be a better demo for those of us running headless computers 🙂 ooh… or the ascii SDL driver.

    Unfortunately your code breaks with the SDL ascii art backend (at least on my system).

    export SDL_VIDEODRIVER=aalib

    320 240
    [New Thread -1213109872 (LWP 8951)]

    Program received signal SIGSEGV, Segmentation fault.
    [Switching to Thread -1213109872 (LWP 8951)]
    0x08048ffd in real_set_pixel ()
    (gdb) where
    #0 0x08048ffd in real_set_pixel ()
    #1 0x0804d625 in set_pixel ()
    #2 0x08048f5c in _dcall ()
    #3 0x0804c0ba in _tcall ()
    #4 0x0804f5a7 in _vm_call ()
    #5 0x08051ab5 in tp_step ()
    #6 0x08053158 in vm_run_1 ()
    #7 0x080531e9 in vm_call ()
    #8 0x080538e2 in main ()

  42. Rene Dudfield Says:

    ah, your tinypy seems to be py3k compatible already 😉

    Since print ‘hello world’ doesn’t work, but print (‘hello world’) does!

    Sweet.

  43. philhassey Says:

    Rene – I totally designed this with the future in mind. Actually – since py3k strips out some of the “crufty” syntax of python, I figured I might as well go that way – makes things easier for me.

  44. Zooko Says:

    Phil: Way to go! This looks very cool.

    Don’t you want “-Os” instead of “-O2”?

    Here’s a permissive licence that I created by compressing the MIT licence:

    “Permission is hereby granted to any person obtaining a copy of this work to deal in this work without restriction (including the rights to use, modify, distribute, sublicense, and/or sell copies).”

    https://zooko.com/simple_permissive_licence.html

  45. kspu Says:

    About simple permissive license, check , linked from .

  46. reinforce Says:

    Hi, great work!

    Works fine here with FreeBSD 7.1 and Python 2.5.2

    Vote for BSD Licence!

    Question: What disatvantages has this python implementation ?

  47. Project naming suggestions? | keyongtech Says:

    […] on a small python derivative called > tinypy. Sorry, I meant to include a relevant link: http://www.philhassey.com/blog/2008/…-bootstrapped/ […]