Phil Hassey » tinypy

Archive for the 'tinypy' Category

64k tinypy – garbage collection is tough

Tuesday, February 5th, 2008

For giggles I tried to write a garbage collector to replace libgc in tinypy. I tried doing a tri-color incremental collector. I couldn’t get it to work, so I ended up switching it to be more of a tri-color mark and sweep collector.

The result of my mark and sweep collector was a 40% reduction in speed. I’m guessing ol’ libgc was designed with a bit more cleverness than mine 😉 Anyway, for now I’ve moved my “tgc” development into a branch of tinypy svn://www.imitationpickles.org/tinypy/branches/tgc if you want to see it in action.

Stuff to read: Memory Management Reference and libgc.

Posted in crazy, development, python, tinypy | 1 Comment »

tinypy 64k – bootstrapped!

Thursday, January 31st, 2008

So.. hey, it’s done. Basically. tinypy is a 64k implementation of a subset of python. It can bootstrap itself into a single executable that can compile python files to bytecode and run them on a VM. Thanks to everyone who gave feedback thus far on this project. Double thanks to allefant who listened to me blab about it endlessly on irc for the last month 🙂

I found all the stuff people told me about for parsing was a huge help. This article http://javascript.crockford.com/tdop/tdop.html was what I ended basing it on. It’s almost like magic, but it makes for a really simple easy to follow parser. The VM is based on stuff I read about the lua VM.

So what’s next? I need to let it sit around for a week and then I’ll do a “release” I guess. I’ve gotta pick a license or something for it (probably MIT? I’m open for suggestions.) I’m also mulling over possible names. Maybe “tinypy” .. or “wedge” .. or “cupcake” .. or “garter”. Hmmn.

Anyway – I’m sure I’ll be tweaking it a bit over time, but I’m pretty happy to have it to this point now. I probably won’t do much with it until I try making a game with it. Right now it depends on libgc for garbage collection. If someone clever out there can implement a garbage collector for it that works in like 2-4k, that’d be better. My brain is pretty spent.

For the brave: svn://www.imitationpickles.org/tinypy/trunk or tinypy.zip. The following is only tested under linux, but I bet it would work in any bash environment. Maybe.

$ python boot.py

Will run the 3-phase testing + bootstrapping process. It will first use python to generate the .tpc files for the compiler. Second phase uses the VM to generate those same files. Third phase uses the bootstrapped tinypy executable to “re-bootstrap” tinypy to get the final version. The -nopos option strips out debug info from the .tpc files.

$ ./tinypy julia.py

Run the julia demo without dependence on *anything* but the tinypy executable.

$ ./tinypy your_own_code.py

Will do something! Probably print out a pretty traceback about how you tried to use a python feature / module that tinypy doesn’t support 🙂 “batteries not included”

Posted in crazy, development, languages, python, tinypy | 47 Comments »

64k tinypy – parsing woes

Tuesday, January 22nd, 2008

I’ve been out of town for the last few days, so I haven’t done much to tinypy. However, I have realized that my current “dumbparse” module is pretty dumb. Two things are wrong with it:

It’s slow. Noticeably so with 8k modules, etc.
It’s dumb. It doesn’t give “useful” parsing errors.

I’ve read up a bit on LR, LL, top-down, and a whole bunch of kinds of parsers and it all sounds a bit magical to me. I’d be glad for some suggestions or ideas on where I could go from here. My requirements are:

must be pretty compact (I want my parser to be < 12k in size)
must not get exponentially slower on larger files (but it doesn’t have to be blindingly fast, just decent)
must be able to notice where a syntax error is (indicated by token – I’ve got a good tokenizer already)
must parse “basic” python and be written in that as well (see my current files for examples)

Thanks! svn://www.imitationpickles.org/tinypy/trunk – for the brave

Posted in crazy, development, python, tinypy | 8 Comments »

64k tinypy – now with list comprehensions and fancy arguments

Wednesday, January 16th, 2008

Well, I got two of my favorite python features added in – list comprehensions [x*x for x in range(1,5)] and fancy arguments test(1,2,a=3,*c,**d).

Adding list comprehensions was painless, took only a few minutes. The arguments change was rather difficult because I had to add more rules to the parser, change how the bytecode was outputted, and rework all the internal calling stuff in the VM. Now every function takes a single argument of type ‘params’ which can contain all the details of a call.

I’ve also added in better error handling. Errors print out a backtrace of the function calls and lines of code. This is making debugging everything much easier.

I’m also finding that the more features I add, well, the slower things get. I’m also coming up against my 64k limit, so I’ve moved my tests out of the core code into a separate tests.py file to ensure that I don’t “trim down the tests” to give myself more space 🙂 The tests I’ve made have been invaluable in making it possible to do big changes fairly quickly.

Next on the agenda – either bootstrapping, or making a game with it 🙂

For the brave: svn://www.imitationpickles.org/tinypy/trunk ; python tests.py ; ./run_julia_o3 (not very fast)

Posted in crazy, development, languages, python, tinypy | 12 Comments »

64k tinypy – now with VM included

Thursday, January 10th, 2008

I’ve managed to build a simple VM into tinypy – modeled after the lua VM. It’s register based and “stackless” in the “it doesn’t use the C-stack” sense of the word. (Not in the, it does anything fancy like “stackless python” does sense.)

‘ve just reached the 64k mark, so that means anything I add into the code will require me to clean up other code to save space. I’ve already done a bit of that with good results. The one rule I follow in shortening code is that the code must retain readability, if not improve it.

Garbage collection is rather complicated, so I think I’ve decided to continue to leave that to libgc. I read some literature on the matter, and it sounds like I could write one, but it probably wouldn’t be very good. I’ve put it at the bottom of my TODO.txt list in the section labeled “Probably not going to happen”.

At present tinypy supports basic python code with functions and loops and lists and dicts and classes. At least, in some basic form. I’m looking towards adding in list comprehensions and *args **nargs to tinypy as those are two of my favorite python features. After that I’ve got a handful of functions I want to write and then some packaging work.

For the curious svn://www.imitationpickles.org/tinypy/trunk – ./run_julia_o3 to see it all happen. It depends on python to compile the bytecode (we’re not bootstrapped yet), libgc, and SDL.

Posted in crazy, development, gamedev, languages, python, tinypy | 7 Comments »

64k tinypy – vm or no ..

Tuesday, January 1st, 2008

Well, things continue to come along with tinypy. I’ve been experimenting with a lot of different things with this. Some things I’ve tried:

– Assumed all infix operations were for numbers – was able to get quite a bit of speed at the sacrifice of things being “obvious”. I’ve disabled this for now, since I figure if you really want speed, you’d be using C anyways, not “tinypy”.

– Tried switching from passing around 16 byte structures to pointers to those structures. This didn’t work so well, because given my simple implementation it meant that any time any operation happened, another 16 bytes had to be allocated to store my floats. Not so hot for rendering fractals 😉 This caused a 10x slow-down. I reverted that out, but I copied that stuff to a branch for reference.

– I learned more about how libgc works. Basically, you need to use the GC_MALLOC and GC_REALLOC functions instead of the libc ones. libgc takes care of the rest. Also, if you integrate with libsdl (or some such library), you’ll need to do the memory management for it. There is hope – gc.h includes a function called GC_REGISTER_FINALIZER which makes it possible to trigger finalizers when an object is about to be freed. I’ve added some code to do this to tp.c, but haven’t tested it yet. That’ll get more testing when I start using libsdl for a game.

At present I’m debating between two options — leaving tinypy as-is and just cleaning it up, thus using it as a C code generator. Or changing it into a bytecode generator and writing a simple VM to run the bytecode. Advantages and disadvantages as I see them:

C code generator:

Quite simple and reasonably fast. Optimizations and goodies come from the compiler.
Really swell “FFI” (all functions are accessed directly and available as functions in C. Really, this means, no notable FFI at all.)
No need to do much more than tweaking stuff, etc.
No “eval” unless I use tinycc. Even then, no “safety”.

Bytecode + VM:

I’d be able to implement exceptions and tracebacks and stuff
I’d be able to build a “safe” environment for running scripts “eval”
Things would be a bit slower, but not much
FFI would be more complex (but I could probably work around that by auto-generating some goodies for people)

I’ll have to think about it for a bit .. I might just try making a fun little game with tinypy as-is, and work to keep the FFI really clean. And if I feel like upgrading tinypy, try to do it in such a way as to not break my “clean” FFI.

If you care to take a peak: svn://www.imitationpickles.org/tinypy/trunk – ./run_julia to see the julia demo. Must have libsdl and libgc installed. It’s sort of slow, but if you want to see it “fast” set JIT_DISABLE = False at the top of dumpout.py

Posted in crazy, development, languages, python, tinypy | 1 Comment »

64k “tinypy” keeps plugging along …

Friday, December 28th, 2007

So .. I had a hard time resisting working on my 64k version of “python”. I’ve been able to get quite a few features into it and I’ve gotten my julia demo up to near-C speed as well as tamed the crazy memory problems I was having.
I suppose the question I ponder is “why bother”? Well:

It’s fun 🙂
I’m learning the basics of parsing, code generation.
It might even be somewhat useful!

The first two don’t require much explanation, the third (usefulness?):

By keeping the codebase < 64k, it will be readable by mortals
By generating C code, it can build self-contained binaries easily
It has a really simple “FFI” which auto-generates many of the “FFI” wrappers for you
It’s sort of fast now (no promises for anything real)

It, of course, isn’t python, it just looks a bit like it. Notable differences are:

Access to members is like lua / javascript. x[“y”] and x.y mean the same thing.
Most infix operators only work with numbers. “x”+”y” won’t work. (Rational: makes numerical math fast)
It’s missing (and will never have) a bunch of really nice features. Syntax checking is notably weak. Maybe I should scrap my parser, etc and just use python’s.
No exception handling. Incorrect use of anything will result in a seg fault.

Notable similarities:

Language contains separate list and dict types. I thought about doing like lua / javascript / php and having a single type, but it just didn’t feel right.
It’s indented (duh)
Garbage collection via libgc

Well .. that’s about it. I expect before I’m done I’ll make a game with it, to see how it works in the real world. I’ve got a few more things on my TODO list to get done first. If you are brave, check out svn://www.imitationpickles.org/tinypy/trunk and run ./run_julia (linux) to see the julia demo.

Oh, and for all you “test first” folks, I’ve (more or less) been doing that. It’s made development considerably easier. See the bottom of “pylang.py” “dumbparse.py” and “dumpout.py” for all the testing fun.

Posted in crazy, development, gamedev, languages, python, tinypy | 3 Comments »

64k version of python is 4234x faster* than c-python!

Saturday, December 22nd, 2007

To satisfy some sick curiosity of mine, I decided to write a miniature version of python in less than 64k of code. The code includes a tokenizer (which works rather well) a parser (which sort of works ) a translator (which does the job). All of those bits are written in python. The last bit is the C library I wrote to handle all the grit. That part isn’t so good.

To sum up, my implementation does very little, and what it does, it does very slowly. Most of the speed is spent up in memory allocation (my best guess). I wish I had a julia screenshot to show you, but I don’t think I have the patience, nor the RAM, to provide you with the pleasure.

If you wish to witness the horrors, feel free to check out svn://www.imitationpickles.org/tinypy/trunk … Please don’t look too closely. I think if I really want to work on this sort of stuff my time would be better spent either:

Learning the python C API or ctypes so I can do stuff without SWIG. (I’ve liked SWIG, but all the magic makes me nervous.)
Contribute to shed-skin, and add some features I’d like to it
Contribute to PyPy-Rpython, and add some features I’d like to it
Wait for py3k, where hopefully someone will write some magic code that uses the swell function annotation features to make fast things happen.

Of my options, learning the C-Python API (or maybe ctypes) is probably the easiest answer – I’d get all the fun of python, and I’d get my C speed. Between working on shed-skin and PyPy — my feeling is that I really wish shed-skin generated C code or that PyPy-Rpython wasn’t so huge and scary. All that said, I’ll probably just wait for py3k and hope 😉

*today is opposite day!