{"id":337,"date":"2005-09-14T16:17:00","date_gmt":"2005-09-14T23:17:00","guid":{"rendered":"https:\/\/www.reenigne.org\/blog\/bootstrapping-a-compiler-from-nothing\/"},"modified":"2008-05-23T11:42:02","modified_gmt":"2008-05-23T18:42:02","slug":"bootstrapping-a-compiler-from-nothing","status":"publish","type":"post","link":"https:\/\/www.reenigne.org\/blog\/bootstrapping-a-compiler-from-nothing\/","title":{"rendered":"Bootstrapping a compiler from nothing"},"content":{"rendered":"<p>Two posts today 'cos I missed yesterday due to being disorganized.<\/p>\n<p>Recently I've been working on bootstrapping a compiler from nothing. Just for fun. I know it's been <a href=\"http:\/\/web.archive.org\/web\/20061108010907\/http:\/\/www.rano.org\/bcompiler.html\">done before<\/a> but I wanted to learn about parsing and optimizing and how compilers are constructed.<\/p>\n<p>The first stage of my compiler is a pretty clever hack, even if I do say so myself. I didn't want to use any external tools to get my compiler started, but that left me with a problem - how do I generate the first executable file? Well, one way to generate an arbitrary file from Windows is to just use an \"echo\" statement in the Windows command prompt and redirect the output to a file. But that only works reliably for ASCII characters (and not even all of those). This poses a problem, because the opcodes for even simple \"MOV\" instructions are all non-ASCII characters. But it turns out that the \"constrained machine code\" for x86 consisting of only ASCII bytes is actually Turing-complete and can be used to do useful things (non-ASCII opcodes such as the one for \"INT\" can be constructed using self-modifying code). So I put together an ASCII program that takes two characters from the command line, combines them into a byte and outputs the resulting byte (which can then be redirected to a file). Calls to this program can be strung together to make (almost) arbitrary binary files, which can be used to compile more complex languages.<\/p>\n<p>In this way (13 iterations later) I have built up a simple but effective 2-pass 16-bit DOS assembler which outputs .com files. I have also written a recursive descent parser for simple infix expressions on unsigned 16-bit integers, and am working on writing a code generator which can output binary code for these expressions.<\/p>\n<p>Eventually I hope to evolve this into a fast and powerful language to rival C++. It will be a language which combines very low-level and very high-level concepts, and will therefore be an ideal language for writing compilers (such as itself) in. I could then use it for writing all sorts of other fun things - maybe I'll tackle an OS when I've finished the language. But for now I'm just having fun learning about things.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Two posts today 'cos I missed yesterday due to being disorganized. Recently I've been working on bootstrapping a compiler from nothing. Just for fun. I know it's been done before but I wanted to learn about parsing and optimizing and how compilers are constructed. The first stage of my compiler is a pretty clever hack, [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[27],"tags":[],"class_list":["post-337","post","type-post","status-publish","format-standard","hentry","category-language"],"_links":{"self":[{"href":"https:\/\/www.reenigne.org\/blog\/wp-json\/wp\/v2\/posts\/337","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.reenigne.org\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.reenigne.org\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.reenigne.org\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.reenigne.org\/blog\/wp-json\/wp\/v2\/comments?post=337"}],"version-history":[{"count":0,"href":"https:\/\/www.reenigne.org\/blog\/wp-json\/wp\/v2\/posts\/337\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.reenigne.org\/blog\/wp-json\/wp\/v2\/media?parent=337"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.reenigne.org\/blog\/wp-json\/wp\/v2\/categories?post=337"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.reenigne.org\/blog\/wp-json\/wp\/v2\/tags?post=337"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}