Extending the Framework

This library is simply a framework for defining the behavior of Lua code. With the four fundamental types, you can change the entire behavior of this framework. With the default implementations, you can pick and choose which portions of code to use and which to change. This section will describe how to extend this framework to suit your needs. It is highly suggested to read the source code to learn how the default implementations work.

The Parser

The parser is the first step to creating Lua code. It is in charge of converting plain-text code into a DOM (Document Object Model) of the code. This is separated into two parts: (1) the tokenizer, and (2) the parser. The tokenizer starts with some input (usually from a file) and converts this into a sequence of tokens. The parser takes these tokens and converts these into an IParseItem tree.

The Tokenizer

The ITokenizer is the interface that defines how the tokenizer behaves. Tokenizer defines the default behavior for a plain-text input. The default tokenizer takes a TextElementEnumerator as input. This allows for full Unicode support. This will automatically keep track of position and line. This can be extended to change how it behaves. It defines the behavior of Peek, Read, and PushBack. It then as needed calls InternalRead that does the actual reading from the input. This method will read a single token from the input.

There are several helper functions that are defined. Each helper function can be altered to change how it works. ReadElement and PeekElement will read and peek a single letter (or two for surrogate pairs) and will return null on the end of the stream. ReadElement will also increase the position and Line as needed. It will also convert any line endings into '\n'. ReadWhitespace will read any whitespace characters in the input using PeekElement and ReadElement. ReadComment will read a comment, it assumes that the first two letters have been read '--' and the input is on the next letter. ReadString will read a string, the input must be on the first letter of the string and 'depth' must specify the depth of the string. -1 for ', -2 for ", and a positive number for that number of equal signs (e.g. 3 means '[===['). _ReadNumber_ will read a number, you must pass the first letter of the number and the input must be on the second one.

The Parser

The parser will take an _ITokenizer_ input and will convert this into an _IParseItem_ tree. The default _PlainParser_ can be extended to alter its behavior. What may be confusing is the different uses of the Token type. The tokens returned fro the tokenizer are used to determine what to add to the output, but are also used to store debug information. Each _IParseItem_ object contains a Debug property that contains the token that defines it. If it is a compound token (e.g. _BlockItem_) then it is the entire token. This is done by calling _Token.Append_ which will append the given token onto the end of the current token. If a function accepts a token object, this represents the enclosing token and anything that is read should be appended to the end of that token. This is usually done by holding a global 'debug' Token that defines the item that is currently being read and at the end of the function calling append on the enclosing token with 'debug' as the argument.

There are several Read functions that will read different items. _ReadBlock_ does most of the work and will read a block of code. This can either be the main section or the contains of another item, such as an 'if' block. It is important to note that you should not append the 'end' token to what is read by a block, the 'end' token is read and added by the enclosing function. _ReadPrefixExp_ will read a prefix-expression from the input. A prefix expression is a simple expression such as literal or an indexer. It will read the entire indexer(s) as well as any function calls. This may call _ReadBlock_ for an embedded function or _ReadExp_ for function call arguments. _ReadExp_ will read a single expression from the input. If the 'precedence' argument is -1, then it is the initial call and will keep reading until it gets to something that is not an expression; if it is not -1, then it will keep reading until it reaches the end or an expression with a higher precedence. _ReadFunction_ and _ReadTable_ will read a function and table respectively.

The Compiler

The compiler is in charge of converting the _IParseItem_ tree into an IMethod object that can be invoked later. The default compiler cannot be extended due to its complexity. If you want to write your own compiler, you need to do it from scratch. The compiler need to perform two functions, (1) compile _IParseItem_ trees into _IMethod_ objects, and (2) create delegates that will invoke a given IMethod object. If you want to use the default behavior of the second item, you can call the static method _CodeCompiler.CreateDelegate_ in your method. The default compiler uses System.Reflection.Emit to generate IL according to the IParseItem tree. It uses an internal type _CompilerVisitor_ to visit, the parse tree and it generates the code. There is also a _ChunkBuilder_ that helps generate the code.

There is a public type called _GetInfoVisitor_ that will visit an _IParseItem_ tree and will get information about it. It is highly suggested that you use this in you compilers. This will resolve any 'goto' or 'break' statements. It will also search for captured variables and generate a _FuncDefItem.FunctionInfo_ object for each function. This will help to determine which variables are local, captured, and global. The array inside contains any local variables that are captured by nested functions, all others can be real local variables because they are only used in this function. There is also a field called _CapturesParrent_ that determines if it captures variables from the parent function. It also defines a field called _HasNested_ that determines if this function also has nested functions.

The Runtime

The runtime defines how the Lua code will execute. If you write your own compiler, using the runtime is not needed. However, if using the default compiler, you need to use _ILuaRuntime_. The runtime defines several functions that are called by the generated code. The default runtime (_LuaRuntime_) can be extended to change parts of its behavior. It is important to read the section on operator overloads to make sure the operation of the function is the same. You can use _OverloadInfo_ and _GetBetterOverload_ to determine overload information. Also make sure to read the next section on methods to correctly support tail-calls.

Proper Tail Calls

The Lua specification says that is supports proper tail-calls. This means that if a Lua-defined function returns a call to another function, it will remove it's stack information so it can support infinite recursion. By default, C# does not ever support tail-calls; however you can use IL to support tail-calls with the OpCodes.TailCall opcode. This means that you need to dynamically generate some types to support proper tail-calls. This is what the framework does by default. If you want to create a LuaRuntime instance, you need to call the static method Create and it will create an instance of the dynamic type. This is the same for each of the LuaMethod types. The call to LuaRuntime.Invoke and LuaMethod.Invoke are generated so the tail-call opcode is injected. If you want to support tail-calls, you need to do the same.

This is the call structure of the default runtime when Lua code executes a Lua-defined function. (1) the backing method calls ILuaRuntime.Invoke passing the arguments and the object to call (defined in the default compiler), (2) the runtime then calls LuaMethod.Invoke (as LuaMethod implements IMethod), then (3) that validates the arguments and calls the abstract method LuaMethod.InvokeInternal, (4) this is handled by LuaDefinedMethod.InvokeInternal and it simply calls the dynamically generated method. In each step above, it needs to contain the tail-call opcode, otherwise it will not support infinite recursion. If you want to define your own LuaMethod object, you can use LuaMethod.AddInvokableImpl which will generate dynamic implementations of IMethod that will call InvokeInternal. NOTE that deriving from either LuaMethod or LuaRuntime will NOT support tail-calls, the code for their respective Invoke methods MUST be dynamically generated to support tail-calls.

Last edited May 26, 2013 at 4:18 AM by ModMaker, version 1


No comments yet.