LuaState (a Lua Wrapper) 4.1 Alpha Distribution

http://workspacewhiz.com/Other/LuaState/LuaState.html
http://workspacewhiz.com/Other/LuaState/LuaState_LuaWrapper41Alpha.zip

Overview

Warning: The following distribution contains modified Lua code.  It is not an official distribution of Lua as the authors intended it.  Many modifications are superficial, but some, such as the Unicode string type, go quite a bit deeper.  It is my hope these changes will be considered for inclusion into the master Lua distribution.

The intent of this package is not to masquerade as an official Lua interpreter.  Any references to Lua in this document refer specifically to the modified Lua code contained herein, unless otherwise noted.  Only additions have been made to Lua.

The LuaState distribution provides the following functionality:

Unicode

Unicode (or rather, wide character) support is built-in as a native Lua type.  While it is entirely possible to represent a Unicode string using a regular Lua 8-bit clean string, there is no way of determining whether the given string is Unicode or not.  The secondary problem involves the use of standard string functionality, such as the concatenation operator.  If a Unicode string is represented as a normal 8-bit Lua string, special functions would have to be written to perform operations on the string (i.e. concatenation of two Lua 8-bit clean strings which represent Unicode).  Rather than confuse the user, a Unicode string type is available.  The term Unicode is used loosely in this context (since there are different standards of Unicode), but for the purposes of this distribution, Unicode refers to 16-bit wide character support.

Unicode strings can be entered into Lua script using the C approach to wide character strings.

L"Wide characters"

By inserting an L in front of the quote, the Lua lexer creates a wide character representation of the above string.  If the string was entered as a regular Lua string, the Unicode equivalent would be simulated as follows:

"W\000i\000d\000e\000 \000c\000h\000a\000r\000a\000c\000t\000e\000r\000s\000\000\000"

In the event it is necessary to insert Unicode character codes in the wide character string, an additional property of the L"" approach may be used.  16-bit characters may be entered using hexadecimal notation, in the same way as C:

L"Wide characters: \x3042\x3043"

The standard Lua libraries have been upgraded to accept and operate on Unicode strings.  Below is a brief list of added functionality.

Memory Allocators

This distribution replaces the #define approach to memory allocation within Lua with a callback mechanism, where the memory allocators can be replaced on a per Lua state basis.  This allows a powerful mechanism to be employed to adjust memory allocation strategies on a per state basis.

For purposes of better memory tracking, the realloc() callback allows a void pointer of user data, an allocation name, and allocation flags to be passed along.  All of these arguments are optional, but they are available if the memory allocation callback needs them.

The only allocation flag available is LUA_ALLOC_TEMP.  A memory manager could react to the LUA_ALLOC_TEMP flag, for instance, by allocating the code for the main function of a Lua file at the top of the heap.  If all other Lua allocations happen at the bottom of the heap, no holes will be left in memory when the LUA_ALLOC_TEMP flagged allocation is garbage collection.

The callbacks look like:

static void* luaHelper_ReallocFunction(void* ptr, int size, void* data, const char* allocName, unsigned int allocFlags)
{
    return realloc(ptr, size);
}
static void luaHelper_FreeFunction(void* ptr, void* data)
{
    free(ptr);
}

The allocation functions must be assigned before a Lua global state is created, in a fashion similar to below.  It is good practice to restore the previous realloc() and free() callbacks.

lua_ReallocFunction oldReallocFunc;
lua_FreeFunction oldFreeFunc;
void* oldData;
lua_getdefaultmemoryfunctions(&oldReallocFunc, &oldFreeFunc, &oldData);
lua_setdefaultmemoryfunctions(luaHelper_ReallocFunction, luaHelper_FreeFunction, NULL);
lua_State* state = lua_open(0);
lua_setdefaultmemoryfunctions(oldReallocFunc, oldFreeFunc, oldData);

Memory Optimizations

A whole host of functionality has been added to facilitate the optimization of memory usage in a tight memory environment.

Multithreading

Multithreading is built into the LuaState distribution by default.  The function lua_setlockfunctions() can be used to set up the multithreading.

Example:

static void LSLock(void* data)
{
    CRITICAL_SECTION* cs = (CRITICAL_SECTION*)data;
    ::EnterCriticalSection(cs);
}

static void LSUnlock(void* data)
{
    CRITICAL_SECTION* cs = (CRITICAL_SECTION*)data;
    ::LeaveCriticalSection(cs);
}

lua_State* m_state = lua_open(stackSize);
CRITICAL_SECTION* cs = new CRITICAL_SECTION;
::InitializeCriticalSection(cs);
lua_setlockfunctions(m_state, LSLock, LSUnlock, cs);

Fatal Error Handler

Having exit() be called in non-command line apps is generally a bad thing.  In some environments, exit() can't be called at all.  Rather than have the application blow up in an undesirable fashion, the LuaState distribution allows the fatal error exit() function in Lua to be overridden through a call to lua_setfatalerrorfunction().  The default fatal error callback runs the exit() function.  It can be replaced as desired.

Other Optimizations

New String Formatting Enhancements

format has been extended with the following control types.  The use of these control types makes it easy to write binary file formats to disk.

Additionally, ANSI strings can use the hexadecimal character notation to insert bytes into the string:

str = "Hel\x80o"

Built-in Pointer Type

The LuaState distribution offers a built-in pointer type.  The pointer type is used for just passing a raw pointer into Lua and back out to a C function.  There are some advantages offered by the pointer type over the user data type:

  1. Handing off a pointer to Lua is "free."  For user data, there is a memory cost associated with creating a user data object.  For simple pointer passing, the pointer type is a much better alternative.
  2. Since the mantissa of a double is large enough to hold a 32-bit pointer without data loss, a Lua double could be used to hand off pointers.  The pointer interface is much cleaner than the double one and far more portable.

A pointer is represented by the Lua type, LUA_TPOINTER.  The following functions are available for pointer access and mirror their Lua type counterparts:

Unified Methods

Unified methods are based heavily on Edgar Toernig's Sol implementation of unified methods (note: some text is taken verbatim from the Sol documentation).

Every object in Lua has an attached method table.  For C++ users, the method table is most similar to a v-table.  For Lua's simple types (nil, number, string, ustring, and function), there is one method table for all objects of the given type.  Table and userdata objects have the ability to have method tables on a per object basis.

Unlike Edgar's Sol implementation, the colon operator for Lua's automatic self functions is not replaced with an alternate implementation.  This is done in an effort to keep LuaState functionality identical to the original Lua distribution.  Instead, two new function operators are introduced.  The pointer symbol (->) behaves like the colon operator, but it looks up the function to call in the method table.  The second operator is the double colon operator, which behaves like the regular dot operator (no self is passed in).

The biggest advantage of unified methods is memory savings.  When dealing with many Lua objects (say, tables) of the same type, the functions don't have to be duplicated for each and every one.  Significant amounts of memory may be saved by the use of the shared method table.

Every data type has methods, even numbers.  It is possible to write code that looks like:

print(4->sqrt())
str = "Hello"
print(str->len())

For the majority of cases, the use of tag methods can more or less be forgotten.

The default method tables have been put into global names for convenience. They are named like the type but with a capital first letter (Nil, Number, String, UString, Function, Table, Userdata).

Method tables may be accessed from within Lua using the methods() function.

table = {}

-- Save the old methods.
tableMethods = methods(table)

newMethods =
{
    doNothing = function(self)
    end
}

-- Set the new methods.
methods(table, newMethods)
table->doNothing()

In C, methods may be retrieved using the following functions:

LUA_API void lua_getmethods(lua_State *L, int index);
LUA_API void lua_getdefaultmethods(lua_State *L, int type);
LUA_API void lua_setmethods(lua_State *L, int index);

Serializing

The LuaState distribution can write out a Lua table in a nice, formatted file.  The only downside to LuaState's approach is that the table can't currently be cyclic.

A table can be written both from Lua and C++.  The function prototypes are:

function WriteLuaFile(fileName, objectName, valueToWrite, indentLevel, writeAll, alphabetical, maxIndentLevel)
function WriteLuaGlobalsFile(fileName, writeAll, alphabetical, maxIndentLevel)
function WriteLuaObject(filePtr, objectName, valueToWrite, indentLevel, writeAll, alphabetical, maxIndentLevel)

The C++ functionality is very similar in form.

Standard Unified Method Callbacks

The basic types have the following unified method callbacks applied to them.

Table =
{
    function foreach(self, func)
    function foreachi(self, func)
    function next(self, [index])
    function rawget(self, index)
    function rawset(self, index, value)
    function getn(self)
    function sort(self [, comp])
    function insert(self [, pos] , value) -- tinsert
    function remove(self [, pos]) -- tremove
    function unpack(self)
}

File =
{
    function close(self) -- closefile
    function flush(self) -- flush
    function open(filename, mode) -- openfile
    function read([self,] format1, ...) -- read
    function seek(self [, whence] [, offset]) -- seek
    function write([self,] value1, ...) -- write
    function execute(command) -- execute
    function remove(filename) -- remove
    function rename(name1, name2) -- rename
    function tmpname() -- tmpname
    stdin
    stdout
    stderr
}

String =
{
    function len(self) -- strlen
    function sub(self, i [, j]) -- strsub
    function lower(self) -- strlower
    function upper(self) -- strupper
    function char(i1, i2, ...) -- strchar
    function rep(self, n) -- strrep
    function byte(self [, i]) -- strbyte
    function format(formatstring, e1, e2, ...) -- format
    function find(self, pattern [, init [, plain]]) -- strfind
    function gsub(self, pat, repl [, n]) -- gsub
}

UString =
{
    function len(self) -- ustrlen
    function sub(self, i [, j]) -- ustrsub
    function lower(self) -- ustrlower
    function upper(self) -- ustrupper
    function char(i1, i2, ...) -- ustrchar
    function rep(self, n) -- ustrrep
    function byte(self [, i]) -- ustrbyte
    function format(formatstring, e1, e2, ...) -- uformat
    function find(self, pattern [, init [, plain]]) -- ustrfind
    function gsub(self, pat, repl [, n]) -- ugsub
}

Number =
{
    function abs()
    function sin()
    function cos()
    function tan()
    function asin()
    function acos()
    function atan()
    function atan2()
    function ceil()
    function floor()
    function mod()
    function frexp()
    function ldexp()
    function sqrt()
    function min()
    function max()
    function log()
    function log10()
    function exp()
    function deg()
    function rad()
    function random()
    function randomseed()
}

Bonus Functions

FileFind =
{
    -- Returns a handle representing the first file matching fileName.
    function First(fileName)

    -- Retrieves the next file matching fileName.
    function Next(self)

    -- Closes the file search.
    function Close(self)

    -- Gets the file name of the currently matched file.
    function GetFileName(self)

    -- Determines if the currently matched file is a directory.
    function IsDirectory(self)
}

-- Added to File:
File =
{
    -- Returns the size of fileName.
    function GetFileSize(fileName)

    -- Returns as two numbers the last write time for fileName.
    function GetWriteTime(fileName)

    -- Sets the write time for fileName.
    function SetWriteTime(fileName, timeLo, timeHi)

    -- Same as the C function _access.
    function access(fileName, type)
}

-- Returns a new table with a hash table of size.
function NewTableSize(size)

-- Copies a non-cyclic table recursively.
function CopyTable(tableToCopy)

-- Looks up a table entry by string name: Table1.Table2.3.Value2
function FullLookup(table, lookupStr)

-- Processes all the files matching wildcard in the directory [path] and calls func(path, name) on each one.
function DirProcessFiles(path, wildcard, func)

-- Recursively processes all the files in the directory [path], optionally matching [ext] and calls func(path, name) on each one.
function DirProcessFilesRecursive(path, func, ext)