UnrealScript Grammar
This is a EBNF Specification of the UnrealScript grammar.
It can be useful if you are going to write a parser for the UnrealScript language.
Note: this is not the official specification, it's made by visitors of the UnrealWiki.
Edit guidelines:
- all non-terminals should have all uppercase characters. Keep everything aligned. If you leave things open use '...' to make that clear.
- Always use as much brackets as needed, don't optimize because this can result in confusion.
- Terminals that are words can be used directly in the production rules, otherwise you must use a terminal rule.
Non-Terminals
PROGRAM = CLASSDECL ( DECLARATIONS )* ( REPLICATIONBLOCK )? BODY ( DEFAULTPROPERTIESBLOCK )? CLASSDECL = class IDENTIFIER ( extends PACKAGEIDENTIFIER )? ( CLASSPARAMS )* SEMICOLON CLASSPARAMS = CONSTCLASSPARAMS | within PACKAGEIDENTIFIER | dependson LBRACK PACKAGEIDENTIFIER RBRACK | config ( LBRACK PACKAGEIDENTIFIER RBRACK )? | hidecategories LBRACK IDENTIFIERLIST RBRACK | showcategories LBRACK IDENTIFIERLIST RBRACK IDENTIFIER = ( ALPHA | UNDERSCORE ) ( ALPHA | UNDERSCORE | DIGIT )* // packagename.classname or classname.structname PACKAGEIDENTIFIER = ( IDENTIFIER DOT )? IDENTIFIER QUALIFIEDIDENTIFIER = ( ( class SQUOTE PACKAGEIDENTIFIER SQUOTE DOT default DOT IDENTIFIER ) | ( ( IDENTIFIER DOT )* IDENTIFIER ) ) IDENTIFIERLIST = IDENTIFIER ( COMMA IDENTIFIER )* STRINGVAL = DQUOTE PRINTABLE DQUOTE INTVAL = ( DIGIT+ | ( '0x' ( HEXDIGIT )+ ) ) FLOATVAL = ( DIGIT )+ DOT ( DIGIT )*
Declaration parts
DECLARATIONS = ( CONSTDECL | VARDECL | ENUMDECL | STRUCTDECL ) SEMICOLON CONSTDECL = const IDENTIFIER = CONSTVALUE CONSTVALUE = ( STRINGVAL | INTVAL | FLOATVAL | BOOLVAL ) VARDECL = var ( CONFIGGROUP )? ( VARPARAMS )* VARTYPE VARIDENTIFIER ( COMMA VARIDENTIFIER )* CONFIGGROUP = LBRACK IDENTIFIER RBRACK VARTYPE = PACKAGEIDENTIFIER | ENUMDECL | STRUCTDECL | ARRAYDECL | CLASSTYPE | BASICTYPE VARIDENTIFIER = IDENTIFIER ( LSBRACK INTVAL RSBRACK ) ARRAYDECL = array LABRACK (PACKAGEIDENTIFIER | CLASSTYPE | BASICTYPE) RABRACK CLASSTYPE = class LABRACK PACKAGEIDENTIFIER RABRACK ENUMDECL = enum IDENTIFIER LCBRACK ENUMOPTIONS RCBRACK ENUMOPTIONS = IDENTIFIER ( COMMA IDENTIFIER )* STRUCTDECL = struct ( STRUCTPARAMS )* IDENTIFIER ( extends PACKAGEIDENTIFIER )? LCBRACK STRUCTBODY RCBRACK STRUCTPARAMS = ( native | export ) STRUCTBODY = ( VARDECL SEMICOLON )+
Replication parts
REPLICATIONBLOCK = replication LCBRACK ( REPLICATIONBODY )* RCBRACK REPLICATIONBODY = ( reliable | unreliable ) if LBRACK EXPR RBRACK IDENTIFIER ( COMMA IDENTIFIER )* SEMICOLON
Body parts
BODY = ( STATEDECL | FUNCTIONDECL )*
State parts
STATEDECL = ( STATEPARAMS )* state IDENTIFIER ( extends IDENTIFIER )? STATEBODY STATEBODY = LCBRACK ( STATEIGNORE )? ( FUNCTIONDECL )* STATELABELS RCBRACK STATEIGNORE = ignores IDENTIFIER ( COMMA IDENTIFIER )* SEMICOLON STATELABELS = ( IDENTIFIER COLON ( CODELINE )* )*
Function parts
// operators require an set amouth of arguments FUNCTIONDECL = ( NORMALFUNC | OPERATORFUNC ) NORMALFUNC = ( FUNCTIONPARAMS )* FUNCTIONTYPE ( LOCALTYPE )? IDENTIFIER LBRACK ( FUNCTIONARGS ( COMMA FUNCTIONARGS )* )? RBRACK FUNCTIONBODY FUNCTIONPARAMS = CONSTFUNCPARAMS | native ( LBRACK INTVAL RBRACK )? OPERATORFUNC = ( FUNCTIONPARAMS )* OPERATORTYPE FUNCTIONBODY OPERATORTYPE = ( BINARYOPERATOR | UNARYOPERATOR ) // requires two arguments BINARYOPERATOR = operator LBRACK INTVAL RBRACK PACKAGEIDENTIFIER OPIDENTIFIER LBRACK FUNCTIONARGS COMMA FUNCTIONARGS RBRACK // requires one argument UNARYOPERATOR = ( preoperator | postoperator ) PACKAGEIDENTIFIER OPIDENTIFIER LBRACK FUNCTIONARGS RBRACK OPIDENTIFIER = IDENTIFIER | OPERATORNAMES FUNCTIONARGS = ( optional | out | coerce )? PACKAGEIDENTIFIER IDENTIFIER FUNCTIONBODY = ( SEMICOLON | ( ( LOCALDECL )* ( CODELINE )* ) ( SEMICOLON )? ) LOCALDECL = local LOCALTYPE IDENTIFIER ( COMMA IDENTIFIER )* LOCALTYPE = PACKAGEIDENTIFIER | ARRAYDECL | CLASSTYPE | BASICTYPE
Code parts
CODELINE = ( STATEMENT | ASSIGNMENT | IFTHENELSE | WHILELOOP | DOLOOP | SWITCHCASE | RETURNFUNC | FOREACHLOOP | FORLOOP ) CODEBLOCK = ( CODELINE | ( LCBRACK ( CODELINE )* RCBRACK ) ) STATEMENT = FUNCCALL SEMICOLON ASSIGNMENT = IDENTIFIER EQUALS EXPR SEMICOLON IFTHENELSE = if LBRACK EXPR RBRACK CODEBLOCK ( else CODEBLOCK )? WHILELOOP = while LBRACK EXPR RBRACK CODEBLOCK DOLOOP = do CODEBLOCK until LBRACK EXPR RBRACK SWITCHCASE = switch LBRACK EXPR RBRACK RCBRACK ( CASERULE )+ ( DEFAULTRULE )? RCBRACK CASERULE = case INTVAL COLON CODEBLOCK DEFAULTRULE = default CODEBLOCK RETURNFUNC = return ( EXPR )? SEMICOLON FOREACHLOOP = foreach FUNCCALL CODEBLOCK FORLOOP = for LBRACK ASSIGNMENT SEMICOLON EXPR SEMICOLON EXPR RBRACK CODEBLOCK EXPR = OPERAND ( OPIDENTIFIER OPERAND )* OPERAND = ( CONSTVALUE | QUALIFIEDIDENTIFIER | FUNCCALL ) FUNCCALL = ( ( class SQUOTE PACKAGEIDENTIFIER SQUOTE DOT static DOT ) | ( ( IDENTIFIER DOT )+ ) )? IDENTIFIER LBRACK ( EXPR ( COMMA EXPR )* )? RBRACK
Defaultproperties
DEFAULTPROPERTIESBLOCK = defaultproperties LCBRACK ( DEFPROP )* RCBRACK DEFPROP = DEFPROPIDENTIFIER EQUALS PRINTABLE DEFPROPIDENTIFIER = IDENTIFIER ( ( LBRACK INTVAL RBRACK ) | ( LSBRACK INTVAL RSBRACK ) )?
Terminals
PRINTABLE = all printable characters ALPHA = 'a' .. 'z' DIGIT = '0' .. '9' HEXDIGIT = DIGIT | 'a' .. 'f' SEMICOLON = ';' COLON = ':' UNDERSCORE = '_' LBRACK = '(' RBRACK = ')' LABRACK = '<' RABRACK = '>' LCBRACK = '{' RCBRACK = '}' LSBRACK = '[' RSBRACK = ']' DOT = '.' COMMA = ',' SQUOTE = ''' DQUOTE = '"' EQUALS = '=' CONSTCLASSPARAMS = abstract | native | nativereplication | safereplace | perobjectconfig | transient | noexport | exportstructs | // available from warfare and up: collapsecategories | dontcollapsecategories | placeable | notplaceable | editinlinenew | noteditinlinenew BOOLVAL = true | false VARPARAMS = config | const | editconst | export | globalconfig | input | localized | native | private | protected | transient | travel | // available from warfare and up: editinline | deprecated | edfindable | editinlineuse STATEPARAMS = auto | simulated CONSTFUNCTPARAMS = final | iterator | latent | simulated | singular | static | exec | protected | private BASICTYPE = byte | int | float | string | bool | name | class FUNCTIONTYPE = function | event | delegate OPERATORNAMES = '~' | '!' | '@' | '#' | '$' | '%' | '^' | '&' | '*' | '-' | '=' | '+' | '|' | '\' | ':' | '<' | '>' | '/' | '?' | '`' | '<<' | '>>' | '!=' | '<=' | '>=' | '++' | '?-' | '+=' | '-=' | '*=' | '/=' | '&&' | '||' | '^^' | '==' | '**' | '~=' | '@=' | '>>>'
Notes
Case
UnrealScript is case insensitive, so all terminals may be written in any case format. Because of this the uppercase variants for ALPHA and HEXDIGIT are omitted.
Unreal Engine
This grammar applies to the Unreal Warfare engine. Older versions of the Unreal engine have a few diffirences. Here's a list of changes to this grammar to be applied for older versions.
- extends can be replaced with expands
- The ARRAYDECL rule does not apply
- in the CLASSPARAMS rule the following do not apply:
- within PACKAGEIDENTIFIER
- dependson LBRACK PACKAGEIDENTIFIER RBRACK
- hidecategories LBRACK IDENTIFIERLIST RBRACK
- showcategories LBRACK IDENTIFIERLIST RBRACK
- In CONSTCLASSPARAMS nousercreate is allowed
- STRUCTPARAMS does not apply
Related Topics
Discussion
El Muerte TDS: As suggested in UnDox Revisited , so hell why not
Tarquin: Nice
Jerome-X This can be very useful for the parser in the UCEditor plugin. Thanks
El Muerte TDS: The only open things are the class, var and function params, for the rest is should be done. So if anyone could verify the stuff I wrote down, I might have missed some things.
El Muerte TDS: done, no more open rules
CaptainNuss: Greetings, just added the local keyword for variable declarations. Btw, why aren't the basic built-in variable types listed in this specification?
Mychaeel: "local" is covered by LOCALDECL already. In VARDECL it's a bug.
CaptainNuss: Oops, I'm sorry. Didn't see that.
El Muerte TDS: you're right about the basic types, added them now, also the function return type was incorrect (functions can also return arrays, etc..)
The reson why var and local are diffirent is because inline enum and struct definitions are not allowed in local but are in var.
Wormbo: Is there a (free) program that can check a source code file against an EBNF definition?
El Muerte TDS: not that I know of. But there are programs that create a parser from a EBNF definition (needs some chaning tho): http://catalog.compilertools.net/lexparse.html and one not in that list [ANTLR]