Google

Racc grammar reference


This is temporary specification.

global structure

blocks in file

There's two block on toplevel. one is 'class' block, another is 'user code' block. 'user code' block MUST places after 'class' block.

comment

You can insert comment about all places. Two style comment can be used, Ruby style (#.....) and C style (/*......*/) .

class block

class block is like this:


    class class_name
      [precedance table]
      [token declearations]
      [expected number of S/R conflict]
      [options]
      [semantic value convertion]
      [start rule]
    rule
      RULES
    end

class_name is a name of parser class. This is the name of generating parser class.

If class_name includes '::', Racc outputs module clause. For example, "class M::C" causes creating the code bellow:


module M
  class C
    :
  end
end

rule block

'rule block' discripts grammar which is able to be understood by parser. For example:


   (token): (token) (token) (token).... (action) ;

   (token): (token) (token) (token).... (action)
          | (token) (token) (token).... (action)
          | (token) (token) (token).... (action)

(action) is an action which is executed when its (token)s are found. (action) is like this:

        { print val[0]
          puts val[1] }

In (action), you cannot use '%' string, here document, '%r' regexp. Actions can be omitted. When it is omitted, ''(empty string) is used.

A return value of action is a value of left side value ($$). It is value of result, or returned value by "return" statement.

Then, here's a sample of whole 'rule block'.


rule
  goal: definition ruls source { result = val }

  definition: /* none */   { result = [] }
    | definition startdesig  { result[0] = val[1] }
    | definition
             precrule   # this line continue from upper line
      {
        result[1] = val[1]
      }

  startdesig: START TOKEN

end

You can use these special local variables in action.
result ($$)
value of left-hand side (lhs). A default value is val[0].
val ($1,$2,$3...)
an array of value of right-hand side (rhs).
_values (...$-2,$-1,$0)
a stack of values. DO NOT CHANGE this stack unless you know what you are doing.

Operator precedance

This function is equal to '%prec' in yacc. To designate this block:


    prechigh
      nonassoc '++'
      left     '*' '/'
      left     '+' '-'
      right    '='
    preclow

'right' is '%right', 'left' is '%left'. While this example is written 'prechigh' upper and 'preclow' lower, another format 'preclow...prechigh' is also OK.

'%prec' can be used. format is like this:


  prechigh
    nonassoc UMINUS
    left '*' '/'
    left '+' '-'
  preclow

  rule
    exp: exp '*' exp
       | exp '-' exp
       | '-' exp       = UMINUS   # this!!!
           :

expect

Racc has bison's "expect" directive.

# Example

class MyParser
rule
  expect 3
    :
This directive declears "expected" number of shift/reduce conflict. If "expected" number is equal to real number, racc does not print confliction warning message. Mainly for release-version parser.

Token Declearation

By declearing tokens that IT IS TOKEN, you can avoid many meanless bugs. If decleared token does not exist/existing token does not decleared, Racc output warnings.
Declearation syntax is:


  token TOKEN_NAME AND_IS_THIS
        ALSO_THIS_IS AGAIN_AND_AGAIN THIS_IS_LAST

Options

You can write options for racc command in your racc file.


  options OPTION OPTION ...

Now options are:
omit_action_call
if omit empty action call
result_var
if use variable "result"
These flags also can be given with 'no_' prefix.

Convert Token Symbol

Token symbols are, as default,

naked token string in racc file (TOK, XFILE, this_is_token, ...)
symbol of it (:TOK, :XFILE, :this_is_token, ...)
quoted string (':', '.', '(', ...)
same string (':', '.', '(', ...)
You can change this by "convert" block. This is example:

    convert
      PLUS 'PlusClass'      # not use :PLUS but PlusClass
      MIN  'MinusClass'     # not use :MIN but MinusClass
    end

Almost all ruby value can be used by token symbol, but 'false' and 'nil' are NOT. These are causes unexpected parse error.

If you want to use String as token symbol, special care is required. For example:


    convert
      class '"cls"'            # in code, "cls"
      PLUS '"plus\n"'          # in code, "plus\n"
      MIN  "\"minus#{val}\""   # in code, \"minus#{val}\"
    end

This is not BUG, but FEATURE.

Start Rule

'%start' in yacc. This changes start rule.


      start real_target

this statement won't be used forever, I think.

User Code

'user block' is Ruby source code which is copied to output. In racc command, Three spetial user code 'header' 'inner' 'footer' are used.
Until version 0.10.2, they are called prepare/inner/driver. It is also OK, but they will be eliminated in future.

format of user code is like this:


---- header
  ruby statement
  ruby statement
  ruby statement

---- inner
  ruby statement
     :

If 4 '-' exist on line head, racc treat it as beginning of user code. A name of user code must be one word.

You can include other file as user code like this:


---- footer = init.rb run.rb

print 'this code is executed too.'

This statement makes racc to use "init.rb" "run.rb" as 'footer' user code.