class CodeRay::Tokens

Tokens TODO: Rewrite!

The Tokens class represents a list of tokens returnd from a Scanner.

A token is not a special object, just a two-element Array consisting of

A token looks like this:

['# It looks like this', :comment]
['3.1415926', :float]
['$^', :error]

Some scanners also yield sub-tokens, represented by special token actions, namely #begin_group and end_group.

The Ruby scanner, for example, splits “a string” into:

 [:begin_group, :string],
 ['"', :delimiter],
 ['a string', :content],
 ['"', :delimiter],
 [:end_group, :string]

Tokens is the interface between Scanners and Encoders: The input is split and saved into a Tokens object. The Encoder then builds the output from this object.

Thus, the syntax below becomes clear:

CodeRay.scan('price = 2.59', :ruby).html
# the Tokens object is here -------^

See how small it is? ;)

Tokens gives you the power to handle pre-scanned code very easily: You can convert it to a webpage, a YAML file, or dump it into a gzip'ed string that you put in your DB.

It also allows you to generate tokens directly (without using a scanner), to load them from a file, and still use any Encoder that CodeRay provides.



The Scanner instance that created the tokens.

Public Class Methods

Undump the object using Marshal.load, then unzip it using CodeRay::GZip.gunzip.

The result is commonly a Tokens object, but this is not guaranteed.

def Tokens.load dump
  dump = GZip.gunzip dump
  @dump = Marshal.load dump

Public Instance Methods

def begin_group kind; push :begin_group, kind end
def begin_line kind; push :begin_line, kind end
Return the actual number of tokens.

def count
  size / 2
Dumps the object into a String that can be saved in files or databases.

The dump is created with Marshal.dump; In addition, it is gzipped using CodeRay::GZip.gzip.

The returned String object includes Undumping so it has an undump method. See ::load.

You can configure the level of compression, but the default value 7 should be what you want in most cases as it is a good compromise between speed and compression rate.

See GZip module.

def dump gzip_level = 7
  dump = Marshal.dump self
  dump = GZip.gzip dump, gzip_level
  dump.extend Undumping
Encode the tokens using encoder.

encoder can be

  • a symbol like :html oder :statistic

  • an Encoder class

  • an Encoder object

options are passed to the encoder.

def encode encoder, options = {}
  encoder = Encoders[encoder].new options if encoder.respond_to? :to_sym
  encoder.encode_tokens self, options
def end_group kind; push :end_group, kind end
def end_line kind; push :end_line, kind end
Redirects unknown methods to encoder calls.

For example, if you call tokens.html, the HTML encoder is used to highlight the tokens.

def method_missing meth, options = {}
  encode meth, options
rescue PluginHost::PluginNotFound
Split the tokens into parts of the given sizes.

The result will be an Array of Tokens objects. The parts have the text size specified by the parameter. In addition, each part closes all opened tokens. This is useful to insert tokens betweem them.

This method is used by @Scanner#tokenize@ when called with an Array of source strings. The Diff encoder uses it for inline highlighting.

def split_into_parts *sizes
  parts = []
  opened = []
  content = nil
  part =
  part_size = 0
  size = sizes.first
  i = 0
  for item in self
    case content
    when nil
      content = item
    when String
      if size && part_size + content.size > size  # token must be cut
        if part_size < size  # some part of the token goes into this part
          content = content.dup  # content may no be safe to change
          part << content.slice!(0, size - part_size) << item
        # close all open groups and lines...
        closing = do |content_or_kind|
          case content_or_kind
          when :begin_group
          when :begin_line
        part.concat closing
          parts << part
          part =
          size = sizes[i += 1]
        end until size.nil? || size > 0
        # ...and open them again.
        part.concat opened.flatten
        part_size = 0
        redo unless content.empty?
        part << content << item
        part_size += content.size
      content = nil
    when Symbol
      case content
      when :begin_group, :begin_line
        opened << [content, item]
      when :end_group, :end_line
        raise ArgumentError, 'Unknown token action: %p, kind = %p' % [content, item]
      part << content << item
      content = nil
      raise ArgumentError, 'Token input junk: %p, kind = %p' % [content, item]
  parts << part
  parts << while parts.size < sizes.size
to_s() click to toggle source

Turn tokens into a string by concatenating them.

def to_s