class CodeRay::Tokens

Tokens TODO: Rewrite!

The Tokens class represents a list of tokens returnd from a Scanner.

A token is not a special object, just a two-element Array consisting of

A token looks like this:

['# It looks like this', :comment]
['3.1415926', :float]
['$^', :error]

Some scanners also yield sub-tokens, represented by special token actions, namely #begin_group and end_group.

The Ruby scanner, for example, splits “a string” into:

 [:begin_group, :string],
 ['"', :delimiter],
 ['a string', :content],
 ['"', :delimiter],
 [:end_group, :string]

Tokens is the interface between Scanners and Encoders: The input is split and saved into a Tokens object. The Encoder then builds the output from this object.

Thus, the syntax below becomes clear:

CodeRay.scan('price = 2.59', :ruby).html
# the Tokens object is here -------^

See how small it is? ;)

Tokens gives you the power to handle pre-scanned code very easily: You can convert it to a webpage, a YAML file, or dump it into a gzip'ed string that you put in your DB.

It also allows you to generate tokens directly (without using a scanner), to load them from a file, and still use any Encoder that CodeRay provides.



The Scanner instance that created the tokens.

Public Class Methods

load(dump) click to toggle source

Undump the object using Marshal.load, then unzip it using CodeRay::GZip.gunzip.

The result is commonly a Tokens object, but this is not guaranteed.

# File lib/coderay/tokens.rb, line 201
def Tokens.load dump
  dump = GZip.gunzip dump
  @dump = Marshal.load dump

Public Instance Methods

begin_group(kind;) click to toggle source
# File lib/coderay/tokens.rb, line 207
def begin_group kind; push :begin_group, kind end
begin_line(kind;) click to toggle source
# File lib/coderay/tokens.rb, line 209
def begin_line kind; push :begin_line, kind end
count() click to toggle source

Return the actual number of tokens.

# File lib/coderay/tokens.rb, line 181
def count
  size / 2
dump(gzip_level = 7) click to toggle source

Dumps the object into a String that can be saved in files or databases.

The dump is created with Marshal.dump; In addition, it is gzipped using CodeRay::GZip.gzip.

The returned String object includes Undumping so it has an undump method. See ::load.

You can configure the level of compression, but the default value 7 should be what you want in most cases as it is a good compromise between speed and compression rate.

See GZip module.

# File lib/coderay/tokens.rb, line 174
def dump gzip_level = 7
  dump = Marshal.dump self
  dump = GZip.gzip dump, gzip_level
  dump.extend Undumping
encode(encoder, options = {}) click to toggle source

Encode the tokens using encoder.

encoder can be

  • a symbol like :html oder :statistic

  • an Encoder class

  • an Encoder object

options are passed to the encoder.

# File lib/coderay/tokens.rb, line 66
def encode encoder, options = {}
  encoder = Encoders[encoder].new options if encoder.respond_to? :to_sym
  encoder.encode_tokens self, options
end_group(kind;) click to toggle source
# File lib/coderay/tokens.rb, line 208
def end_group kind; push :end_group, kind end
end_line(kind;) click to toggle source
# File lib/coderay/tokens.rb, line 210
def end_line kind; push :end_line, kind end
method_missing(meth, options = {}) click to toggle source

Redirects unknown methods to encoder calls.

For example, if you call tokens.html, the HTML encoder is used to highlight the tokens.

# File lib/coderay/tokens.rb, line 80
def method_missing meth, options = {}
  encode meth, options
rescue PluginHost::PluginNotFound
split_into_parts(*sizes) click to toggle source

Split the tokens into parts of the given sizes.

The result will be an Array of Tokens objects. The parts have the text size specified by the parameter. In addition, each part closes all opened tokens. This is useful to insert tokens betweem them.

This method is used by @Scanner#tokenize@ when called with an Array of source strings. The Diff encoder uses it for inline highlighting.

# File lib/coderay/tokens.rb, line 95
def split_into_parts *sizes
  parts = []
  opened = []
  content = nil
  part =
  part_size = 0
  size = sizes.first
  i = 0
  for item in self
    case content
    when nil
      content = item
    when String
      if size && part_size + content.size > size  # token must be cut
        if part_size < size  # some part of the token goes into this part
          content = content.dup  # content may no be safe to change
          part << content.slice!(0, size - part_size) << item
        # close all open groups and lines...
        closing = do |content_or_kind|
          case content_or_kind
          when :begin_group
          when :begin_line
        part.concat closing
          parts << part
          part =
          size = sizes[i += 1]
        end until size.nil? || size > 0
        # ...and open them again.
        part.concat opened.flatten
        part_size = 0
        redo unless content.empty?
        part << content << item
        part_size += content.size
      content = nil
    when Symbol
      case content
      when :begin_group, :begin_line
        opened << [content, item]
      when :end_group, :end_line
        raise ArgumentError, 'Unknown token action: %p, kind = %p' % [content, item]
      part << content << item
      content = nil
      raise ArgumentError, 'Token input junk: %p, kind = %p' % [content, item]
  parts << part
  parts << while parts.size < sizes.size
to_s() click to toggle source

Turn tokens into a string by concatenating them.

# File lib/coderay/tokens.rb, line 72
def to_s