Skip to content

II. A Ruby Tutorial

Benjamin Anderson edited this page Oct 26, 2017 · 1 revision

Everything in Ruby is an object, including Strings, Integer, Booleans, and Floats. This means that they can have methods!

1.hour # => 3600
"foobar".upcase # => "FOOBAR"
Math::PI.to_r # => (884279719003555/281474976710656)

Since everthing is an object, we only deal in classes and methods. This doesn't work out to be nearly as horrible as it is in, say Java, though, because Ruby is full of delicious syntactic sugar which makes working with objects pleasant!

Ruby's object model makes sense when you think of "receivers" receiving "messages" rather than objects with methods (it inherits this from Smalltalk). "foobar".upcase means "given the string "foobar", send it the message upcase". In Ruby parlance, the string "foobar" is the "receiver", and "upcase" is the message sent.

This is important because self is an important concept in Ruby. When a receiver is not specified, self is implictly the receiver.

Additionally, it is important to note that in Ruby, the following rules are enforced by the language:

  • Constants always start with an uppercase letter. String, Math, ENV are all examples of constants (of the types Class, Module, and Array respectively). You can actually reassign a constant after assignment, but you shouldn't and Ruby will complain loudly if you do. Classes and Modules are two of the most obvious examples of constants in Ruby.
  • Local variables always start with a lower case letter. Local variables are visible within the method scope they are defined in.
  • Global variables begin with a $. $stderr is a good example (a standard global variable which is the STDERR IO stream). Generally, you should avoid globals - you should not really ever need them.
  • Instance variables begin with an @. An instance variable is visible, as one would expect, within an instance of a class. You can think of instance variables as class instance properties, except they aren't directly visible outside of the class.

Let's look at the structure of a Ruby class, and some common method forms:

class MyClass
  # `initialize` is the name of the constructor in Ruby.
  # You don't have to specify it unless your class needs
  # to do some setup.
  def initialize(name)
    # Instance variables are prefixed with the @ sigil. This is
    # equivalent to self.name = name in Python, but in Ruby,
    # self.name = name would send the message ["name=", "name"]
    # to the receiver self, rather than creating a property
    # called `name`.
    #
    # Instance variables (or ivars) are used to store state
    # internal to an object. We will then expose those ivars
    # through getters and setters
    @name = name
  end

  # In Ruby, the value of the last statement of any given scope
  # is the return value of that scope. In this case, `@name` being
  # the last statement means that the method #name returns the value
  # of @name when it is invoked. It is conventional to not use
  # return unless necessary.
  #
  # Also note the lack of empty argument parentheses. Ruby does not
  # require parentheses for argument lists. In fact, `def name arg1
  # arg2` is valid Ruby, but it is conventionally frowned upon. Use 
  # whichever construct makes the code most easily read.
  def name
    @name
  end

  # This is a setter, but there's no magic in it - Ruby can use
  # punctuation in method names. As alluded to above,
  # `self.name = "Chris"` just sends the message
  # `["name=", "Chris"]` to `self`. This means that  we can
  # define setter behavior without having to use Javaisms like
  # `def setName(String name)`
  def name=(name)
    @name = name
  end

  # Method arguments can be given default values.
  def generate_name(length = 10)
    @name = length.times.map { ("a".."z").sample }.join
  end

  # As of Ruby 2.3, methods can also take keyword arguments.
  # Here we could call `construct_name(first_name: "Chris")`,
  # permitting us to ignore the ordering of parameters, much
  # like Python
  #
  # construct_name first_name: "Chris" # => "Chris Default"
  def construct_name(last_name: "Default", first_name: "Name")
    @name = format("%s %s", first_name, last_name)
  end

  # Prior to Ruby 2.3, it was conventional to provide a pseudo-named
  # args interface by passing a hash as the last argument of the
  # method. When a Hash is the last argument, braces are unecessary
  #
  # self.construct_title "Punologist", first_name: "Chris", last_name: "Heald"
  #
  # this is equivalent to:
  #
  #  self.construct_title("Punologist", {first_name: "Chris", last_name: "Heald"})
  def construct_title(title, name = {})
    @name = format("%s %s: %s", name[:first_name], name[:last_name], title)
  end

  # Punctuation: as mentioned before, Ruby can accept punctuation as
  # A valid part of method names.
  # This means that you can have punctuation. The most common
  # punctuation used are ? and !. These don't do anything special,
  # but they are conventionally used to indicate that a method is a
  # predicate method (that is, that it answers a question and
  # returns a boolean, generally without side effects) and that
  # a method is mutative, respectively.
  def name?
    !@name.nil?
  end

  def upcase_name!
    @name.upcase!
  end

  # Prefixing the method name with `self` indicates that this is a
  # class method. It is only available with the class itself as the
  # receiver, rather than instances of the class.
  def self.say_hi
    puts "Hi!"
  end
end

Strings & Symbols

Strings have few surprises. All strings in Ruby are unicode strings by default, unless you reinterpret them with a different encoding. Ruby distinguishes between single- and double-quoted strings only in that single-quoted strings are not processed for interpolation. Otherwise, they are equivalent.

For double-quoted strings, you can interpolate values into the string by using #{}. For example:

first_name = "Chris"
last_name = "Heald"
full_name = "#{first_name} #{last_name}"
# => "Chris Heald"

This doesn't work if you use single-quoted strings:

full_name = '#{first_name} #{last_name}'
# => "\#{first_name} \#{last_name}"

You can also format strings as you might expect from Python:

full_name = "%s %s" % [first_name, last_name]
# => "Chris Heald"

full_name = format("%s %s", first_name, last_name)
# => "Chris Heald"

Symbols are self-referential interned strings. Fundamentally, this means that only one instance of a symbol ever exists globally, which makes it a memory and allocation-efficient way to reuse common values (such as hash keys).

Conventionally, strings and symbols are used interchangably in many contexts, but they are not equivalent values. You can cast back and forth from strings and symbols with the methods Symbol#to_s and String#to_sym:

"foobar".to_sym
# => :foobar 
:foobar.to_s
# => "foobar" 
"foobar" == :foobar
# => false

This can cause some confusion when used with hashes, because given a hash:

hash = {:foo => "bar", "baz" => "bang"}
hash[:foo] => "bar"
hash["foo"] => nil
hash[:baz] => nil
hash["baz"] => "bang"

We can get around this a few ways. ActiveSupport (a common library which extends many core classes with useful functionality) provides:

  • Hash#stringify_keys
  • Hash#symbolize_keys
  • Hash#with_indifferent_access

These can be used to do just what it says on the tin - convert all keys to strings, symbols, or convert the hash to a HashWithIndifferentAccess, which will respond with the same value for a string or symbol variant of the same key.

Blocks

Blocks can be thought of as anonymous functions which retain the scope in which they were defined (usually). Blocks can be called in a different scope (called a binding in Ruby parlance), but you generally don't need to worry about this unless you intend to get into metaprogramming.

A block is not run when it is defined, but only when the method it is passed to either calls it, or calls yield to return control to the block.

Ruby offers three different kinds of anonymous functions: blocks, procs, and lambdas. They do have some significant differences, but for most purposes, the three are interchangable. Blocks take the form of:

some_method() do |arg|
  do_something(arg)
end

# Or

some_method() {|arg| do_something(arg) }

Both forms are equivalent, but conventionally, do...end is used for multi-line blocks, while curly braces are used for single-line blocks.

When you pass a block to a method, it has a few ways to use it:

You can explicitly define a &block parameter as the last parameter to the function. If you do so, then a local var block will be available, which is of type Proc and has methods such as call. You won't typically use this unless you need to explicitly pass the block to another method, but it is often used to self-document that a method expects a block.

def my_method(&block)
  block.call "foobar"
end

You don't need to declare that a method take a block in order to use one. yield will invoke a passed block with the given arguments. This will raise an exception if no block is passed, though.

def my_method()
  yield "foobar"
end

my_method {|msg| puts msg }
# => "foobar"
my_method
# => LocalJumpError: No block given

block_given? is used to determine if a block was passed to the method. Note that the explicit &block parameter isn't necessary.

def my_method()
  yield "foobar" if block_given?
end

my_method
# => nil
my_method {|msg| puts msg }
# => "foobar"

The & sigil prior to a variable used to pass a Proc or lambda as a block.

my_lambda = ->(x) { puts x }

def foo(&block)
  yield "block"
end

foo(&my_lambda)
# "block"

Lambdas and comprehensions

Ruby has no explicit concept of comprehensions. However, its support for blocks means that it is very easy to achieve functional-style programming:

List comprehensions:

# Python
list = [1, 2, 3]
results = [val * 2 for val in list]
# [2, 4, 6]
# Ruby
list = [1, 2, 3]
results = list.map {|val| val * 2 }
# [2, 4, 6]

Map comprehensions

# Python
list = {'x': 1, 'y': 2}
results = {k : v * 2 for k, v in list.iteritems()}
# {'y': 4, 'x': 2}
# Ruby
list = {x: 1, y: 2}
results = list.map {|k, v| [k, v * 2] }.to_h
# {:x => 2, :y => 4}

Ruby offers a shortcut for calling a method on objects during enumeration. The following are equivalent:

[1, 2, 3, 4].map {|num| num.to_s }
# => ["1", "2", "3", "4"]

[1, 2, 3, 4].map(&:to_s)
# => ["1", "2", "3", "4"]

That is, when you just want to send the same message to each object in an Enumerable, you can use &:message_name.

This becomes particularly nice when you want to map a list through a series of transformations:

  p ("a".."z").map(&:upcase).shuffle.each_slice(10).to_a
  # => [
  #   ["M", "Z", "W", "V", "A", "Y", "L", "D", "I", "P"],
  #   ["E", "G", "R", "H", "U", "C", "O", "T", "Q", "J"],
  #   ["X", "N", "S", "K", "B", "F"]
  # ]

Ruby also offers inject or reduce which takes an initial value, plus a block. The current reduced value plus the next item is passed to the block, and the return value of the block becomes the new cumulative value:

(1..100).inject(0) do |sum, value|
  sum + value
end
# => 5050

Or, since we just want to take the input value and send it the message :+, value, we can use the shortcut:

(1..100).inject(&:+)
# => 5050

Class Inheritance and Composition

Ruby doesn't offer multiple inheritance, unlike some other languages (including Python). That is, you can't derive a subclass from multiple base classes - a class only ever has one superclass. However, you can compose subclasses with the mixin pattern with Modules.

A Module can be thought of as an abstract class - it cannot be instantiated, and while you can define "class methods" on a module that can be called directly, modules are generally used to encapsulate functionality that should be used to compose other classes.

module Gobbler
  def gobble
    puts "Gobble!"
  end
end

module Quacker
  def quack
    puts "Quack!"
  end
end

module Clucker
  def cluck
    puts "Cluck!"
  end
end

class Turkey
  include Gobbler
end

class Turducken
  include Gobbler
  include Quacker
  include Clucker

  def turduckenate
    gobble
    quack
    cluck
  end
end

module Terminator
  def catchphrase
    puts "I'll be back."
  end
end

class Turduckenator < Turducken
  include Terminator
end

Turkey.new.gobble # => "Gobble!"
Turkey.new.quack # => NoMethodError
Turducken.new.quack # => "Quack!"
Turducken.new.turduckenate
# => Gobble!
#    Quack!
#    Cluck!
Turduckenator.new.quack # => "Quack!"
Turduckenator.new.catchphrase # => "I'll be back."

Namespaces

Namespaces in Ruby are used to group constants together. Namespaces are achieved by simply defining a constant within the scope of another constant:

module Birds
  class Turkey
  end
end

# Or alternately, assuming that the constant Birds already exists
class Birds::Duck
end

Birds::Turkey.new
Birds::Duck.new

Ruby will attempt to find constants in the current namespace, and will then walk up the chain to the parent looking for a match:

# The :: prefix explicitly indicates that Duck should be a toplevel constant
class ::Duck
  def quack
    "I am a toplevel duck! Quack!"
  end
end

module Birds
  class Duck
    def quack
      "I am a midlevel duck. Quack!"
    end
  end

  Duck.new.quack
  # => will find Birds::Duck, because we are in the Birds scope

  ::Duck.new.quack
  # => will find the toplevel ::Duck, because we are not in the Birds scope
end

Duck.quack
# => will find the toplevel ::Duck, because we are not in the Birds scope

Birds::Duck.new.quack
# => will find Birds::Duck

This is important because constants are global within Ruby. Within a given runtime, once a constant has been loaded, it is visible from everywhere, not just the scope or file which loaded it.

# my_class.rb
class MyClass
end

# my_class_consumer.rb
require "my_class"
MyClass.class # => Class

# consumers_list.rb
require "my_class_consumer"
MyClass.class # => Class

# MyClass is visible even though we didn't import it in this file,
# because it was imported by another file.