-
Notifications
You must be signed in to change notification settings - Fork 4
II. A Ruby Tutorial
Everything in Ruby is an object, including Strings, Integer, Booleans, and Floats. This means that they can have methods!
1.hour # => 3600
"foobar".upcase # => "FOOBAR"
Math::PI.to_r # => (884279719003555/281474976710656)
Since everthing is an object, we only deal in classes and methods. This doesn't work out to be nearly as horrible as it is in, say Java, though, because Ruby is full of delicious syntactic sugar which makes working with objects pleasant!
Ruby's object model makes sense when you think of "receivers" receiving "messages" rather than objects with methods (it inherits this from Smalltalk). "foobar".upcase
means "given the string "foobar"
, send it the message upcase
". In Ruby parlance, the string "foobar"
is the "receiver", and "upcase" is the message sent.
This is important because self
is an important concept in Ruby. When a receiver is not specified, self
is implictly the receiver.
Additionally, it is important to note that in Ruby, the following rules are enforced by the language:
- Constants always start with an uppercase letter.
String
,Math
,ENV
are all examples of constants (of the typesClass
,Module
, andArray
respectively). You can actually reassign a constant after assignment, but you shouldn't and Ruby will complain loudly if you do. Classes and Modules are two of the most obvious examples of constants in Ruby. - Local variables always start with a lower case letter. Local variables are visible within the method scope they are defined in.
- Global variables begin with a
$
.$stderr
is a good example (a standard global variable which is the STDERR IO stream). Generally, you should avoid globals - you should not really ever need them. - Instance variables begin with an
@
. An instance variable is visible, as one would expect, within an instance of a class. You can think of instance variables as class instance properties, except they aren't directly visible outside of the class.
Let's look at the structure of a Ruby class, and some common method forms:
class MyClass
# `initialize` is the name of the constructor in Ruby.
# You don't have to specify it unless your class needs
# to do some setup.
def initialize(name)
# Instance variables are prefixed with the @ sigil. This is
# equivalent to self.name = name in Python, but in Ruby,
# self.name = name would send the message ["name=", "name"]
# to the receiver self, rather than creating a property
# called `name`.
#
# Instance variables (or ivars) are used to store state
# internal to an object. We will then expose those ivars
# through getters and setters
@name = name
end
# In Ruby, the value of the last statement of any given scope
# is the return value of that scope. In this case, `@name` being
# the last statement means that the method #name returns the value
# of @name when it is invoked. It is conventional to not use
# return unless necessary.
#
# Also note the lack of empty argument parentheses. Ruby does not
# require parentheses for argument lists. In fact, `def name arg1
# arg2` is valid Ruby, but it is conventionally frowned upon. Use
# whichever construct makes the code most easily read.
def name
@name
end
# This is a setter, but there's no magic in it - Ruby can use
# punctuation in method names. As alluded to above,
# `self.name = "Chris"` just sends the message
# `["name=", "Chris"]` to `self`. This means that we can
# define setter behavior without having to use Javaisms like
# `def setName(String name)`
def name=(name)
@name = name
end
# Method arguments can be given default values.
def generate_name(length = 10)
@name = length.times.map { ("a".."z").sample }.join
end
# As of Ruby 2.3, methods can also take keyword arguments.
# Here we could call `construct_name(first_name: "Chris")`,
# permitting us to ignore the ordering of parameters, much
# like Python
#
# construct_name first_name: "Chris" # => "Chris Default"
def construct_name(last_name: "Default", first_name: "Name")
@name = format("%s %s", first_name, last_name)
end
# Prior to Ruby 2.3, it was conventional to provide a pseudo-named
# args interface by passing a hash as the last argument of the
# method. When a Hash is the last argument, braces are unecessary
#
# self.construct_title "Punologist", first_name: "Chris", last_name: "Heald"
#
# this is equivalent to:
#
# self.construct_title("Punologist", {first_name: "Chris", last_name: "Heald"})
def construct_title(title, name = {})
@name = format("%s %s: %s", name[:first_name], name[:last_name], title)
end
# Punctuation: as mentioned before, Ruby can accept punctuation as
# A valid part of method names.
# This means that you can have punctuation. The most common
# punctuation used are ? and !. These don't do anything special,
# but they are conventionally used to indicate that a method is a
# predicate method (that is, that it answers a question and
# returns a boolean, generally without side effects) and that
# a method is mutative, respectively.
def name?
!@name.nil?
end
def upcase_name!
@name.upcase!
end
# Prefixing the method name with `self` indicates that this is a
# class method. It is only available with the class itself as the
# receiver, rather than instances of the class.
def self.say_hi
puts "Hi!"
end
end
Strings have few surprises. All strings in Ruby are unicode strings by default, unless you reinterpret them with a different encoding. Ruby distinguishes between single- and double-quoted strings only in that single-quoted strings are not processed for interpolation. Otherwise, they are equivalent.
For double-quoted strings, you can interpolate values into the string by using #{}
. For example:
first_name = "Chris"
last_name = "Heald"
full_name = "#{first_name} #{last_name}"
# => "Chris Heald"
This doesn't work if you use single-quoted strings:
full_name = '#{first_name} #{last_name}'
# => "\#{first_name} \#{last_name}"
You can also format strings as you might expect from Python:
full_name = "%s %s" % [first_name, last_name]
# => "Chris Heald"
full_name = format("%s %s", first_name, last_name)
# => "Chris Heald"
Symbols are self-referential interned strings. Fundamentally, this means that only one instance of a symbol ever exists globally, which makes it a memory and allocation-efficient way to reuse common values (such as hash keys).
Conventionally, strings and symbols are used interchangably in many contexts, but they are not equivalent values. You can cast back and forth from strings and symbols with the methods Symbol#to_s
and String#to_sym
:
"foobar".to_sym
# => :foobar
:foobar.to_s
# => "foobar"
"foobar" == :foobar
# => false
This can cause some confusion when used with hashes, because given a hash:
hash = {:foo => "bar", "baz" => "bang"}
hash[:foo] => "bar"
hash["foo"] => nil
hash[:baz] => nil
hash["baz"] => "bang"
We can get around this a few ways. ActiveSupport (a common library which extends many core classes with useful functionality) provides:
Hash#stringify_keys
Hash#symbolize_keys
Hash#with_indifferent_access
These can be used to do just what it says on the tin - convert all keys to strings, symbols, or convert the hash to a HashWithIndifferentAccess
, which will respond with the same value for a string or symbol variant of the same key.
Blocks can be thought of as anonymous functions which retain the scope in which they were defined (usually). Blocks can be called in a different scope (called a binding
in Ruby parlance), but you generally don't need to worry about this unless you intend to get into metaprogramming.
A block is not run when it is defined, but only when the method it is passed to either calls it, or calls yield
to return control to the block.
Ruby offers three different kinds of anonymous functions: blocks, procs, and lambdas. They do have some significant differences, but for most purposes, the three are interchangable. Blocks take the form of:
some_method() do |arg|
do_something(arg)
end
# Or
some_method() {|arg| do_something(arg) }
Both forms are equivalent, but conventionally, do...end
is used for multi-line blocks, while curly braces are used for single-line blocks.
When you pass a block to a method, it has a few ways to use it:
You can explicitly define a &block
parameter as the last parameter to the function. If you do so, then a local var block
will be available, which is of type Proc and has methods such as call
. You won't typically use this unless you need to explicitly pass the block to another method, but it is often used to self-document that a method expects a block.
def my_method(&block)
block.call "foobar"
end
You don't need to declare that a method take a block in order to use one. yield
will invoke a passed block with the given arguments. This will raise an exception if no block is passed, though.
def my_method()
yield "foobar"
end
my_method {|msg| puts msg }
# => "foobar"
my_method
# => LocalJumpError: No block given
block_given?
is used to determine if a block was passed to the method. Note that the explicit &block
parameter isn't necessary.
def my_method()
yield "foobar" if block_given?
end
my_method
# => nil
my_method {|msg| puts msg }
# => "foobar"
The &
sigil prior to a variable used to pass a Proc or lambda as a block.
my_lambda = ->(x) { puts x }
def foo(&block)
yield "block"
end
foo(&my_lambda)
# "block"
Ruby has no explicit concept of comprehensions. However, its support for blocks means that it is very easy to achieve functional-style programming:
List comprehensions:
# Python
list = [1, 2, 3]
results = [val * 2 for val in list]
# [2, 4, 6]
# Ruby
list = [1, 2, 3]
results = list.map {|val| val * 2 }
# [2, 4, 6]
Map comprehensions
# Python
list = {'x': 1, 'y': 2}
results = {k : v * 2 for k, v in list.iteritems()}
# {'y': 4, 'x': 2}
# Ruby
list = {x: 1, y: 2}
results = list.map {|k, v| [k, v * 2] }.to_h
# {:x => 2, :y => 4}
Ruby offers a shortcut for calling a method on objects during enumeration. The following are equivalent:
[1, 2, 3, 4].map {|num| num.to_s }
# => ["1", "2", "3", "4"]
[1, 2, 3, 4].map(&:to_s)
# => ["1", "2", "3", "4"]
That is, when you just want to send the same message to each object in an Enumerable
, you can use &:message_name
.
This becomes particularly nice when you want to map a list through a series of transformations:
p ("a".."z").map(&:upcase).shuffle.each_slice(10).to_a
# => [
# ["M", "Z", "W", "V", "A", "Y", "L", "D", "I", "P"],
# ["E", "G", "R", "H", "U", "C", "O", "T", "Q", "J"],
# ["X", "N", "S", "K", "B", "F"]
# ]
Ruby also offers inject
or reduce
which takes an initial value, plus a block. The current reduced value plus the next item is passed to the block, and the return value of the block becomes the new cumulative value:
(1..100).inject(0) do |sum, value|
sum + value
end
# => 5050
Or, since we just want to take the input value and send it the message :+, value
, we can use the shortcut:
(1..100).inject(&:+)
# => 5050
Ruby doesn't offer multiple inheritance, unlike some other languages (including Python). That is, you can't derive a subclass from multiple base classes - a class only ever has one superclass. However, you can compose subclasses with the mixin pattern with Modules.
A Module can be thought of as an abstract class - it cannot be instantiated, and while you can define "class methods" on a module that can be called directly, modules are generally used to encapsulate functionality that should be used to compose other classes.
module Gobbler
def gobble
puts "Gobble!"
end
end
module Quacker
def quack
puts "Quack!"
end
end
module Clucker
def cluck
puts "Cluck!"
end
end
class Turkey
include Gobbler
end
class Turducken
include Gobbler
include Quacker
include Clucker
def turduckenate
gobble
quack
cluck
end
end
module Terminator
def catchphrase
puts "I'll be back."
end
end
class Turduckenator < Turducken
include Terminator
end
Turkey.new.gobble # => "Gobble!"
Turkey.new.quack # => NoMethodError
Turducken.new.quack # => "Quack!"
Turducken.new.turduckenate
# => Gobble!
# Quack!
# Cluck!
Turduckenator.new.quack # => "Quack!"
Turduckenator.new.catchphrase # => "I'll be back."
Namespaces in Ruby are used to group constants together. Namespaces are achieved by simply defining a constant within the scope of another constant:
module Birds
class Turkey
end
end
# Or alternately, assuming that the constant Birds already exists
class Birds::Duck
end
Birds::Turkey.new
Birds::Duck.new
Ruby will attempt to find constants in the current namespace, and will then walk up the chain to the parent looking for a match:
# The :: prefix explicitly indicates that Duck should be a toplevel constant
class ::Duck
def quack
"I am a toplevel duck! Quack!"
end
end
module Birds
class Duck
def quack
"I am a midlevel duck. Quack!"
end
end
Duck.new.quack
# => will find Birds::Duck, because we are in the Birds scope
::Duck.new.quack
# => will find the toplevel ::Duck, because we are not in the Birds scope
end
Duck.quack
# => will find the toplevel ::Duck, because we are not in the Birds scope
Birds::Duck.new.quack
# => will find Birds::Duck
This is important because constants are global within Ruby. Within a given runtime, once a constant has been loaded, it is visible from everywhere, not just the scope or file which loaded it.
# my_class.rb
class MyClass
end
# my_class_consumer.rb
require "my_class"
MyClass.class # => Class
# consumers_list.rb
require "my_class_consumer"
MyClass.class # => Class
# MyClass is visible even though we didn't import it in this file,
# because it was imported by another file.