Anil Galve: September 2011

Thursday, September 29, 2011

Collect vs Map

Collect vs Map in ruby

I’ve been using collect in all my programs up till now, and I recently discovered that the map function is equivalent.

My question was what is the difference between them and which one is better to use. I investigated the net and found that it is a matter or convention more than a difference:
- collect is recommended to be used when the list is not changed:

ruby-1.8.7-p302 :009 > ["one", "two", "three"].collect{|item| item.length == 3}
=> [true, true, false]

- map is recommended to be used when the elements of the list have to be processed:

["one", "two", "three"].map{|item| process(item)}

Sunday, September 25, 2011

delayed_job method caching issue

In recent I have faced weird behavior of delayed_jobs. I had some failed enqueued jobs in delayed_jobs table. I seen last error field and corrected them by and pushed code.

But jobs were still failing as delayed_job caches method definition also with method call so it was calling that old buggy method only even though I have updated that method.

So I have restarted delayed job server again and its started working as expected.

Wednesday, September 21, 2011

delayed_job vs resque

Resque vs DelayedJob

How does Resque compare to DelayedJob, and why would you choose one over the other?

Resque supports multiple queues
DelayedJob supports finer grained priorities
Resque workers are resilient to memory leaks / bloat
DelayedJob workers are extremely simple and easy to modify
Resque requires Redis
DelayedJob requires ActiveRecord
Resque can only place JSONable Ruby objects on a queue as arguments
DelayedJob can place any Ruby object on its queue as arguments
Resque includes a Sinatra app for monitoring what's going on
DelayedJob can be queried from within your Rails app if you want to add an interface

If you're doing Rails development, you already have a database and ActiveRecord. DelayedJob is super easy to setup and works great. GitHub used it for many months to process almost 200 million jobs.

Choose Resque if:

You need multiple queues
You don't care / dislike numeric priorities
You don't need to persist every Ruby object ever
You have potentially huge queues
You want to see what's going on
You expect a lot of failure / chaos
You can setup Redis
You're not running short on RAM

Choose DelayedJob if:

You like numeric priorities
You're not doing a gigantic amount of jobs each day
Your queue stays small and nimble
There is not a lot failure / chaos
You want to easily throw anything on the queue
You don't want to setup Redis

In no way is Resque a "better" DelayedJob, so make sure you pick the tool that's best for your app.

Thursday, September 15, 2011

Converting Array to Hash

We can convert Array to Hash. Lets take an example.

a = ['a',1,'b',2,'c',3]

If we want to convert this array into has we can easily convert it to hash as follows:

h = Hash[*a]

It will result into
{"a"=>1, "b"=>2, "c"=>3}

Also if,

a = [["a",1],["b",2],["c",3]]

h = Hash[*a.flatten]

will result into
{"a"=>1, "b"=>2, "c"=>3}

Sunday, September 11, 2011

"find_by_sql" can be devil

Rails’ find_by_sql is the devil. Ninety nine percent of the time find_by_sql is unnecessary and problematic, but it’s sooo seductive. I can’t even begin to count the ways that find_by_sql can cause trouble, but here’s a few:

Plugins like acts_as_paranoid rely on developers *not* using the back door to get around the dynamic conditions to exclude deleted rows.
There quite a few gotchas, ie: “SELECT * FROM users JOIN another_table …” won’t work because ActiveRecord will use the last ID field, not the first.
Logic “hidden” in find_by_sql is not reusable (as compared to a fancy association, etc)
It offends my aesthetic sense. We all like to pretend our ORM layer isn’t leaky.. don’t we?

Think you need find_by_sql? Ask yourself the following questions:

Can I just use :include, :select, :join, :conditions or some combination of the above?
Should this be an association? (perhaps with :conditions and :select on it? Maybe :readonly?)

Monday, September 5, 2011

The Difference Between Ruby Symbols and Strings

Symbols are quite an interesting, and often ill-understood, facet of Ruby. Used extensively throughout Rails and many other Ruby libraries, Symbols are a common sight. However, their rational and purpose is something of a mystery to many Rubyists. This misunderstanding can probably be attributed to many methods throughout Ruby using Symbols and Strings interchangeably. Hopefully, this tutorial will show the value of Symbols and why they are a very useful attribute of Ruby.

Symbols are Strings, Sort Of

The truth of the matter is that Symbols are Strings, just with an important difference, Symbols are immutable. Mutable objects can be changed after assignment while immutable objects can only be overwritten. Ruby is quite unique in offering mutable Strings, which adds greatly to its expressiveness. However mutable Strings can have their share of issues in terms of creating unexpected results and reduced performance. It is for this reason Ruby also offers programmers the choice of Symbols.

So, how flexible are Symbols in terms of representing data? The answer is just as flexible as Strings. Lets take a look at some valid Strings and their Symbol equivalents.

"hello"
:hello

"hello world"
:"hello world"

bang = "!"

"hello world#{bang}" # => "hello world!"
:"hello world#{bang}" # => :"hello world!"
Many Ruby programmers think that Symbols can only contain alphanumeric characters, however by using quotes you can not only use the same characters as you would a String, but also interpolate Symbols as well. The only thing that a Symbol needs syntactically is a colon : prepended. Symbols are actually so similar to Strings that converting between the two is very simple and consistent.

:"hello world".to_s # => "hello world"
"hello world".intern # => :"hello world"
This being the case, what can’t Symbols do? Well, they can’t change.

puts "hello" << " world"
puts :hello << :" world"

# => hello world
# => *.rb:4: undefined method `<<' for :hello:Symbol (NoMethodError)
In this example, we tried to insert world into the end of both a String and a Symbol. While the mutable String updated itself, the immutable Symbol gave us an error. From this, we can determine that all the receiver changing methods (e.g. upcase!, delete!, gsub!) we have grown to depend on with Strings will not be with us in Symbols. This being the case, why would Ruby even have Symbols as Strings seem to be quite a flexible upgrade? The reason is that while mutability is very useful, it can also cause quite a few problems.

Mutability Gone Wrong

Strings often perform double duty in an program by both storing data and driving operation. For holding data, mutability offers us a slew of expressive options. However, a programs operation (especially in critical applications) should be a bit more ridged. Take the following example:

status = "peace"

buggy_logger = status

print "Status: "
print buggy_logger << "\n" # <- This insertion is the bug.

def launch_nukes?(status)
unless status == 'peace'
return true
else
return false
end
end

print "Nukes Launched: #{launch_nukes?(status)}\n"

# => Status: peace
# => Nukes Launched: true
In this example, a script silently watches the world for aggression and danger. Once things go wrong, it launches the nukes. However, by having a mutable String control the program’s operation, a buggy logger disintegrates the entire world. Now, the above script does make some mistakes (I would have cloned the status String instead of just assigning it for example). However, we can see that relying on something that can change during run-time can cause some unexpected results.

Frozen Strings

There are two ways to handle the mutability issue above. The first is to freeze the String and thus making it immutable. Once a String is frozen it cannot be changed.

example = "hello world"
example.upcase!

puts example

example.freeze
example.downcase!

# => "HELLO WORLD"
# => *.rb:7:in `downcase!': can't modify frozen string (TypeError)
The second way would be to make your program use Symbols. But if frozen Strings work just as Symbols, we are back to wondering the true value of Symbols and why they should be used.

String and Symbol Performance

Because Strings are mutable, the Ruby interpreter never knows what that String may hold in terms of data. As such, every String needs to have its own place in memory. We can see this by creating some Strings and printing their object id.

puts "hello world".object_id
puts "hello world".object_id
puts "hello world".object_id
puts "hello world".object_id
puts "hello world".object_id

# => 3102960
# => 3098410
# => 3093860
# => 3089330
# => 3084800
NOTE: Your object id’s will be different then the ones above.

What we see above might not seem like a big issue, but behind the scenes is some heavy waste. To understand why, we first have to understand what is going on under the hood. An abridged explanation is as follows:

First, a new String object is instantiated with the value of hello world.
The Ruby interpreter needs to look at the heap (your computers memory), find a place to put the new string and keep track of it via its object id.
The String is passed to the puts method, and output to the screen.
The Ruby interpreter sees that the String will not be used again as it is not assigned to a variable, and marks it for destruction.
Back to step one four more times.
Ruby’s garbage collector, or GC, is of the mark and sweep variety. This means that objects to be destroyed are marked during the programs operation. The GC then sweeps up all the marked objects on the heap whenever it has some spare time. However, above we are creating a new object, storing and ultimately destroying it when reusing the same object would be much more efficient. Lets do the same thing but with Symbols instead.

puts :"hello world".object_id
puts :"hello world".object_id
puts :"hello world".object_id
puts :"hello world".object_id
puts :"hello world".object_id

# => 239518
# => 239518
# => 239518
# => 239518
# => 239518
This time, every Symbol shares the same object id, and as such the same space on the heap. To further increase performance, Ruby will not mark Symbols for destruction, allowing you to reuse them again and again. As Symbols stay in memory throughout the programs operation, we can quickly snag them from memory instead of instantiating a new copy every time. In fact, Symbols are not only stored in memory, they are also keep track of via an optimized Symbols dictionary. You can see it by running the next example.

puts Symbol.all_symbols.inspect

# => A long list of Symbols...
The all_symbols class method will return an Array of every Symbol currently used in your program. Every time you create a new Symbol, it is also put here. When you use a Symbol, Ruby will check the dictionary first, and if available it will be fetched and used. If the Symbol cannot be found in the dictionary, only then will the interpreter instantiate a new Symbol and put it in the heap. We can easily see this process by dropping into IRB and typing the following:

>> Symbol.all_symbols.collect{|sym| sym.to_s}.include?("new_symbol")
=> false
>> :new_symbol
=> :new_symbol
>> Symbol.all_symbols.collect{|sym| sym.to_s}.include?("new_symbol")
=> true
So how much faster are Symbols compared to Strings? Well, lets build a script to benchmark the difference.

require 'benchmark'

str = Benchmark.measure do
10_000_000.times do
"test"
end
end.total

sym = Benchmark.measure do
10_000_000.times do
:test
end
end.total

puts "String: " + str.to_s
puts "Symbol: " + sym.to_s
puts
In this example, we are creating 10 million new Strings and then 10 million new Symbols. With the Benchmark library, we can find out how long each activity takes and compare. After running this script three times, I got the following results (yours will most likely be different).

$ ruby benchmark.rb
String: 2.24
Symbol: 1.32

$ ruby benchmark.rb
String: 2.25
Symbol: 1.32

$ ruby benchmark.rb
String: 2.24
Symbol: 1.33
On average, we are getting a 40% increase solely by using Symbols. However, Symbols are not just faster then Strings in how they are stored and used, but also in how they are compared. As Symbols of the same text share the same space on the heap, testing for equality is as quick as comparing the object id’s. Testing equality for Strings however, is a bit more computationally expensive. As every String has its own place in memory, the interpreter has to compare the actual data making up the String. A good example to visualize this type of check would be the following:

a = "test".split(//)
b = "test".split(//)
e = true

number = (a.length > b.length) ? a.length : b.length

number.times do |index|
if a[index] == b[index]
puts "#{a[index]} is equal to #{b[index]}"
else
puts "#{a[index]} is not equal to #{b[index]}"
e = false
break
end
end

puts "#{a.join} equal to #{b.join}: #{e}"

# => t is equal to t
# => e is equal to e
# => s is equal to s
# => t is equal to t
# => test equal to test: true
So, back to benchmarking. How much faster is comparing Symbols then to Strings? Lets find out:

require 'benchmark'

str = Benchmark.measure do
10_000_000.times do
"test" == "test"
end
end.total

sym = Benchmark.measure do
10_000_000.times do
:test == :test
end
end.total

puts "String: " + str.to_s
puts "Symbol: " + sym.to_s
puts
Just like before, I ran this script three times and got the following results.

$ ruby benchmark.rb
String: 4.48
Symbol: 2.39

$ ruby benchmark.rb
String: 4.47
Symbol: 2.39

$ ruby benchmark.rb
String: 4.47
Symbol: 2.38
And just like before, we are close to a cutting our processing time by half. Not so bad.

Summary

Overall, we went though quite a bit in this tutorial. First, we learned about mutability and how Strings and Symbols are different in this regard. We then jumped into when to use each and finally looked at the performance characteristics of both. Now that you have a better idea of this interesting aspect of Ruby, you can start using Symbols to keep your applications running faster and more consistently.

Modules & Mixins

Modules are a way of grouping together methods, classes, and constants. Modules give you two major benefits.

Modules provide a namespace and prevent name clashes.

Modules implement the mixin facility.

Modules define a namespace, a sandbox in which your methods and constants can play without having to worry about being stepped on by other methods and constants.

Syntax:

module Identifier
statement1
statement2
...........
end
Module constants are named just like class constants, with an initial uppercase letter. The method definitions look similar, too: module methods are defined just like class methods.

As with class methods, you call a module method by preceding its name with the module.s name and a period, and you reference a constant using the module name and two colons.

Example:

#!/usr/bin/ruby

# Module defined in trig.rb file

module Trig
PI = 3.141592654
def Trig.sin(x)
# ..
end
def Trig.cos(x)
# ..
end
end
We can define one more module with same function name but different functionality:

#!/usr/bin/ruby

# Module defined in moral.rb file

module Moral
VERY_BAD = 0
BAD = 1
def Moral.sin(badness)
# ...
end
end
Like class methods, whenever you define a method in a module, you specify the module name followed by a dot and then the method name.

Ruby require Statement:
The require statement is similar to the include statement of C and C++ and the import statement of Java. If a third program wants to use any defined module, it can simply load the module files using the Ruby require statement:

Syntax:

require filename
Here it is not required to give .rb extension along with a file name.

Example:

require 'trig.rb'
require 'moral'

y = Trig.sin(Trig::PI/4)
wrongdoing = Moral.sin(Moral::VERY_BAD)
IMPORTANT: Here both the files contain same function name. So this will result in code ambiguity while including in calling program but modules avoid this code ambiguity and we are able to call appropriate function using module name.

Ruby include Statement:
You can embed a module in a class. To embed a module in a class, you use the include statement in the class:

Syntax:

include modulename
If a module is defined in separate file then it is required to include that file using require statement before embeding module in a class.

Example:

Consider following module written in Week.rb file.

module Week
FIRST_DAY = "Sunday"
def Week.weeks_in_month
puts "You have four weeks in a month"
end
def Week.weeks_in_year
puts "You have 52 weeks in a year"
end
end
Now you can include this module in a class as follows:

#!/usr/bin/ruby
require "Week"

class Decade
include Week
no_of_yrs=10
def no_of_months
puts Week::FIRST_DAY
number=10*12
puts number
end
end
d1=Decade.new
puts Week::FIRST_DAY
Week.weeks_in_month
Week.weeks_in_year
d1.no_of_months
This will produce following result:

Sunday
You have four weeks in a month
You have 52 weeks in a year
Sunday
120
Mixins in Ruby:
Before going through this section, I assume you have knowledge of Object Oriented Concepts.

When a class can inherit features from more than one parent class, the class is supposed to show multiple inheritance.

Ruby does not suppoprt mutiple inheritance directly but Ruby Modules have another, wonderful use. At a stroke, they pretty much eliminate the need for multiple inheritance, providing a facility called a mixin.

Mixins give you a wonderfully controlled way of adding functionality to classes. However, their true power comes out when the code in the mixin starts to interact with code in the class that uses it.

Let us examine the following sample code to gain an understand of mixin:

module A
def a1
end
def a2
end
end
module B
def b1
end
def b2
end
end

class Sample
include A
include B
def s1
end
end

samp=Sample.new
samp.a1
samp.a2
samp.b1
samp.b2
samp.s1
Module A consists of the methods a1 and a2. Module B consists of the methods b1 and b2. The class Sample includes both modules A and B. The class Sample can access all four methods, namely, a1, a2, b1, and b2. Therefore, you can see that the class Sample inherits from both the modules. Thus you can say the class Sample shows multiple inheritance or a mixin.

block vs lambda vs proc

block vs lambda vs proc
What is a block in ruby. A block is a piece of code. However a block has to be attached to something.

1.# this wont' work
2.{ |x| puts x }
3.
4.# this won't work either
5.temp_block = {|x| puts x }
So how do I carry a piece of code from one place to another. Convert that piece of code into a proc.

1.temp_proc = Proc.new { puts "inside a block" }
2.temp_proc.call # inside a block
What is temp_proc here. Let’s find out the class of temp_proc.

1.temp_proc = Proc.new { puts "inside a block" }
2.puts temp_proc.class.to_s # Proc
What is lamda. Lamda is another way to convert a block of code into a proc.

1.temp_lambda = lambda { puts "inside a block" }
2.temp_lambda.call # inside a block
Let’s see what is the class of the temp_lambda.

1.temp_lambda = lambda { puts "inside a block" }
2.puts temp_lambda.class.to_s # Proc
temp_lambda is a Proc. It means both lambda and Proc.new create an instance of “Proc”. Then what’s the different between the two.

Differences between lambda and Proc. I know of two distinctions.

01.temp_proc = Proc.new {|a,b| puts "sum is #{a + b}" }
02.temp_lambda = lambda {|a,b| puts "sum is #{a + b}" }
03.
04.temp_lambda.call(2,3) #5
05.temp_lambda.call(2) #wrong number of arguments error
06.temp_lambda.call(2,3,4) #wrong number of arguments error
07.
08.temp_proc.call (2,3) # 5
09.temp_proc.call (2)
10.# nil can't be coerced. Since the second parameter was not supplied,
11.# proc.call supplied nil for the missing parameter
12.temp_proc.call(2,3,4) #5
First difference is that lambda.call wants the exact number of parameters to be supplied. Otherwise lambda will throw “wrong number of arguments error”. While Proc doesn’t demand that. For the missing params proc will supply “nil” and any extra parameter is ignored.

01.def learning_proc
02. temp_proc = Proc.new { return "creating proc" }
03. temp_proc.call # control leaves the method here
04. return "inside method learning_proc"
05.end
06.
07.puts learning_proc
08.
09.def learning_lambda
10. temp_lambda = lambda { return "creating lambda" }
11. temp_lambda.call # control doesn't leave the method here
12. return "inside method learning_lambda"
13.end
14.
15.puts learning_lambda
Second difference has to do with the usage of the word “return” in the context of proc. When there is a return keyword inside the block then proc treats that statement as returning from the method. Hence the statement return “inside method learning_proc” was never executed. Lambda behaves the way ruby behaves in general. The control goes to the next statement after the temp_lambda.call is finished.

The point to be noted is that not all the temp_proc.call will ensure the return from the method at that very instant. It’s the usage of the word return that is causing this behavior. This code will work just fine.

1.def learning_proc
2. temp_proc = Proc.new { "creating proc" }
3. temp_proc.call
4. return "inside method learning_proc"
5.end
6.
7.puts learning_proc
In ruby any method can accept a block and the methods do not need to do anything to accept this block of code. All the method has to do is to call “yield” and the piece of code will be invoked.

01.def learning
02. puts "learning ruby "
03. yield
04. puts "learning rake"
05.end
06.
07.learning {puts "learning rake" }
08.
09.# output is
10.#learning ruby
11.#learning rake
12.#learning rake
In the above example method “learning” did not have to do anything to accept the block. This feature is built into ruby.

However if the method wants to detect if a block is being passed to it or not, it can use the method block_given?.

01.def learning
02. puts "learning ruby "
03.
04. if block_given?
05. yield
06. else
07. puts "no block was passed"
08. end
09. puts "learning rake"
10.end
11.
12.learning
13.
14.# output is
15.#learning ruby
16.#no block was passed
17.#learning rake
There is another way to pass a block to a method: as an argument. However this argument must be the very last argument. It works like this.

1.def learning(&block)
2. block.call
3.end
4.
5.learning { puts "learning" }
In the above case there is an ampersand sign before the name “block”. That ampersand sign is important. By using that sign we are telling ruby that a block is being passed and convert that block into a proc and then set the variable name of the block as “block”. Now we can invoke “block.call”.

We can pass variables to a block. However it is important to understand the scope of the variable. Try this.

01.def thrice
02. x = 100
03. yield
04. yield
05. yield
06. puts "value of x inside method is #{x}"
07.end
08.
09.x = 5
10.puts "value of x before: #{x}"
11.thrice { x += 1 }
12.puts "value of x after: #{x}"
13.
14.#output
15.#value of x before: 5
16.#value of x inside method is 100
17.#value of x after: 8
This is something. A block will not touch variables defined inside the method. What happens if outer x is not defined.

01.def thrice
02. x = 100
03. yield
04. yield
05. yield
06. puts "value of x inside method is #{x}"
07.end
08.
09.puts "value of x before: #{x}"
10.thrice { x += 1 }
11.puts "value of x after: #{x}"
12.
13.#output
14.#undefined local variable or method `x' for main:Object (NameError)
Hopefully this provides some idea about blocks, procs and lambda. I intend to add some more cases later to this article.

Ruby Require VS Load VS Include VS Extend

Here are the differences between Require, Load, Include and Extend methods:

Include

When you Include a module into your class as shown below, it’s as if you took the code defined within the module and inserted it within the class, where you ‘include’ it. It allows the ‘mixin’ behavior. It’s used to DRY up your code to avoid duplication, for instance, if there were multiple classes that would need the same code within the module.

The following assumes that the module Log and class TestClass are defined in the same .rb file. If they were in separate files, then ‘load’ or ‘require’ must be used to let the class know about the module you’ve defined.

module Log
  def class_type
    "This class is of type: #{self.class}"
  end
end
 
class TestClass
  include Log
  # ...
end
 
tc = TestClass.new.class_type

The above will print “This class is of type: TestClass”

Load

The load method is almost like the require method except it doesn’t keep track of whether or not that library has been loaded. So it’s possible to load a library multiple times and also when using the load method you must specify the “.rb” extension of the library file name.

Most of the time, you’ll want to use require instead of load but load is there if you want a library to be loaded each time load is called. For example, if your module changes its state frequently, you may want to use load to pick up those changes within classes loaded from.

Here’s an example of how to use load. Place the load method at the very top of your “.rb” file. Also the load method takes a path to the file as an argument:

load 'test_library.rb'

So for example, if the module is defined in a separate .rb file than it’s used, then you can use the

File: log.rb

module Log
  def class_type
    "This class is of type: #{self.class}"
  end
end

File: test.rb

load 'log.rb'
 
class TestClass
  include Log
  # ...
end

Require

The require method allows you to load a library and prevents it from being loaded more than once. The require method will return ‘false’ if you try to load the same library after the first time. The require method only needs to be used if library you are loading is defined in a separate file, which is usually the case.

So it keeps track of whether that library was already loaded or not. You also don’t need to specify the “.rb” extension of the library file name.

Here’s an example of how to use require. Place the require method at the very top of your “.rb” file:

require 'test_library'

Extend

When using the extend method instead of include, you are adding the module’s methods as class methods instead of as instance methods.

Here is an example of how to use the extend method:

module Log
  def class_type
    "This class is of type: #{self.class}"
  end
end
 
class TestClass
  extend Log
  # ...
end
 
tc = TestClass.class_type