The difference is that #each scopes block variable inside the block, whereas for/in scopes it outside the block.
Following demonstration will clear this:
for i in [1,2,3]
# ...
end
puts i # => 3
[1,2,3].each do |j|
# ...
end
puts j # => NameError: undefined local...
Lets take one more example to demonstrate this:
loop1 = []
loop2 = []
calls = ["one", "two", "three"]
calls.each do |c|
loop1 << Proc.new { puts c }
end
for c in calls
loop2 << Proc.new { puts c }
end
loop1[1].call #=> "two"
loop2[1].call #=> "three"
Saturday, October 8, 2011
Thursday, September 29, 2011
Collect vs Map
Collect vs Map in ruby
I’ve been using collect in all my programs up till now, and I recently discovered that the map function is equivalent.
My question was what is the difference between them and which one is better to use. I investigated the net and found that it is a matter or convention more than a difference:
- collect is recommended to be used when the list is not changed:
ruby-1.8.7-p302 :009 > ["one", "two", "three"].collect{|item| item.length == 3}
=> [true, true, false]
- map is recommended to be used when the elements of the list have to be processed:
["one", "two", "three"].map{|item| process(item)}
I’ve been using collect in all my programs up till now, and I recently discovered that the map function is equivalent.
My question was what is the difference between them and which one is better to use. I investigated the net and found that it is a matter or convention more than a difference:
- collect is recommended to be used when the list is not changed:
ruby-1.8.7-p302 :009 > ["one", "two", "three"].collect{|item| item.length == 3}
=> [true, true, false]
- map is recommended to be used when the elements of the list have to be processed:
["one", "two", "three"].map{|item| process(item)}
Sunday, September 25, 2011
delayed_job method caching issue
In recent I have faced weird behavior of delayed_jobs. I had some failed enqueued jobs in delayed_jobs table. I seen last error field and corrected them by and pushed code.
But jobs were still failing as delayed_job caches method definition also with method call so it was calling that old buggy method only even though I have updated that method.
So I have restarted delayed job server again and its started working as expected.
But jobs were still failing as delayed_job caches method definition also with method call so it was calling that old buggy method only even though I have updated that method.
So I have restarted delayed job server again and its started working as expected.
Wednesday, September 21, 2011
delayed_job vs resque
Resque vs DelayedJob
How does Resque compare to DelayedJob, and why would you choose one over the other?
Resque supports multiple queues
DelayedJob supports finer grained priorities
Resque workers are resilient to memory leaks / bloat
DelayedJob workers are extremely simple and easy to modify
Resque requires Redis
DelayedJob requires ActiveRecord
Resque can only place JSONable Ruby objects on a queue as arguments
DelayedJob can place any Ruby object on its queue as arguments
Resque includes a Sinatra app for monitoring what's going on
DelayedJob can be queried from within your Rails app if you want to add an interface
If you're doing Rails development, you already have a database and ActiveRecord. DelayedJob is super easy to setup and works great. GitHub used it for many months to process almost 200 million jobs.
Choose Resque if:
You need multiple queues
You don't care / dislike numeric priorities
You don't need to persist every Ruby object ever
You have potentially huge queues
You want to see what's going on
You expect a lot of failure / chaos
You can setup Redis
You're not running short on RAM
Choose DelayedJob if:
You like numeric priorities
You're not doing a gigantic amount of jobs each day
Your queue stays small and nimble
There is not a lot failure / chaos
You want to easily throw anything on the queue
You don't want to setup Redis
In no way is Resque a "better" DelayedJob, so make sure you pick the tool that's best for your app.
How does Resque compare to DelayedJob, and why would you choose one over the other?
Resque supports multiple queues
DelayedJob supports finer grained priorities
Resque workers are resilient to memory leaks / bloat
DelayedJob workers are extremely simple and easy to modify
Resque requires Redis
DelayedJob requires ActiveRecord
Resque can only place JSONable Ruby objects on a queue as arguments
DelayedJob can place any Ruby object on its queue as arguments
Resque includes a Sinatra app for monitoring what's going on
DelayedJob can be queried from within your Rails app if you want to add an interface
If you're doing Rails development, you already have a database and ActiveRecord. DelayedJob is super easy to setup and works great. GitHub used it for many months to process almost 200 million jobs.
Choose Resque if:
You need multiple queues
You don't care / dislike numeric priorities
You don't need to persist every Ruby object ever
You have potentially huge queues
You want to see what's going on
You expect a lot of failure / chaos
You can setup Redis
You're not running short on RAM
Choose DelayedJob if:
You like numeric priorities
You're not doing a gigantic amount of jobs each day
Your queue stays small and nimble
There is not a lot failure / chaos
You want to easily throw anything on the queue
You don't want to setup Redis
In no way is Resque a "better" DelayedJob, so make sure you pick the tool that's best for your app.
Thursday, September 15, 2011
Converting Array to Hash
We can convert Array to Hash. Lets take an example.
a = ['a',1,'b',2,'c',3]
If we want to convert this array into has we can easily convert it to hash as follows:
h = Hash[*a]
It will result into
{"a"=>1, "b"=>2, "c"=>3}
Also if,
a = [["a",1],["b",2],["c",3]]
h = Hash[*a.flatten]
will result into
{"a"=>1, "b"=>2, "c"=>3}
a = ['a',1,'b',2,'c',3]
If we want to convert this array into has we can easily convert it to hash as follows:
h = Hash[*a]
It will result into
{"a"=>1, "b"=>2, "c"=>3}
Also if,
a = [["a",1],["b",2],["c",3]]
h = Hash[*a.flatten]
will result into
{"a"=>1, "b"=>2, "c"=>3}
Sunday, September 11, 2011
"find_by_sql" can be devil
Rails’ find_by_sql is the devil. Ninety nine percent of the time find_by_sql is unnecessary and problematic, but it’s sooo seductive. I can’t even begin to count the ways that find_by_sql can cause trouble, but here’s a few:
Plugins like acts_as_paranoid rely on developers *not* using the back door to get around the dynamic conditions to exclude deleted rows.
There quite a few gotchas, ie: “SELECT * FROM users JOIN another_table …” won’t work because ActiveRecord will use the last ID field, not the first.
Logic “hidden” in find_by_sql is not reusable (as compared to a fancy association, etc)
It offends my aesthetic sense. We all like to pretend our ORM layer isn’t leaky.. don’t we?
Think you need find_by_sql? Ask yourself the following questions:
Can I just use :include, :select, :join, :conditions or some combination of the above?
Should this be an association? (perhaps with :conditions and :select on it? Maybe :readonly?)
Plugins like acts_as_paranoid rely on developers *not* using the back door to get around the dynamic conditions to exclude deleted rows.
There quite a few gotchas, ie: “SELECT * FROM users JOIN another_table …” won’t work because ActiveRecord will use the last ID field, not the first.
Logic “hidden” in find_by_sql is not reusable (as compared to a fancy association, etc)
It offends my aesthetic sense. We all like to pretend our ORM layer isn’t leaky.. don’t we?
Think you need find_by_sql? Ask yourself the following questions:
Can I just use :include, :select, :join, :conditions or some combination of the above?
Should this be an association? (perhaps with :conditions and :select on it? Maybe :readonly?)
Monday, September 5, 2011
The Difference Between Ruby Symbols and Strings
Symbols are quite an interesting, and often ill-understood, facet of Ruby. Used extensively throughout Rails and many other Ruby libraries, Symbols are a common sight. However, their rational and purpose is something of a mystery to many Rubyists. This misunderstanding can probably be attributed to many methods throughout Ruby using Symbols and Strings interchangeably. Hopefully, this tutorial will show the value of Symbols and why they are a very useful attribute of Ruby.
Symbols are Strings, Sort Of
The truth of the matter is that Symbols are Strings, just with an important difference, Symbols are immutable. Mutable objects can be changed after assignment while immutable objects can only be overwritten. Ruby is quite unique in offering mutable Strings, which adds greatly to its expressiveness. However mutable Strings can have their share of issues in terms of creating unexpected results and reduced performance. It is for this reason Ruby also offers programmers the choice of Symbols.
So, how flexible are Symbols in terms of representing data? The answer is just as flexible as Strings. Lets take a look at some valid Strings and their Symbol equivalents.
"hello"
:hello
"hello world"
:"hello world"
bang = "!"
"hello world#{bang}" # => "hello world!"
:"hello world#{bang}" # => :"hello world!"
Many Ruby programmers think that Symbols can only contain alphanumeric characters, however by using quotes you can not only use the same characters as you would a String, but also interpolate Symbols as well. The only thing that a Symbol needs syntactically is a colon : prepended. Symbols are actually so similar to Strings that converting between the two is very simple and consistent.
:"hello world".to_s # => "hello world"
"hello world".intern # => :"hello world"
This being the case, what can’t Symbols do? Well, they can’t change.
puts "hello" << " world"
puts :hello << :" world"
# => hello world
# => *.rb:4: undefined method `<<' for :hello:Symbol (NoMethodError)
In this example, we tried to insert world into the end of both a String and a Symbol. While the mutable String updated itself, the immutable Symbol gave us an error. From this, we can determine that all the receiver changing methods (e.g. upcase!, delete!, gsub!) we have grown to depend on with Strings will not be with us in Symbols. This being the case, why would Ruby even have Symbols as Strings seem to be quite a flexible upgrade? The reason is that while mutability is very useful, it can also cause quite a few problems.
Mutability Gone Wrong
Strings often perform double duty in an program by both storing data and driving operation. For holding data, mutability offers us a slew of expressive options. However, a programs operation (especially in critical applications) should be a bit more ridged. Take the following example:
status = "peace"
buggy_logger = status
print "Status: "
print buggy_logger << "\n" # <- This insertion is the bug.
def launch_nukes?(status)
unless status == 'peace'
return true
else
return false
end
end
print "Nukes Launched: #{launch_nukes?(status)}\n"
# => Status: peace
# => Nukes Launched: true
In this example, a script silently watches the world for aggression and danger. Once things go wrong, it launches the nukes. However, by having a mutable String control the program’s operation, a buggy logger disintegrates the entire world. Now, the above script does make some mistakes (I would have cloned the status String instead of just assigning it for example). However, we can see that relying on something that can change during run-time can cause some unexpected results.
Frozen Strings
There are two ways to handle the mutability issue above. The first is to freeze the String and thus making it immutable. Once a String is frozen it cannot be changed.
example = "hello world"
example.upcase!
puts example
example.freeze
example.downcase!
# => "HELLO WORLD"
# => *.rb:7:in `downcase!': can't modify frozen string (TypeError)
The second way would be to make your program use Symbols. But if frozen Strings work just as Symbols, we are back to wondering the true value of Symbols and why they should be used.
String and Symbol Performance
Because Strings are mutable, the Ruby interpreter never knows what that String may hold in terms of data. As such, every String needs to have its own place in memory. We can see this by creating some Strings and printing their object id.
puts "hello world".object_id
puts "hello world".object_id
puts "hello world".object_id
puts "hello world".object_id
puts "hello world".object_id
# => 3102960
# => 3098410
# => 3093860
# => 3089330
# => 3084800
NOTE: Your object id’s will be different then the ones above.
What we see above might not seem like a big issue, but behind the scenes is some heavy waste. To understand why, we first have to understand what is going on under the hood. An abridged explanation is as follows:
First, a new String object is instantiated with the value of hello world.
The Ruby interpreter needs to look at the heap (your computers memory), find a place to put the new string and keep track of it via its object id.
The String is passed to the puts method, and output to the screen.
The Ruby interpreter sees that the String will not be used again as it is not assigned to a variable, and marks it for destruction.
Back to step one four more times.
Ruby’s garbage collector, or GC, is of the mark and sweep variety. This means that objects to be destroyed are marked during the programs operation. The GC then sweeps up all the marked objects on the heap whenever it has some spare time. However, above we are creating a new object, storing and ultimately destroying it when reusing the same object would be much more efficient. Lets do the same thing but with Symbols instead.
puts :"hello world".object_id
puts :"hello world".object_id
puts :"hello world".object_id
puts :"hello world".object_id
puts :"hello world".object_id
# => 239518
# => 239518
# => 239518
# => 239518
# => 239518
This time, every Symbol shares the same object id, and as such the same space on the heap. To further increase performance, Ruby will not mark Symbols for destruction, allowing you to reuse them again and again. As Symbols stay in memory throughout the programs operation, we can quickly snag them from memory instead of instantiating a new copy every time. In fact, Symbols are not only stored in memory, they are also keep track of via an optimized Symbols dictionary. You can see it by running the next example.
puts Symbol.all_symbols.inspect
# => A long list of Symbols...
The all_symbols class method will return an Array of every Symbol currently used in your program. Every time you create a new Symbol, it is also put here. When you use a Symbol, Ruby will check the dictionary first, and if available it will be fetched and used. If the Symbol cannot be found in the dictionary, only then will the interpreter instantiate a new Symbol and put it in the heap. We can easily see this process by dropping into IRB and typing the following:
>> Symbol.all_symbols.collect{|sym| sym.to_s}.include?("new_symbol")
=> false
>> :new_symbol
=> :new_symbol
>> Symbol.all_symbols.collect{|sym| sym.to_s}.include?("new_symbol")
=> true
So how much faster are Symbols compared to Strings? Well, lets build a script to benchmark the difference.
require 'benchmark'
str = Benchmark.measure do
10_000_000.times do
"test"
end
end.total
sym = Benchmark.measure do
10_000_000.times do
:test
end
end.total
puts "String: " + str.to_s
puts "Symbol: " + sym.to_s
puts
In this example, we are creating 10 million new Strings and then 10 million new Symbols. With the Benchmark library, we can find out how long each activity takes and compare. After running this script three times, I got the following results (yours will most likely be different).
$ ruby benchmark.rb
String: 2.24
Symbol: 1.32
$ ruby benchmark.rb
String: 2.25
Symbol: 1.32
$ ruby benchmark.rb
String: 2.24
Symbol: 1.33
On average, we are getting a 40% increase solely by using Symbols. However, Symbols are not just faster then Strings in how they are stored and used, but also in how they are compared. As Symbols of the same text share the same space on the heap, testing for equality is as quick as comparing the object id’s. Testing equality for Strings however, is a bit more computationally expensive. As every String has its own place in memory, the interpreter has to compare the actual data making up the String. A good example to visualize this type of check would be the following:
a = "test".split(//)
b = "test".split(//)
e = true
number = (a.length > b.length) ? a.length : b.length
number.times do |index|
if a[index] == b[index]
puts "#{a[index]} is equal to #{b[index]}"
else
puts "#{a[index]} is not equal to #{b[index]}"
e = false
break
end
end
puts "#{a.join} equal to #{b.join}: #{e}"
# => t is equal to t
# => e is equal to e
# => s is equal to s
# => t is equal to t
# => test equal to test: true
So, back to benchmarking. How much faster is comparing Symbols then to Strings? Lets find out:
require 'benchmark'
str = Benchmark.measure do
10_000_000.times do
"test" == "test"
end
end.total
sym = Benchmark.measure do
10_000_000.times do
:test == :test
end
end.total
puts "String: " + str.to_s
puts "Symbol: " + sym.to_s
puts
Just like before, I ran this script three times and got the following results.
$ ruby benchmark.rb
String: 4.48
Symbol: 2.39
$ ruby benchmark.rb
String: 4.47
Symbol: 2.39
$ ruby benchmark.rb
String: 4.47
Symbol: 2.38
And just like before, we are close to a cutting our processing time by half. Not so bad.
Summary
Overall, we went though quite a bit in this tutorial. First, we learned about mutability and how Strings and Symbols are different in this regard. We then jumped into when to use each and finally looked at the performance characteristics of both. Now that you have a better idea of this interesting aspect of Ruby, you can start using Symbols to keep your applications running faster and more consistently.
Symbols are Strings, Sort Of
The truth of the matter is that Symbols are Strings, just with an important difference, Symbols are immutable. Mutable objects can be changed after assignment while immutable objects can only be overwritten. Ruby is quite unique in offering mutable Strings, which adds greatly to its expressiveness. However mutable Strings can have their share of issues in terms of creating unexpected results and reduced performance. It is for this reason Ruby also offers programmers the choice of Symbols.
So, how flexible are Symbols in terms of representing data? The answer is just as flexible as Strings. Lets take a look at some valid Strings and their Symbol equivalents.
"hello"
:hello
"hello world"
:"hello world"
bang = "!"
"hello world#{bang}" # => "hello world!"
:"hello world#{bang}" # => :"hello world!"
Many Ruby programmers think that Symbols can only contain alphanumeric characters, however by using quotes you can not only use the same characters as you would a String, but also interpolate Symbols as well. The only thing that a Symbol needs syntactically is a colon : prepended. Symbols are actually so similar to Strings that converting between the two is very simple and consistent.
:"hello world".to_s # => "hello world"
"hello world".intern # => :"hello world"
This being the case, what can’t Symbols do? Well, they can’t change.
puts "hello" << " world"
puts :hello << :" world"
# => hello world
# => *.rb:4: undefined method `<<' for :hello:Symbol (NoMethodError)
In this example, we tried to insert world into the end of both a String and a Symbol. While the mutable String updated itself, the immutable Symbol gave us an error. From this, we can determine that all the receiver changing methods (e.g. upcase!, delete!, gsub!) we have grown to depend on with Strings will not be with us in Symbols. This being the case, why would Ruby even have Symbols as Strings seem to be quite a flexible upgrade? The reason is that while mutability is very useful, it can also cause quite a few problems.
Mutability Gone Wrong
Strings often perform double duty in an program by both storing data and driving operation. For holding data, mutability offers us a slew of expressive options. However, a programs operation (especially in critical applications) should be a bit more ridged. Take the following example:
status = "peace"
buggy_logger = status
print "Status: "
print buggy_logger << "\n" # <- This insertion is the bug.
def launch_nukes?(status)
unless status == 'peace'
return true
else
return false
end
end
print "Nukes Launched: #{launch_nukes?(status)}\n"
# => Status: peace
# => Nukes Launched: true
In this example, a script silently watches the world for aggression and danger. Once things go wrong, it launches the nukes. However, by having a mutable String control the program’s operation, a buggy logger disintegrates the entire world. Now, the above script does make some mistakes (I would have cloned the status String instead of just assigning it for example). However, we can see that relying on something that can change during run-time can cause some unexpected results.
Frozen Strings
There are two ways to handle the mutability issue above. The first is to freeze the String and thus making it immutable. Once a String is frozen it cannot be changed.
example = "hello world"
example.upcase!
puts example
example.freeze
example.downcase!
# => "HELLO WORLD"
# => *.rb:7:in `downcase!': can't modify frozen string (TypeError)
The second way would be to make your program use Symbols. But if frozen Strings work just as Symbols, we are back to wondering the true value of Symbols and why they should be used.
String and Symbol Performance
Because Strings are mutable, the Ruby interpreter never knows what that String may hold in terms of data. As such, every String needs to have its own place in memory. We can see this by creating some Strings and printing their object id.
puts "hello world".object_id
puts "hello world".object_id
puts "hello world".object_id
puts "hello world".object_id
puts "hello world".object_id
# => 3102960
# => 3098410
# => 3093860
# => 3089330
# => 3084800
NOTE: Your object id’s will be different then the ones above.
What we see above might not seem like a big issue, but behind the scenes is some heavy waste. To understand why, we first have to understand what is going on under the hood. An abridged explanation is as follows:
First, a new String object is instantiated with the value of hello world.
The Ruby interpreter needs to look at the heap (your computers memory), find a place to put the new string and keep track of it via its object id.
The String is passed to the puts method, and output to the screen.
The Ruby interpreter sees that the String will not be used again as it is not assigned to a variable, and marks it for destruction.
Back to step one four more times.
Ruby’s garbage collector, or GC, is of the mark and sweep variety. This means that objects to be destroyed are marked during the programs operation. The GC then sweeps up all the marked objects on the heap whenever it has some spare time. However, above we are creating a new object, storing and ultimately destroying it when reusing the same object would be much more efficient. Lets do the same thing but with Symbols instead.
puts :"hello world".object_id
puts :"hello world".object_id
puts :"hello world".object_id
puts :"hello world".object_id
puts :"hello world".object_id
# => 239518
# => 239518
# => 239518
# => 239518
# => 239518
This time, every Symbol shares the same object id, and as such the same space on the heap. To further increase performance, Ruby will not mark Symbols for destruction, allowing you to reuse them again and again. As Symbols stay in memory throughout the programs operation, we can quickly snag them from memory instead of instantiating a new copy every time. In fact, Symbols are not only stored in memory, they are also keep track of via an optimized Symbols dictionary. You can see it by running the next example.
puts Symbol.all_symbols.inspect
# => A long list of Symbols...
The all_symbols class method will return an Array of every Symbol currently used in your program. Every time you create a new Symbol, it is also put here. When you use a Symbol, Ruby will check the dictionary first, and if available it will be fetched and used. If the Symbol cannot be found in the dictionary, only then will the interpreter instantiate a new Symbol and put it in the heap. We can easily see this process by dropping into IRB and typing the following:
>> Symbol.all_symbols.collect{|sym| sym.to_s}.include?("new_symbol")
=> false
>> :new_symbol
=> :new_symbol
>> Symbol.all_symbols.collect{|sym| sym.to_s}.include?("new_symbol")
=> true
So how much faster are Symbols compared to Strings? Well, lets build a script to benchmark the difference.
require 'benchmark'
str = Benchmark.measure do
10_000_000.times do
"test"
end
end.total
sym = Benchmark.measure do
10_000_000.times do
:test
end
end.total
puts "String: " + str.to_s
puts "Symbol: " + sym.to_s
puts
In this example, we are creating 10 million new Strings and then 10 million new Symbols. With the Benchmark library, we can find out how long each activity takes and compare. After running this script three times, I got the following results (yours will most likely be different).
$ ruby benchmark.rb
String: 2.24
Symbol: 1.32
$ ruby benchmark.rb
String: 2.25
Symbol: 1.32
$ ruby benchmark.rb
String: 2.24
Symbol: 1.33
On average, we are getting a 40% increase solely by using Symbols. However, Symbols are not just faster then Strings in how they are stored and used, but also in how they are compared. As Symbols of the same text share the same space on the heap, testing for equality is as quick as comparing the object id’s. Testing equality for Strings however, is a bit more computationally expensive. As every String has its own place in memory, the interpreter has to compare the actual data making up the String. A good example to visualize this type of check would be the following:
a = "test".split(//)
b = "test".split(//)
e = true
number = (a.length > b.length) ? a.length : b.length
number.times do |index|
if a[index] == b[index]
puts "#{a[index]} is equal to #{b[index]}"
else
puts "#{a[index]} is not equal to #{b[index]}"
e = false
break
end
end
puts "#{a.join} equal to #{b.join}: #{e}"
# => t is equal to t
# => e is equal to e
# => s is equal to s
# => t is equal to t
# => test equal to test: true
So, back to benchmarking. How much faster is comparing Symbols then to Strings? Lets find out:
require 'benchmark'
str = Benchmark.measure do
10_000_000.times do
"test" == "test"
end
end.total
sym = Benchmark.measure do
10_000_000.times do
:test == :test
end
end.total
puts "String: " + str.to_s
puts "Symbol: " + sym.to_s
puts
Just like before, I ran this script three times and got the following results.
$ ruby benchmark.rb
String: 4.48
Symbol: 2.39
$ ruby benchmark.rb
String: 4.47
Symbol: 2.39
$ ruby benchmark.rb
String: 4.47
Symbol: 2.38
And just like before, we are close to a cutting our processing time by half. Not so bad.
Summary
Overall, we went though quite a bit in this tutorial. First, we learned about mutability and how Strings and Symbols are different in this regard. We then jumped into when to use each and finally looked at the performance characteristics of both. Now that you have a better idea of this interesting aspect of Ruby, you can start using Symbols to keep your applications running faster and more consistently.
Subscribe to:
Posts (Atom)