Thursday, December 31, 2020

How to Use Ruby Case Statements with === / Higher Order Lambdas / Pattern Matching

Happy New Year!

In this blog post, I will go over a practical example from a real project of how to use the Ruby `case` statement with Class implicit `is_a?` comparisons via `===` , higher order lambdas, and the new Ruby 3 pattern matching.

I just had to refactor some code in my new project YASL (Yet Another Serialization Library), which was originally in this form:

# Relating to [YASL](https://github.com/AndyObtiva/yasl)
def dump_ruby_basic_data_type_data(object)
case object
when Time
object.to_datetime.marshal_dump
when Date
object.marshal_dump
when Complex, Rational, Regexp, Symbol, BigDecimal
object.to_s
when Set
object.to_a.uniq.map {|element| dump_structure(element) unless unserializable?(element)}
when Range
[object.begin, object.end, object.exclude_end?]
when Array
object.map {|element| dump_structure(element) unless unserializable?(element)}
when Hash
object.reject do |key, value|
[key, value].detect {|element| unserializable?(element)}
end.map do |pair|
pair.map {|element| dump_structure(element)}
end
end
end

The when statements rely implicitly on the Class `===` method in object comparisons, which is implemented to test if `object.is_a?(SomeClass)` by default (e.g. `object.is_a?(Time)` for `when Time`) .

Although the code is pretty concise and readable, there is one big issue in it, mainly that `Date`, `DateTime`, and `BigDecimal` aren't loaded by default in Ruby, and in some older versions of Ruby, `Set` isn't loaded either. As such, that code would work in most cases, but bomb in cases in which comparison reaches `Date` and `DateTime` while not loaded via `require 'date'`, `BigDecimal` while not loaded via `require 'bigdecimal'`, or `Set` while not loaded via `require 'set'`. Unfortunately, pre-loading `Date`, `DateTime`, `BigDecimal` and `Set` could raise the memory footprint of the library for no important reason, so it is not desirable as a solution.

To circumvent this problem, I ended up comparing to class name strings instead of loaded classes by relying on higher order lambdas (in Ruby versions prior to Ruby 3) to achieve readable code without reliance on unwieldy if..elsif statements, albeit less pretty than the original code:

# Relating to [YASL](https://github.com/AndyObtiva/yasl)
# Intentionally avoiding `case ClassType` expressions because we are comparing against classes
# that might not be loaded (in older Ruby versions) into memory yet depending on whether
# consumer code loads Set with `require 'set'` and BigDecimal with `require 'bigdecimal'`
def dump_ruby_basic_data_type_data(object)
class_ancestors_names_include = lambda do |*class_names|
lambda do |object|
class_names.any? do |class_name|
obj.class.ancestors.map(&:name).include?(class_name)
end
end
end
case object
when class_ancestors_names_include.call('Time')
object.to_datetime.marshal_dump
when class_ancestors_names_include.call('Date')
object.marshal_dump
when class_ancestors_names_include.call('Complex', 'Rational', 'Regexp', 'Symbol', 'BigDecimal')
object.to_s
when class_ancestors_names_include.call('Set')
object.to_a.uniq.map {|element| dump_structure(element) unless unserializable?(element)}
when class_ancestors_names_include.call('Range')
[object.begin, object.end, object.exclude_end?]
when class_ancestors_names_include.call('Array')
object.map {|element| dump_structure(element) unless unserializable?(element)}
when class_ancestors_names_include.call('Hash')
object.reject do |key, value|
[key, value].detect {|element| unserializable?(element)}
end.map do |pair|
pair.map {|element| dump_structure(element)}
end
end
end

The reason that works is because `Proc` objects produced from lambdas have `===` implemented as simply `call(object)`. This ensures that a `Proc` object is first called with the class name(s) as strings (not actual loaded classes), returning another `Proc` ready to do the comparison on the particular object being tested (array of ancestor class names). Unfortunately, it is not very pretty due to the logic-unrelated lower-level `.call` methods. Thankfully, more recent versions of Ruby allow dropping them while keeping the `.` only for a more concise version:

# Relating to [YASL](https://github.com/AndyObtiva/yasl)
# Intentionally avoiding `case ClassType` expressions because we are comparing against classes
# that might not be loaded (in older Ruby versions) into memory yet depending on whether
# consumer code loads Set with `require 'set'` and BigDecimal with `require 'bigdecimal'`
def dump_ruby_basic_data_type_data(object)
class_ancestors_names_include = lambda do |*class_names|
lambda do |object|
class_names.any? do |class_name|
obj.class.ancestors.map(&:name).include?(class_name)
end
end
end
case object
when class_ancestors_names_include.('Time')
object.to_datetime.marshal_dump
when class_ancestors_names_include.('Date')
object.marshal_dump
when class_ancestors_names_include.('Complex', 'Rational', 'Regexp', 'Symbol', 'BigDecimal')
object.to_s
when class_ancestors_names_include.('Set')
object.to_a.uniq.map {|element| dump_structure(element) unless unserializable?(element)}
when class_ancestors_names_include.('Range')
[object.begin, object.end, object.exclude_end?]
when class_ancestors_names_include.('Array')
object.map {|element| dump_structure(element) unless unserializable?(element)}
when class_ancestors_names_include.('Hash')
object.reject do |key, value|
[key, value].detect {|element| unserializable?(element)}
end.map do |pair|
pair.map {|element| dump_structure(element)}
end
end
end

This is prettier and more readable, but still a bit awkward. Can we drop the dot (`.`) entirely? Sure. Just switch parentheses to square brackets, and you could drop the dot (`.`) in newish versions of Ruby, resulting in more readable code:

# Relating to [YASL](https://github.com/AndyObtiva/yasl)
# Intentionally avoiding `case ClassType` expressions because we are comparing against classes
# that might not be loaded (in older Ruby versions) into memory yet depending on whether
# consumer code loads Set with `require 'set'` and BigDecimal with `require 'bigdecimal'`
def dump_ruby_basic_data_type_data(object)
class_ancestors_names_include = lambda do |*class_names|
lambda do |object|
class_names.any? do |class_name|
obj.class.ancestors.map(&:name).include?(class_name)
end
end
end
case object
when class_ancestors_names_include['Time']
object.to_datetime.marshal_dump
when class_ancestors_names_include['Date']
object.marshal_dump
when class_ancestors_names_include['Complex', 'Rational', 'Regexp', 'Symbol', 'BigDecimal']
object.to_s
when class_ancestors_names_include['Set']
object.to_a.uniq.map {|element| dump_structure(element) unless unserializable?(element)}
when class_ancestors_names_include['Range']
[object.begin, object.end, object.exclude_end?]
when class_ancestors_names_include['Array']
object.map {|element| dump_structure(element) unless unserializable?(element)}
when class_ancestors_names_include['Hash']
object.reject do |key, value|
[key, value].detect {|element| unserializable?(element)}
end.map do |pair|
pair.map {|element| dump_structure(element)}
end
end
end

That said, in the newly released Ruby 3, one could just rely on Array Pattern Matching via `case in` instead of `case when` to avoid higher order lambdas altogether:

# Relating to [YASL](https://github.com/AndyObtiva/yasl)
def dump_ruby_basic_data_type_data(object)
case object.class.ancestors.map(&:name)
in [*, 'Time', *]
object.to_datetime.marshal_dump
in [*, 'Date', *]
object.marshal_dump
in [*, 'Complex', *] | [*, 'Rational', *] | [*, 'Regexp', *] | [*, 'Symbol', *] | [*, 'BigDecimal', *]
object.to_s
in [*, 'Set', *]
object.to_a.uniq.map {|element| dump_structure(element) unless unserializable?(element)}
in [*, 'Range', *]
[object.begin, object.end, object.exclude_end?]
in [*, 'Array', *]
object.map {|element| dump_structure(element) unless unserializable?(element)}
in [*, 'Hash', *]
object.reject do |key, value|
[key, value].detect {|element| unserializable?(element)}
end.map do |pair|
pair.map {|element| dump_structure(element)}
end
end
end

That takes away the need to use higher order lambdas, which are a more complicated construct that is better avoided when possible. That said, YASL (Yet Another Serialization Library) still needs to support older Ruby versions, so I am stuck with higher order lambdas for now.

Just as a final note, keep in mind that the case statement could be eliminated completely by relying on Object Oriented Programming Design Patterns, such as Strategy. This is usually done on a case by case basis (no pun intended), and in this case I deemed it over-engineering to use the Strategy Pattern, but it's certainly an option on the table if needed in future refactorings.

In summary, case statements provide multiple ways to test objects through:

Have a Happy 2021!

No comments: