Wednesday, February 22, 2017

Ruby's Elegant Way of Dealing with The Array of Hashes vs Single Hash JSON Problem

I'm sure you've encountered REST Web Service APIs that operate as follows:
HTTP Request: =>
GET /people
<= 1 person JSON Response:
{"first_name":"Bob","last_name":"McAndrew","city":"Chicago"}
HTTP Request: =>
GET /people
<= 3 people JSON Response:
[{"first_name":"Bob","last_name":"McAndrew","city":"Chicago"}, {"first_name":"Lisa","last_name":"Donald","city":"Madison"}, {"first_name":"Montgomery","last_name":"Hicks","city":"Indianapolis"}]
How do you work with the varied JSON responses in Ruby?
One approach for an app that needs to count people in cities:
city_counts = {}
json_response = people_http_request
if json_response.is_a?(Hash)
  city_counts[json_response["city"]] ||= 0
  city_counts[json_response["city"] += 1
elsif json_response.is_a?(Array)
  json_response.each do |person|
    city_counts[person["city"]] ||= 0
    city_counts[person["city"]] += 1
  end
end
Not only is the code above redundant and overly complicated, but it also breaks common Ruby and object oriented development standards by relying on explicit type checking instead of duck-typing, polymorphism, or design patterns.
A slightly better version relying on duck-typing would be:
city_counts = {}
json_response = people_http_request
if json_response.respond_to?(:each_pair)
  city_counts[json_response["city"]] ||= 0
  city_counts[json_response["city"] += 1
elsif json_response.respond_to?(:each_index)
  json_response.each do |person|
    city_counts[person["city"]] ||= 0
    city_counts[person["city"]] += 1
  end
end
A slightly clearer version relying on design patterns (Strategy) and parametric polymorphism (functional) would be:
city_counts = {}
city_counting_strategies = {
  Hash: -> { |json_response|
    city_counts[json_response["city"]] ||= 0
    city_counts[json_response["city"] += 1
  },
  Array: -> { |json_response|
    json_response.each do |person|
      city_counts[person["city"]] ||= 0
      city_counts[person["city"]] += 1
    end
  }
}
json_response = people_http_request
city_counting_strategies[json_response.class].call(json_response)
A more radical version relying on object-oriented polymorphism and Ruby open-classes would be:
Hash.class_eval do
  def process_json_response(&processor)
    processor.call(self)
  end
end

Array.class_eval do
  def process_json_response(&processor)
    each(&processor)
  end
end

city_counts = {}
json_response = people_http_request
json_response.process_json_response do |person|
  city_counts[person["city"]] ||= 0
  city_counts[person["city"]] += 1
end
This version is quite elegant, clear, and Ruby idiomatic, but aren't we using a Nuclear device against a fly that sometimes comes as a swarm of flies? I'm sure we can have a much simpler solution, especially in a language like Ruby.
Well, how about this functional solution?
city_counts = {}
[people_http_request].flatten.each do |person|
  city_counts[person["city"]] ||= 0
  city_counts[person["city"]] += 1
end
Yes, hybrid functional/object-oriented programming to the rescue.
One may wonder what to do if the response comes in as nil or includes nil values in an array. Well, this approach can scale to handle that too should ignoring nil be the requirement.
city_counts = {}
[people_http_request].flatten.compact.each do |person|
  city_counts[person["city"]] ||= 0
  city_counts[person["city"]] += 1
end
Can we generalize this elegant solution beyond counting cities? After all, the key problem with the code on top is it gets quite expensive to maintain in a real-world production app containing many integrations with REST Web Service APIs.
This functional generalization should work by allowing you to switch json_response variable and process_json_response proc anyway you want:
[json_response].flatten.compact.each(&:process_json_response)
Example:
[cities_json_response].flatten.compact.each(&:group_by_country)
How about go one step further and bake this into all objects using our previous approach of object-oriented polymorphism and Ruby open-classes? That way, we don't just collapse the difference between dealing with arrays of hashes vs hashes but also arrays of objects vs singular objects by adding. Note the use of flatten(1) below to prevent arrays or arrays from collapsing more than one level.
Object.class_eval do
  def to_collection
    [self].flatten(1).compact
  end
end
Example usage (notice how more readable this is than the explicit version above by hiding flatten and compact):
city_counts = {}
people_http_request.to_collection.each do |person|
  city_counts[person["city"]] ||= 0
  city_counts[person["city"]] += 1
end
A refactored version including optional compacting would be:
Object.class_eval do
  def to_collection(compact=true)
    collection = [self].flatten(1)
    compact ? collection.compact : collection
  end
end
Example usage of to_collection(compact) to count bad person hashes coming as nil:
bad_people_count = 0
city_counts = {}
people_http_request.to_collection(false).each do |person|
  if person.nil?
    bad_people_count += 1
  else
    city_counts[person["city"]] ||= 0
    city_counts[person["city"]] += 1
  end
end
Grab this solution as a Ruby gem: 
https://github.com/AndyObtiva/to_collection


Know of any other solutions to this problem? Please share in comments.

No comments: