3

I have a hash that has an unknown collection and mixture of nested arrays, hashes, arrays of hashes and strings. This is the result of JSON.parse. The structure of the data must be the same as it started with. The end goal is to convert strings to Fixnums that could be Fixnums.

The following works just fine, but I was wondering if it could be shortened. Note how I need the key and the value in the clean method as not all strings that can be Fixnums should be. Any ideas?

def clean_node(node)
    if node.class == String
      clean(node)
    elsif node.class == Array
      node.each_with_index do |obj, i|
        if obj.class == String
          node[i] = clean(node[i], obj)
        else
          clean_node(obj)
        end
      end
    elsif node.class == Hash
      node.each_pair do |key, value|
        if value.class == String
          node[key] = clean(key, value)
        else
          clean_node(value)
        end
      end
    end
end

def clean(key, value)
    FIXNUM_KEYS.include?(key)? value.to_i : value
end
5
  • 3
    Are you generating the JSON yourself? It looks like the JSON contains numbers as strings. I would rather try to fix the code that generates the JSON. Commented Nov 13, 2013 at 7:38
  • In line 7, on the right side of the assignment, node[i] is the same as obj. Why are you writing clean(node[i], obj)? Commented Nov 13, 2013 at 7:46
  • notwithstanding the strange logic noted by sawa, this kind of code is pretty typical of processing heterogenous data structures - I don't think it really simplifies.... Commented Nov 13, 2013 at 7:52
  • I don't generate the JSON otherwise of course, i would fix it there. And trying to fix it once I get it before parsing it to a hash results in a horrible amount of side effects to deal with. Commented Nov 13, 2013 at 21:45
  • @sawa because clean actually needs the key and the value, clean returns a string. I wrote clean so I didn't have to write that same line in clean twice, once for the hash and once for the array cases. Commented Nov 13, 2013 at 21:47

2 Answers 2

3

There is no need to split up the string processing, you can do it all in one recursive routine, if you return the value of node at the end of the routine. This has the side effect of removing some parameters, and you can use in-place .map! for handling arrays.

Using case makes the selection by type slightly easier to read.

Adding the ability to deal with arrays of strings could be done in lots of ways. I've chosen to add a state variable (a common pattern in recursion, for instance to count depth). Then in the case of arrays, the recursion inherits the state from the current level, whilst the hash case determines new state (to convert or not) based on lookup in the list of keys to apply conversions.

def clean_node( node, convert_item = false )

    case node

    when String
      node = node.to_i if convert_item

    when Array
      node.map! do |obj| 
        clean_node( obj, convert_item )
      end

    when Hash
      node.each_pair do |key, value|
        node[key] = clean_node( value, FIXNUM_KEYS.include?(key) )
      end
    end

    node
end
Sign up to request clarification or add additional context in comments.

2 Comments

I like it, but you don't check the FIXNUM_KEYS and do a .to_i with any of the array objects?
@Marc: Your original code did not apparently deal with that (and having the declarative nature of FIXNUM_KEYS means there is some un-written intent in how it should be used). In practice I think it is cleaner to add back in the handling of String values and convert one recursion level deeper.
2

Although I have not looked into the recursion, I must comment on the fact that you are writing if statements that would be easier to read in a case statement:

def clean_node(node)
  case node
    when String then clean(node)
    when Array
      node.each_with_index do |obj, i|
        case obj
          when String
            node[i] = clean(node[i], obj)
          else
            clean_node(obj)
        end
      end
    when Hash....

2 Comments

I also think that the case for String is redundant. All the strings will be altered inside the containers, and clean(node) does not alter the string in place. The code could be cleaned up even further, but it all starts looking like codereview.stackexchange.com - I think moving to case statements is a good core piece of advice for readability.
I never knew of codereview.stackexchange.com, thanks, and yes, that was what I was going after. The casing suggestion is great. Going to wait for @NeilSlater response to my question to his answer before I mark anyone as answer.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.