0

I need to split a string into a nested array something like this. Small subset of the actual data

  • There is no fixed size, code could be aaa100 and bbbbbbbb100
  • No fixed char to split on

A similar problem is with converting file path to array but I dont have anything like / to split on.

Use case:

Product import with a few thousand products. Each product code is made up with something like the following:

xxx product type
yyy size
zzz colour

So you have codes like xxxx, xxxxyy, xxxxzz or xxxxyyzz.

AAA100B // product type AAA with size 100 and colour blue
AAA200B // product type AAA with size 200 and colour blue
AAA100G // product type AAA with size 100 and colour green
BBB100B // product type BBB with size 100 and colour blue

Some do not have colour, others don't have size. Possible that some will not have 'matches' and just be a lone code.

Edit

The last char (or two) is normally a colour. Colours can be any of the following. (Colour code will never anywhere except the end of the string)

BK, GY
B, G, O, P, R, S

numbers are assumed to be the size, if the number is in the form 00-00 its lenght/width, else its a single size 00

Edit

Rough start to the problem, virtually there (expand and merge are part of the framework so just hacked them in there) http://codepad.viper-7.com/Wp19g2

8
  • There has to be some pattern to the input .. is it always "Letters""Numbers""Letters"? Commented Dec 6, 2012 at 23:14
  • I linked a sample of actual data pastebin.com/BLPVfvBj. Only real pattern is 'product-type' - 'size' - 'colour', but colour and size are not always there. There is also no separator between them always Commented Dec 6, 2012 at 23:19
  • 1
    Well there has to be a way to know when product-type ends and size (or color) begins Commented Dec 6, 2012 at 23:20
  • generally the size would be numeric while the rest is alpha. The colours seem to always be the last char (or two in some cases like BK for black as B is blue). Commented Dec 6, 2012 at 23:21
  • If you're saying that product-type can be followed immediately by color and that both are alpha and there is no pattern to color, then this is kind of impossible Commented Dec 6, 2012 at 23:22

1 Answer 1

0

Hmm, I guess the obvious (yet somewhat inefficient) way would be to take your code string, loop through the letters till you reach a number (recording all the letters up to that point as your product type), when you reach a number record that value and every number after it till you reach a letter again, and begin recording the color. This, of course, assumes that the color and product type does not include numbers!

$product_size = ""; $product_type = ""; $product_color = "";
for ($i = 0; $i < strlen ( $code ); $i++ ) {
     if ( !is_numeric ( $code [ $i ] ) ) { 
          if ( $product_size == "" ) $product_type .= $code [ $i ];
          else $product_color .= $code [ $i ]
     } else $product_size .= $code [ $i ];
}

Its a good design choice to have a separator though. One problem with my solution is that if there is no product size then it will never record a color.

Sign up to request clarification or add additional context in comments.

5 Comments

This doesn't work for the codes with no size between the type and color.
Yup. I'm not sure how one can overcome that hurdle without a separator or a good idea at how we can differentiate between the blocks.
That's the problem. Until the OP provides enough information to disambiguate, there's not much point in posting answers.
Unfortunately I do not have a choice in the separator matter. Just using what I have got. I have added a bit more detail regarding the colours which are pretty constant.
Will the last color always be one or two letters?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.