0

I am trying to write a python program to extract colors of rubiks cube, i am stuck at recognizing / masking stage ( so as to separate the rubiks cube from background ).

What i do is:

  • canny edge detection
  • dilate
  • contours
  • countour approximation

But i still end up with too many contours, cause of background objects etc.., to be able to separate the rubiks cube from background

Any ideas how i can approach this problem?

Link to my current output(image):

My current output

3
  • is the camera fixed? will the cube always be in the same position relative to the camera (sitting, whatever sides visible)? show SOURCE data, not filtered results. your approach is likely not leading to a solution. you should want different approaches discussed, not ways to fix a bad approach. Commented Sep 27, 2021 at 18:40
  • Always best to post your original color image, so others might suggest different approaches for you. Commented Sep 28, 2021 at 4:38
  • Please provide enough code so others can better understand or reproduce the problem. Commented Oct 5, 2021 at 17:03

2 Answers 2

1

The code described below is over on github in main.py here simbo1905/CubieMoves. Note that link points to the first commit that has a working solve that is detailed below.

My approach will be to get the user to hold the cube face close and centre until it is the correct size. This is what apps do to have you scan a credit card into your phone or a QR code:

enter image description here

Then I put this mask over the image to pick out the majority of the stickers:

enter image description here

So I can draw circles on a video image to help the user align the mask properly.

If you apply the mask over the sticker and concatenate all the images into one large "all sides" image you get:

enter image description here

Note the commit above does not have any user input logic it starts as input with some files such as 'U.png', 'L.png', 'R.png' etc that need to have been cropped to 768x768 centred on the cube face. This is to be able to focus on the image processing logic.

Here is the logic to mask each of the six images and stack them into one large image to make that polka dot image above:

    #  Up, Left, Front, Right, Back, and Down https://pypi.org/project/kociemba/
    face_labels = ['U', 'R', 'F', 'D', 'L', 'B']


    masked_images = []
    for face in face_labels:
        img_name = face + '.png'
        img_original = cv2.imread(img_name)
        img = cv2.resize(img_original, dim, interpolation=cv2.INTER_LINEAR)
        # Apply mask onto input image
        masked = cv2.bitwise_and(img, img, mask=mask_img)
        masked_images.append(masked)

    zero_one = np.hstack((masked_images[0], masked_images[1]))
    zero_one_two = np.hstack((zero_one, masked_images[2]))
    zero_one = None

    three_four = np.hstack((masked_images[3], masked_images[4]))
    three_four_five = np.hstack((three_four, masked_images[5]))

    sticker_stack = np.vstack((zero_one_two, three_four_five))

We then know we need to segment that large sides image into 7 colours (the six sides plus the black mask).

In the following image below, I have run kmeans for 7 clusters as per the OpenCV docs (see the main.py code at the link above). It has successfully found blue, yellow and green. Yet it has confused red and orange as one colour. It has also split up white to light grey and pink:

enter image description here

First I "count pixels" in each circle and label that position with the majority colour. We have a high certainty of correctly identifying one set of stickers when you count exactly 9 sticker of one colour. You can then mask out those regions of interest and subtract 1 from the total number of clusters left to find. In the first pass it had correctly picked out 9 yellow, 9 blue and 9 green stickers. For each group them I overwrite them with black pixes and subtract 1 from k. So I then run kmeans again to find 4 clusters (black, white, orange and red) which gave:

enter image description here

In that second run, it successfully segmented each of black, white, orange and red. Once again I count the pixels of each colour in each sticker spot and label the spot with the majority colour. As I then find that I have 9 of each sticker column we are solved.

This leads me to the following classification:

enter image description here

We end up with the array final_colors logging as:

INFO:root:final_colours: [(0, 58, 241), (0, 185, 204), (150, 118, 38), (0, 58, 241), (0, 185, 204), (34, 2, 202), (143, 161, 198), (0, 163, 82), (34, 2, 202), (150, 118, 38), (150, 118, 38), (0, 58, 241), (150, 118, 38), (150, 118, 38), (0, 185, 204), (0, 58, 241), (34, 2, 202), (0, 185, 204), (150, 118, 38), (0, 58, 241), (0, 185, 204), (143, 161, 198), (0, 58, 241), (0, 185, 204), (150, 118, 38), (150, 118, 38), (0, 185, 204), (0, 58, 241), (0, 58, 241), (0, 163, 82), (150, 118, 38), (143, 161, 198), (143, 161, 198), (34, 2, 202), (34, 2, 202), (34, 2, 202), (143, 161, 198), (143, 161, 198), (34, 2, 202), (0, 163, 82), (0, 163, 82), (0, 163, 82), (143, 161, 198), (143, 161, 198), (0, 185, 204), (143, 161, 198), (34, 2, 202), (0, 163, 82), (0, 58, 241), (34, 2, 202), (0, 185, 204), (0, 163, 82), (0, 163, 82), (0, 163, 82)]

We know that the centres do not move and are at the following indexes in that array:

centres: dict[str, int] = {'U': 4 + (0 * 9), 'R': 4 + (1 * 9), 'F': 4 + (2 * 9), 'D': 4 + (3 * 9), 'L': 4 + (4 * 9),
                           'B': 4 + (5 * 9)}

So we can pick out from the long list what the colours are at the centres stickers with:

INFO:root:colors_to_labels: {(0, 185, 204): 'U', (150, 118, 38): 'R', (0, 58, 241): 'F', (143, 161, 198): 'D', (0, 163, 82): 'L', (34, 2, 202): 'B'}

Then we can loop over the 45 elements of final_colours and lookup their labels with:

    encoding = ""
    for colour in final_colours:
        encoding = encoding + colors_to_labels[colour]

That gives our input to the solver library:

    solve = kociemba.solve(encoding)

For my test images in the 'solve0' folder we get:

INFO:root:encoding: FURFUBDLBRRFRRUFBURFUDFURRUFFLRDDBBBDDBLLLDDUDBLFBULLL
INFO:root:solve:D R2 L F' R'
Sign up to request clarification or add additional context in comments.

Comments

0

There are three ways I could think of.

[1]

The first way is to change the way you take a picture of Rubik cube. It is hard to use contours detection when your image is having a lot of background noises.

Having a Rubik cube in all-white(or other colors that you like), contours detection would work so much better.

[2]

Or even better, you can remove the color in the background first(just like the green screen, but make sure it's not the color that presents on the Rubik) and you would end up with only a Rubik cube only.

[3]

But If you really need to use the image like you attached(you holding the Rubik cube). I might recommend using a small object detection model. You can use haar cascade and prepare some datasets to make a detection model and work on the result after that.

[Extra]

https://github.com/kkoomen/qbr I have found one good method that could work. You can ignore the edge detection or other stuff. But focus on detecting the Rubik's color in the image instead (might have to calibrate with your Rubik's colors). Once you know the rough location of each color, you can do some coding to make it neater later.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.