Make unique array with minimal sum

Question

It is a interview question. Given an array, e.g., [3,2,1,2,7], we want to make all elements in this array unique by incrementing duplicate elements and we require the sum of the refined array is minimal. For example the answer for [3,2,1,2,7] is [3,2,1,4,7] and its sum is 17. Any ideas?

Are you looking for any solution, or do you also have a peformance bound to hit here? — Tim Biegeleisen
– Tim Biegeleisen, Commented Oct 8, 2017 at 8:36
Sort the array. Scan from left to right (low to high), if any element is the same as its left neighbour add 1 to it. Unsort the array (you did remember to keep the sort permutation didn't you ?). — High Performance Mark
– High Performance Mark, Commented Oct 8, 2017 at 9:57

High Performance Mark · Accepted Answer · 2017-10-08 12:43:27Z

2

It's not quite as simple as my earlier comment suggested, but it's not terrifically complicated.

First, sort the input array. If it matters to be able to recover the original order of the elements then record the permutation used for the sort.

Second, scan the sorted array from left to right (ie from low to high). If an element is less than or equal to the element to its left, set it to be one greater than that element.

Pseudocode

  sar = sort(input_array)
  for index = 2:size(sar)   ! I count from 1
      if sar(index)<=sar(index-1) sar(index) = sar(index-1)+1
  forend

Is the sum of the result minimal ? I've convinced myself that it is through some head-scratching and trials but I haven't got a formal proof.

answered Oct 8, 2017 at 12:43

High Performance Mark

78.7k7 gold badges109 silver badges168 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Andrea Calabrese · Accepted Answer · 2017-10-08 10:26:55Z

If you only need to find ONE of the best solution, here's the algorythm with some explainations. The idea of this problem is to find an optimal solution, which can be found only by testing all existing solutions (well, they're infinite, let's stick with the reasonable ones).

I wrote a program in C, because I'm familiar with it, but you can port it to any language you want.

The program does this: it tries to increment one value to the max possible (I'll explain how to find it in the comments under the code sections), than if the solution is not found, decreases this value and goes on with the next one and so on.

It's an exponential algorythm, so it will be very slow on large values of duplicated data (yet, it assures you the best solution is found).

I tested this code with your example, and it worked; not sure if there's any bug left, but the code (in C) is this.

#include <stdio.h>
#include <stdlib.h>
#include <limits.h>
typedef int BOOL;   //just to ease meanings of values
#define TRUE 1
#define FALSE 0

Just to ease comprehension, I did some typedefs. Don't worry.

typedef struct duplicate {      //used to fasten the algorythm; it uses some more memory just to assure it's ok
    int value;
    BOOL duplicate;
} duplicate_t;

int maxInArrayExcept(int *array, int arraySize, int index); //find the max value in array except the value at the index given
                                                            //the result is the max value in the array, not counting th index
int *findDuplicateSum(int *array, int arraySize);
BOOL findDuplicateSum_R(duplicate_t *array, int arraySize, int *tempSolution, int *solution, int *totalSum, int currentSum);    //resursive function used to find solution
BOOL check(int *array, int arraySize);  //checks if there's any repeated value in the solution

These are all the functions we'll need. All split up for comprehension purpose. First, we have a struct. This struct is used to avoid checking, for every iteration, if the value on a given index was originally duplicated. We don't want to modify any value not duplicated originally.

Then, we have a couple functions: first, we need to see the worst case scenario: every value after the duplicated ones is already occupied: then we need to increment the duplicated value up to the maximum value reached + 1. Then, there are the main Function we'll discute later about. The check Function only checks if there's any duplicated value in a temporary solution.

int main() {                            //testing purpose
    int i;
    int testArray[] = { 3,2,1,2,7 };    //test array
    int nTestArraySize = 5;             //test array size
    int *solutionArray;                 //needed if you want to use the solution later
    solutionArray = findDuplicateSum(testArray, nTestArraySize);
    for (i = 0; i < nTestArraySize; ++i) {
        printf("%d ", solutionArray[i]);
    }
    return 0;
}

This is the main Function: I used it to test everything.

int * findDuplicateSum(int * array, int arraySize)
{
    int *solution = malloc(sizeof(int) * arraySize);
    int *tempSolution = malloc(sizeof(int) * arraySize);
    duplicate_t *duplicate = calloc(arraySize, sizeof(duplicate_t));
    int i, j, currentSum = 0, totalSum = INT_MAX;
    for (i = 0; i < arraySize; ++i) {
        tempSolution[i] = solution[i] = duplicate[i].value = array[i];
        currentSum += array[i];
        for (j = 0; j < i; ++j) {   //to find ALL the best solutions, we should also put the first found value as true; it's just a line more
                                //yet, it saves the algorythm half of the duplicated numbers (best/this case scenario)
            if (array[j] == duplicate[i].value) {
                duplicate[i].duplicate = TRUE;
            }
        }
    }
    if (findDuplicateSum_R(duplicate, arraySize, tempSolution, solution, &totalSum, currentSum));
    else {
        printf("No solution found\n");
    }
    free(tempSolution);
    free(duplicate);
    return solution;
}

This Function does a lot of things: first, it sets up the solution array, then it initializes both the solution values and the duplicate array, that is the one used to check for duplicated values at startup. Then, we find the current sum and we set the maximum available sum to the maximum integer possible. Then, the recursive Function is called; this one gives us the info about having found the solution (that should be Always), then we return the solution as an array.

int findDuplicateSum_R(duplicate_t * array, int arraySize, int * tempSolution, int * solution, int * totalSum, int currentSum)
{
    int i;
    if (check(tempSolution, arraySize)) {
        if (currentSum < *totalSum) {       //optimal solution checking
            for (i = 0; i < arraySize; ++i) {
                solution[i] = tempSolution[i];
            }
            *totalSum = currentSum;
        }
        return TRUE; //just to ensure a solution is found
    }
    for (i = 0; i < arraySize; ++i) {
        if (array[i].duplicate == TRUE) {
            if (array[i].duplicate <= maxInArrayExcept(solution, arraySize, i)) {   //worst case scenario, you need it to stop the recursion on that value
                tempSolution[i]++;
                return findDuplicateSum_R(array, arraySize, tempSolution, solution, totalSum, currentSum + 1);
                tempSolution[i]--;  //backtracking
            }
        }
    }
    return FALSE; //just in case the solution is not found, but we won't need it
}

This is the recursive Function. It first checks if the solution is ok and if it is the best one found until now. Then, if everything is correct, it updates the actual solution with the temporary values, and updates the optimal condition. Then, we iterate on every repeated value (the if excludes other indexes) and we progress in the recursion until (if unlucky) we reach the worst case scenario: the check condition not satisfied above the maximum value. Then we have to backtrack and continue with the iteration, that will go on with other values.

PS: an optimization is possible here, if we move the optimal condition from the check into the for: if the solution is already not optimal, we can't expect to find a better one just adding things.

The hard code has ended, and there are the supporting functions:

int maxInArrayExcept(int *array, int arraySize, int index) {
    int i, max = 0;
    for (i = 0; i < arraySize; ++i) {
        if (i != index) {
            if (array[i] > max) {
                max = array[i];
            }
        }
    }
    return max;
}

BOOL check(int *array, int arraySize) {
    int i, j;
    for (i = 0; i < arraySize; ++i) {
        for (j = 0; j < i; ++j) {
            if (array[i] == array[j]) return FALSE;
        }
    }
    return TRUE;
}

I hope this was useful. Write if anything is unclear.

jhpratt · Accepted Answer · 2018-04-18 18:51:18Z

0

Well, I got the same question in one of my interviews. Not sure if you still need it. But here's how I did it. And it worked well.

num_list1 = [2,8,3,6,3,5,3,5,9,4]

def UniqueMinSumArray(num_list):

    max=min(num_list)
    for i,V in enumerate(num_list):
        while (num_list.count(num_list[i])>1):
            if (max > num_list[i]+1) :
                num_list[i] = max + 1
            else:
                num_list[i]+=1
            max = num_list[i]
        i+=1
    return num_list

print (sum(UniqueMinSumArray(num_list1)))

You can try with your list of numbers and I am sure it will give you the correct unique minimum sum.

edited Apr 18, 2018 at 18:51

jhpratt

7,15016 gold badges43 silver badges53 bronze badges

answered Apr 18, 2018 at 18:24

Analyst17

1631 gold badge2 silver badges13 bronze badges

Comments

Zhen Wang · Accepted Answer · 2019-06-25 04:36:50Z

I got the same interview question too. But my answer is in JS in case anyone is interested.

For sure it can be improved to get rid of for loop.

function getMinimumUniqueSum(arr) {
    // [1,1,2] => [1,2,3] = 6
    // [1,2,2,3,3] = [1,2,3,4,5] = 15
    if (arr.length > 1) {
        var sortedArr = [...arr].sort((a, b) => a - b);
        var current = sortedArr[0];
        var res = [current];

        for (var i = 1; i + 1 <= arr.length; i++) {
            // check current equals to the rest array starting from index 1.
            if (sortedArr[i] > current) {
                res.push(sortedArr[i]);
                current = sortedArr[i];
            } else if (sortedArr[i] == current) {
                    current = sortedArr[i] + 1;
                    // sortedArr[i]++;
                    res.push(current);
                } else {
                   current++;
                   res.push(current);
            }
        }

        return res.reduce((a,b) => a + b, 0);
    } else {
        return 0;
    }
}

Collectives™ on Stack Overflow

Make unique array with minimal sum

4 Answers 4

Comments

Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

Comments

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related