Calculating multiples in Haskell (conversion from C)? [closed]

Question

Closed. This question needs to be more focused. It is not currently accepting answers.

Want to improve this question? Guide the asker to update the question so it focuses on a single, specific problem. Narrowing the question will help others answer the question concisely. You may edit the question if you feel you can improve it yourself. If edited, the question will be reviewed and might be reopened.

Closed 7 years ago.

Improve this question

I'm trying to write a Haskell program that calculates multiples. Basically, when given two integers a and b, I want to find how many integers 1 ≤ bi ≤ b are multiple of any integer 2 ≤ ai ≤ a. For example, if a = 3 and b = 30, I want to know how many integers in the range of 1-30 are a multiple of 2 or 3; there are 20 such integers: 2, 3, 4, 6, 8, 9, 10, 12, 14, 15, 16, 18, 20, 21, 22, 24, 26, 27, 28, 30.

I have a C program that does this. I'm trying to get this translated into Haskell, but part of the difficulty is getting around the loops that I've used since Haskell doesn't use loops. I appreciate any and all help in translating this!

My C program for reference (sorry if formatting is off):

#define PRIME_RANGE 130
#define PRIME_CNT   32

#define UPPER_LIMIT (1000000000000000ull) //10^15
#define MAX_BASE_MULTIPLES_COUNT  25000000

typedef struct
{
    char primeFactorFlag;
    long long multiple;
}multipleInfo;

unsigned char primeFlag[PRIME_RANGE + 1];
int primes[PRIME_CNT];
int primeCnt = 0;

int maxPrimeStart[PRIME_CNT];
multipleInfo baseMultiples[MAX_BASE_MULTIPLES_COUNT];
multipleInfo mergedMultiples[MAX_BASE_MULTIPLES_COUNT];
int baseMultiplesCount, mergedMultiplesCount;

void findOddMultiples(int a, long long b, long long *count);
void generateBaseMultiples(void);
void mergeLists(multipleInfo listSource[], int countS, multipleInfo    

listDest[], int *countD);
void sieve(void);

int main(void)
{

    int i, j, a, n, startInd, endInd;
    long long b, multiples;

    //Generate primes
    sieve();
    primes[primeCnt] = PRIME_RANGE + 1;

    generateBaseMultiples();
    baseMultiples[baseMultiplesCount].multiple = UPPER_LIMIT + 1;

    //Input and Output
    scanf("%d", &n);
    for(i = 1; i <= n; i++)
    {
        scanf("%d%lld", &a, &b);

        //If b <= a, all are multiple except 1
        if(b <= a)
            printf("%lld\n",b-1);
        else
        {
            //Add all even multiples
            multiples = b / 2;
            //Add all odd multiples
            findOddMultiples(a, b, &multiples);-

            printf("%lld\n", multiples);
        }
    }
    return 0;
}

void findOddMultiples(int a, long long b, long long *count)
{
    int i, k;
    long long currentNum;

    for(k = 1; k < primeCnt && primes[k] <= a; k++)
    {
        for(i = maxPrimeStart[k]; i < maxPrimeStart[k + 1] &&         

baseMultiples[i].multiple <= b; i++)
        {
            currentNum = b/baseMultiples[i].multiple;
            currentNum = (currentNum  + 1) >> 1; // remove even multiples
            if(baseMultiples[i].primeFactorFlag) //odd number of factors
                (*count) += currentNum;
            else
                (*count) -= currentNum;
        }
    }
}

void addTheMultiple(long long value, int primeFactorFlag)
{
    baseMultiples[baseMultiplesCount].multiple = value;
    baseMultiples[baseMultiplesCount].primeFactorFlag = primeFactorFlag;
    baseMultiplesCount++;
}

void generateBaseMultiples(void)
{
    int i, j, t, prevCount;
    long long curValue;

    addTheMultiple(3, 1);
     mergedMultiples[0] = baseMultiples[0];
    mergedMultiplesCount = 1;

    maxPrimeStart[1] = 0;
    prevCount = mergedMultiplesCount;
    for(i = 2; i < primeCnt; i++)
    {
        maxPrimeStart[i] = baseMultiplesCount;
        addTheMultiple(primes[i], 1);
        for(j = 0; j < prevCount; j++)
        {
            curValue = mergedMultiples[j].multiple * primes[i];
            if(curValue > UPPER_LIMIT)
                break;

            addTheMultiple(curValue, 1 - mergedMultiples[j].primeFactorFlag);
        }
        if(i < primeCnt - 1)
            mergeLists(&baseMultiples[prevCount], baseMultiplesCount - prevCount, mergedMultiples, &mergedMultiplesCount);

        prevCount = mergedMultiplesCount;
    }
    maxPrimeStart[primeCnt] = baseMultiplesCount;
}

void mergeLists(multipleInfo listSource[], int countS, multipleInfo listDest[], int *countD)
{
int limit = countS + *countD;
int i1, i2, j, k;

//Copy one list in unused safe memory
for(j = limit - 1, k = *countD - 1; k >= 0; j--, k--)
    listDest[j] = listDest[k];

//Merge the lists
for(i1 = 0, i2 = countS, k = 0; i1 < countS && i2 < limit; k++)
{

    if(listSource[i1].multiple <= listDest[i2].multiple)
        listDest[k] = listSource[i1++];
    else
        listDest[k] = listDest[i2++];
}

while(i1 < countS)
    listDest[k++] = listSource[i1++];


while(i2 < limit)
    listDest[k++] = listDest[i2++];

*countD = k;
}

void sieve(void)
{
   int i, j, root = sqrt(PRIME_RANGE);
    primes[primeCnt++] = 2;
    for(i = 3; i <= PRIME_RANGE; i+= 2)
    {
        if(!primeFlag[i])
        {
            primes[primeCnt++] = i;
            if(root >= i)
            {
                for(j = i * i; j <= PRIME_RANGE; j += i << 1)
                primeFlag[j] = 1;
        }
    }
}
}

As a starter: for loops in C are maps in functional programming, and while loops are recursion. I suspect your C does not need to be "translated" at all, rather you need to look for a functional solution in the first place. — user8174234
– user8174234, Commented Apr 12, 2018 at 2:49
As much as I understand you might need help in learning Haskell, StackOverflow is not a code-writing service. This question would be better if you focused on loops specifically, or gave an attempt. — AJF
– AJF, Commented Apr 12, 2018 at 18:47

chessai · Accepted Answer · 2018-04-12 18:28:08Z

2

First, unless I'm grossly misunderstanding, the number of multiples you have there is wrong. The number of multiples of 2 between 1 and 30 is 15, and the number of multiples of 3 between 1 and 30 is 10, so there should be 25 numbers there.

EDIT: I did misunderstand; you want unique multiples.

To get unique multiples, you can use Data.Set, which has the invariant that the elements of the Set are unique and ordered ascendingly.

If you know you aren't going to exceed x = maxBound :: Int, you can get even better speedups using Data.IntSet. I've also included some test cases and annotated with comments what they run at on my machine.

{-# LANGUAGE BangPatterns #-}

{-# OPTIONS_GHC -O2 #-}

module Main (main) where

import System.CPUTime (getCPUTime)
import Data.IntSet (IntSet)
import qualified Data.IntSet as IntSet

main :: IO ()
main = do
  test 3 30       -- 0.12 ms
  test 131 132    -- 0.14 ms
  test 500 300000 -- 117.63 ms

test :: Int -> Int -> IO ()
test !a !b = do
  start <- getCPUTime
  print (numMultiples a b)
  end   <- getCPUTime
  print $ "Needed " ++ show ((fromIntegral (end - start)) / 10^9) ++ " ms.\n"

numMultiples :: Int -> Int -> Int
numMultiples !a !b = IntSet.size (foldMap go [2..a])
  where
    go :: Int -> IntSet 
    go !x = IntSet.fromAscList [x, x+x .. b]

edited Apr 12, 2018 at 18:28

answered Apr 12, 2018 at 0:24

chessai

1366 bronze badges

Sign up to request clarification or add additional context in comments.

8 Comments

detroitfriedchicken Over a year ago

The amount of multiples between numbers should not duplicate. So, since 6 is a multiple of both 2 and 3, 6 should only be recorded once. I think when I tested your code, it duplicated values that should have only been recorded once.

chessai Over a year ago

i edited the post to reflect this

Daniel Wagner Over a year ago

For [y | y <- [1..b], y `mod` x == 0], consider [x,x+x .. b] instead. You might also find data-ordlist interesting, as it doesn't require converting to Set and could potentially give you the infinite list of multiples.

chessai Over a year ago

I switched to using [x, x+x .. b], it resulted in a relatively large speedup. Also, I think it's worth it to use IntSet here.

Daniel Wagner Over a year ago

Another thought: it is probably cheaper to track the numbers which are not multiples, using set difference at each step of the fold (and starting with the set of odd numbers). This way both the size of the set being folded in decreases and the size of the accumulator decreases; by comparison here the size of the set being folded in decreases as iteration goes on, but the size of the set of all multiples is increasing as iteration continues.

|

Daniel Wagner · Accepted Answer · 2018-04-12 06:26:51Z

2

I'm not really into understanding your C, so I implemented a solution afresh using the algorithm discussed here. The N in the linked algorithm is the product of the primes up to a in your problem description.

So first we'll need a list of primes. There's a standardish trick for getting a list of primes that is at once very idiomatic and relatively efficient:

primes :: [Integer]
primes = 2:filter isPrime [3..]

-- Doesn't work right for n<2, but we never call it there, so who cares?
isPrime :: Integer -> Bool
isPrime n = go primes n where
    go (p:ps) n | p*p>n = True
                | otherwise = n `rem` p /= 0 && go ps n

Next up: we want a way to iterate over the positive square-free divisors of N. This can be achieved by iterating over the subsets of the primes less than a. There's a standard idiomatic way to get a powerset, namely:

-- import Control.Monad
-- powerSet :: [a] -> [[a]]
-- powerSet = filterM (const [False, True])

That would be a fine component to use, but since at the end of the day we only care about the product of each powerset element and the value of the Mobius function of that product, we would end up duplicating a lot of multiplications and counting problems. It's cheaper to compute those two things directly while producing the powerset. So:

-- Given the prime factorization of a square-free number, produce a list of
-- its divisors d together with mu(d).
divisorsWithMu :: Num a => [a] -> [(a, a)]
divisorsWithMu [] = [(1, 1)]
divisorsWithMu (p:ps) = rec ++ [(p*d, -mu) | (d, mu) <- rec] where
    rec = divisorsWithMu ps

With that in hand, we can just iterate and do a little arithmetic.

f :: Integer -> Integer -> Integer
f a b = b - sum
    [ mu * (b `div` d)
    | (d, mu) <- divisorsWithMu (takeWhile (<=a) primes)
    ]

And that's all the code. Crunched 137 lines of C down to 15 lines of Haskell -- not bad! Try it out in ghci:

> f 3 30
20

As an additional optimization, one could consider modifying divisorsWithMu to short-circuit when its divisor is bigger than b, as we know such terms will not contribute to the final sum. This makes a noticeable difference for large a, as without it there are exponentially many elements in the powerset. Here's how that modification looks:

-- Given an upper bound and the prime factorization of a square-free number,
-- produce a list of its divisors d that are no larger than the upper bound
-- together with mu(d).
divisorsWithMuUnder :: (Ord a, Num a) => a -> [a] -> [(a, a)]
divisorsWithMuUnder n [] = [(1, 1)]
divisorsWithMuUnder n (p:ps) = rec ++ [(p*d, -mu) | (d, mu) <- rec, p*d<=n]
    where rec = divisorsWithMuUnder n ps

f' :: Integer -> Integer -> Integer
f' a b = b - sum
    [ mu * (b `div` d)
    | (d, mu) <- divisorsWithMuUnder b (takeWhile (<=a) primes)
    ]

Not much more complicated; the only really interesting difference is that there's now a condition in the list comprehension. Here's an example of f' finishing quickly for inputs that would take infeasibly long with f:

> f' 100 100000
88169

edited Apr 12, 2018 at 6:26

answered Apr 12, 2018 at 5:25

Daniel Wagner

156k10 gold badges231 silver badges392 bronze badges

4 Comments

Daniel Wagner Over a year ago

Some additional commentary: even without compiling, this appears to be quite a bit faster than the C code, less memory-hungry, and less wrong; e.g. f' 131 132 takes 0.2s and 130MB to print 131, whereas the C code compiled with optimizations takes 0.6s and 650MB to print 130. Compiling the Haskell code brings the time under measurement noise and memory usage to under 4MB.

chessai Over a year ago

I like this answer a lot. Do you happen to know what the computational complexity is?

Daniel Wagner Over a year ago

@chessai The main cost is in enumerating divisors. Taking arithmetic to be constant-time for the moment, this can in general take 2^pi(a) time for sufficiently large b, where pi is the prime-counting function. In more familiar terms, a/log a is a pretty decent approximation to pi(a), so we land just shy of doubly-exponential in the size of the first argument (and the second argument makes no difference asymptotically).

chessai Over a year ago

This is super cool. I want to optimise this and save both solutions somewhere.

Will Ness · Accepted Answer · 2018-04-12 19:14:07Z

1

With data-ordlist package mentioned by Daniel Wagner in the comments, it is just

f a b = length $ unionAll [ [p,p+p..b] | p <- takeWhile (<= a) primes]

That is all. Some timings, for non-compiled code run inside GHCi:

~> f 100 (10^5)
88169
(0.05 secs, 48855072 bytes)

~> f 131 (3*10^6)
2659571
(0.55 secs, 1493586480 bytes)

~> f 131 132
131
(0.00 secs, 0 bytes)

~> f 500 300000
274055
(0.11 secs, 192704760 bytes)

Compiling will surely make the memory consumption a non-issue, by converting the length to a counting loop.

edited Apr 12, 2018 at 19:14

answered Apr 12, 2018 at 18:44

Will Ness

71.6k10 gold badges105 silver badges192 bronze badges

Comments

ack · Accepted Answer · 2018-04-12 14:06:53Z

0

You'll have to use recursion in place of loops.

In (most) procedural or object-orientated languages, you should hardly ever (never?) be using recursion. It is horribly inefficient, as a new stack frame must be created each time the recursive function is called.

However, in a functional language, like Haskell, the compiler is often able to optimize the recursion away into a loop, which makes it much faster then its procedural counterparts.

I've converted your sieve function into a set of recursive functions in C. I'll leave it to you to convert it into Haskell:

int main(void) {
    //...
    int root = sqrt(PRIME_RANGE);
    primes[primeCnt++] = 2;

    sieve(3, PRIME_RANGE, root);
    //...
}

void sieve(int i, int end, int root) {
    if(i > end) {
        return;
    }

    if(!primeFlag[i]) {
        primes[primeCnt++] = i;

        if(root >= i) {
            markMultiples(i * i, PRIME_RANGE, i);
        }
    }

    i += 2;
    sieve(i, end, root);
}

void markMultiples(int j, int end, int prime) {
    if(j > end) {
        return;
    }

    primeFlag[j] = 1;
    j += i << 1;

    markMultiples(j, end, prime);
}

The point of recursion is that the same function is called repeatedly, until a condition is met. The results of one recursive call are passed onto the next call, until the condition is met.

Also, why are you bit-fiddling instead of just multiplying or dividing by 2? Any half-decent compiler these days can convert most multiplications and divisions by 2 into a bit-shift.

edited Apr 12, 2018 at 14:06

answered Apr 12, 2018 at 0:06

ack

1,2511 gold badge19 silver badges36 bronze badges

1 Comment

leftaroundabout Over a year ago

“in a functional language, like Haskell, the compiler is able to optimize the recursion away into a loop” – that's a little bit misleading. Yes, often the compiler can do this, but there's no guarantee that any given loop translated somehow into recursion will get transformed back to a loop efficiently. Also, a recursive “loop” often is rather harder to read than a straightout imperative loop. Usually, the best replacement for loops are higher-order functions such as traversals and strict folds.

Collectives™ on Stack Overflow

Calculating multiples in Haskell (conversion from C)? [closed]

4 Answers 4

8 Comments

4 Comments

Comments

1 Comment

Linked

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

8 Comments

4 Comments

Comments

1 Comment

Linked

Related