0

I'm trying to write a Haskell program that calculates multiples. Basically, when given two integers a and b, I want to find how many integers 1 ≤ bi ≤ b are multiple of any integer 2 ≤ ai ≤ a. For example, if a = 3 and b = 30, I want to know how many integers in the range of 1-30 are a multiple of 2 or 3; there are 20 such integers: 2, 3, 4, 6, 8, 9, 10, 12, 14, 15, 16, 18, 20, 21, 22, 24, 26, 27, 28, 30.

I have a C program that does this. I'm trying to get this translated into Haskell, but part of the difficulty is getting around the loops that I've used since Haskell doesn't use loops. I appreciate any and all help in translating this!

My C program for reference (sorry if formatting is off):

#define PRIME_RANGE 130
#define PRIME_CNT   32

#define UPPER_LIMIT (1000000000000000ull) //10^15
#define MAX_BASE_MULTIPLES_COUNT  25000000

typedef struct
{
    char primeFactorFlag;
    long long multiple;
}multipleInfo;

unsigned char primeFlag[PRIME_RANGE + 1];
int primes[PRIME_CNT];
int primeCnt = 0;

int maxPrimeStart[PRIME_CNT];
multipleInfo baseMultiples[MAX_BASE_MULTIPLES_COUNT];
multipleInfo mergedMultiples[MAX_BASE_MULTIPLES_COUNT];
int baseMultiplesCount, mergedMultiplesCount;

void findOddMultiples(int a, long long b, long long *count);
void generateBaseMultiples(void);
void mergeLists(multipleInfo listSource[], int countS, multipleInfo    

listDest[], int *countD);
void sieve(void);

int main(void)
{

    int i, j, a, n, startInd, endInd;
    long long b, multiples;

    //Generate primes
    sieve();
    primes[primeCnt] = PRIME_RANGE + 1;

    generateBaseMultiples();
    baseMultiples[baseMultiplesCount].multiple = UPPER_LIMIT + 1;

    //Input and Output
    scanf("%d", &n);
    for(i = 1; i <= n; i++)
    {
        scanf("%d%lld", &a, &b);

        //If b <= a, all are multiple except 1
        if(b <= a)
            printf("%lld\n",b-1);
        else
        {
            //Add all even multiples
            multiples = b / 2;
            //Add all odd multiples
            findOddMultiples(a, b, &multiples);-

            printf("%lld\n", multiples);
        }
    }
    return 0;
}

void findOddMultiples(int a, long long b, long long *count)
{
    int i, k;
    long long currentNum;

    for(k = 1; k < primeCnt && primes[k] <= a; k++)
    {
        for(i = maxPrimeStart[k]; i < maxPrimeStart[k + 1] &&         

baseMultiples[i].multiple <= b; i++)
        {
            currentNum = b/baseMultiples[i].multiple;
            currentNum = (currentNum  + 1) >> 1; // remove even multiples
            if(baseMultiples[i].primeFactorFlag) //odd number of factors
                (*count) += currentNum;
            else
                (*count) -= currentNum;
        }
    }
}

void addTheMultiple(long long value, int primeFactorFlag)
{
    baseMultiples[baseMultiplesCount].multiple = value;
    baseMultiples[baseMultiplesCount].primeFactorFlag = primeFactorFlag;
    baseMultiplesCount++;
}

void generateBaseMultiples(void)
{
    int i, j, t, prevCount;
    long long curValue;

    addTheMultiple(3, 1);
     mergedMultiples[0] = baseMultiples[0];
    mergedMultiplesCount = 1;

    maxPrimeStart[1] = 0;
    prevCount = mergedMultiplesCount;
    for(i = 2; i < primeCnt; i++)
    {
        maxPrimeStart[i] = baseMultiplesCount;
        addTheMultiple(primes[i], 1);
        for(j = 0; j < prevCount; j++)
        {
            curValue = mergedMultiples[j].multiple * primes[i];
            if(curValue > UPPER_LIMIT)
                break;

            addTheMultiple(curValue, 1 - mergedMultiples[j].primeFactorFlag);
        }
        if(i < primeCnt - 1)
            mergeLists(&baseMultiples[prevCount], baseMultiplesCount - prevCount, mergedMultiples, &mergedMultiplesCount);

        prevCount = mergedMultiplesCount;
    }
    maxPrimeStart[primeCnt] = baseMultiplesCount;
}

void mergeLists(multipleInfo listSource[], int countS, multipleInfo listDest[], int *countD)
{
int limit = countS + *countD;
int i1, i2, j, k;

//Copy one list in unused safe memory
for(j = limit - 1, k = *countD - 1; k >= 0; j--, k--)
    listDest[j] = listDest[k];

//Merge the lists
for(i1 = 0, i2 = countS, k = 0; i1 < countS && i2 < limit; k++)
{

    if(listSource[i1].multiple <= listDest[i2].multiple)
        listDest[k] = listSource[i1++];
    else
        listDest[k] = listDest[i2++];
}

while(i1 < countS)
    listDest[k++] = listSource[i1++];


while(i2 < limit)
    listDest[k++] = listDest[i2++];

*countD = k;
}

void sieve(void)
{
   int i, j, root = sqrt(PRIME_RANGE);
    primes[primeCnt++] = 2;
    for(i = 3; i <= PRIME_RANGE; i+= 2)
    {
        if(!primeFlag[i])
        {
            primes[primeCnt++] = i;
            if(root >= i)
            {
                for(j = i * i; j <= PRIME_RANGE; j += i << 1)
                primeFlag[j] = 1;
        }
    }
}
}
2
  • As a starter: for loops in C are maps in functional programming, and while loops are recursion. I suspect your C does not need to be "translated" at all, rather you need to look for a functional solution in the first place. Commented Apr 12, 2018 at 2:49
  • As much as I understand you might need help in learning Haskell, StackOverflow is not a code-writing service. This question would be better if you focused on loops specifically, or gave an attempt. Commented Apr 12, 2018 at 18:47

4 Answers 4

2

First, unless I'm grossly misunderstanding, the number of multiples you have there is wrong. The number of multiples of 2 between 1 and 30 is 15, and the number of multiples of 3 between 1 and 30 is 10, so there should be 25 numbers there.

EDIT: I did misunderstand; you want unique multiples.

To get unique multiples, you can use Data.Set, which has the invariant that the elements of the Set are unique and ordered ascendingly.

If you know you aren't going to exceed x = maxBound :: Int, you can get even better speedups using Data.IntSet. I've also included some test cases and annotated with comments what they run at on my machine.

{-# LANGUAGE BangPatterns #-}

{-# OPTIONS_GHC -O2 #-}

module Main (main) where

import System.CPUTime (getCPUTime)
import Data.IntSet (IntSet)
import qualified Data.IntSet as IntSet

main :: IO ()
main = do
  test 3 30       -- 0.12 ms
  test 131 132    -- 0.14 ms
  test 500 300000 -- 117.63 ms

test :: Int -> Int -> IO ()
test !a !b = do
  start <- getCPUTime
  print (numMultiples a b)
  end   <- getCPUTime
  print $ "Needed " ++ show ((fromIntegral (end - start)) / 10^9) ++ " ms.\n"

numMultiples :: Int -> Int -> Int
numMultiples !a !b = IntSet.size (foldMap go [2..a])
  where
    go :: Int -> IntSet 
    go !x = IntSet.fromAscList [x, x+x .. b]
Sign up to request clarification or add additional context in comments.

8 Comments

The amount of multiples between numbers should not duplicate. So, since 6 is a multiple of both 2 and 3, 6 should only be recorded once. I think when I tested your code, it duplicated values that should have only been recorded once.
i edited the post to reflect this
For [y | y <- [1..b], y `mod` x == 0], consider [x,x+x .. b] instead. You might also find data-ordlist interesting, as it doesn't require converting to Set and could potentially give you the infinite list of multiples.
I switched to using [x, x+x .. b], it resulted in a relatively large speedup. Also, I think it's worth it to use IntSet here.
Another thought: it is probably cheaper to track the numbers which are not multiples, using set difference at each step of the fold (and starting with the set of odd numbers). This way both the size of the set being folded in decreases and the size of the accumulator decreases; by comparison here the size of the set being folded in decreases as iteration goes on, but the size of the set of all multiples is increasing as iteration continues.
|
2

I'm not really into understanding your C, so I implemented a solution afresh using the algorithm discussed here. The N in the linked algorithm is the product of the primes up to a in your problem description.

So first we'll need a list of primes. There's a standardish trick for getting a list of primes that is at once very idiomatic and relatively efficient:

primes :: [Integer]
primes = 2:filter isPrime [3..]

-- Doesn't work right for n<2, but we never call it there, so who cares?
isPrime :: Integer -> Bool
isPrime n = go primes n where
    go (p:ps) n | p*p>n = True
                | otherwise = n `rem` p /= 0 && go ps n

Next up: we want a way to iterate over the positive square-free divisors of N. This can be achieved by iterating over the subsets of the primes less than a. There's a standard idiomatic way to get a powerset, namely:

-- import Control.Monad
-- powerSet :: [a] -> [[a]]
-- powerSet = filterM (const [False, True])

That would be a fine component to use, but since at the end of the day we only care about the product of each powerset element and the value of the Mobius function of that product, we would end up duplicating a lot of multiplications and counting problems. It's cheaper to compute those two things directly while producing the powerset. So:

-- Given the prime factorization of a square-free number, produce a list of
-- its divisors d together with mu(d).
divisorsWithMu :: Num a => [a] -> [(a, a)]
divisorsWithMu [] = [(1, 1)]
divisorsWithMu (p:ps) = rec ++ [(p*d, -mu) | (d, mu) <- rec] where
    rec = divisorsWithMu ps

With that in hand, we can just iterate and do a little arithmetic.

f :: Integer -> Integer -> Integer
f a b = b - sum
    [ mu * (b `div` d)
    | (d, mu) <- divisorsWithMu (takeWhile (<=a) primes)
    ]

And that's all the code. Crunched 137 lines of C down to 15 lines of Haskell -- not bad! Try it out in ghci:

> f 3 30
20

As an additional optimization, one could consider modifying divisorsWithMu to short-circuit when its divisor is bigger than b, as we know such terms will not contribute to the final sum. This makes a noticeable difference for large a, as without it there are exponentially many elements in the powerset. Here's how that modification looks:

-- Given an upper bound and the prime factorization of a square-free number,
-- produce a list of its divisors d that are no larger than the upper bound
-- together with mu(d).
divisorsWithMuUnder :: (Ord a, Num a) => a -> [a] -> [(a, a)]
divisorsWithMuUnder n [] = [(1, 1)]
divisorsWithMuUnder n (p:ps) = rec ++ [(p*d, -mu) | (d, mu) <- rec, p*d<=n]
    where rec = divisorsWithMuUnder n ps

f' :: Integer -> Integer -> Integer
f' a b = b - sum
    [ mu * (b `div` d)
    | (d, mu) <- divisorsWithMuUnder b (takeWhile (<=a) primes)
    ]

Not much more complicated; the only really interesting difference is that there's now a condition in the list comprehension. Here's an example of f' finishing quickly for inputs that would take infeasibly long with f:

> f' 100 100000
88169

4 Comments

Some additional commentary: even without compiling, this appears to be quite a bit faster than the C code, less memory-hungry, and less wrong; e.g. f' 131 132 takes 0.2s and 130MB to print 131, whereas the C code compiled with optimizations takes 0.6s and 650MB to print 130. Compiling the Haskell code brings the time under measurement noise and memory usage to under 4MB.
I like this answer a lot. Do you happen to know what the computational complexity is?
@chessai The main cost is in enumerating divisors. Taking arithmetic to be constant-time for the moment, this can in general take 2^pi(a) time for sufficiently large b, where pi is the prime-counting function. In more familiar terms, a/log a is a pretty decent approximation to pi(a), so we land just shy of doubly-exponential in the size of the first argument (and the second argument makes no difference asymptotically).
This is super cool. I want to optimise this and save both solutions somewhere.
1

With data-ordlist package mentioned by Daniel Wagner in the comments, it is just

f a b = length $ unionAll [ [p,p+p..b] | p <- takeWhile (<= a) primes]

That is all. Some timings, for non-compiled code run inside GHCi:

~> f 100 (10^5)
88169
(0.05 secs, 48855072 bytes)

~> f 131 (3*10^6)
2659571
(0.55 secs, 1493586480 bytes)

~> f 131 132
131
(0.00 secs, 0 bytes)

~> f 500 300000
274055
(0.11 secs, 192704760 bytes)

Compiling will surely make the memory consumption a non-issue, by converting the length to a counting loop.

Comments

0

You'll have to use recursion in place of loops.

In (most) procedural or object-orientated languages, you should hardly ever (never?) be using recursion. It is horribly inefficient, as a new stack frame must be created each time the recursive function is called.

However, in a functional language, like Haskell, the compiler is often able to optimize the recursion away into a loop, which makes it much faster then its procedural counterparts.

I've converted your sieve function into a set of recursive functions in C. I'll leave it to you to convert it into Haskell:

int main(void) {
    //...
    int root = sqrt(PRIME_RANGE);
    primes[primeCnt++] = 2;

    sieve(3, PRIME_RANGE, root);
    //...
}

void sieve(int i, int end, int root) {
    if(i > end) {
        return;
    }

    if(!primeFlag[i]) {
        primes[primeCnt++] = i;

        if(root >= i) {
            markMultiples(i * i, PRIME_RANGE, i);
        }
    }

    i += 2;
    sieve(i, end, root);
}

void markMultiples(int j, int end, int prime) {
    if(j > end) {
        return;
    }

    primeFlag[j] = 1;
    j += i << 1;

    markMultiples(j, end, prime);
}

The point of recursion is that the same function is called repeatedly, until a condition is met. The results of one recursive call are passed onto the next call, until the condition is met.

Also, why are you bit-fiddling instead of just multiplying or dividing by 2? Any half-decent compiler these days can convert most multiplications and divisions by 2 into a bit-shift.

1 Comment

“in a functional language, like Haskell, the compiler is able to optimize the recursion away into a loop” – that's a little bit misleading. Yes, often the compiler can do this, but there's no guarantee that any given loop translated somehow into recursion will get transformed back to a loop efficiently. Also, a recursive “loop” often is rather harder to read than a straightout imperative loop. Usually, the best replacement for loops are higher-order functions such as traversals and strict folds.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.