3

I'm wrapping the FM-SBLEX Saldo program in a Java library.

Saldo is written in Haskell and does lookups in a lexicon for lines given on stdin, e.g.

echo "ord"|./sblex/bin/saldo dicts/saldo.dict

prints something like the following to stdout

{"ord":{"s_1":{"word":"ord","head":"sanna mina ord","pos":"abm","param":"invar 1:3-3","inhs":[],"id":"sanna_mina_ord..abm.1","p":"abm_i_till_exempel","attr":"3"},...

If I run it with

./sblex/bin/saldo dicts/saldo.dict

it does a lookup for each line I enter at the console until I send EOF.

In my Java library, I start it with ProcessBuilder and set up a thread to dump stdout and stderr to my program's stdout, and another thread writes a word and a newline, then flushes outputStream

On the console, saldo returns results each time I press return, but in my wrapper, it returns results for all my input only once I close outputStream (see .close() is commented out in the next code block)

    ProcessBuilder pb = new ProcessBuilder(binPath, dictPath);

    pb.redirectErrorStream(true);
    saldoProcess = pb.start();

    new Thread(new Reader(saldoProcess.getInputStream())).start();
    new Thread(new Writer(saldoProcess.getOutputStream())).start();

    saldoProcess.waitFor();
    System.out.println("saldo exited.");
    Thread.sleep(2000);

Writer's run override:

    public void run() {
        try {
            outputStream.write("ord\n".getBytes());
            outputStream.flush();
            //outputStream.close();
        } catch (IOException e) {
            // TODO Auto-generated catch block
            e.printStackTrace();
        }
    }

The Haskell code that reads the input

run' :: Language a => a -> (String -> [Tok]) -> (String -> [[String]]) -> AnaType -> Stats -> IO Stats
run' l tokenizer f a st = 
 do b <- hIsEOF stdin
    if b then return st 
     else do 
       s <- hGetLine stdin
       analyze l a f (tokenizer s) st >>= run' l tokenizer f a

If binPath="cat" and dictPath="-" my java program outputs the input after each flush. Any idea why this haskell program is only dealing with the input after I close the outputStream?

NB: as the answer shows, it was not hGetLine that didn't return (as I assumed) but the output that was buffered because the Haskell implementation I'm using defaults to block buffer if it's not run from the console.

1 Answer 1

4

Your Haskell program is probably buffering its output. (Line-buffered when it's writing to a terminal, block-buffered when writing to anything else.)

Try adding

hSetBuffering stdout LineBuffering

near the start of the program.

More about buffering in Haskell.

(Edited in response to Daniel Wagner's comment.)

Sign up to request clarification or add additional context in comments.

4 Comments

If one line of input corresponds to at least one line of output, then LineBuffering is probably more appropriate.
Fixing it in the code works, but do you know a way to make the program think it's being run from the console (instead of changing the third-party code)? I've seen someone suggest running it under script /dev/null, but I wonder if Java supports a more platform-independent way of doing this?
I don't know about a Java way of doing it, but if you're running on Linux / OSX / other Unix, and you have the script program installed, you could get Java to run script -c "./sblex/bin/saldo dicts/saldo.dict" /dev/null instead of ./sblex/bin/saldo dicts/saldo.dict.
hFlush is generally preferred over no buffering.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.