0

I'm doing some experiments about multithreading.

When the program runs to the output part, (using java.io.FileWriter) sometimes it can go through quickly, but sometimes it just stuck on there.

Is the FileWriter's problem? Here is the simplified code:

import java.io.File;
import java.io.FileWriter;
import java.io.IOException;

public class Test extends Thread {
    private int _id;

    public Test(int id) {
        _id = id;
    }

    @Override
    public void run() {
        long start = System.currentTimeMillis();

        for (int i = 0; i < 10000; i++) {
            try (FileWriter fw = new FileWriter(new File(_id + ".txt"))) {
                fw.write("hello!");
            } catch (IOException e) {
                System.err.println("Something wrong.");
            }
        }

        System.out.println(_id + ": " + (System.currentTimeMillis() - start));
    }

    public static void main(String[] args) {
        for (int i = 0; i < 10; i++) {
            new Test(i).start();
        }
    }
}

And here is my result:

7: 3820
9: 3878
2: 3965
8: 3956
0: 4058
6: 4097
5: 4111
3: 4259
1: 4354
4: 4369
9: 4703
7: 4748
5: 4891
2: 4994
4: 5065
3: 5672
1: 5804
0: 5805
8: 5925
6: 6042
1: 4495
9: 5265
6: 5551
2: 5651
5: 5676
8: 5697
3: 5917
0: 6001
7: 6002
4: 6314

I runs it three times, why are the elapsed times different? Is it the FileWriter's problem or the file system's?

0

2 Answers 2

1

You have at least two problems:

  • you make way too many syscalls; basically, for one loop of each thread you do open(), write(), (maybe flush() and finally close(); at least 300k syscalls!
  • you create 100k FileWriter objects, 100k File objects; the gc needs to handle all of them; and since the gc runs in a thread by itself and is scheduled like any other threads, it will run more or less often.

The problem is therefore more with your program than anything OS-related... The JIT can't do anything for you here.

Also, since you use Java 7, you should consider using Files.newBufferedWriter() -- only once per thread, of course, not 10000 times!


Further note about the "syscall problem": at least on Unix systems, but other OSes probably work the same, each time you make a syscall, your process has to enter kernel mode the time that the syscall is executed; this is not free. Even if on modern systems the cost is not that significant, it is nevertheless significantly higher than not having to do user->kernel->user.


Well, OK, I lied a little; the JIT does kick in but it will only optimize the user side of things. The JIT will start to optimize after 10k executions of a method, here your run(), and optimize more as time passes.

Sign up to request clarification or add additional context in comments.

7 Comments

so the problem is about too many syscall right? but I have to create a new file and write some text every time (each file represents one transaction).
@Snowbird If you have a really bad design you should expect it to perform poorly. ;) I suggest getting an SSD which will at least handle much higher IOPS. e.g. 500x more IOPS.
@Snowbird actually, its not that bad, you are getting around 20K writes per second which is pretty good considering you are creating new files like this. If you wanted say over a million transactions per second, you would need a different approach.
@Snowbird are your records fixed size?
@Snowbird but do they have a maximum size?
|
0

why are the elapsed times different? Is it the FileWriter's problem or the file system's?

Opening and closing files is very expensive. Your bottleneck is likely to be in your OS or your HDD. i.e. the code doesn't scale well. Most likely you have less than 10 CPUs so only so many threads are running at once. When you have code which doesn't scale well and an overloaded system you get wildly varing performance results.

The problem is in your program trying to overload your system. Using more CPUs gives you more processing power and you are using it to hammer your OS from 10 different sides.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.