Java software design - Looping, object creation VS modifying variables. Memory, performance & reliability comparison

Question

Let's say we are trying to build a document scanner class in java that takes 1 input argument, the log path(eg. C:\document\text1.txt). Which of the following implementations would you prefer based on performance/memory/modularity?

ArrayList<String> fileListArray = new ArrayList<String>();
fileListArray.add("C:\\document\\text1.txt");
fileListArray.add("C:\\document\\text2.txt");
.
.
.



//Implementation A
for(int i =0, j = fileListArray.size(); i < j; i++){
    MyDocumentScanner ds = new MyDocumentScanner(fileListArray.get(i));
    ds.scanDocument(); 
    ds.resultOutput();

}

//Implementation B
MyDocumentScanner ds = new MyDocumentScanner();
for(int i=0, j=fileListArray.size(); i < j; i++){
    ds.setDocPath(fileListArray.get(i));
    ds.scanDocument(); 
    ds.resultOutput(); 
}

Personally I would prefer A due to its encapsulation, but it seems like more memory usage due to creation of multiple instances. I'm curious if there is an answer to this, or it is another "that depends on the situation/circumstances" dilemma?

Seelenvirtuose · Accepted Answer · 2015-01-29 07:10:10Z

Although this is obviously opinion-based, I will try an answer to tell my opinion.

You approach A is far better. Your document scanner obviously handles a file. That should be set at construction time and be saved in an instance field. So every method can refer to this field. Moreover, the constructor can do some checks on the file reference (null check, existence, ...).

Your approach B has two very serious disadvantages:

After constructing a document scanner, clients could easily call all of the methods. If no file was set before, you must handle that "illegal state" with maybe an IllegalStateException. Thus, this approach increases code and complexity of that class.
There seems to be a series of method calls that a client should or can perform. It's easy to call the file setting method again in the middle of such a series with a completely other file, breaking the whole scan facility. To avoid this, your setter (for the file) should remember whether a file was already set. And that nearly automatically leads to approach A.

Regarding the creation of objects: Modern JVMs are really very fast at creating objects. Usually, there is no measurable performance overhead for that. The processing time (here: the scan) usually is much higher.

Eran · Accepted Answer · 2015-01-29 07:05:06Z

2

If you don't need multiple instances of DocumentScanner to co-exist, I see no point in creating a new instance in each iteration of the loop. It just creates work to the garbage collector, which has to free each of those instances.

If the length of the array is small, it doesn't make much difference which implementation you choose, but for large arrays, implementation B is more efficient, both in terms of memory (less instances created that the GC hasn't freed yet) and CPU (less work for the GC).

answered Jan 29, 2015 at 7:05

Eran

395k57 gold badges726 silver badges793 bronze badges

Comments

Philipp Wendler · Accepted Answer · 2015-01-29 07:12:07Z

Are you implementing DocumentScanner or using an existing class?

If the latter, and it was designed for being able to parse multiple documents in a row, you can just reuse the object as in variant B.

However, if you are designing DocumentScanner, I would recommend to design it such that it handles a single document and does not even have a setDocPath method. This leads to less mutable state in that class and thus makes its design much easier. Also using an instance of the class becomes less error-prone.

As for performance, there won't be a measurable difference unless instantiating a DocumentScanner is doing a lot of work (like instantiating many other objects, too). Instantiating and freeing objects in Java is pretty cheap if they are used only for a short time due to the generational garbage collector.

Collectives™ on Stack Overflow

Java software design - Looping, object creation VS modifying variables. Memory, performance & reliability comparison

3 Answers 3

Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related