1

I have a question in regards to parsing a file. Here is what a few of the statements look like that I parse the file with:

while(scan.hasNextLine()){
    line = scan.nextLine();//this gets the whole line
    if(line.contains("SomeDataIKnow"){
        data = new Scanner(line).useDelimiter("=");//scans just the one line
        value = data.next();//sets value to the string right after the =
        this.Data1 = value;
    }
    else if(line.contains("DifferentDataIKnow"){
        data = new Scanner(line).useDelimiter("=");
        value = data.next();
        this.Data2 = value;
 }

Mind you there are about 30 of the lines that all have different Strings in the line contains statement.

I know what the words before each "=" will say but not after the equals. So I need to parse through looking for specific data and then get those values after the equals sign. I know typically you shouldn't use if statements when there are more than like 2 or 3 statements. So I'm curious if anyone has any ideas of another way to do it? Possibly Switch statements I was thinking, but not sure if that would even help at 30 some lines. Also if it helps visualize the file, there is like 400 total lines but I only need about 30 of them. I can't post the file but an example of a full line in the file would be something like:

Device.A Bunch of other info that changes.THE DATA I KNOW = THE VALUE I NEED

So I parse multiple files and the data changes with the exception of the DATA I KNOW part, that never changes and its what I use to find the line that has the value I need. Sorry if this at all sounds confusing.

13
  • You can't use a switch statement when you are using contains (or startsWith, endsWith etc): switches only match on equality. Commented Jul 18, 2016 at 22:55
  • 2
    You could put all your valid prefixes in an array, and iterate them. Commented Jul 18, 2016 at 22:55
  • You don't need to create a Scanner to split a string into pieces: you can use line.split("="), amongst other approaches. Commented Jul 18, 2016 at 22:56
  • I have heard about using an array for it, but would that make it faster? Since id have to iterate through the array on each line of the file. I would think it probably take just as long as my current setup Commented Jul 18, 2016 at 22:58
  • @chris23balln: It wouldn't make it run any faster, but if done right it would make your code a lot more readable. Are you running into performance issues? Commented Jul 18, 2016 at 23:04

3 Answers 3

2

You should not code this way, DRY is the most important rule in coding--and when you look at your switch statement you will find a LOT of redundancy. The reason you are not supposed to use if/elseif chains like this also applies to switch statements--they are a bad code smell simply because they lead to redundant code.

So the first thing to do is look at what's different in each block. Here is your redundant code:

if(line.contains("SomeDataIKnow"){
    data = new Scanner(line).useDelimiter("=");//scans just the one line
    value = data.next();//sets value to the string right after the =
    this.Data1 = value;
}
else if(line.contains("DifferentDataIKnow"){
    data = new Scanner(line).useDelimiter("=");
    value = data.next();
    this.Data2 = value;
}

The differences are [SomeDataIKnow, Data1] vs [DifferentDataIKnow, Data2]

Scalars are not a good way to store data like this--they often make you create code like yours above because there is no obvious easier way, so generally I use a map.

So if I saw this code in my codebase my first refactor would look something like:

String[] strings=new String[]{"SomeDataIKnow","DifferentDataIKnow"};
Map results=new HashMap<String, String>()

for(String sub:strings)
{
    if(line.contains(sub){
        data = new Scanner(line).useDelimiter("=");
        value = data.next();
        results.put(sub, value);
}

This has some issues but works. You don't have to add any code when there are new strings to check for, your code is reduced by 30x and any patch to your if statement/code doesn't need to be repeated in 30 different locations.

The biggest issue is that it's no longer trivial to look up the value. You need the key which is the string "SomeDataIKnow", pain in the butt... You used to have a variable "Data1".

If you want to retain "Data1" you can make your string something like:

String[] strings=new String[]{"SomeDataIKnow:Data1","DifferentDataIKnow:Data2"};

and then do a String[] split=sub.split(":") and do the lookup by split[0] but enter it into the map by split[1]

After this instead of using this.Data1, you'd use results.get("Data1") which is reasonable.

There are LOTS of other ways to supply metadata for a common operation like this.. as long as you keep it DRY, most will work fine. I use the string array construct only because it's easy to initialize and work with... same for split.

As an example of another approach, if you really wanted to use scalars you could set your class up like this:

class MyClass{
    @SearchStr("SomeDataIKnow")
    String Data1;
    @SearchStr("DifferentDataIKnow")
    String Data2;
...

Annotations are pretty easy to process these days--you would simply go through all @SearchStr annotations in the class to build your search terms, loop over them the same way as above and stuff the resulting values into the corresponding variable.

The trick of refactoring patterns like this is simply to identify differences and group them--any way that makes you feel comfortable and doesn't add too much complexity.

By the way, once you have the results in a map look at where they are used. You will often find other redundant code there that can be refactored out--refactorings like this can chain to make huge savings across your codebase.

Sign up to request clarification or add additional context in comments.

Comments

0

One possibility could be to use a Map. As you scan each line, add an entry to the map where the key is the part you know before the '=' and the value is the part after the '='

Then, after the loop, you could write something like:

this.Data1 = map.get("SomeDataIKnow");
this.Data2 = map.get("DifferentDataIKnow");

etc.

Comments

0

You might want to consider using a Regex Pattern. If the files are not too large, you could read the entire file into memory, then use a carefully crafted Regex Pattern to search for (SomeDataIKnow|DifferentDataIKnow|...)= and capture both the matching text before the equals sign and everything between the = and the next line break.

The main effect this would have is to allow your program to scan through the file's contents just once instead of 40 times. If searching through the contents is what's causing things to go slowly, then that should help.

Once you know the exact pattern that got matched before the equals sign, then you have a specific string that you can use in a switch statement or look up in a HashMap<> or any of the other approaches suggested by Bill K.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.