0

So, I am expecting my data in following format:

"domain::foo::127"

So here is my code:

String[] typeChunks = input.split("::");

            String type = typeChunks[0];
            String edge = typeChunks[1];

            double reputation = Double.parseDouble(typeChunks[2].trim());

But I get this eror

            java.lang.NumberFormatException: empty String
at sun.misc.FloatingDecimal.readJavaFormatString(FloatingDecimal.java:1011)
at java.lang.Double.parseDouble(Double.java:540)
at org.attempt2.BuildGraph$ReduceClass.reduce(BuildGraph.java:94)
at org.attempt2.BuildGraph$ReduceClass.reduce(BuildGraph.java:1)
at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:176)
at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:650)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:418)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1136)
at org.apache.hadoop.mapred.Child.main(Child.java:249)

Whats a good way to handle this?

1
  • Might be better off with a regular expression that matches each group. Commented Oct 25, 2013 at 0:09

7 Answers 7

2

There's no one single good approach to validate data except that one should before using it. Instead of splitting your data string and then individually converting them to the right data types possibly running into exceptions, I suggest parsing your data string as a whole using Scanner which would lend you type safety as well.

Double reputation = null;
String type = null, edge = null;

String dataString = "domain::foo::127";
Scanner scanner = new Scanner(dataString).useDelimiter("::");

if (scanner.hasNext()) {
    type = scanner.next();
} else
  throw new IllegalArgumentException("Type not found!");
if (scanner.hasNext()) {
    edge = scanner.next();
} else
  throw new IllegalArgumentException("Edge not found!");
if (scanner.hasNextDouble()) {
    reputation = scanner.nextDouble();
} else
  throw new IllegalArgumentException("Reputation not found!");

System.out.println(type); // domain
System.out.println(edge); // foo
System.out.println(reputation); // 127.0


An equally good approach would be to test the complete data string against a regular expression (if its not exorbitantly long) but at the cost of losing the information on exactly which data unit failed validation.

Pattern pattern = Pattern.compile("(\\w+)::(\\w+)::(\\d+)");
Matcher matcher = pattern.matcher(dataString);

if (matcher.matches()) {
    type = matcher.group(1);
    edge = matcher.group(2);
    reputation = Double.valueOf(matcher.group(3));
} else
  throw new IllegalArgumentException("Invalid input data");
Sign up to request clarification or add additional context in comments.

Comments

1

You need to handle the case where you have malformed data. This isn't exactly an exhaustive validation, but it might be a place to start:

String[] format = "domain::foo::127".split("::");

...

boolean validateFormat(String[] format) {
  // Check for anything that you don't want coming through as data
  return format.length == 3;
}

Comments

1

With regex, you can verify if the input string is valid or not!

String pattern = "[a-z]+::{1}[a-z]+::{1}[0-9]+(\\.[0-9][0-9]?)?";

String type, edge;
double reputation;

if(input.matches(pattern)){
    String[] typeChunks = input.split("::");
    type = typeChunks[0];
    edge = typeChunks[1];
    reputation = Double.parseDouble(typeChunks[2].trim());
}
else
    throw new IllegalArgumentException();

This regex will check for

  1. Alphabetic type
  2. Alphabetic edge
  3. Numeric reputation with or without decimal
  4. "::" between all three

Comments

0

use if (!input.equals("")){ before String[] typeChunks = input.split("::"); and don't forget close }

Comments

0

You can validate before parse the string value:

double reputation = (typeChunks[2].trim() != null && 
                    !typeChunks[2].trim().isEmpty()) ? 
                     Double.parseDouble(typeChunks[2].trim()) : 0;

1 Comment

Generally, instead of failing silently by returning a 0 one should consider throwing an Exception. That way you can decide what is most appropriate to handle this problem. (Inform the user, use a default, log, etc.)
0

The error message is due to handling empty data.

double reputation = 0;
final String reputStr = typeChunks[2];
if ((reputStr != null) && !("").equals(reputStr.trim()))
{
   reputation = Double.parseDouble(typeChunks[2].trim());
}

Comments

0

What about creating a simple helper class to check your string ... something like

public class StringUtil {

public static boolean isNullOrEmpty(final String string)

{
return string == null || string.isEmpty() || string.trim().isEmpty();

}

}

So in this way you don't need to use trim(). Because if you use trim() on an empty string you will get an exception. But you still have to deal with NumberFormatException in Double.parseDouble.

So if you don't wanna add try and catch blocks every time you can create a simple wrapper for Double.parseDouble to catch exception and deal with them in your way (let's say return -1).

double reputation = StringUtil.isNullOrEmpty(typeChunks[2])== true ? 0 : YourClass.methodToParseDoubleAndHandleException(typeChunks[2]);

1 Comment

What I like in this approach is that you always control your result

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.