I have this pipeline where i stream data from Python and connect to the stream in a Java applicaton. The data records are matrices of complex numbers. Now I've learned that json.dumps() can't deal with pythons complex type.
For the moment I've converted the complex values to a string, put it in a dictionary like this:
for entry in range(len(data_array)):
data_as_string = [str(i) for i in data_array[entry]["DATA"].tolist()]
send({'data': data_array[entry]["DATA"],
'coords': data_array[entry]["UVW"].tolist()})
and send it to he pipeline. But this requires extensive and expensive custom deserialization in Java, which increases the running time of the pipeline by a lot. Currently I'm doing the deserialization like this:
JSONObject = new JSONOBJECT(string);
try {
data= jsonObject.getString("data");
uvw= jsonObject.getString("uvw");
} catch (JSONException ex) {
ex.printStackTrace();
}
And then I'm doing a lot of data.replace(string1, string2) to remove some of the signs added by the serialization and then looping through the matrix to convert every number into a Java Complex type.
My Java deserialization code looks the following:
data = data.replace("(","");
data = data.replace(")","");
data = data.replace("\"","");
data = data.replace("],[","¦");
data = data.replace("[","");
data = data.replace("]","");
uvw = uvw.replace("[","");
uvw = uvw.replace("]","");
String[] frequencyArrays = data.split("¦");
Complex[][] tempData = new Complex[48][4];
for(int i=0;i< frequencyArrays.length;i++){
String[] complexNumbersOfAFrequency = frequencyArrays[i].split(", ");
for(int j =0;j<complexNumbersOfAFrequency.length;j++){
boolean realPartNegative = false;
Complex c;
if(complexNumbersOfAFrequency[j].startsWith("-")){
realPartNegative = true;
//Get ridd of the first - sign to be able to split the real & imaginary parts
complexNumbersOfAFrequency[j] =complexNumbersOfAFrequency[j].replaceFirst("-","");
}
if(complexNumbersOfAFrequency[j].contains("+")){
String[] realAndImaginary = complexNumbersOfAFrequency[j].split("\\+");
try {
double real = Double.parseDouble(realAndImaginary[0]);
double imag = Double.parseDouble(realAndImaginary[1].replace("j",""));
if(realPartNegative){
c = new Complex(-real,imag);
} else {
c = new Complex(real,imag);
}
}catch(IndexOutOfBoundsException e) {
//System.out.println("Wrongly formatted number, setting it to 0");
c = new Complex(0,0);
}
catch (NumberFormatException e){
System.out.println("Wrongly formatted number, setting it to 0");
c = new Complex(0,0);
}
} else {
String[] realAndImaginary = complexNumbersOfAFrequency[j].split("-");
try {
double real = Double.parseDouble(realAndImaginary[0]);
double imag = Double.parseDouble(realAndImaginary[1].replace("j", "").replace("e", ""));
if (realPartNegative) {
c = new Complex(-real, -imag);
} else {
c = new Complex(real, -imag);
}
}
catch(IndexOutOfBoundsException e){
System.out.println("Not correctly formatted: ");
for(int temp = 0;temp<realAndImaginary.length;temp++){
System.out.println(realAndImaginary[temp]);
}
System.out.println("Setting it to (0,0)");
c = new Complex(0,0);
}
catch (NumberFormatException e){
c = new Complex(0,0);
}
}
tempData[i][j] = c;
}
}
Now my question would be if there is a way to either
1)Deserialize the Dictionary in Java without expensive String manipulations and looping through the matrices for each record or
2)Do a better Job in serializing the data in python so that it can be done better in java
Any hints are appreciated.
Edit: JSON looks the following
{"data": ["[(1 + 2j), (3 + 4j), ...]","[(5 + 6j), ...]", ..."],
"coords": [1,2,3]}
Edit: For the coordinates I can do the deserialization in Java pretty easily:
uvw = uvw.replace("[","");
uvw = uvw.replace("]","");
String[] coords = uvw.split(",");
And then cast the Strings in coords with Double.parseDouble(), howver for the data string this is way more complicated because the string is full of characters that need to be removed in order to get the actual numbers and to put them in the right place in the Complex[][] I want to cast it to