thanks for your time.
I have a dataframe in pyspark in Databricks that reads json. The data from the source does not always have the same structure, sometimes the 'emailAddress' field does not appear, causing me the error "org.apache.spark.sql.AnalysisException: cannot resolve ...".
I have tried to solve by applying a Try-Except function in this way:
try:
df_json = df_json.select("responseID", "surveyID", "surveyName","timestamp", "customVariables.Id_Cliente", "timestamp", "responseSet", "emailAddress")
except ValueError:
None
But it does not work for me, it returns the same error that I mentioned.
I am even trying to take another alternative but without results:
if 'Id_Cliente' in s_fields:
try:
df_json = df_json.select("responseID", "surveyID", "surveyName","timestamp", "customVariables.Id_Cliente", "timestamp", "responseSet", "emailAddress")
except ValueError:
df_json = df_json.select("responseID", "surveyID", "surveyName","timestamp", "customVariables.Id_Cliente", "timestamp", "responseSet")
Please help me with some idea to control this situation? I need to stop the execution of my notebook when it does not find the field in the structure, otherwise (it finds the emailAddress variable) to continue processing.
From already thank you very much.
Greetings.