I have done one transformation using sqlcontext in spark but same query I want to write using Spark Data frame only . This query include join operation plus case statement of SQL. The sql query written as below:
refereshLandingData=spark.sql( "select a.Sale_ID, a.Product_ID,"
"CASE "
"WHEN (a.Quantity_Sold IS NULL) THEN b.Quantity_Sold "
"ELSE a.Quantity_Sold "
"END AS Quantity_Sold, "
"CASE "
"WHEN (a.Vendor_ID IS NULL) THEN b.Vendor_ID "
"ELSE a.Vendor_ID "
"END AS Vendor_ID, "
"a.Sale_Date, a.Sale_Amount, a.Sale_Currency "
"from landingData a left outer join preHoldData b on a.Sale_ID = b.Sale_ID" )
now I want equvalent code in spark dataframe in both scala and python. I have tried some code but its
not working .my tried code is as follow:
joinDf=landingData.join(preHoldData,landingData['Sale_ID']==preHoldData['Sale_ID'],'left_outer')
joinDf.withColumn\
('QuantitySold',pf.when(pf.col(landingData('Quantity_Sold')).isNull(),pf.col(preHoldData('Quantity_Sold')))
.otherwise(pf.when(pf.col(preHoldData('Quantity_Sold')).isNull())),
pf.col(landingData('Quantity_Sold'))).show()
In the above code joining done perfectly but case condition not working. I am getting--> TypeError: 'DataFrame' object is not callable I am using spark 2.3.2 version and python 3.7 and similarly scala 2.11 in case of spark-scala Please anyone suggest me any equivalent code or guidence !