0

I am working on Query Service in Adobe Experience Platform. It uses limited Spark SQL functions which are listed here.

I have the following table

Name   AddressType    CustomerDetails
------------------------------------------------------------------------------------------      
John   home           [{"acctType":"Mortgage loan","acctID":101},{"acctType":"Home Equity loan","acctID":104},{"acctType":"Checking Account","acctID":105,}]
John   work           [{"acctType":"Mortgage loan","acctID":101},{"acctType":"Home Equity loan","acctID":104},{"acctType":"Checking Account","acctID":105,}]
John   office         [{"acctType":"Mortgage loan","acctID":101},{"acctType":"Home Equity loan","acctID":104},{"acctType":"Checking Account","acctID":105,}]

Here Name and AddressType are String where as CustomerDetails has the schema type array<struct<acctType:string, acctID:string>>

I need to add AddressType column to the CustomerDetails column and the final output should look like the below format

Name   AddressType    CustomerDetails
--------------------------------------      
John   home           [{"AddressType":"home","acctType":"Mortgage loan","acctID":101}, {"AddressType":"home","acctType":"Home Equity loan","acctID":104}, {"AddressType":"home","acctType":"Checking Account","acctID":105}]
John   work           [{"AddressType":"work","acctType":"Mortgage loan","acctID":101}, {"AddressType":"work","acctType":"Home Equity loan","acctID":104}, {"AddressType":"work","acctType":"Checking Account","acctID":105}]
John   office         [{"AddressType":"office","acctType":"Mortgage loan","acctID":101}, {"AddressType":"office","acctType":"Home Equity loan","acctID":104}, {"AddressType":"office","acctType":"Checking Account","acctID":105}]

I am using the below query to add an extra field to the CustomerDetails column. But I am not able to figure out how to add the value of AddressType column to col3

SELECT Name, addressType,
(from_json(CustomerDetails, 'ARRAY<STRUCT<addressType: STRING, acctType: STRING, acctID: STRING>>')) AS col3
FROM CustomerTable 

I have also looked at TRANSFORM function but not getting the SQL specific syntax to use it. Any help on this would be appreciated.

2 Answers 2

1

After digging around for a bit, I was finally able to use this answer as reference and come up with a solution. Below is the query which worked for me.

SELECT ct1.Name, ct1.addressType, to_json(ct1.col2) AS CustomerDetails
FROM
(SELECT ct.Name, ct.addressType,
TRANSFORM(ct.col3, x -> struct(ct.addressType as addressType, x.acctType as acctType, x.acctID as acctID)) as col2
FROM
(SELECT Name, addressType,
(from_json(CustomerDetails, 'ARRAY<STRUCT<addressType: STRING, acctType: STRING, acctID: STRING>>')) AS col3
FROM CustomerTable) ct) ct1
Sign up to request clarification or add additional context in comments.

Comments

1

Another alternative is to use map rather than struct since you need to specify all columns.

SELECT  
    name,
    addresstype,
    to_json(
        transform(
            from_json(
                customerdetails, 'array<map<string, string>>'
            ), 
            f -> map_concat(
                f, 
                map('AddressType', addresstype)
            )
        )
    ) AS customerdetails
FROM VALUES 
   ("John","home",'[{"acctType":"Mortgage loan","acctID":101},{"acctType":"Home Equity loan","acctID":104},{"acctType":"Checking Account","acctID":105}]'), 
   ("John","work",'[{"acctType":"Mortgage loan","acctID":101},{"acctType":"Home Equity loan","acctID":104},{"acctType":"Checking Account","acctID":105}]'),
   ("John","office",'[{"acctType":"Mortgage loan","acctID":101},{"acctType":"Home Equity loan","acctID":104},{"acctType":"Checking Account","acctID":105}]') 
AS (name, addresstype, customerdetails)
+----+-----------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|name|addresstype|customerdetails                                                                                                                                                                                                 |
+----+-----------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|John|home       |[{"acctType":"Mortgage loan","acctID":"101","AddressType":"home"},{"acctType":"Home Equity loan","acctID":"104","AddressType":"home"},{"acctType":"Checking Account","acctID":"105","AddressType":"home"}]      |
|John|work       |[{"acctType":"Mortgage loan","acctID":"101","AddressType":"work"},{"acctType":"Home Equity loan","acctID":"104","AddressType":"work"},{"acctType":"Checking Account","acctID":"105","AddressType":"work"}]      |
|John|office     |[{"acctType":"Mortgage loan","acctID":"101","AddressType":"office"},{"acctType":"Home Equity loan","acctID":"104","AddressType":"office"},{"acctType":"Checking Account","acctID":"105","AddressType":"office"}]|
+----+-----------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

5 Comments

Your query is throwing me an error at line 17:11: extraneous input ','. May be AEP is not liking the way data is being provided.
what is AEP means ?
how are you executing above query ?
I am using Adobe Experience Platform's Query Service feature to do so. It has very limited Spark functionalities. But I am getting the gist of what you are trying to accomplish. The usage of maps would be beneficial if I have a very large number of columns.
Yes, That's correct.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.