0

I have the pyspark code below. In the code I'm creating a dataframe from another dataframe that has been converted into a temporary view. I'm then using a sql query to create a new field in the final query. The code for the field I'm trying to create originally comes from postgresql and I'm wondering what the correct version of the case statement and regex would be in pyspark sql?

case when a.field2::varchar ~ '^[0-9]+$' then a.field2::varchar else '0' end

do I just cast(field2 as string)?

also what would be the correct pyspark sql version of the regex test?

code:

from pyspark.sql.types import *
from pyspark.context import SparkContext
from pyspark.sql import Window
from pyspark.sql import SQLContext
from pyspark.sql.functions import col
from pyspark.sql.functions import first
from pyspark.sql.functions  import date_format
from pyspark.sql.functions import lit,StringType

from pyspark.sql.functions import date_trunc, udf,trim, upper, to_date, substring, length, min, when, format_number, dayofmonth, hour, dayofyear,  month, year, weekofyear, date_format, unix_timestamp
from pyspark import SparkConf
from pyspark.sql.functions import coalesce
from pyspark.sql import SparkSession
from pyspark.sql.functions import year, month, dayofmonth
from pyspark.sql.functions import UserDefinedFunction
import datetime
from pyspark.sql.functions import year
from pyspark.sql.functions import datediff,coalesce,lag
from pyspark.sql.functions import when, to_date
from pyspark.sql.functions import date_add
from pyspark.sql.functions import UserDefinedFunction

import traceback
import sys
import time
import math
import datetime



table_df.createOrReplaceTempView("table")


query="""select
case when a.field2::varchar ~ '^[0-9]+$' then a.field2::varchar else '0' end as field1

from table a"""


df=spark.sql(query)

1 Answer 1

1

You can try:

query = """
    select
        case when a.field2 rlike '^[0-9]+$' 
             then a.field2
             else '0' 
             end as field1
    from table a
"""

df = spark.sql(query)
Sign up to request clarification or add additional context in comments.

2 Comments

thank you for your suggestion. I tried it and I'm getting the error I posted on your comment as an update. I did check and the data type of field2 is already string.
@user3476463 sorry, could you try the updated answer please?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.