If you want to use arrays, you'll need to transform the array with an rlike comparison:
import pyspark.sql.functions as F
word_list = ['dog', 'mouse', 'horse', 'bird']
df2 = df.withColumn(
'words',
F.array(*[F.lit(w) for w in word_list])
).withColumn(
'isList',
F.expr("array_max(transform(words, x -> lower(text) rlike x))")
).drop('words')
df2.show(20,0)
+------------------------------------+------+
|text |isList|
+------------------------------------+------+
|I like my two dogs |true |
|I don't know if I want to have a cat|false |
|Anna sings like a bird |true |
|Horseland is a good place |true |
+------------------------------------+------+
A filter operation on the array is also possible, where you test the size of the filtered array (with matching words):
df2 = df.withColumn(
'words',
F.array(*[F.lit(w) for w in word_list])
).withColumn(
'isList',
F.expr("size(filter(words, x -> lower(text) rlike x)) > 0")
).drop('words')
If you fancy using aggregate that's also possible:
df2 = df.withColumn(
'words',
F.array(*[F.lit(w) for w in word_list])
).withColumn(
'isList',
F.expr("aggregate(words, false, (acc, x) -> acc or lower(text) rlike x)")
).drop('words')
Note that all three of these higher order functions require Spark >= 2.4.