For Spark version >= 2.4, we can utilize higher-order functions to work with arrays, including this problem. Let's say df is the dataframe.
df = spark.createDataFrame([
("Barkley likes people. Barkley likes treats. Barkley likes everything.",[22, 22, 25]),
("A sentence. Another sentence.",[13, 18]),
("One sheep. Two sheep. Three sheep. Four sheep.",[11, 12, 13, 12])],
"col1:string, col2:array<int>")
df.show()
# +--------------------+----------------+
# | col1| col2|
# +--------------------+----------------+
# |Barkley likes peo...| [22, 22, 25]|
# |A sentence. Anoth...| [13, 18]|
# |One sheep. Two sh...|[11, 12, 13, 12]|
# +--------------------+----------------+
To slice the sentences from col1, substring function will used and it needs arguments of start position and length. col2 is the lengths of every sentences in the string. The start positions of every sentences are the cumulative sum of array col2 from 0 to n-1, as hinted in the question. To get that, use higher-order functions transform and aggregate. After that, get every sentences and use map_from_entries to create a map for every sentences and their indexes. This is an example to do so.
import pyspark.sql.functions as F
df = (df
.withColumn("start", F.expr("transform(transform(col2, (v1,i) -> slice(col2, 1, i)), v2 -> aggregate(v2, 0, (a,b) -> a + b))"))
.withColumn("sentences", F.expr("transform(col2, (v, i) -> struct(i+1 as index, substring(col1, start[i], col2[i]) as sentence))"))
.selectExpr("col1", "map_from_entries(sentences) as sentences")
)
df.show(truncate=False)
# +---------------------------------------------------------------------+------------------------------------------------------------------------------------------+
# |col1 |sentences |
# +---------------------------------------------------------------------+------------------------------------------------------------------------------------------+
# |Barkley likes people. Barkley likes treats. Barkley likes everything.|[1 -> Barkley likes people. , 2 -> Barkley likes treats., 3 -> Barkley likes everything]|
# |A sentence. Another sentence. |[1 -> A sentence. A, 2 -> Another sentence.] |
# |One sheep. Two sheep. Three sheep. Four sheep. |[1 -> One sheep. , 2 -> Two sheep. , 3 -> Three sheep. , 4 -> Four sheep.] |
# +---------------------------------------------------------------------+------------------------------------------------------------------------------------------+