create a spark dataframe column consists of a list as data type

scala:

import org.apache.spark.sql.functions.{lit,array_repeat}
import spark.implicits._

val df = Seq(1, 2, 3).toDF("list_len")
df.withColumn("new_list", array_repeat(lit(""), $"list_len"))

reference: https://spark.apache.org/docs/latest/api/java/org/apache/spark/sql/functions.html#array_repeat-org.apache.spark.sql.Column-org.apache.spark.sql.Column-

pyspark:

from pyspark.sql.functions import lit, array_repeat, col
df.withColumn("new_list", array_repeat(lit(""), col("list_len")))

reference: https://spark.apache.org/docs/latest/api/python/pyspark.sql.html#pyspark.sql.functions.array_repeat

CLICK HERE to find out more related problems solutions.

Leave a Comment

Your email address will not be published.

Scroll to Top