2018-11-19 Program Development Spark - DataFrame UDF使用多个col 123456789101112val calUdf = udf((r: Row) => { val chi2 = KpiChiSquareThreshold.calChiSquareValue(baseline, r.toSeq.toList.asInstanceOf[List[Long]]) if (chi2 > kpiChiSquareThr) (1, Math.abs(chi2 - kpiChiSquareThr), if (Math.abs(chi2 - kpiChiSquareThr) / kpiChiSquareThr > 1) 1.0 else Math.abs(chi2 - kpiChiSquareThr) / kpiChiSquareThr) else (0, 0.0, 0.0)})df = df.withColumn("detect", calUdf(struct(featureList.map(f => col(f)): _*))) .withColumn($(outputCol), col("detect._1")) .withColumn($(absDevCol), col("detect._2")) .withColumn($(relDevCol), col("detect._3")) .drop("detect") Newer Spark - 使用RDD为DataFrame增加Column Older Spark - 从DataFrame中选取一个List的Columns