I was able to do this using RDD. But I don't know if this is an acceptable solution for you.
Once you have the DF available as an RDD, you can apply repartitionAndSortWithinPartitions
to perform custom repartitioning of data.
Here is a sample I used:
class DatePartitioner(partitions: Int) extends Partitioner {
override def getPartition(key: Any): Int = {
val start_time: Long = key.asInstanceOf[Long]
Objects.hash(Array(start_time)) % partitions
}
override def numPartitions: Int = partitions
}
myRDD
.repartitionAndSortWithinPartitions(new DatePartitioner(24))
.map { v => v._2 }
.toDF()
.write.mode(SaveMode.Overwrite)