[python] View RDD contents in Python Spark?

By latest document, you can use rdd.collect().foreach(println) on the driver to display all, but it may cause memory issues on the driver, best is to use rdd.take(desired_number)

https://spark.apache.org/docs/2.2.0/rdd-programming-guide.html

To print all elements on the driver, one can use the collect() method to first bring the RDD to the driver node thus: rdd.collect().foreach(println). This can cause the driver to run out of memory, though, because collect() fetches the entire RDD to a single machine; if you only need to print a few elements of the RDD, a safer approach is to use the take(): rdd.take(100).foreach(println).