Is there a way to take the first 1000 rows of a Spark Dataframe

Question

I am using the randomSplitfunction to get a small amount of a dataframe to use in dev purposes and I end up just taking the first df that is returned by this function   val df subset   data randomSplit Array 0 00000001  0 01   seed   12345  0    If I use df take 1000  then I end up with an array of rows- not a dataframe  so that won t work for me   Is there a better  simpler way to take say the first 1000 rows of the df and store it as another df

User · Answer

Limit is very simple  example limit first 50 rows val df subset   data limit 50

User · Answer

The method you are looking for is  limit   Returns a new Dataset by taking the first n rows  The difference between this function and head is that head returns an array while limit returns a new Dataset   Example usage  df limit 1000

[scala] Is there a way to take the first 1000 rows of a Spark Dataframe?

Examples related to scala

Examples related to apache-spark