[apache-spark] How to read multiple text files into a single RDD?

TRY THIS Interface used to write a DataFrame to external storage systems (e.g. file systems, key-value stores, etc). Use DataFrame.write() to access this.

New in version 1.4.

csv(path, mode=None, compression=None, sep=None, quote=None, escape=None, header=None, nullValue=None, escapeQuotes=None, quoteAll=None, dateFormat=None, timestampFormat=None) Saves the content of the DataFrame in CSV format at the specified path.

Parameters: path – the path in any Hadoop supported file system mode – specifies the behavior of the save operation when data already exists.

append: Append contents of this DataFrame to existing data. overwrite: Overwrite existing data. ignore: Silently ignore this operation if data already exists. error (default case): Throw an exception if data already exists. compression – compression codec to use when saving to file. This can be one of the known case-insensitive shorten names (none, bzip2, gzip, lz4, snappy and deflate). sep – sets the single character as a separator for each field and value. If None is set, it uses the default value, ,. quote – sets the single character used for escaping quoted values where the separator can be part of the value. If None is set, it uses the default value, ". If you would like to turn off quotations, you need to set an empty string. escape – sets the single character used for escaping quotes inside an already quoted value. If None is set, it uses the default value, \ escapeQuotes – A flag indicating whether values containing quotes should always be enclosed in quotes. If None is set, it uses the default value true, escaping all values containing a quote character. quoteAll – A flag indicating whether all values should always be enclosed in quotes. If None is set, it uses the default value false, only escaping values containing a quote character. header – writes the names of columns as the first line. If None is set, it uses the default value, false. nullValue – sets the string representation of a null value. If None is set, it uses the default value, empty string. dateFormat – sets the string that indicates a date format. Custom date formats follow the formats at java.text.SimpleDateFormat. This applies to date type. If None is set, it uses the default value value, yyyy-MM-dd. timestampFormat – sets the string that indicates a timestamp format. Custom date formats follow the formats at java.text.SimpleDateFormat. This applies to timestamp type. If None is set, it uses the default value value, yyyy-MM-dd'T'HH:mm:ss.SSSZZ.