[linux] How to split one text file into multiple *.txt files?

I got a text file file.txt(12 MBs) containing:

something1
something2
something3
something4
(...)

Is there any way to split file.txt in to 12 *.txt files let say file2.txt, file3.txt, file4.txt (...) ?

This question is related to linux bash

The answer is


Try something like this:

awk -vc=1 'NR%1000000==0{++c}{print $0 > c".txt"}' Datafile.txt

for filename in *.txt; do mv "$filename" "Prefix_$filename"; done;

Using bash:

readarray -t LINES < file.txt
COUNT=${#LINES[@]}
for I in "${!LINES[@]}"; do
    INDEX=$(( (I * 12 - 1) / COUNT + 1 ))
    echo "${LINES[I]}" >> "file${INDEX}.txt"
done

Using awk:

awk '{
    a[NR] = $0
}
END {
    for (i = 1; i in a; ++i) {
        x = (i * 12 - 1) / NR + 1
        sub(/\..*$/, "", x)
        print a[i] > "file" x ".txt"
    }
}' file.txt

Unlike split this one makes sure that number of lines are most even.


$ split -l 100 input_file output_file

where -l is the number of lines in each files. This will create:

  • output_fileaa
  • output_fileab
  • output_fileac
  • output_filead
  • ....

Regardless to what is said above, on my ubuntu 16 i had to do :

> split -b 10M -d  system.log system_split.log 

Please note the space between -b and the value


On my Linux system (Red Hat Enterprise 6.9), the split command does not have the command-line options for either -n or --additional-suffix.

Instead, I've used this:

split -d -l NUM_LINES really_big_file.txt split_files.txt.

where -d is to add a numeric suffix to the end of the split_files.txt. and -l specifies the number of lines per file.

For example, suppose I have a really big file like this:

$ ls -laF
total 1391952
drwxr-xr-x 2 user.name group         40 Sep 14 15:43 ./
drwxr-xr-x 3 user.name group       4096 Sep 14 15:39 ../
-rw-r--r-- 1 user.name group 1425352817 Sep 14 14:01 really_big_file.txt

This file has 100,000 lines, and I want to split it into files with at most 30,000 lines. This command will run the split and append an integer at the end of the output file pattern split_files.txt..

$ split -d -l 30000 really_big_file.txt split_files.txt.

The resulting files are split correctly with at most 30,000 lines per file.

$ ls -laF
total 2783904
drwxr-xr-x 2 user.name group        156 Sep 14 15:43 ./
drwxr-xr-x 3 user.name group       4096 Sep 14 15:39 ../
-rw-r--r-- 1 user.name group 1425352817 Sep 14 14:01 really_big_file.txt
-rw-r--r-- 1 user.name group  428604626 Sep 14 15:43 split_files.txt.00
-rw-r--r-- 1 user.name group  427152423 Sep 14 15:43 split_files.txt.01
-rw-r--r-- 1 user.name group  427141443 Sep 14 15:43 split_files.txt.02
-rw-r--r-- 1 user.name group  142454325 Sep 14 15:43 split_files.txt.03


$ wc -l *.txt*
    100000 really_big_file.txt
     30000 split_files.txt.00
     30000 split_files.txt.01
     30000 split_files.txt.02
     10000 split_files.txt.03
    200000 total

John's answer won't produce .txt files as the OP wants. Use:

split -b=1M -d  file.txt file --additional-suffix=.txt

If each part have the same lines number, for example 22, here my solution:
split --numeric-suffixes=2 --additional-suffix=.txt -l 22 file.txt file
and you obtain file2.txt with the first 22 lines, file3.txt the 22 next lineā€¦

Thank @hamruta-takawale, @dror-s and @stackoverflowuser2010


I agree with @CS Pei, however this didn't work for me:

split -b=1M -d file.txt file

...as the = after -b threw it off. Instead, I simply deleted it and left no space between it and the variable, and used lowercase "m":

split -b1m -d file.txt file

And to append ".txt", we use what @schoon said:

split -b=1m -d file.txt file --additional-suffix=.txt

I had a 188.5MB txt file and I used this command [but with -b5m for 5.2MB files], and it returned 35 split files all of which were txt files and 5.2MB except the last which was 5.0MB. Now, since I wanted my lines to stay whole, I wanted to split the main file every 1 million lines, but the split command didn't allow me to even do -100000 let alone "-1000000, so large numbers of lines to split will not work.