[linux] How to recursively find and list the latest modified files in a directory with subdirectories and times

  • Operating system: Linux

  • Filesystem type: ext3

  • Preferred solution: Bash (script/one-liner), Ruby, or Python

I have several directories with several subdirectories and files in them. I need to make a list of all these directories that is constructed in a way such that every first-level directory is listed next to the date and time of the latest created/modified file within it.

To clarify, if I touch a file or modify its contents a few subdirectory levels down, that timestamp should be displayed next to the first-level directory name. Say I have a directory structured like this:

./alfa/beta/gamma/example.txt

and I modify the contents of the file example.txt, I need that time displayed next to the first-level directory alfa in human readable form, not epoch. I've tried some things using find, xargs, sort and the like, but I can't get around the problem that the filesystem timestamp of 'alfa' doesn't change when I create/modify files a few levels down.

This question is related to linux recursion time filesystems

The answer is


This could be done with a recursive function in Bash too.

Let F be a function that displays the time of file which must be lexicographically sortable yyyy-mm-dd, etc., (OS-dependent?)

F(){ stat --format %y "$1";}                # Linux
F(){ ls -E "$1"|awk '{print$6" "$7}';}      # SunOS: maybe this could be done easier

R, the recursive function that runs through directories:

R(){ local f;for f in "$1"/*;do [ -d "$f" ]&&R $f||F "$f";done;}

And finally

for f in *;do [ -d "$f" ]&&echo `R "$f"|sort|tail -1`" $f";done

For those, who faced

stat: unrecognized option: format

when executed the line from Heppo's answer (find $1 -type f -exec stat --format '%Y :%y %n' "{}" \; | sort -nr | cut -d: -f2- | head)

Please try the -c key to replace --format and finally the call will be:

find $1 -type f -exec stat -c '%Y :%y %n' "{}" \; | sort -nr | cut -d: -f2- | head

That worked for me inside of some Docker containers, where stat was not able to use --format option.


For plain ls output, use this. There is no argument list, so it can't get too long:

find . | while read FILE;do ls -d -l "$FILE";done

And niceified with cut for just the dates, times, and name:

find . | while read FILE;do ls -d -l "$FILE";done | cut --complement -d ' ' -f 1-5

EDIT: Just noticed that the current top answer sorts by modification date. That's just as easy with the second example here, since the modification date is first on each line - slap a sort onto the end:

find . | while read FILE;do ls -d -l "$FILE";done | cut --complement -d ' ' -f 1-5 | sort

GNU find (see man find) has a -printf parameter for displaying the files in Epoch mtime and relative path name.

redhat> find . -type f -printf '%T@ %P\n' | sort -n | awk '{print $2}'

To find all files whose file status was last changed N minutes ago:

find -cmin -N

For example:

find -cmin -5

The following returns you a string of the timestamp and the name of the file with the most recent timestamp:

find $Directory -type f -printf "%TY-%Tm-%Td-%TH-%TM-%TS %p\n" | sed -r 's/([[:digit:]]{2})\.([[:digit:]]{2,})/\1-\2/' |     sort --field-separator='-' -nrk1 -nrk2 -nrk3 -nrk4 -nrk5 -nrk6 -nrk7 | head -n 1

Resulting in an output of the form: <yy-mm-dd-hh-mm-ss.nanosec> <filename>


I shortened Daniel Böhmer's awesome answer to this one-liner:

stat --printf="%y %n\n" $(ls -tr $(find * -type f))

If there are spaces in filenames, you can use this modification:

OFS="$IFS";IFS=$'\n';stat --printf="%y %n\n" $(ls -tr $(find . -type f));IFS="$OFS";

Try this one:

#!/bin/bash
find $1 -type f -exec stat --format '%Y :%y %n' "{}" \; | sort -nr | cut -d: -f2- | head

Execute it with the path to the directory where it should start scanning recursively (it supports filenames with spaces).

If there are lots of files it may take a while before it returns anything. Performance can be improved if we use xargs instead:

#!/bin/bash
find $1 -type f -print0 | xargs -0 stat --format '%Y :%y %n' | sort -nr | cut -d: -f2- | head

which is a bit faster.


You may give the printf command of find a try

%Ak File's last access time in the format specified by k, which is either @' or a directive for the C strftime' function. The possible values for k are listed below; some of them might not be available on all systems, due to differences in `strftime' between systems.


Ignoring hidden files — with nice & fast time stamp

Handles spaces in filenames well — not that you should use those!

$ find . -type f -not -path '*/\.*' -printf '%TY.%Tm.%Td %THh%TM %Ta %p\n' |sort -nr |head -n 10

2017.01.28 07h00 Sat ./recent
2017.01.21 10h49 Sat ./hgb
2017.01.16 07h44 Mon ./swx
2017.01.10 18h24 Tue ./update-stations
2017.01.09 10h38 Mon ./stations.json

More find galore can be found by following the link.


Try this:

#!/bin/bash
stat --format %y $(ls -t $(find alfa/ -type f) | head -n 1)

It uses find to gather all files from the directory, ls to list them sorted by modification date, head for selecting the first file and finally stat to show the time in a nice format.

At this time it is not safe for files with whitespace or other special characters in their names. Write a commend if it doesn't meet your needs yet.


@anubhava's answer is great, but unfortunately won't work on BSD tools – i.e. it won't work with the find that comes installed by default on macOS, because BSD find doesn't have the -printf operator.

So here's a variation that works with macOS + BSD (tested on my Catalina Mac), which combines BSD find with xargs and stat:

$ find . -type f -print0 \
      | xargs -0 -n1 -I{} stat -f '%Fm %N' "{}" \
      | sort -rn 

While I'm here, here's BSD command sequence I like to use, which puts the timestamp in ISO-8601 format

$ find . -type f -print0 \
    | xargs -0 -n1 -I{} \
       stat  -f '%Sm %N' -t '%Y-%m-%d %H:%M:%S' "{}" \
    | sort -rn

(note that both my answers, unlike @anubhava's, pass the filenames from find to xargs as a single argument rather than a \0 terminated list, which changes what gets piped out at the very end)

And here's the GNU version (i.e. @anubhava's answer, but in iso-8601 format):

$ gfind . -type f -printf "%T+ %p\0" | sort -zk1nr

Related q: find lacks the option -printf, now what?


I have this alias in my .profile that I use quite often:

$ alias | grep xlogs
xlogs='sudo find . \( -name "*.log" -o -name "*.trc" \) -mtime -1 | sudo xargs ls -ltr --color | less -R'

So it does what you are looking for (with exception it doesn't traverse change date/time multiple levels) - looks for latest files (*.log and *.trc files in this case); also it only finds files modified in the last day, and then sorts by time and pipes the output through less:

sudo find . \( -name "*.log" -o -name "*.trc" \) -mtime -1 | sudo xargs ls -ltr --color | less -R

PS.: Notice I don't have root on some of the servers, but always have sudo, so you may not need that part.


Quick Bash function:

# findLatestModifiedFiles(directory, [max=10, [format="%Td %Tb %TY, %TT"]])
function findLatestModifiedFiles() {
    local d="${1:-.}"
    local m="${2:-10}"
    local f="${3:-%Td %Tb %TY, %TT}"

    find "$d" -type f -printf "%T@ :$f %p\n" | sort -nr | cut -d: -f2- | head -n"$m"
}

Find the latest modified file in a directory:

findLatestModifiedFiles "/home/jason/" 1

You can also specify your own date/time format as the third argument.


This command works on Mac OS X:

find "$1" -type f -print0 | xargs -0 gstat --format '%Y :%y %n' | sort -nr | cut -d: -f2- | head

On Linux, as the original poster asked, use stat instead of gstat.

This answer is, of course, user37078's outstanding solution, promoted from comment to full answer. I mixed in CharlesB's insight to use gstat on Mac OS X. I got coreutils from MacPorts rather than Homebrew, by the way.

And here's how I packaged this into a simple command ~/bin/ls-recent.sh for reuse:

#!/bin/bash
# ls-recent: list files in a directory tree, most recently modified first
#
# Usage: ls-recent path [-10 | more]
#
# Where "path" is a path to target directory, "-10" is any argument to pass
# to "head" to limit the number of entries, and "more" is a special argument
# in place of "-10" which calls the pager "more" instead of "head".
if [ "more" = "$2" ]; then
   H=more; N=''
else
   H=head; N=$2
fi

find "$1" -type f -print0 |xargs -0 gstat --format '%Y :%y %n' \
    |sort -nr |cut -d: -f2- |$H $N

Both the Perl and Python solutions in this post helped me solve this problem on Mac OS X:

How to list files sorted by modification date recursively (no stat command available!)

Quoting from the post:

Perl:

find . -type f -print |
perl -l -ne '
    $_{$_} = -M;  # store file age (mtime - now)
    END {
        $,="\n";
        print sort {$_{$b} <=> $_{$a}} keys %_;  # print by decreasing age
    }'

Python:

find . -type f -print |
python -c 'import os, sys; times = {}
for f in sys.stdin.readlines(): f = f[0:-1]; times[f] = os.stat(f).st_mtime
for f in sorted(times.iterkeys(), key=lambda f:times[f]): print f'

This is what I'm using (very efficient):

function find_last () { find "${1:-.}" -type f -printf '%TY-%Tm-%Td %TH:%TM %P\n' 2>/dev/null | sort | tail -n "${2:-10}"; }

PROS:

  • it spawns only 3 processes

USAGE:

find_last [dir [number]]

where:

  • dir - a directory to be searched [current dir]
  • number - number of newest files to display [10]

Output for find_last /etc 4 looks like this:

2019-07-09 12:12 cups/printers.conf
2019-07-09 14:20 salt/minion.d/_schedule.conf
2019-07-09 14:31 network/interfaces
2019-07-09 14:41 environment

This should actually do what the OP specifies:

One-liner in Bash:

$ for first_level in `find . -maxdepth 1 -type d`; do find $first_level -printf "%TY-%Tm-%Td %TH:%TM:%TS $first_level\n" | sort -n | tail -n1 ; done

which gives output such as:

2020-09-12 10:50:43.9881728000 .
2020-08-23 14:47:55.3828912000 ./.cache
2018-10-18 10:48:57.5483235000 ./.config
2019-09-20 16:46:38.0803415000 ./.emacs.d
2020-08-23 14:48:19.6171696000 ./.local
2020-08-23 14:24:17.9773605000 ./.nano

This lists each first-level directory with the human-readable timestamp of the latest file within those folders, even if it is in a subfolder, as requested in

"I need to make a list of all these directories that is constructed in a way such that every first-level directory is listed next to the date and time of the latest created/modified file within it."


I'm showing this for the latest access time, and you can easily modify this to do latest modification time.

There are two ways to do this:


  1. If you want to avoid global sorting which can be expensive if you have tens of millions of files, then you can do (position yourself in the root of the directory where you want your search to start):

     Linux> touch -d @0 /tmp/a;
     Linux> find . -type f -exec tcsh -f -c test `stat --printf="%X" {}` -gt  `stat --printf="%X" /tmp/a`  ; -exec tcsh -f -c touch -a -r {} /tmp/a ; -print
    

    The above method prints filenames with progressively newer access time and the last file it prints is the file with the latest access time. You can obviously get the latest access time using a "tail -1".

  2. You can have find recursively print the name and access time of all files in your subdirectory and then sort based on access time and the tail the biggest entry:

     Linux> \find . -type f -exec stat --printf="%X  %n\n" {} \; | \sort -n | tail -1
    

And there you have it...


Here is one version that works with filenames that may contain spaces, newlines, and glob characters as well:

find . -type f -printf "%T@ %p\0" | sort -zk1nr
  • find ... -printf prints the file modification time (Epoch value) followed by a space and \0 terminated filenames.
  • sort -zk1nr reads NUL terminated data and sorts it reverse numerically

As the question is tagged with Linux, I am assuming GNU Core Utilities are available.

You can pipe the above with:

xargs -0 printf "%s\n"

to print the modification time and filenames sorted by modification time (most recent first) terminated by newlines.


Examples related to linux

grep's at sign caught as whitespace How to prevent Google Colab from disconnecting? "E: Unable to locate package python-pip" on Ubuntu 18.04 How to upgrade Python version to 3.7? Install Qt on Ubuntu Get first line of a shell command's output Cannot connect to the Docker daemon at unix:/var/run/docker.sock. Is the docker daemon running? Run bash command on jenkins pipeline How to uninstall an older PHP version from centOS7 How to update-alternatives to Python 3 without breaking apt?

Examples related to recursion

List all the files and folders in a Directory with PHP recursive function Jquery Ajax beforeSend and success,error & complete Node.js - Maximum call stack size exceeded best way to get folder and file list in Javascript Recursive sub folder search and return files in a list python find all subsets that sum to a particular value jQuery - Uncaught RangeError: Maximum call stack size exceeded Find and Replace string in all files recursive using grep and sed recursion versus iteration Method to get all files within folder and subfolders that will return a list

Examples related to time

Date to milliseconds and back to date in Swift How to manage Angular2 "expression has changed after it was checked" exception when a component property depends on current datetime how to sort pandas dataframe from one column Convert time.Time to string How to get current time in python and break up into year, month, day, hour, minute? Xcode swift am/pm time to 24 hour format How to add/subtract time (hours, minutes, etc.) from a Pandas DataFrame.Index whos objects are of type datetime.time? What does this format means T00:00:00.000Z? How can I parse / create a date time stamp formatted with fractional seconds UTC timezone (ISO 8601, RFC 3339) in Swift? Extract time from moment js object

Examples related to filesystems

Get an image extension from an uploaded file in Laravel Notepad++ cached files location No space left on device How to create a directory using Ansible best way to get folder and file list in Javascript Exploring Docker container's file system Remove directory which is not empty GIT_DISCOVERY_ACROSS_FILESYSTEM not set Trying to create a file in Android: open failed: EROFS (Read-only file system) Node.js check if path is file or directory