BASH

Linux Find Large Files Example

Having our disk tidy and free of garbage occupies space is something we all want to do, but many times we don’t pay attention at large files that are taking up a considerable amount of space that could be useful for something else.

In this example, we will see how to look for large files, just to be aware of their existence, or for deleting or moving them somewhere else.

For this example, Linux Mint 18 has been used.
 
 
 
 

1. Introduction

There’s more than a way to find by its size. And, of course, a “large file” is only a subjective term to refer to files that each one considers large depending on its size; there’s no a way of directly saying “look for large files”. So, we will look for files specifying a size.

For this, we will use the powerful find command.

2. Understanding the find command

The syntax of find, for looking for files by their size, is the following:

find <path> [-type <file-type>] -size +<size><unit>

Let’s explain it briefly:

  • find looks for files recursively, so, when we specify a path, it will also look in the children directories.
  • Optionally, we can specify the file type, if we only want to find certain ones. The file types are the followings:
    • d: directory
    • f: regular file
    • l: symbolic link
    • b: buffered block
    • c: unbuffered character
    • p: named pipe
    • s: socket
  • Finally, the size, composed by the size itself, and the unit we are using, that can be one of these:
    • b: 512-byte blocks (the one used by default, if no other unit is specified)
    • c: bytes
    • k: kilobytes
    • M: megabytes
    • G: gigabytes

2.1. Examples

Let’s see some examples, if there’s any doubt left:

Find files in home directory bigger than 2 GB:

find /home -size +2G

Find directories in the whole system bigger than 10 GB:

sudo find / -type d -size +10G

Note that, in this case, we have used root privileges with sudo, to look also inside directories the current user does not have access to.

2.2. Finding by file extension

If we want to be more precise, we can also look for files by the extension. For that, we can use the -name option:

find [...] -name *.<extension>

Here are some examples:

Find pdf files in home directory bigger than 20 MB:

sudo find /home -size +20M -name *.pdf

Find mkv files in the whole system bigger than 5 GB:

sudo find /home -size +5G -name *.mkv

3. Executing actions

Until now, we have seen how to just find large files. Usually, if we want to find large files, is with a certain purpose (usually, delete them, or move them somewhere else).

So, finding the files, just give as the path to them. This is not very useful if we want to perform some action, like previously mentioned.

find command allows to execute commands for every found file. The syntax is the following:

find [...] -exec <command> {} \;

Where the command is the command itself, and the curly braces {} are where the found file will be placed, for the command execution. Finally, \; indicates the command end.

Let’s see some examples:

Remove all log files bigger than 20 MB:

sudo find / -name *.log -size +20M -exec rm {} \;

Move all files bigger than 30 GB to /tmp:

sudo find / -size +30G -exec mv {} /tmp \;

Be careful, if you execute those commands, the files will be removed/moved!

Of course, we can execute any other action, such as showing file information:

sudo find / -size +10G -exec ls -l {} \;

4. Finding the top larges files

The remaining interesting action we can do with find is to find the largest files in the system.

In the previous examples, we have been looking for files specifying a size. But we may have no idea about the sizes we want to look for, and we just want to find the largest ones.

For that, we can use find executing the following command:

find [...] -exec ls -s {} \; | sort -n -r | head -n

Where n is the number of files to look for.

So, for example, if we execute:

sudo find / -type f -exec ls -s {} \; | sort -n -r | head -10

Will look for the biggest 10 files in the disk. This is a sample output of the command:

26460 /tmp/lu24928inllnr.tmp/lu24928inllvi.tmp
5984 /tmp/lu24928inllnr.tmp/lu24928inllos.tmp
5984 /tmp/lu24928inllnr.tmp/lu24928inllom.tmp
5984 /tmp/lu24928inllnr.tmp/lu24928inllo8.tmp
5984 /tmp/lu24928inllnr.tmp/lu24928inllo4.tmp
5864 /tmp/lu24928inllnr.tmp/lu24928inllox.tmp
5864 /tmp/lu24928inllnr.tmp/lu24928inllor.tmp
5864 /tmp/lu24928inllnr.tmp/lu24928inllod.tmp
5864 /tmp/lu24928inllnr.tmp/lu24928inllo7.tmp
5784 /tmp/lu24928inllnr.tmp/lu24928inllp2.tmp

5. Conclusion

In this example we have seen how to look for files in Linux by their size, with the aim of finding the largest files. We have seen that we can specify any type of file and size, being also able to execute a command for each coincidence, which can help us cleaning up our disks.

Finally, we have shown how to look for the top largest files in the disk, without the need of specifying a size.

Subscribe
Notify of
guest

This site uses Akismet to reduce spam. Learn how your comment data is processed.

2 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
RKBA
RKBA
8 years ago

Good introduction to the find command, but there are a couple places where it could use clarification or improvement. First, if you are searching by name and using wildcards (like *), you will need to escape the wildcard characters if you have any matching files in the current directory you are in. This is because the shell (bash, csh, etc) will expand your wildcards into the matching files, making your find either break (if it matched more than one file) or only match the file that was expanded. So your *.pdf example should look like this: sudo find /home -size… Read more »

RKBA
RKBA
8 years ago

Unfortunately, like most commenting systems, my post was formatted to remove “extraneous” spacing that I had put there intentionally. So it may be a little difficult to read.

Also, and potentially more important, it also replaced my single quotes around *.pdf with a back-tick and a front-tick. So if you cut-and-paste that command, it will fail. Those ticks need to be single quotes for the command to work at all.

Back to top button