Thursday, October 28, 2021

[SOLVED] Command line to search all files in all text files in directory recursively

Issue

I've just migrated to Octopress. During the migration, the plugin downloaded a lot of unused images from WordPress and now I'm trying to delete it.

I'd like to list all the images and search for all files in _post folder and give me all the files that are unused so I can delete them.

I came up with this command.

find ./ -type f | xargs basename | xargs grep -r {} ../_posts/

which I don't think it's correct. It gives me this result.

grep: Screenshot-2015-07-30-13.04.35.png: No such file or directory
grep: Screenshot-2015-07-30-13.05.41-150x116.png: No such file or directory
grep: Screenshot-2015-07-30-13.05.41.png: No such file or directory
grep: Screenshot-2015-07-30-13.06.33-150x150.png: No such file or directory
grep: Screenshot-2015-07-30-13.06.33-231x300.png: No such file or directory
grep: Screenshot-2015-07-30-13.06.33-518x500.png: No such file or directory
grep: Screenshot-2015-07-30-13.06.33-518x592.png: No such file or directory
grep: Screenshot-2015-07-30-13.06.33.png: No such file or directory
grep: Screenshot-2015-07-30-13.07.47-1024x601.png: No such file or directory
grep: Screenshot-2015-07-30-13.07.47-1120x500.png: No such file or directory
grep: Screenshot-2015-07-30-13.07.47-150x150.png: No such file or directory
grep: Screenshot-2015-07-30-13.07.47-300x176.png: No such file or directory
grep: Screenshot-2015-07-30-13.07.47-786x592.png: No such file or directory
grep: Screenshot-2015-07-30-13.07.47.png: No such file or directory

Because even the files that are being used still get listed.


Solution

That's because you're not limiting how many lines xargs is picking up

find ./ -type f | xargs basename | xargs grep -r {} ../_posts/

xargs will spawn grep with many strings, you'll get

grep -r foo bar baz qux ../_posts/

and since grep only wants one pattern, all the rest are assumed to be filenames.

The quick fix is

find ./ -type f | xargs -L 1 basename | xargs -L 1 grep -r {} ../_posts/

I would rewrite that without xargs at all:

find ./ -type f -printf "%f\n" | grep -rFf - ../_posts/

This way, you get a single find call, and a single grep call. Should be a lot faster.


since grep -f - isn't working, the more complicated bash process substitution (see your bash man page)

grep -rFf <(find ./ -type f -exec basename '{}' \;) ../_posts/

That turns the find output into a file that grep can read with the -f option.



Answered By - glenn jackman