Issue
I have a text file which is >300GB in size originally, and gzipped it still has >10GB. (it is a database export which ran for days, and then was aborted, and I want to know the timestamp of the last exported entry so I can resume the export.)
I am interested in the last few lines of this text file, preferably without having to unzip the whole 300GB (even into memory). This file does not grow any more so I don't need to track changes or appended data a.k.a tail -f
.
Is there a way to gunzip only the last part of the file?
tail --bytes=10000000 /mnt/myfile.db.gz | gunzip - |less
does not work (it returns stdin: not in gzip format
). Since gzip can compress not just files, but also streams of data, it should be possible to search for an entry point somewhere in the file where to start uncompressing, without having to read the file header. Right?
Solution
No, not right. Unless the gzip stream was specially generated to allow random access, the only way to decode the last few lines is to decode the whole thing.
Answered By - Mark Adler