Issue
I am trying to identify the installed software on Centos servers, until now I 'came up' with the following two basic solutions
The first one is time consuming, while the second does not apply to all my cases. For example I want to 'search' for packages even if the server is not running and I can access only its file system as a remote Volume, a Snapshot or an Image.
What I am thinking is to try and parse the same database / files that rpm -qa
reads the data from.
After running strace -o /tmp/rpm-strace.out rpm -qa
I found (without being sure) that /var/lib/rpm/Packages
and /var/lib/rpm/Names
are some possible locations for that 'database' but I can not parse any of those 2 files.
Does anyone know how to parse these files? Is there any alternative to achieve what I want?
Note: The whole idea is feasible under Ubuntu as this 'Unix & Linux' question describes.
Disclaimer: This question may be more suitable for serverfault site.
Solution
You really need to use rpm
to parse the rpm database. If you have access to the filesystem, you could simply use chroot
to run rpm
inside the appropriate root context:
chroot /my/server/filesystem rpm -qa
Those files are various sorts of BerkeleyDB database files. Assuming that your runtime environment has the same version of BerkeleyDB available, you can use something like Python's bsddb
module to read them:
>>> import bsddb
>>> name = bsddb.btopen('/var/lib/rpm/Name')
>>> for pkg in name.keys():
... print pkg
...
GConf2
GeoIP
GeoIP-GeoLite-data
GeoIP-GeoLite-data-extra
GitPython
GraphicsMagick
[...]
But this is a terrible idea and you shouldn't do it, because who knows if the Name
database has exactly what you're looking for? Maybe it includes deleted packages that are somehow marked deleted, so rpm -qa
would ignore them. You would probably need to look at the rpm sources to figure out exactly how things are stored.
Answered By - larsks Answer Checked By - Terry (WPSolving Volunteer)