Monday, April 24, 2006

Package Management

In order to simplify updates to applications and libraries many package managers have been created. To try and best solve the problem of keeping a system’s programs up to date and secure many approaches have been implemented. Wikipedia defines a package management system as “a collection of tools to automate the process of installing, upgrading, configuring, and removing software package from a computer.”

There are many features that are nice additions to the general functions (install, listing, upgrade, remove) of a package manager such as checksums to detect corruption, dependency tracking, and component tracking.

A brief list of a few common package managers:

1) OpenPkg http://www.openpkg.org/

2) Debian Potato dpkg (.deb)

3) Red Hat Package Manager

4) Gentoo’s Portage

5) FreeBSD Ports

6) MacOS Installer

7) Fink (used in MacOS) http://fink.sourceforge.net/

8) SlackWares tgz package system

9) HP-UX’s Software Distributor

10) TRU-64’s setld

11) Solaris’ SysV format (called pkgmgr)

12) Autopackage http://autopackage.org/


There are a few problems with package managers:

1) You need to create the package- this is time consuming. There are quite a few “services” that provide many packages- rpmfind, up2date, yum, yast, apt-get, etc. This can really be a time saver if you don’t want to create your own… but heed problem 2

2) Dependency Hell. This is when there is either a conflict with existing libraries/packages or you don’t have the dependencies needed for the specific package install. The combination of conflict and lack of specific dependencies can be extremely frustrating.

3) Distribution dependant

But what about those applications which don’t have a package pre-built or if there are conflicts, or if there are missing components? Two solutions come to mind: build the package from package source and that may get you by some of the above issues (e.g. rpmbuild); or my favorite build from source.

The Hard Core Approach

This is the ./configure/make/make install route. It is not as bad as it sounds and it is surely simpler that creating a package from scratch. Usually source packages, sometimes referred to as tar balls (.tar.gz files), are very easy to install especially if you have a modern distribution of Linux. My philosophy is simple- if a source package doesn’t ./configure, make, make install easily (you do need to have the prerequisites of course) then I deem the package total crap and find another solution. If there are no other solutions, I grind away at the problems using my experience, deja.com, forums, and search engines.

But upon completion of the ./configure/make/make install I am still left with having NO idea what I installed and where it all went. Sure ./configure has the –prefix commands etc, but what about log files, and /etc files or any other thing that was installed. By default most source tar balls install in /usr/local but not always. To be accurate, the package could install stuff ALL OVER the place. You have no idea what kind of nut case created this package. So how do you track the install and remove packages when needed?

This question brings me to the Pick Of The Day Paco – PACkage Organizer. It is a GPL Open Source package organizer for source code for Unix/Linux systems. It can be found here http://paco.sourceforge.net/

When installing a package from sources, Paco wraps the installation command (make install, ./install.sh, or whatever), and generates a log containing the list of all installed files. How does it perform this magic? It is accomplished using the “LD_PRELOAD” method which preloads a shared library before installation using the environment variable LD_PRELOAD. During installation, this library catches the system calls that cause filesystem alterations (such as open(), link(), rename), and logs the created files. This method is very simple to use and it does not require a "pre-install" phase because it monitors processes while they run.

Since the preloaded library is used during the specific installation process, the paco logs are not contaminated with any file created by other processes making filesystem alteration calls- You can even use paco to track parallel installations.

Paco has many usage options for looking at package files, file counts, sorting, missing files, etc. The one thing it lacks, and I don’t know why such a simple feature is not included (I will contact the maintainer), is a checksum or MD5 type check. This could easily be added and used to check the integrity of packages tracked by Paco.

An added bonus is gpaco, a GTK+ based GUI for Paco. Unfortunately, it requires GTK+-2.6, which I do not have on Red Hat AS 4.

Drawbacks to Paco:

1) Can’t be used on systems where the binaries are statically linked such as FreeBSD or OpenBSD.

2) Graphical installers may not be able to be captured. If there is a CLI installation method, use that. (I need to check on this… Such as using the Oracle Graphical Java installer).

3) Lacks a checksum or MD5 type check.


Installation of Paco

Since I did not have GTK+2.6 or later I needed to do the following for installation:

cd /tmp
wget http://superb-east.dl.sourceforge.net/sourceforge/paco/paco-1.10.7.tar.gz
tar zxf paco-1.10.7.tar.gz
cd paco-1.10.7
./configure --disable-gpaco
make
make install


Using Paco

paco --help gives all the options

I used paco to install splunk-server. Splunk comes with its own installer and I figured I would see if paco could handle it.


To install splunk-server you need to type

./splunk-Server-1.2.4-linux-installer.bin

then answer a bunch of questions. So I used this to install splunk-server with paco

paco -lp splunk-server-1.2.4 "./splunk-Server-1.2.4-linux-installer.bin"

Paco worked like a champ!!

To log the installation of the package ‘jackBnimble-1.0.0’, which is installed using the command

'make install'

Use

paco -lp jackBnimble-1.0.0 "make install"

paco will create a log file named jackBnimble-1.0.0 in the log directory, with the list of all installed files.


If you wished to log the installation of the package ‘jackBnimble-1.0.0’ and you are in the directory JackBNimble-1.0, you can use the current directory name as name of the package to log by using the –D option:

paco -lD “make install”


To update a Paco log with a file that was not added during a previous installation yu can use the ‘-+’ option:

paco -lp+ JackBNimble-1.0.0 "install Bquick /usr/bin/Bquick"


To remove all versions of the package JackBNimble-1.0.0, keeping the files in /etc and /root, and without asking for confirmation:

paco -rx -e /etc:/root --batch JackBNimble


If you wish to post process a package installation of jackNjill that is installed in /usr/local/jackNjill but paco was not initially used to log the install, you can use the following command to do just that:

find /usr/local/jackNjill-1.2.3 | paco -lp jackNjill-1.2.3

Conclusion

There actually is room for both types of package managers: the pre-built kind (like RPM) and source code logging types (like Paco). I think package managers such as RPM are GREAT for core package handling and kernel updating and the like. It is nice to know there are MANY people and companies out there updating the thousands of packages my system uses. This allows for stable and secure packages with minimal hassle when the big time crunch happens when there is a security alert. In addition to the RPM facility, updater technology such as YAST, yum, and up2date are truly fantastic. The package managers that are like Paco are great for those applications that are source code only distributions or for developers who wish to try the latest software and want to be able to track, update, and remove those added packages, but their complex environment would be problematic with dependancy issues.


Comments: Post a Comment



<< Home

This page is powered by Blogger. Isn't yours?