macosxlabs.org

About Us Contact Us Home iCal Search Site Map

Divider
Introduction Button What's New Button Why Mac OS X Button Documentation Button Forum Button Participants Button Presentations Button Resources Button Resources Button Webcasts Button
Divider

Print Accounting with Pykota/CUPS


Background

The Printing System



CUPS

Printing within Sheridan is currently done using the Berkeley Line Printer Daemon ("LPD"). This printing system was designed in the 70's for printing text to line printers; vendors have since added varying levels of support for other types of printers.

LPD lacks many of the features required for todays printing. Replacements for LPD have emerged [LPRng, Palladin, PLP]. However, none of the replacements change the fundamental capabilities of these systems.

Over the last few years several attempts at developing a standard printing interface have been made. The Internet Printing Protocol (“IPP”) defines extensions to HTTP to provide support for remote printing services. IPP currently enjoys widespread industry support and is poised to become the standard network printing solution for all operating systems.

CUPS uses IPP/1.1 to provide a complete, modern printing system for UNIX that can be extended to support new printers, devices, and protocols while providing compatibility with existing UNIX applications. CUPS is free software provided under the terms of the GNU General Public License and GNU Library General Public License.

Pykota

Pykota is a highly granular, and customizable print accounting plug-in for CUPS. While CUPS features it’s own simple print quota system, it is not extensible and lacks many of the features present in pykota.

Pykota "kidnaps" the job after CUPS rasters it to perform it's various accounting functions. If the user is allowed to print then it sends the print data back to the appropriate CUPS backend. Pykota is free software distributed under the terms of the GNU General Public License.


Easy Software Products

ESP are the primary developers of CUPS. They offer a commercial application which runs on top of CUPS, ESP PrintPro. This application provides the following:

• 4,400+ printer drivers for CUPS
• IPP clients for Windows
• Sophisticated Management and Interface GUIs
• Commercial support for CUPS
• Easy installer for PrintPro/CUPS
• Automatic Updates

Samba

If ESP provides IPP clients for Windows, then wouldn’t it be simpler fo print to CUPS directly via IPP?

Supporting Windows native printing calls offers several distinct advantages over IPP. Please see the samba section for more information.

Windows uses Microsoft Remote Procedure Calls (MS-RPC) as its printing mechanism. This protocol is supported by the Samba File & Print sharing suite. Samba is free software distributed under the terms of the GNU General Public License.

Internet & Free Software Consulting

The primary developer of Pykota. They offer priority email support, bug fixes, and official updates.


PyKyota Pro's

Integrates seamlessly with MacOSX

OSX already use CUPS as it's printing subsystem so integrating them to pykota/CUPS is fairly trivial. On the OSX client, just modify the following directive in /etc/cups/cupsd.conf:

BrowsePoll cupserver:631

This polls the cups server for available printers to use.

Customizable

The addition of Pre & Post Hooks for Pykota, introduced in 1.18, greatly eases administration. Pykota can pass a large number of arguments to your custom hook.

Pre-Hook: At this stage your script gets executed after:

  • being sure the user, printer, and user quota entry on the printer both exist in the quota storage backend
  • checking if the user is allowed to print or not
  • but just before the job is sent to the printer (if allowed).

    Post-Hook: At this stage your script gets executed after:
  • the job is added to the job history on the quota storage backend

    Admins can also specify an external policy hook to run if pykota cannot find the user or printer in the quota storage backend. This custom hook can automatically validate the user and/or printer values and create the necessary users with their respective quota records.

    Highly Granular

    Quota on a user to a printer - "Give Smith a quota of 100 pages for printer oa-e108-e2"

    Quota on a user group to a printer group - "Give the staff group a quota of 100 pages for printer group Color_Laser_Room142"

    The you can quota by user or user group TO A printer or printer group.

    Each user or user group can have their own quota value.

    CUPS is limited to assigning the same quota for all users. CUPS does not have the functionality of grouping users and printers. CUPS has no means of reading out the current balance or "used-up" number of pages.

    Built-In Charge-Back System

    Pykota also has a built in "balance" type system with the same granularity as quotas.

    Instead of quotas each user has a running balance. Each printer or printer group you'd wish to charge for has an associated price_per_page and price_per_job. Pykota then decrements the users running balance. An administrator would need to top up the users balance once it's depleted for the user to print again to that printer [or group].

    Option to retain Job History

    Great for settling quota or balance disputes. Pykota can retain each job printed with the following information:

  • date
  • username
  • printername
  • filename
  • job size
  • price
  • number of copies
  • job options

    This is stored in the RDBMS or LDAP so report generation can be sophisticated through the use of SQL or ldap queries.

    CUPS can also store job history information. But this is stored in individual files, under the /var/spool/cups directory. Report generation is cumbersome, at best.

  • PyKota Cons

    Complexity

    pykota is an opensource application and has dependencies on other open source apps. It is not a "canned" solution. It will take some effort and expertise to set up and administer.

    Check out the System Requirements.

    MacOSX clients do not support CUPS authentication (yet)

    The print center application on the OSX client does not know how to handle an authentication request from the CUPS server. However, authenticated printing from the command line of the OSX client to the CUPS server works fine.

    As an interim, we configured the CUPS server to use NO Authentication. The OSX client still passes the username to the CUPS server for accounting. If the user does not exist in LDAP or the RDBMS then printing is denied.

    However, this will not work for deployments where users have administrative access to their machines. Users can simply create local accounts with other user credentials. Accounting would then be done on the wrong user!

    For true print accounting, authentication has to be performed as each job is processed. WORD from apple is that a UI for CUPS security features is to be added in a future release

    Potential Load on Storage BackEnd

    To date, I know of no large institutions where pykota is deployed. There will be features lacking in pykota which are needed in large institutions. How will the quota database scale and how will pykota perform if there are tens of thousands of users and hundreds of printers to serve them?

    If you belong to this category then please share your experience with me.

    Also, you would be interested in The Pykota Process.

    Lack of Reporting print job status to end user

    From a Windows XP users perspective, the job was successfully sent to the printer but nothing actually comes out if their quota is reached. The client does not get any error messages. They'll try to print the job repeatedly but still no output. They'll complain to the instructor and the IT staff, wasting time and resources.

    They'll find out much later that they've hit their quota when they check their email.

    As a workaround, Admins can use the posthook directive to script a check on the job status after pykota processses it. If the printjob was denied, then the posthook will send a notification to the XP user.

    What mechanism is used to alert the XP client?

    This could be easily done with the Windows messenger service. Samba provides an SMB client to send WinPopUp messages. However, our institution has disabled the messenger service on XP clients due to spam abuse...


    How PyKota stores it's accounting Information

    PyKota can store its accounting information either in LDAP or RDBMS. Only the postgreSQL RDBMS is supported at this time. However, dB independent calls are planned for a future release.

    Pykota uses 7 object types to perform it's accounting. Let's look at how pykota represents these:



    How pykota works (step by step)

    This details the inner workings of pykota:



    1 :

    - information is extracted from running environment to detect
        the printing system being used, and learn its parameters.

    - retrieve Printer from database, using the printer's name received
        from the printing system.

    - retrieve User from database, using the user's name received from
        the printing system.

    - retrieve Quota entry for this user on this printer.

    2 :

    - if any of these doesn't exist, and policy for this printer is set
        to external :

    - execute external command defined in policy, and go back to 1
          for a single additionnal pass.

    - continue with 3 below.

    3 :

    - if an external policy command was executed, and if any of these
        still doesn't exist, job will be rejected, nothing will appear
        in history.

    - if any of these doesn't exist and policy was set to DENY, job
        will be rejected, nothing will appear in history.

    - if any of these doesn't exist and policy was set to ALLOW, job
        will be allowed, nothing will appear in history. Continue to 6
        below.

    - if all three entries exist, normal processing will take place
        in 4 below.

    4 :

    - some environment variables are exported : user and user quota
        informations, and PYKOTAPHASE with a value of "BEFORE".

    - User print quota is checked :

    - retrieve all users groups the user is a member of from database.

    - for each group found :

    - retrieve quota entry for this group on current printer and
            on all the printers groups the printer is a member of.

    - check that the group quota entry on this printer allows
            printing :

    - group quota entry's page counters, and group's balance,
              are the sum of each of the group members' page counters or
              account balances : with the LDAP backend, each group
              member is then retrieved from the database, as well
              as the member's quota entry on the current printer.
              Only one database query is needed to do this with
              the relationnal backend, but with the LDAP backend,
              the number of queries needed is much more important,
              depending on the size of the group.

    - if this checking tells that the job should be denied,
            testing ceases immediately.

    - if group quota checking allowed the job to pass through, the
         user quota entry is checked :

    - if user's limiting factor is "balance", the user's account
           balance is checked.

    - otherwise (i.e. user's limiting factor is "quota"), the
           user's page counters on current printer and all the printers
           group the current printer is a member of are checked wrt
           limits which may be defined on each of this user's quota
           entries found (current printers + all it's parent printers
           recursively). Several database queries are needed, to retrieve
           all the parents of the current printer, and this user's quota
           entry on each of them, from the database.
           Data retrieval ceases as soon as the user's quota entry on the
           printer or one of its parents denies printing.

    - some messaging with the user takes place if needed if either :

    - quota is reached on current printer or on its parents

    or

    - account balance is low

    5 :

    - now we know if job is to be accepted or not, exports this
        information in the PYKOTAACTION environment variable.

    - pre-hook is executed if defined.

    - accounting begins, with hardware accounters, the printer's internal
        page counter is read using the appropriate requester.

    6 :

    - if job is allowed, pass job's data to real CUPS backend.

    7 :

    - if normal processing took place (job was not allowed/denied because
        of the printer's policy) :

    - exports the PYKOTAPHASE environment variable again, with a value
          of "AFTER".

    - accounting ends, with hardware accounters, the printer's internal
          page counter is read again, using the appropriate requester.

    - if job was denied, job's size is forced to 0, else job's size
          is the difference between the two page counter's values
          (after - before), with hardware accounters. We force job's size
          to 0 because a manual intervention on the printer at the same
          time could have changed the internal page counter's value.

    - job price is computed :

    SUM((Nbpages * PricePerPage) + PricePerJob)

    for current printer and all the printers groups this printer
          is a member of, recursively.

    - user's account balance is decreased by job's price, user's page
          counters on this printer and all the printers groups this printer
          is a member of are incremented by job's size.

    - job is added to history.

    - exports two additionnal environment variables, PYKOTAJOBSIZE
          and PYKOTAJOBPRICE, and re-exports user and user quota entry
          informations (they have changed).

    - post hook is executed if defined.

    8 :

    - Unless the real CUPS backend failed, CUPS is told all is OK and
        the job was printed successfully (even if the job was in fact
        denied).


    How PyKota Handles Page Counting

    CUPS implements it’s own page counting through software. It uses the pstops filter to do page counting as a print job passes through it. This pstops filter is responsible for inserting device specific print options: duplexing, stapling, punching, etc.

    However, this method has some caveats:

    • In order for counting to be done correctly, the job has to pass through the pstops filter. Image files (jpeg, tiff, bmp, gif, etc.) do not pass through this. In this case, CUPS assigns a dummy page log of “1”

    • The page is still counted if it passes through the filter, but does not actually come out of the printer. What if the printer was down? What if the printer was jammed? -- If there is a jam while printing the 5th sheet of 1000 then the job is incorrectly counted as 1000!

    Pykota performs a more accurate page counting by hardware. It queries the page counter MIB of the printer, via SNMP, before and after the job is printed to determine the total number of pages printed.


    Installing PyKota

    I've got some messed up installation notes which you can look at.

    See the Download Page to get the official source tarball. You can also download the latest edition for free via CVS.

    Or send me an email and I will be happy to send the sources to you. However, once you've evaluated it I encourage you to to purchase an official version to support it's development.

    CUPS

    Design Overview

    The above diagram highlights the following subsystems that comprise CUPS.

    Scheduler

    The CUPS scheduler is a server application that handles HTTP requests. Besides processing IPP Post requests, the scheduler also acts as a full featured web server for documentation, status monitoring, and administration.

    The scheduler also manages a list of available printers on the LAN and dispatches print jobs as needed using the appropriate filters and backends.

    Configuration Files

    The CUPS configuration files are purposely similar to the Apache config files and defines all the access control properties for the print server. The printer and class definition files are also stored here.

    Printer classes are essentially groupings of printers. If one printer in a group goes down then the jobs are still printed to the other printers. Classes are also used for load balancing where the job is printed to the first available printer.

    CUPS API

    The CUPS API contains functions for queuing print jobs, getting printer information, accessing resources via HTTP and IPP, and manipulating PPD files.

    PPDs are PostScript Printer Description files (the printer driver). These files describe the capabilities of each printer.

    Filters

    Filters are responsible for translating the print job into the required output format. For example, translating an image or postscript file to print to a non-postscript printer.

    A lot of work is done on this subsystem. There is a section dedicated to filters on the next page. This was taken from Kurt Pfeifle’s excellent document – Printing with Samba 3.0

    CUPS Imaging

    The CUPS Imaging library provides functions for managing large images, doing colorspace conversion and color management, scaling images for printing, and managing raster page streams.

    Pykota

    This is where we plug in pykota to perform it’s various accounting functions before the job is sent to the appropriate backend.

    Backends

    Backends are filters which are responsible for communicating with printers over different interfaces such as ipp, jetdirect, parallel, serial, socket, and usb ports.


    Samba

    Samba’s role is to mediate native Windows printing calls with the CUPS scheduler, through the CUPS API.

    Although ESP provides clients to print to CUPS directly via IPP, the added complexity of configuring Samba allows the printing system to take advantage of some Windows specific features:

    Domain Membership

    Samba can fully participate as a member of the Active Directory domain. This provides the following benefits:

  • MS workstations & users get the benefit of Single Sign On. There is no need to authenticate each time the user wants to print. Authentication and Authorization relayed to the KDC.

  • Printers can be automatically installed on the workstations through the use of netlogon scripts.

    Point & Print

    A windows feature which lets users print to a network printer without having to manually install a new driver. When a printer is created in CUPS, the files created for the new printer will also need to be copied to Samba by using the 'cupsaddsmb' utility.

  • Installing CUPS/Pykota on OSX

    Here are my crappy installation notes for installing CUPS/Pykota on OSX:

    ## Install XCode Tools

    ## Install fink - http://fink.sourceforge.net

    ## Add unstable tree to fink - http://fink.sourceforge.net/faq/usage-fink.php?phpLang=en#unstable
    osxprint:/SC/ss/src root# pico /sw/etc/fink.conf
    Trees: local/main stable/main stable/crypto local/bootstrap unstable/main unstable/crypto
    osxprint:/SC/ss/src root# fink selfupdate
    osxprint:/SC/ss/src root# fink update-all
    osxprint:/SC/ss/src root# fink index

    ## Install preferred text editor
    osxprint:/SC/ss/src/pykota root# fink install jove

    ## Install OpenSSL
    osxprint:/SC/ss/src root# tar -xzvf openssl-0.9.7d.tar.gz
    osxprint:/SC/ss/src root# ./config
    osxprint:/SC/ss/src root# make
    osxprint:/SC/ss/src root# make test
    osxprint:/SC/ss/src root# make install

    ## Install postgresql
    osxprint:/SC/ss/src/postgresql-7.4.2 root# fink install readline
    osxprint:/SC/ss/src/postgresql-7.4.2 root# fink install bison

    osxprint:/SC/ss/src root# tar -xzvf postgresql-7.4.2.tar.gz
    osxprint:/SC/ss/src/postgresql-7.4.2 root# export LDFLAGS=-L/sw/lib
    osxprint:/SC/ss/src/postgresql-7.4.2 root# export CPPFLAGS=-I/sw/include
    osxprint:/SC/ss/src/postgresql-7.4.2 root# ./configure --with-python --with-openssl
    osxprint:/SC/ss/src/postgresql-7.4.2 root# make
    ## Will fail with python errors
    ## see http://www.macosxguru.net/article.php?story=20031202101454698 to fix
    osxprint:/SC/ss/src/postgresql-7.4.2 root# make install

    ## Add user postgres
    osxprint:/ root# mkdir -p /SC/ss/pykota/postgres/data
    osxprint:/ root# chown postgres /SC/ss/pykota/postgres
    osxprint:/ root# chown postgres /SC/ss/pykota/postgres/data
    osxprint:/SC/ss/pykota postgres$ initdb -D /SC/ss/pykota/postgres/data



    ## Setup pygresql
    osxprint:/SC/ss/src root# tar -xzvf PyGreSQL-3.4.tgz
    osxprint:/SC/ss/src/PyGreSQL-3.4 root# jove setup.py
    {
    elif sys.platform == "darwin":
             include_dirs=['/usr/local/pgsql/include/server','/usr/local/pgsql/include']
             library_dirs=['/usr/local/pgsql/lib']
             optional_libs=['pq','crypto','ssl','krb5']
             data_files=[]
    }
    osxprint:/SC/ss/src/postgresql-7.4.2/src root# cp -rp include /usr/local/pgsql/include/server
    osxprint:/SC/ss/src/PyGreSQL-3.4 root# ./setup.py install

    ## Install mxDateTime extention for python (required by pykota)
    osxprint:/SC/ss/src root# tar -xvf egenix-mx-base-2.0.5.tar
    osxprint:/SC/ss/src/egenix-mx-base-2.0.5 root# cd egenix-mx-base-2.0.5
    osxprint:/SC/ss/src/egenix-mx-base-2.0.5 root# python setup.py install

    ## Install Pykota
    osxprint:/SC/ss/src root# tar -xzvf pykota.tar.gz
    osxprint:/SC/ss/src/pykota root# ./setup.py install

    ## Populate postgreSQL dB with PyKota tables and users
    osxprint:/SC/ss/pykota/postgres/data root# psql -U postgres template1
    template1=# \i /SC/ss/src/pykota/initscripts/postgresql/pykota-postgresql.sql

    ## Create passwords for various postgreSQL users
    pykota=# alter user postgres with password 'rootpasswd';
    ALTER USER
    pykota=# alter user pykotauser with password 'somepasswd';
    ALTER USER
    pykota=# alter user pykotaadmin with password 'somepasswd';
    ALTER USER

    ## Modify postgreSQL to use secure authentication
    osxprint:/SC/ss/pykota/postgres/data# jove pg_hba.conf

    ## Enable tcp connections in postgreSQL
    osxprint:/SC/ss/pykota/postgres/data# jove postgresql.conf

    ## Plug-In Pykota to CUPS
    osxprint:/usr/libexec/cups/backend root# ln -s /System/Library/Frameworks/Python.framework/Versions/2.3/share/pykota/cupspykota cupspykota


    Glossary

    Raster

    Also referred to as bitmap images, these are images that are represented by a sequence of pixels (picture elements) or points, which when taken together, describe the display of an image on an output device. There are many different raster image formats in use, among them GIF, JPEG, PCX, and TIFF.

    Raster Image Processor (RIP)

    A device, consisting of hardware and software, that converts vector graphics or text into a raster (bitmapped) image. Raster image processors are used in page printers, phototypesetters, and electrostatic plotters. They compute the brightness and color value of each pixel on the page so that the resulting pattern of pixels re–creates the vector graphics and text originally described. Acronym: RIP.

    PostScript

    A page description language (PDL) developed by Adobe Systems. PostScript is primarily a language for printing documents on laser printers, but it can be adapted to produce images on other types of devices. PostScript is the standard for desktop publishing because it is supported by imagesetters, the very high-resolution printers used by service bureaus to produce camera-ready copy.

    PostScript is an object-oriented language, meaning that it treats images, including fonts, as collections of geometrical objects rather than as bit maps. PostScript fonts are called outline fonts because the outline of each character is defined. They are also called scalable fonts because their size can be changed with PostScript commands. Given a single typeface definition, a PostScript printer can thus produce a multitude of fonts. In contrast, many non-PostScript printers represent fonts with bit maps. To print a bit-mapped typeface with different sizes, these printers require a complete set of bit maps for each size.

    The principal advantage of object-oriented (vector) graphics over bit-mapped graphics is that object-oriented images take advantage of high-resolution output devices whereas bit-mapped images do not. A PostScript drawing looks much better when printed on a 600-dpi printer than on a 300-dpi printer. A bit-mapped image looks the same on both printers.

    Every PostScript printer contains a built-in interpreter that executes PostScript instructions. If your laser printer does not come with PostScript support, you may be able to purchase a cartridge that contains PostScript.

    There are three basic versions of PostScript: Level 1, Level 2 and PostScript 3. Level 2 PostScript, which was released in 1992, has better support for color printing. PostScript 3, release in 1997, supports more fonts, better graphics handling, and includes several features to speed up PostScript printing.

    Page Description Language

    Abbreviated as PDL, a language for describing the layout and contents of a printed page. The best-known PDLs are Adobe PostScript and Hewlett-Packard PCL (Printer Control Language), both of which are used to control laser printers.

    Both PostScript and modern versions of PCL are object-oriented, meaning that they describe a page in terms of geometrical objects such as lines, arcs, and circles.

    MIB

    Short for Management Information Base, a database of objects that can be monitored by a network management system. Both SNMP and RMON use standardized MIB formats that allows any SNMP and RMON tools to monitor any device defined by a MIB.

    References