-------- Examining Remote OS Detection using LPD Querying ---------
-------------------------------------------------------------------
------[ Feb 19, 2001 - by f0bic - http://www.low-level.net ]-------


Abstract

   At present there are many ways of determining ("guessing") a remote
   hosts' Operating System.  Some of these methods rely on the packet,
   whereas others rely on the behavior  of certain daemons  in  rather
   errorous ("unordinary") conditions.    This paper tries to describe
   a way of using the line printer daemon ("lpd") as a knowledge  base
   with which we can determine a  possible  Operating  System  on  the
   remote host.


I. Introduction


   (A) Significance and Definitions of Terms used.

       [1] DETERMINE  -  "GUESSING".  Trying to guess as accurately as
                         possible what Operating  System a remote host 
                         is running.

       [2] ERROROUS   -  "INCORRECT".   Refers to the condition of the 
                         request sent out to the  remote  hosts'  line 
                         printer daemon.          We consider it to be 
                         [errorous]  because  we  are  not  using  the  
                         correct RFC assigned request format.
    
       [3] VALID      -  "CORRECT".       Applies to the syntax of the 
                         request that's sent out to a remote host.   A
                         correct syntax follows the rules of the  RFC. 


   (B) Line Printer Daemon Protocol Specifications.

       This  sub-section  is  based  on  the  specifications  made  in  
       RFC 1179 ("Line Printer Daemon Protocol, L. J. McLaughlin III").
       The Unix Operating System provides line printer spooling with a
       combination of various tools:

            * lpr   (assign to queue)    |
            * lpq   (display the queue)  |
            * lprm  (remove from queue)  |-> LPD (line printer daemon)
            * lpc   (control the queue)  |

       All of these tools ("programs") interact with a  daemon  called
       the line printer daemon ("LPD").      In order to "control" the
       printing functions, the various printer spooling tools  send  a
       valid line printer daemon protocol request  to the line printer
       daemon.  The shape and format of this request will be discussed 
       in further detail in the following sections.           For now, 
       we will just acknowledge that such a "request format" exists to 
       successfully send commands to the LPD.


   (C) General Concepts of Daemon-based Fingerprinting.

       Daemon-based fingerprinting relies for the  most  part  on  the 
       authenticity of a daemon on a certain platform. It is important
       to realize that when fingerprinting daemons,   you are actually
       fingerprinting at the application level,   whereas when you are
       fingerprinting the TCP/IP stack, you are fingerprinting at  the
       kernel level (default window-sizes, ttls, etc). Therefore it is 
       very important that the application (daemon, in this case)  you
       are fingerprinting is the default one installed on the  system,
       because that will help  determine  what  Operating  System  the  
       remote host is running.   Depending on the version of a daemon,
       and its characteristics,      it might give a totally different 
       OS fingerprint. 

       I'm using identd for the following example  to  show  you  that 
       the authenticity of a daemon on an Operating System is  of  the
       utmost importance when fingerprinting a host:

       ident fingerprint for Red Hat 6.2

       -> "pidentd 3.0.10 2.2.5-22smp (Feb 22 2000 16:14:21)"
        
       ident fingerprint for Red Hat 7.0

       -> "pidentd 3.0.10 for Linux 2.2.5-22smp (Jul 20 2000 15:09:20)"

       The above are default identd fingerprints for Red Hat  versions
       6.2 and 7.0 respectively.   If I decide to swap identd versions 
       and run the default identd for 6.2 on Red Hat 7.0,           an 
       application-level identd fingerprinter would return Red Hat 6.2
       while the Operating System is in fact Red Hat 7.0.    These are
       definately some loopholes one should consider  when  performing
       an application-level fingerprint.


II. Line Printer Daemon OS Fingerprinting

    (A) Theoretical Analysis
        
        The theory behind this concept lies within the  boundaries  of 
        RFC 1179 (Line Printer Daemon Protocol). As mentioned earlier,
        there is a certain hierarchical structure within the format of
        requests sent to an LPD.   The "appropriate" message format is 
        described in RFC 1179 as follows:


           [ RFC 1179, "Section 3.1 Message Formats" ]

           "All commands begin with a single octet code, which is a
           binary number which represents the  requested  function.  
           The code is immediately followed by the  ASCII  name  of 
           the printer queue name on which the function  is  to  be 
           performed" [....] "The end of the command  is  indicated 
           with an ASCII line feed character." 
           
           
           [ RFC 1179, "Section 7 Control File Lines" ]

           "Each line of the control  file  consists  of  a  single, 
           printable ASCII character which represents a function  to 
           be performed when the file is printed.  Interpretation of
           these command characters are case-sensitive.  The rest of 
           the line after the command  character  is  the  command's 
           operand." 
           [....]
           "Some  commands  must  be included in every control file.  
           These are 'H'  (responsible host)  and  'P'  (responsible 
           user).   Additionally, there must  be  at least one lower 
           case command to produce any output."


        The excerpts above describe the  correct  message/query  format
        in which a request should be structured.      This theoretical
        analysis is not concerned about the first RFC excerpt  (Message
        Formats), since we don't actually want to go  out  and  send  a 
        printing format query. And basically we don't wanna any printed
        files to come out of some printer at the other end:))

        As you might have guessed, we are going  to  use  Control  File
        commands to "determine" a possible Operating System on a remote
        host.  A normal ("correct") print request would look like this:


        [We're following the syntax described in RFC 1179 here]

          +---+----------+----+
          | H | 10.0.0.2 | LF | - Command code {H} -> "source host"
          +---+----------+----+
          
          +---+----------+----+
          | P |    502   | LF | - Command code {P} -> "user id"
          +---+----------+----+

          +---+----------+----+
          | f | file.txt | LF | - Command code {f} -> "file to print"
          +---+----------+----+

	
        This would allow a file called "file.txt" to be  printed  after 
        both the source (request-originating) host and the user id have
        been verified.    Since we are fingerprinting a remote host and 
        might not have proper "source host" and "user id" to perform  a 
        valid print request, we have to rely on other means of querying
        the remote host for lpd information. 

        Instead of sending a valid request with the correctly formatted
        syntax structure, we will send an errorous ("incorrect") syntax
        and see how the remote LPD acknowledges this query to us.    In
        this case we will omit the authentication information  {H} and
        {P} and change the {f} command to a different command to ensure
        that we don't get any conflicting responses:


        [We're discarding the syntax described in RFC 1179 here] 

          +---+----------+----+
          | M |   user   | LF | - Command code {M} -> "mail when printed"
          +---+----------+----+                                    


        In this scenario, we have sent a malformed request to a remote
        LPD and wait for an acknowledgement. The format and content of
        this  acknowledgement  will  reveal  the  error   notification 
        message, which in many cases is OS-proprietary.    We can then 
        build a database of possible acknowledgements ("replies") from
        the lpd and match those up with a certain Operating System.


    (B) Practical Analysis

        To clearly state the fact  that  different  Operating  Systems,
        actually different LPD's, reply in different ways,    I wrote a 
        little program that  clearly  shows  the  differences  and  the 
        similarities between different LPD fingerprints.    The program
        sends a malformed request looking like this:


           +---+----------+----+
           | M |   r00t   | LF |
           +---+----------+----+                    

     
        The following are examples that show the  information  gathered 
        by sending the malformed request depicted above.     Here goes:


           ::(ninja)-([f0bic]--[~])$ ./lpprint XXX.XXX.4.130
           -- Connected to lpd on XXX.XXX.4.130
           Reply: Invalid protocol request (77): MMr00t     
 
           [ This is a SunOS/Solaris 5.7 box ]

     
           ::(ninja)-([f0bic]--[~])$ ./lpprint XXX.XXX.59.200
           -- Connected to lpd on XXX.XXX.59.200
           Reply: Invalid protocol request (77): MMr00t  

           [ This is a SunOS/Solaris 5.6 box ]


        Are we starting to see some similarities here?:)
        Let's try a different Operating System this time:


           ::(ninja)-([f0bic]--[~])$ ./lpprint XXX.XXX.153.2
           -- Connected to lpd on XXX.XXX.153.2
           Reply: 0781-201 ill-formed FROM address.  
        
           [ This is an AIX 4.3 box ]
 
           
           ::(ninja)-([f0bic]--[~])$ ./lpprint XXX.XXX.14.203
           -- Connected to lpd on XXX.XXX.14.203
           Reply: 0781-201 ill-formed FROM address.    

           [ This is an AIX 4.3 box ]


        We get different replies for different Operating System but the
        same Operating Systems return similar messages.

        NOTE: Some Operating Systems (Compaq Tru64 Unix, HP-UX, and the
              like) will return zero length replies,     which makes it  
              hard to distinguish one from the other.     But most OS's 
              return a similar (same OS) but  different  (different OS)
              message.


III. Proof of Concept Code


     I have also created a  "proof of concept"  tool  that  contains  a              
     database of LPD returned messages and Operating  Systems  matching
     those messages.

     This tool is available at http://www.low-level.net/ and is  called
     "lpdfp".

     Download: http://www.low-level.net/f0bic/releases/lpdfp.tar.gz


IV. References and Acknowledgements

     
    [1] RFC 1179 : Line Printer Daemon Protocol
        Network Printing Working Group
        L. McLaughlin III, 1990
        
        Available at: ftp://ftp.isi.edu/in-notes/rfc1179.txt


    [2] I'd like to thank incubus at Securax  for  letting  me
        fingerprint some of his boxes.     Also, everyone else
        who lend me a hand in allowing me to fingerprint their
        machines (you know who you are).


V. Contact Information

   f0bic@low-level.net
   http://www.low-level.net