





ICP Working Group                                             D. Wessels
Internet-Draft                           National Laboratory for Applied
                                                   Network Research/UCSD
                                                        18 December 1996
                                                    Expire in six months


                Internet Cache Protocol (ICP), version 2
                     <draft-wessels-icp-v2-01.txt>


Status of this Memo


   This document is an Internet-Draft. Internet-Drafts are working
   documents of the Internet Engineering Task Force (IETF), its areas,
   and its working groups. Note that other groups may also distribute
   working documents as Internet-Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time. It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as ``work in progress.''

   To learn the current status of any Internet-Draft, please check the
   ``1id-abstracts.txt'' listing contained in the Internet-Drafts Shadow
   Directories on ftp.is.co.za (Africa), nic.nordu.net (Europe),
   munnari.oz.au (Pacific Rim), ds.internic.net (US East Coast), or
   ftp.isi.edu (US West Coast).



Abstract


   This draft document describes the Internet Cache Protocol (ICP)
   currently implemented in a few World-Wide Web proxy cache packages.
   ICP was initially developed by Peter Danzig, et. al. at the
   Univerisity of Southern California.  It evolved as an important part
   of hierarchical caching on the Harvest research project.


Introduction


   ICP is a message format used for communicating between WWW caches.
   ICP is primarily used in a cache mesh to locate specific WWW objects
   in neighbor caches.  One cache will send an ICP query to its



Wessels                                                         [Page 1]

Internet-Draft                                          18 December 1996


   neighbors.  The neighbors will send back ICP replies indicating a
   ``HIT'' or a ``MISS.''  ICP can also be used to multiplex
   transmission of multiple object streams over a single TCP connection.

   In current practice, ICP is implemented on top of UDP, but there is
   no requirement that it be limited to UDP.  There are some functions
   which work best over UDP, and others which work best over TCP.

   In addition to its use as an object location protocol, the ICP
   messages are often used for ``cache selection.''  Failure to receive
   a reply from a cache may inidicate a network or system failure.  The
   order in which the replies arrive may be used to select the
   ``closest'' neighbor.





ICP Message Format

    0                   1                   2                   3
   0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |     Opcode    |    Version    |        Message Length       |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                         Request Number                      |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                            Options                          |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                            Padding                          |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                       Sender Host Address                   |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   |                                                             |
   |                            Payload                          |
   /                                                             /
   /                                                             /
   |                                                             |
   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

   NOTE: All fields must be represented in network byte order.

   Opcode
      One of the opcodes defined below.

   Version
      The ICP protocol version number.  At the time of this writing,
      both versions two and three are in use.



Wessels                                                         [Page 2]

Internet-Draft                                          18 December 1996


   Message Length
      The total length of the ICP message.

        /* We should probably place some upper limit on the size of
         * an ICP message.  8, 12, or 16 kbytes?  If the whole packet
         * wont fit into a UDP receive buffer some IP stacks will
         * deliver truncated packets, some will deliver the packet
         * over multiple read() calls.  A limit on the ICP message
         * size enforces a limit on the maximum URL length.  -DW */

   Request Number
      An opaque identifier.  When responding to a query, this value must
      be copied into the reply message.

   Options
      A bitfield of options.  See ``ICP Options'' below.

   Padding
      Four bytes of padding; a legacy from previous versions.

   Sender Host Address
      The IPv4 address of the host sending the ICP message.  This field
      should probably not be trusted over what is  provided by
      getpeername(), accept(), and recvfrom().  There is some abmiguity
      over the original purpose of this field.  In practice it is not
      used.

   Payload
      The contents of the Payload field vary depending on the Opcode,
      but most often it contains a null-terminated URL string.



ICP Opcodes


   Value    Name
   -----    -----------------
       0    ICP_OP_INVALID
       1    ICP_OP_QUERY
       2    ICP_OP_HIT
       3    ICP_OP_MISS
       4    ICP_OP_ERR
       5    ICP_OP_SEND
       6    ICP_OP_SENDA
       7    ICP_OP_DATABEG
       8    ICP_OP_DATA
       9    ICP_OP_DATAEND



Wessels                                                         [Page 3]

Internet-Draft                                          18 December 1996


      10    ICP_OP_SECHO
      11    ICP_OP_DECHO
      12    ICP_OP_PREFETCH
   13-17    UNUSED
      18    ICP_OP_MISS_POINTER
      19    ICP_OP_ADVERTISE
      20    ICP_OP_UNADVERTISE
      21    ICP_OP_RELOADING
      22    ICP_OP_DENIED
      23    ICP_OP_HIT_OBJ

   ICP_OP_INVALID
      A placeholder to detect zero-filled or malformed messages.
      ICP_OP_INVALID must never be (intentionally) sent in a message.
      ICP_OP_ERR should be used instead.

   ICP_OP_QUERY
      A query message.  NOTE this opcode has a different payload format
      than most of the others.  First is the requestor's IPv4 address,
      followed by a URL.  The Requestor Host Address is not that of the
      peer cache, but the address of the peer's client which originated
      the request.  The Requestor Host Address is often zero filled.

      Payload Format:

       0                   1                   2                   3
      0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                     Requestor Host Address                  |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                                                             |
      /                       Null-Terminated URL                   /
      /                                                             /
      |                                                             |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

      In response to an ICP_OP_QUERY, the receipient must return one of:
      ICP_OP_HIT, ICP_OP_MISS, ICP_OP_ERR, ICP_OP_RELOADING,
      ICP_OP_DENIED, ICP_OP_HIT_OBJ, or ICP_OP_MISS_POINTER.

   ICP_OP_SECHO
      Similar to ICP_OP_QUERY, but for use in simulating a query to an
      origin server.  When ICP is used to choose the closest neighbor,
      the origin server can be included in the algorithm by bouncing an
      ICP_OP_SECHO message off it's echo port.  The payload is simply
      the null-terminated URL.

      NOTE: the echo server won't interpret the data (i.e. we could send



Wessels                                                         [Page 4]

Internet-Draft                                          18 December 1996


      it anything).  This opcode is used to tell the difference between
      a legitimate query or response, random garbage, and an echo
      response.

   ICP_OP_DECHO
      Similar to ICP_OP_QUERY, but for use in simulating a query to an
      cache which does not use ICP.  When ICP is used to choose the
      closest neighbor, a non-ICP cache can be included in the algorithm
      by bouncing an ICP_OP_DECHO message off it's echo port.  The
      payload is simply the null-terminated URL.

      NOTE: one problem with this approach is that while a system's echo
      port may be functioning perfectly, the cache software may not be
      running at all.


   One of the following six ICP opcodes are sent in response to an
   ICP_OP_QUERY message.  For each one the payload must be the null-
   terminated URL string.  Both the URL string and the Request Number
   field must be exactly the same as from the ICP_OP_QUERY message.

   ICP_OP_HIT
      Sending an ICP_OP_HIT response indicates that the requested URL
      exists in this cache.

   ICP_OP_MISS
      Sending an ICP_OP_MISS response indicates that the requested URL
      does not exist in this cache.  The querying cache may still choose
      to fetch the URL from the replying cache.

   ICP_OP_ERR
      Sending an ICP_OP_ERR indicates some kind of error in parsing or
      handling the query message.

   ICP_OP_RELOADING
      Sending an ICP_OP_RELOADING response indicates that this cache is
      up, but is in a ``startup'' mode.  A cache in startup mode may
      wish to return ICP_OP_HIT for cache hits, but not ICP_OP_MISS for
      misses.  ICP_OP_RELOADING essentially means ``I am up and running,
      but please don't fetch this URL from me now.''

      Note, ICP_OP_RELOADING has a different meaning than ICP_OP_MISS.
      The ICP_OP_MISS is an invitation to fetch the URL from the
      replying cache (in a parent-child relationship), but
      ICP_OP_RELOADING is a request to NOT fetch the URL from the
      replying cache.

   ICP_OP_DENIED



Wessels                                                         [Page 5]

Internet-Draft                                          18 December 1996


      Sending an ICP_OP_DENIED response indicates that the querying site
      is not allowed to retrieve the named object from this cache.
      Caches and proxies may implement complex access controls.  This
      reply can only be interpreted to mean ``you are not allowed to
      request this particular URL from me at this particular time.''

      Caches which receive a high percentage of ICP_OP_DENIED replies
      are probably misconfigured.  Caches should track the ICP_OP_DENIED
      percentage and disable a peer which exceeds a certain threshold
      (e.g. 95%).

   ICP_OP_HIT_OBJ
      Just like an ICP_OP_HIT response, but the actual object data has
      been included in this reply message.   Many requested objects are
      small enough that it makes sense to include them in the query
      response and avoid the need to make a subsequent HTTP request for
      the object.

      ICP_OP_HIT_OBJ has some negative side effects which make its use
      undesirable.  It transfers object data without HTTP and therefore
      bypasses the standard HTTP processing, including authorization and
      age validation.  For these reasons, use of ICP_OP_HIT_OBJ is NOT
      recommended.

      An ICP_OP_HIT_OBJ reply will be sent only if the ICP_FLAG_HIT_OBJ
      flag is set in the query message Options field.

      Payload Format:

       0                   1                   2                   3
      0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                                                             |
      /                       Null-Terminated URL                   /
      /                                                             /
      |                                                             |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |         Object Size           |                             |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+                             |
      |                                                             |
      /                          Object Data                        /
      /                                                             /
      |                                                             |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

      The protocol does not impose any limits on the maximum size of the
      Object Data.  Software that the author is familiar with will use
      ICP_OP_HIT_OBJ over ICP_OP_HIT if the entire message is less than



Wessels                                                         [Page 6]

Internet-Draft                                          18 December 1996


      the system's maximum UDP transmission size.

      The receiving application should check to make sure it actually
      receives Object Size bytes of data.  If it does not, then it
      should treat the ICP_OP_HIT_OBJ reply as though it were a normal
      ICP_OP_HIT.

      NOTE: the Object Size field does not necessarily begin on a 32-bit
      boundary as shown in the diagram above.  It begins immediately
      following the NULL byte of the URL string.

   The following six opcodes have historically been used for
   experimenting with multiplexing object transmission over a TCP
   connection.  At the time of this writing, there are no functional
   proxy/cache implementations using these opcodes.  They are only
   mentioned here because they still exist in some source code.  Any
   future implementations of cache-cache multiplexing will likely need
   to redefine these.

   ICP_OP_PREFETCH
      Work-in-progress.  Reserved for folks working with 'Wcol'.

   ICP_OP_MISS_POINTER
      This message type is only valid as a response to ICP_OP_QUERY with
      the ICP_FLAG_POINTER option set.  An ICP_OP_MISS_POINTER response
      indicates that the responder does not have the requested URL, but
      that it knows of at least one third party cache which is likely to
      hold the object.  The ICP_OP_MISS_POINTER payload consists of a
      list of (IPv4) addresses of third party caches.  The URL is NOT
      included in the payload.  The receiving site can determine request
      from the unique reqnum field.

      Payload Format:

       0                   1                   2                   3
      0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                 Intermediate Cache Address                  |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      |                 Intermediate Cache Address                  |
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      /                                                             /
      /                                                             /
      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+


   ICP_OP_ADVERTISE
      This message is an announcement that the sender holds a specific



Wessels                                                         [Page 7]

Internet-Draft                                          18 December 1996


      URL, which is given in the payload.  There is no reply for this
      message.

      /*
       * Should the 'padding' field be used to hold a TTL?
       * -DW
       */

   ICP_OP_UNADVERTISE
      This message is an announcement that the sender no longer holds a
      specific URL, which is given in the payload.  There is no reply
      for this message.


   ICP_OP_SEND
      Request for object data (non authoritative).  TO BE REMOVED:
      UNIMPLEMENTED, SUPERCEEDED BY HTTP/1.1

   ICP_OP_SENDA
      Request for object data (authoritative).  TO BE REMOVED:
      UNIMPLEMENTED, SUPERCEEDED BY HTTP/1.1

   ICP_OP_DATABEG
      Beginning of data transmission.  TO BE REMOVED: UNIMPLEMENTED,
      SUPERCEEDED BY HTTP/1.1

   ICP_OP_DATA
      Middle of data transmission.  TO BE REMOVED: UNIMPLEMENTED,
      SUPERCEEDED BY HTTP/1.1

   ICP_OP_DATAEND
      End of data transmission.  TO BE REMOVED: UNIMPLEMENTED,
      SUPERCEEDED BY HTTP/1.1





ICP Options


   0x80000000  ICP_FLAG_HIT_OBJ.  This flag is set in an ICP_OP_QUERY
   message indicating that it is okay to respond with an ICP_OP_HIT_OBJ
   message if the object data will fit.

   0x40000000  ICP_FLAG_NETDB_GUNK.  Work-in-progress for Squid.

   0x20000000  ICP_FLAG_POINTER.  This flag is set in an ICP_OP_QUERY



Wessels                                                         [Page 8]

Internet-Draft                                          18 December 1996


   message indicating that it is OK to respond with an
   ICP_OP_MISS_POINTER message if the object is not held locally but is
   known to have been held by a third party within recent history.

   0x10000000  ICP_FLAG_PREADVERTISE.  This flag is set in an
   ICP_OP_QUERY message indicating that the requesting party is pre-
   advertising the fact that it holds this URL.  This is an
   optimisation, combining the ICP_OP_ADVERTISE and ICP_OP_QUERY
   messages together, because the requesting party intends to fetch this
   URL itself if it is unable to discover it in a neighbor cache.  A
   further implication of this flag is that the requestor's operations
   policy allows it to offer this URL to other (some, but not
   necessarily all) third party caches.

   0x08000000  ICP_FLAG_MD5_KEY.  Instead of a URL in the payload, use
   an MD5 hash of the URL.



Missing Features

   Note the lack of an HTTP method in the ICP query.  The current
   version can only be used for GET requests.


Security Considerations

   Security is an issue with ICP over UDP because of its connectionless
   nature.  A proxy/cache must only process ICP messages received from
   its known neighbors.  Messages from unknown addresses must be
   discarded.

   The ICP_OP_HIT_OBJ message is especially senstive to security issues
   since it contains actual object data in the response.  In combination
   with IP spoofing it is most likely possible to pollute the cache with
   invalid objects.


Appending A -- Why ICP is useful.


           ___________________________________________
          (                                           )
          (          T H E   I N T E R N E T          )
          (___________________________________________)
              /                |                \
             /                 |                 \
            /                  |                  \



Wessels                                                         [Page 9]

Internet-Draft                                          18 December 1996


           /                   |                   \
          /                    |                    \
       SLOW                   SLOW                  SLOW
        /                      |                       \
       /                       |                        \
  -----------              -----------              -----------
 |           |            |           |            |           |
 |   ISP 1   | <--FAST--> |   ISP 2   | <--FAST--> |   ISP 3   |
 |           |            |           |            |           |
  -----------              -----------              -----------
    |  |  |                  |  |  |                  |  |  |
    |  |  |                  |  |  |                  |  |  |
    |  |  |                  |  |  |                  |  |  |
     USERS                    USERS                    USERS

Author's  Address:

   Duane Wessels
   National Laboratory for Applied Network Research
   wessels@nlanr.net

   K Claffy
   National Laboratory for Applied Network Research
   kc@nlanr.net



























Wessels                                                        [Page 10]
