DESIGN NOTES

----------
Sanity Checking and Error Propagation
----------
  * An error is an unexpected problem. Generally errors happen at a low level
  (ie within functions that appear as leaves in the call tree), and generally
  result from failures in external functions (kmalloc, etc). These problem
  needs to be caught, and notification of the event needs to be propagated
  upward and dealt with according to the error return policy (below).
  * I've decided that sanity checks should reside at a medium to high level,
  and should be explicit. That is, instead of automatically sanity checking
  at a low level (in the create* struct-creation-functions, for example),
  we do so explicitlly (by calling a sanityCheck* function) and at a higher
  level (within the function that calls the create* function). Although this
  makes it easier to forget to do the sanity check, it allows us to catch
  problems earlier, decide when we want to sanity check and when we know it's
  not necessary, and to gain more information about why the sanity check
  failed (the sanityCheck* function might return a number of error codes,
  while the associated create* function (for example) might only be able to
  return NULL). Further, many checks must happen at a high to medium level,
  and we therefore retain consistency across the range of sanity checks by
  pushing those that could reside at a low level up to a higher level.

  * There are two major types of sanity checking we need to be concerned
  with: Sanity checking of user/application input (which, eventually, will
  all go into the add/peekAndStrip tables) and sanity checking of incoming
  packet data.

--- Errors
  * Error return policy: The concern is that exiting functions (through
  return statements resulting from error conditions) might introduce bugs
  (including memory leaks) into otherwise bug free code. The policy to prevent
  this is simple enough: When returning from a function in error, make sure
  to undo everything done up to that point. This includes free'ing memory,
  undoing modifications to global data structures, and undoing changes to
  arguments passed into the function.

  * I have never liked return value checking of the form: if (functionCall() <
  0){ errorHandling }, or, even worse: if ((returnValue = functionCall()) ==
  NULL) { errorHandling }. I feel that it buries the function call (which is
  the most important thing if you are trying to read the code and understand
  what it does).
  * I have therefore begun using if (returnedErrorStatus(functionCall())) {
  errorHandling }. This isn't a perfect solution, but I feel it makes the
  code more readable. It is still possible to assign the return value to
  a variable (especially when the return value is a pointer) and then use
  these functions to check if the function returned an error.

--- Sanity Checking by Function
  * Check on input: This includes incoming packets and application data
  passed through interface. All of the annotation strip, peek, and add
  functions (addAnnotations, peekAndStripDeannotations, addPrefixAnnotations,
  peekAndStripTypeDeannotations, and stripAllAnnotations) should be
  inspected. If I implement a /proc interface, the data passed through
  proc will need to be sanity checked. If I provide an API, it will likely
  be necessary to provide sanity checking wrapper functions around the
  existing data structure manipulation functions.
  * Ensure our output is sane: addSingleAnnotationToPacketSet
  and writeAnnotationsToPacket need to make sure the data will
  fit in the packet (most/all of this funtionality will go in
  addSingleAnnotationToPacketSet, which should provide a sane
  packetAnnotationSet to writeAnnotationsToPacket).
  * All of the bit/byte manipulation functions are fine (too low level
  to do any meaningful checking), all of the struct creation/destruction
  functions are fine (any data passed to them should be checked on input
  through the interface), and all of the data structure init/cleanup and
  register/unregister functions are fine (the arguments to these functions
  should have already been checked).

--- Packet Data
  * Currently, when a sanity check on the packet data fails, we just pass
  the packet along without modification. The assumption is that the packet
  has been incorrectly identified as being annotated. It may be possible to
  implement more intelligent behavior in this area in the future.

  * There are two classes of sanity checks here: One tries to prevent the
  module from crashing (by accessing out of bounds memory, for example), while
  the other tries to detect non-annotated packets that have been mistakenly
  identified as annotated, or maliciously constructed packets. We can gurantee
  the first on-the-fly. The second is harder. We can improve on the on-the-fly
  checks by doing a pre-scan (although this will hit performance). Still,
  a malicious packet can get bad data to the application (which should be
  doing its own sanity checks).

  * On-The-Fly
    * Ensure that the packet payload is long enough to contain an annotation
    header (this is done in packetIsAnnotated). This will guarantee that we
    don't step out of bounds when accessing the signature bits, index length,
    and annotations text length fields of the header.
    * Check that the sum of the lengths of the index, annotationsText,
    and annotation header do not exceed the length of the received IP payload.
    * Check that the index has length greater than 0 (otherwise, the packet
    shouldn't be annotated).
    * In combination with the length summation check, we can ensure that each
    access into the index is in bounds by checking that the offset into the
    index is less than the length of the index. Similarly, we can ensure that
    each access into the annotations text section is in bounds by checking
    that each offset is less than the length of the annotations text section.
    * Check that the offsets into the annotations text section are never
    decreasing; currentOffset should always be greater than or equal to
    previousOffset.

  * Pre-Scan. We could perform these on the fly, but they would be most
  useful if performed before even touching any of the actual data. Of course,
  we would pay a performance penalty for doing this.
    * In addition to all of the on-the-fly checks:
    * Check that the annotation count in the header equals the number of
    annotations actually seen.
    * The way the code is set up, we will always reach the end of the index,
    unless we see some other type of error. Therefore, we we reach the end
    of the index, check that the current offset is actually equal to the
    specified index length and that the offset of the last annotation equals
    the specified annotations length text.

  * We could help prevent against misidentified packets by including a
  third length field in the header: The non-annotation data length. The
  sum of this, and the lengths of the annotation text, the index, and the
  annotation header should equal the length of the received data section. If
  not, then we have a misidentified packet.
  * We could also use the non-annotation data length to do a check against
  the TCP or UDP checksum.

----------
(De)Annotation Descriptors
----------
  * With possible future improvements to the data structures in mind, I have
  made the decision to:
    * Decouple AD -> (de)annotation mappings from the main annotation
    table. When considering these two structures implemented as trees or
    hashtables, it becomes pretty clear that each is just an index into the
    "database" of (de)annotations in memory. It makes sense, therefore,
    to have one for quick access based on AD, and one based on some other
    property of the (de)annotation (type, prefix, etc).
    * The best way to map ADs to (de)annotations is to associate each AD with
    the memory address of (a pointer to) its (de)annotation. This will give
    us access to all of the information contained in the (de)annotation,
    which should make lookup/removal of that (de)annotation from the main
    table quick/easy.
  * For now, I plan to just implement the AD -> (de)annotation mapping as
  an array (without correlation between the index and the AD) and do linear
  search to find the memory address, then a linear search through the main
  (de)annotation table to find the index to remove.
  * Under this approach, it will be possible to have monatonically increasing
  ADs (modulo the max value). This might be valuable.

  * Deannotations, like annotations, should be associated with annotation
  descriptors; This will allow them to be cancelled.

  * The valid range for annotation descriptor numbers is 0 to (INT_MAX -
  1). Only half the range of an unsigned int is used so that functions that
  return descriptors may also return error (negative) values, in the case
  of a problem.

  * When an ANN_ONCE annotation (or, in the future, DE_ANN_ONCE deannotation)
  is removed from the (de)annotation table because it has been executed,
  I plan to leave the associated annotation descriptor in place. This will
  allow the unregister function to indicate that the descriptor is known,
  but the associated (de)annotation has dissapeared (which the user can
  couple with information about its duration (ie if it was an (DE_)ANN_ONCE
  (de)annotation)).

----------
Annotation Priority
----------
  * Currently, I try to fit as many annotations as possible into each packet,
  skipping over those that don't fit. This will give preference based on the
  order in which we look at annotation tables, then the order we registered
  annotations in each table, then the order we added single annotations to
  the pools. Short annotations will also gain some priority (as they are
  more likely to fit in near-full packets than longer annotations).
  * If necessary, we can add in code to prioritize annotations: the order in
  which the add tables are considered is pretty easy to change. The elements
  in each add table might then be ordered by priority, or could be accessed
  through a pointer array ordered by priority, or through a function that
  does the ordering/accessing.

----------
Dynamic Memory Allocation
----------
  * All structs must be created through constructor functions.
  * All structs with dynamically allocated data must have a corresponding
  destructor function and it must have as many kfree's as the constructor
  has kmalloc's.
  * All constructors must initialize all of the fields of the struct they
  are creating.
  * All static local variables must be initialized to something (0 or NULL;
  A NULL pointer dereference will generate a meaningful oops message,
  whereas a random dereference may do anything).
  * As soon as the memory for a struct is allocated, it should be zero'ed
  (using memset(), for example). This is generally a good policy and will
  aid with resilient destroy* functions.

----------
Annotation Pools
----------
  * The API (kernel or user space) currently does not provide a way to
  create an annotation with an annotation pool that contains more than one
  singleAnnotation.
  * This limitation was imposed for security / simplicity purposes; Variable
  argument count functions are scary from a security perspective and sanity
  checking each argument, without the help of strong typing, is complicated.
  * Ideally, either a secure way to create pools will be implemented (not an
  overwhelming task), or the idea of pools will be removed altogether. Likely,
  neither will happen.

----------
Indexing
----------
  * Because we start at packet->tail (and work backwards) and packet->tail
  points one past the last byte of valid payload data (ie it points to the
  first byte of free space at the end of the payload), access to the packet
  is essentially 1 indexed. As a result, it is legal to access indices less
  than _or equal_ to the length of the region.

----------
Insertion and Removal of Table Entries
----------
  * I had originally planned to combine the annotation pool members of "like"
  add and peekAndStrip table entries such that each table would only have
  one entry for a particular destination/type/etc. However, this plan is
  complicated by the duration field of the add table entries and by the lack
  of callback pools in the peekAndStrip table entries. That is, for example,
  how would we combine an add table entry that has duration ANN_ONCE with
  one that has duration ANN_ALL, even if all the rest of their information
  (destination, etc) is the same? The solution for now is to simply append
  new entries to the end of the add and peekAndStrip tables and to search
  through all entries when adding and stripping and peeking annotations.
  * Instead of only removing an ANN_ONCE annotation when every annotation
  in the associated pool is sent out with a single packet (as is currently
  done), I'm going to remove annotations from the pool as soon as they are
  sent out. When the pool is empty, the entire annotation may be removed. This
  transition has been made in the code.
  * I'm not going to make any attempt to combine prefixAnnotations;
  We must search through the entire table anyways (to ensure we have hit
  all longest-prefix matches), so there is no point in taking on the added
  complexity (especially in the context of annotation descriptors) to combine
  them. Further, any future longest-prefix data structure will have to deal
  with multiple matches and should be able to accomodate multiple entries
  with the same prefix. Similarly, I'm not going to do any combining of
  deannotations (or callback pools, etc). For now I will just search through
  the entire table. Future, more efficient data structures (tree, hashtable,
  etc) should make it easy to find all matching deannotations quickly.

----------
pointerArray's
----------
  * I use struct pointerArray instead of actual arrays of pointers because
    * Arrays of pointers are messy, especially when passing as arguments and
    trying to figure out their size, etc.
    * They are very explicit.
    * We get another level of indirection. This allows us to pass in a
    pointer to a pointerArray struct (into a function, for example), and
    that function can increase or decrease the size of the actual array of
    pointers. If we wanted to do this without pointerArray's, we would need
    to pass in a pointer to an array of pointers (complicated).
    * We can use them for annotation pools.
  * There are some remaining opportunities for code reduction / code
  combination (registerAddPrefix and registerStripType, initializeAddPrefix
  and initializeStripTypes, and initializeAddPrefixDescriptors and
  initializeStripTypeDescriptors could be combined). I have left them separate
  as a sort of abstraction, which might be useful in the case that the major
  (de)annotation tables are reformed to use a different underlying structure.


Nick Neely (nneely@berkeley.edu), 04-16-2006