'bound' is a program which calculates bounds on the true error rate of
a learned hypothesis.  It has several modes of operation and will
automatically switch between them depending upon the information given
to the program.  

All the modes work in the following general form: You give the program
information, then it tells you what information it is using along with
a true error bound.

The simplest mode is useful for calculating a holdout bound.  Here is
an example:

11:32PM z-12: echo "test_examples 600
test_errors 192
delta 0.025" | bound
Applying varying approximation tail bound
delta 0.025
lower_delta 0.5
test_examples 600
test_errors 192
approximation automatic
true_error = 0.32 0.358976827934 0.31926717516

The final line of the output is the observed error rate followed by an
upper and a lower bound on the true error rate.

Here is another example:

11:35PM z-14: cat test_error               
test_examples 800
test_errors 192
delta 0.025
lower_delta 0.025
11:35PM z-15: bound test_error             
Applying varying approximation tail bound
delta 0.025
lower_delta 0.025
test_examples 800
test_errors 192
approximation automatic
true_error = 0.24 0.28252880089 0.20065891277

Let's define the meaning of the input/output language. Valid inputs
are line seperated identifiers with a second value.

test_examples <integer>	
	The number of holdout (or test) examples

test_errors <integer>
	The number of errors encountered in making predictions on the
set of test examples.

delta <probability>
	The probability must be a number between 0 and 1.  The
probability will be a bound on the probability that the upper bound
returned will be incorrect.

lower_delta <probability>
	The probability will be a bound on the probability that the
lower bound returned will be incorrect.

approximation {hoeffding|relative_entropy|exact|automatic}
	This option controls how much computation will be done by the
program in constructing high confidence true error bounds.

The remainder of the options are for training set based bounds only.

train_examples <integer>
	The number of training examples.

train_errors <integer>
	The number of errors on the training examples.

log_prior <float>
	Log (base e) of the "prior" probability of a hypothesis.  This
"prior" probability must be picked _before_ training or the bound may
be invalid.  But, any "prior" so picked will result in a valid bound.

log_hypothesis_size <float>
	Log (base e) of the size of the hypothesis space.  This option
and log_prior can _both_ be active.  This allows, for example,
structural risk minimization on hypothesis spaces with a prior.

The following two options are useful for the shell bound.

error_log_count <int> <float>
	<float> should be the log (base e) of the number of hypotheses
with <int> empirical error.

sample_space_log_size <float>
	<float> should be the size of space of hypotheses uniformly
sampled from if the evaluation is inexact.

The current programs supports several modes of operation:
- holdout bound
- training set bound
	- Occam's Razor, SRM, Microchoice, and others which
		only use training error rate.
	- Shell bound (which use the distribution of training error
		rates across the hypothesis space)
	- Sampling Shell bound (a weakened form of the shell bound
		which uses training set error on random _samples_ from
		the hypothesis space)
- Combined holdout and training set bound
	- Holdout + Occam's Razor, SRM, Microchoice, etc..
	- Holdout + Shell 
	- Holdout + Sampling Shell

Each of these modes can work with various levels of approximation,
which can be particularly important for the 'combined' and 'shell'
bounds.

Here are some more examples of the program in operation.

Let p(h) be the prior probability of a hypothesis and suppose log_e
p(h) = -50.  If we had 100 training examples, and observed 3 training
errors, we might use the following input:
12:07AM z-26: cat train_error              
train_examples 100
train_errors 3
log_prior -50
delta 0.3
12:07AM z-27: bound train_error            
Applying varying approximation tail bound
delta 0.3
lower_delta 0.5
train_examples 100
train_errors 3
log_hypothesis_size 0
log_prior -50
approximation automatic
true_error = 0.03 0.466502545401 9.31322574615e-10

Suppose we had the same scenario as before except we wanted to use the
shell bound.  Suppose we also knew that e^5 hypotheses had training
error 25 and e^30 had training error 50.  Then, we might use this
information:
12:07AM z-28: cat shell_error              
train_examples 100
train_errors 3
error_log_count 3 1
error_log_count 25 5
error_log_count 50 30
delta 0.3
12:07AM z-29: bound shell_error            
Applying varying approximation tail bound
Applying shell bound
delta 0.3
lower_delta 0.5
train_examples 100
error_log_count 3 1 
error_log_count 25 5 
error_log_count 50 30 
approximation automatic
true_error = 0.03 0.144696196541 0.0266506755725

Now, suppose that we did not have time to calculate the training error
rate of every hypothesis and instead sampled e^5 + e^10 times
observing e^5 hypotheses with training error 25 and e^10 with training
error 50. 

12:23AM z-36: cat sampling_shell_error     
train_examples 100
train_errors 3
error_log_count 3 1
error_log_count 25 5
error_log_count 50 10
sample_space_log_size 30
delta 0.3
12:23AM z-37: bound sampling_shell_error   
Applying varying approximation tail bound
Applying shell bound
delta 0.3
lower_delta 0.5
train_examples 100
sample_space_log_size 30
error_log_count 3 1 
error_log_count 25 5 
error_log_count 50 10 
approximation automatic
true_error = 0.03 0.405203092843 0.0266506755725

Note that the results are noticeably more pessimistic simply because
of the sampling.

Now, suppose that we had a holdout set but also wanted to benefit from
a training set.  We simply supply the information required for both
bounds and a combined bound will be used.


12:25AM z-42: cat train_n_test_error       
test_examples 10
test_errors 5
delta 0.3
train_examples 20
train_errors 0
log_prior -1
 1:16AM z-43: bound train_n_test_error     
Applying varying approximation tail bound
delta 0.3
lower_delta 0.5
test_examples 10
test_errors 5
train_examples 20
train_errors 0
log_hypothesis_size 0
log_prior -1
approximation automatic
true_error = 0.5 0.104335444048 0.369755505584
