# Transductive Conformal Predictor

## Definition

We assume that Reality outputs successive pairs

called *examples*. The *objects* are elements of a measurable space and the *labels* are elements of a measurable space .

We call the *example space*, the *significance level*, the complimentary value the *confidence level*.

*Nonconformity measure* is a measurable mapping

,

where is the set of all bags (multisets) of elements of . Intuitively, this function assigns a numerical score (sometimes called the *nonconformity score*) indicating how different a new example is from a set of old ones.

A *transductive conformal predictor (TCM) determined by a nonconformity measure * is a confidence predictor
( is a set of all subsets of ) obtained by setting

equal to the set of all labels such that

,

where

,

,

and designates the bag (multiset) of examples.

"Transductive conformal predictor" is often abbreviated to "conformal predictor".

The standard assumption for conformal predictors is the randomness assumption (also called the i.i.d. assumption).

Transductive conformal predictors can be generalized by inductive conformal predictors or Mondrian conformal predictors to a wider class of confidence predictors.

## Desiderata

### Validity

All the statements in the section are given under the randomness assumption.

The statement of validity is easiest for *smoothed conformal predictors*, for which

is set to the set of all labels such that

,

where the nonconformity scores are defined as before and is a random number (chosen from the uniform distribution on [0,1].

**Theorem** *All smoothed conformal predictors are exactly valid*, in the sense that, for any exchangeable probability distribution on and any significance level , the random variables , , are independent Bernoulli variables with parameter , where is the random variable

if ; otherwise.

The idea of the proof is quite simple.

**Corollary** *All smoothed conformal predictors are asymptotically exact*, in the sense that for any exchangeable probability distribution on and any significance level ,

with probability one.

**Corollary** *All conformal predictors are asymptotically conservative*, in the sense that for any exchangeable probability distribution on and any significance level ,

with probability one.

To put it simply, in the long run the frequency of erroneous predictions does not exceed at each confidence level .

One can also give a formal notion of validity for conformal predictors, although the usufulness of this notion is limited: it is intuitively clear that conformal predictors are valid or "more than valid" since the number of errors made by a conformal predictor never exceeds the number of errors made by the corresponding smoothed conformal predictor.

### Efficiency

As conformal predictors are automatically valid, the main goal is to improve their *efficiency*: to make the prediction sets conformal predictors output as small as possible. In classification problem, a natural measure of effeciency of conformal predictors is the number of *multiple predictions* - the number of prediction sets containing more than one label. In regression problems, the prediction set is often an interval of values, hence, a natural measure of efficiency of such predictors is the length of the interval.