The mathlib project uses the Lean theorem prover which has a Coq-like tactic language, and they’ve defined a `wlog`

tactic. This might be a bit too technical or Lean-specific for unfamiliar readers, but hopefully the documentation is generally readable, particularly the example.

E.g. to prove `p n m`

you are allowed to assume an extra hypothesis `n <= m`

if you prove it always holds in at least one direction and you prove that `p n m`

and `m <= n`

imply `p m n`

(i.e. `p`

doesn’t care about the ordering).

Essentially you establish a lemma `f : ∀ (n m : nat) (h : n <= m) -> p n m`

in the main flow of your proof, with an implied parenthetical that if `m`

is *actually* the smaller value with proof `h : m <= n`

, you just invoke `f m n h`

which has the type `p m n`

, and then swap it to `p n m`

. The `wlog`

principle then lets you simply conclude `∀ (n m : nat) -> p n m`

without the extra hypothesis.

So I think WLOG (at least in this sense) has a lot to do with symmetry as mentioned in #1, because it only works for predicates `p`

where you can prove that the above swapping works. (Of course, the mathlib definition generalizes beyond arity 2, so it’s not just symmetry, but any permutation of the variables passed to the predicate.)

Relevant interesting quote from Terry Tao here:

At a more trivial (but still widely used) level, if there is so much invariance with respect to a parameter that the truth value of a given property does not actually depend on the choice of parameter, then the worst, average, and best case results are equivalent, so one can reduce the worst case to the best case (such arguments are generally described as “without loss of generality” reductions).

This is in the context of a qualitative breakdown of proofs, partial or otherwise, of results in his area into worst case, average case and best case (in that order, going from strongest proof to weakest proof)

]]>Re #7, perhaps I should have explained, sorry. It’s just application, but I like to write $X^t$ instead of $X(t)$ for $X : B G \to U$ to denote the twist of the underlying type $X(pt)$ by a $G$-torsor $t : B G$. And $P^e$ should really be $P^{pt}$, i.e., the untwisted version. (I’ll edit that.) Then we can, by convention, leave out the superscript for the untwisted type when a type is wanted.

]]>Re #4, yes it certainly feels like a ’dependent generalized the’ in a group context.

Re #3, looks interesting, Ulrik. What does that notation, $X^t, P^t, P^e$, denote? It’s not used at infinity-action, where we see things like $\mathbf{c} : \mathbf{B}G \vdash V(\mathbf{c}) : Type$.

]]>Another way to think about WLOG is in terms of “normal forms”, which is when the projection from a sigma type $s : \sum_{x:X} Q\, x \to X$ is a *section* with retraction $r : X \to \sum_{x:X} Q\,x$. The first component of $r$ normalizes $x:X$, $nf : X \to X$ and the second component witnesses that $nf\,x$ is normal/$Q$: $is Q : \prod_{x:X} Q(nf\,x)$. The section-retraction property implies that $nf$ is the identity on normal elements: $\prod_{x:X} Q\,x \to nf\,x = x$.

Let $P : X \to Prop$. Then, if we have a map $t : \prod_{x:X} P(nf\,x) \to P(x)$, then to show $\prod_{x:X}P(x)$ it suffices (WLOG) to show $\prod_{x:X}\prod_{q:Q\,x}P(x)$. Because, given $x:X$ we have $is Q\,x : Q(nf\,x)$ and hence we know $P(nf\,x)$, and thus $P(x)$ from our map $t$.

From this point of view, there’s no group involved, just an equivalence relation that $P$ respects and a way to pick canonical representatives. The equivalence relation could of course be “$x$ and $y$ are in the same orbit”.

]]>Except now we may not have $\mathrm{isContr}(X)$, just $||X||$. More generally, one could have side conditions that trim $X$ down to a connected component (for instance an orbit in an action groupoid) and then use WLOG.

]]>This is closely related to your, David’s, thoughts on formalizing “the”. If we may assume without lack of generality that we are dealing with $x \in X$, then $x$ is “the” element/term of $X$.

]]>Here are some simple comments in that direction: We fix a group $G$ with delooping $B G$. We have a domain $X : B G \to U$ and a property $P : \prod_{t:B G} X^t \to Prop$. Since $B G$ is connected, the map

$\prod_{t:B G}\prod_{x:X^t} P^t\,x \to \prod_{x:X^{pt}} P^{pt}\,x$is an equivalence. This type on the left is equivalent to $\prod_{y:X//G} P\,y$ (where I reuse the name $P$ for the uncurried function), and (since $P$ is proposition-valued) to $\prod_{y:X/G} \tilde P\,y$, where $\tilde P : X/G \to Prop$ is the extension of $P$ to the orbit set $X/G := \| X//G \|_0$.

Now, the WLOG situation arises when we have a predicate $Q : X^{pt} \to Prop$ that describes a “fundamental region” of the domain, so that we have an equivalence $\varphi : \sum_{x:X^{pt}} Q\,x \simeq X/G$. Then the original universal quantification (the type on the right above) is equivalent to

$\prod_{x:X^{pt}} \prod_{q : Q\,x} \tilde P(\varphi(x,q)).$So the witness $q$ is the one we get “WLOG.”

For example, in the case that the action of $G$ on $X$ is transitive, the orbit set is contractible, and we can take $Q\, x := (x = x_0)$, i.e., WLOG $x$ is just $x_0$.

]]>Some thoughts in the context of automated proof:

- John Harrison,
*Without Loss of Generality*, (pdf)

He mentions Klein’s Erlanger Program. I wonder if there’s scope for working in some group invariant context.

]]>Is there anything one can say type-theoretically about the use of WLOG in a proof? Some uses of it seem to involve a symmetric situation where a proof for one choice can be converted into a proof for any.

]]>