Great! Thanks again for the nice additions!

]]>Thanks for the suggestions, Richard – they make sense. I’ll look into it and think about how to do it decorously.

]]>It would be very nice to include this somewhere if we do not already have it, perhaps at differentiation, and then link to it from your argument chain rule. It looks like we may have something at differentiation which is similar to what you wrote in #8, but probably you could incorporate some more of #8 there.

With regard to actual formal power series, it follows very easily, as far as I see, from the binomial theorem/induction that we have the right thing (i.e. the usual formal derivative of power series). But again it would be good to put this observation in somewhere, if we do not already have it, and link to it from your argument at chain rule.

]]>Giving a reference is a slight embarrassment of riches problem. (If Urs is reading this, he might know a canonical reference right away.) Let me show the definition gives the correct answer in the mother of all contexts which is that of real-valued $C^\infty$ functions, rather than power series with coefficients in a commutative ring $A$. In other words, replace $A$ with $\mathbb{R}$ and $A[ [x] ]$ with the ring $C = C^\infty(\mathbb{R})$ of $C^\infty$ functions on $\mathbb{R}$. (Doing it with polynomials would probably be equally convincing.)

So again we have the ring of dual numbers $D = \mathbb{R}[y]/(y^2)$. Given a point $x_0 \in \mathbb{R}$, Hadamard’s lemma allows us to expand any function $f(x) \in C$ in the form $f(x_0) + a_1(x - x_0) + g(x)(x - x_0)^2$ for some unique number $a_1$ and function $g(x) \in C$, and $f'(x_0)$ is according to standard definition this value $a_1$. In other words, there is by this lemma a canonical isomorphism

$C/(x - x_0)^2C \cong D$which maps the residue class of $f(x)$ to $f(x_0) + f'(x_0)y \in D$. In fact any homomorphism $\phi: C \to D$, written in the form $\phi(f) = \phi_0(f) + \phi_1(f)y$, is given by (1) a homomorphism $\phi_0: C \to \mathbb{R}$, with any such on $C = C^\infty(\mathbb{R})$ being evaluation at a uniquely determined point $x_0$, and (2) a derivation $\phi_1: C \to \mathbb{R}$, meaning a linear function satisfying $\phi_1(f g) = \phi_1(f)\phi_0(g) + \phi_0(f)\phi_1(g) = \phi_1(f)g(x_0) + f(x_0)\phi_1(g)$. There is a whole tangent line’s worth of possible derivations, but the particular one we want will take the function $f(x) = x$ to its correct derivative $1$ at $x_0$. Hence we would want the $\phi: C \to D$ such that $\phi(x) = x_0 + 1 \cdot y$. Since $C$ is the free $C^\infty$ ring on a generator $x$, this condition uniquely determines $\phi$ (the book by Moerdijk and Reyes would be a good reference for the theory of $C^\infty$ rings).

The set-up I used just packages all these $\phi$ into the single homomorphism $\delta: C \to C \otimes_\mathbb{R} D \cong C[y]/(y^2)$ that sends $x$ to $x + y$, so that when we compute the composite

$C \stackrel{\delta}{\to} C \otimes_\mathbb{R} D \stackrel{ev_{x_0} \otimes 1}{\to} \mathbb{R} \otimes_\mathbb{R} D \cong D$we find that it sends $f(x)$ to $f(x_0) + f'(x_0)y$, as promised.

I haven’t checked the nLab to see to what extent we’ve already attested to these facts, but it’s pretty standard material in SDG and algebraic geometry.

]]>Ah, I see. Would it be possible to provide some reference for this definition of $p'$? I suppose in a way it is tied up with the identification of the representing object for derivations, but it would be good to make that more explicit.

]]>Sorry, I might not have made myself clear. I don’t think “my” definition of $p'$ is novel. The part that took massaging was the proof of the chain rule I came up with; specifically, the end run (American slang) around the diagram where I sneak in this map I called $\pi$. That technique might not be novel either, but I can at least say it’s not something I’d seen before.

I’ve now added this demonstration to chain rule.

]]>Ah, I didn’t realise that this definition of $p'$ was one of the novel aspects, even if inspired by SDG. Then the observation is nicer still!

]]>Thanks! I just meant that with that definition of $p'$, we prove the chain rule with a minimum of calculation. Other properties might require more calculation.

Of course, the definition is just the algebraic analogue of something familiar from SDG: if $p: R \to R$ is a function on the line and $D$ denotes the walking tangent vector, then under the Kock-Lawvere axiom which asserts an isomorphism $R \times R \cong R^D$ we may form a composite

$R \stackrel{\Delta}{\to} R \times R \cong R^D \stackrel{p^D}{\to} R^D \cong R \times R$which is just $\langle p, p' \rangle: R \to R \times R$. (It’s a little easy here to get turned around with algebro-geometric duality, but I believe it works as advertised.) Even with all these hints, though, plus the fact that the chain rule is trying to express the functoriality of the tangent bundle vector $(-)^D$, it *still* took me some time to massage it into the form above.

Nice, and good fun! I think that your argument is perfectly easy to follow, indeed a very nice to proceed! However, regarding avoiding use of calculation, I’m not sure this is really the case, for in the following line…

For $p \in A[ [x] ]$, define $p'$ via the equation $\delta(p) = p(x) + p'(x)y$.

…I think one would need the binomial theorem or similar to identify $p'$ with one of the usual definitions. But perhaps I’m missing something?

]]>Well, I’ll admit it’s a little silly, almost parodic in the manner of *Mathematics Made Difficult*. A *halfway* normal proof might start the same way but then observe that in the ring with square-nilpotent $y$ we have

say by the binomial theorem, or by an inductive argument. This gives the chain rule $(q \circ p)'(x) = q'(p(x)) \cdot p'(x)$ at least for $q(x) = x^n$. Then extend to all power series $q(x) = \sum_{n \geq 0} a_n x^n$ by linearity, continuity, etc. Of course that’s just one approach; there are others (such as ’meta’ approaches that observe that the usual analysis proof implies the result in formal algebra).

Any of those could be easier to follow than what I produced above. So what’s the point? Nothing, really, except that it amused me that one could prove the statement with virtually zero calculation (like the binomial theorem) *anywhere* – just by sticking to pure concepts and using the universal property of $A[ [x] ]$.

This is more for the sake of amusement than anything else: perhaps the softest imaginable proof of the chain rule for formal power series.

Working over a commutative ring $A$, the statement is that if $q, p \in A[ [x] ]$ are power series, with $0$ the constant coefficient of $p$, then $(q \circ p)'(x) = q'(p(x))p'(x)$ under the usual definitions.

Let $D = A[y]/(y^2)$ be the representing object for derivations. Let $\delta: A[ [x] ] \to A[ [x] ] \otimes_A D \cong A[ [x] ][y]/(y^2)$ be the unique topological $A$-algebra map (under the $(x)$-adic topologies) that sends $x$ to $x + y$. (If it helps, think $\delta(q) = q(x + y)$.) For $p \in A[ [x] ]$, define $p'$ via the equation $\delta(p) = p(x) + p'(x)y$.

Let $\pi: A[ [x] ] \otimes_A D \to A[ [x] ] \otimes_A D$ be the unique topological algebra map taking $x$ to $p(x)$ and $y$ to $p'(x)y$. Let $- \circ p: A[ [x] ] \to A[ [x] ]$ denote the unique topological algebra map that takes $x$ to $p$. Then the diagram

$\array{ A[ [x] ] & \stackrel{\delta}{\to} & A[ [x] ] \otimes_A D \\ \mathllap{- \circ p} \downarrow & & \downarrow \mathrlap{\pi} \\ A[ [x] ] & \underset{\delta}{\to} & A[ [x] ] \otimes_A D }$commutes in the category of topological algebras, since the two legs agree when evaluated at the generator $x$. But then, evaluating each leg at a power series $q \in A [ [x]]$, we have

$[\delta(- \circ p)](q) = \delta(q \circ p) = (q \circ p)(x) + (q \circ p)'(x)y$and

$[\pi \delta](q) = \pi(\delta(q)) = \pi(q(x) + q'(x)y) = q(p(x)) + q'(p(x))(p'(x)y)$whence the coefficients of $y$ agree: $(q \circ p)'(x) = q'(p(x))p'(x)$.

]]>