Kepler's Laws

Section 8.2 Kepler's Laws

Subsection 8.2.1 Orbits

Figure 8.2.1. An orbital path for Kepler's laws

Kepler's laws were originally formulated from the observations of planets in the night sky and were written down before Newton. One of the triumphs of Newtonian mechanics is the recovery of Kepler's laws. In this section, I start with Newton's gravity and derive Kepler's laws.

Kepler described three laws of planetary motion.

Satellites in orbit around a large gravity source have elliptical orbits with the large object at one of the foci of the ellipse.
The radius of a satellite sweeps out equal area over time.
The period \(T\) of a satellite and the major axis \(a\) of the associated ellipse satisfy \(T^2 = \alpha a^3\) for some constant \(\alpha\) depending only upon the mass of the large object.

Subsection 8.2.2 Kepler's First Law

The setup for approaching Kepler's laws is shown in Figure 8.2.1.

There is a large stationary object of mass \(M\text{,}\) and a small object of mass \(m\) in orbit around the larger object. I must assume that \(m \ll M\) to allow the larger object to be essentially stationary.
I place the stationary object of mass \(M\) at the origin in \(\RR^3\text{.}\) We might work in the plane since the orbits are planar, but it is more useful to think of the orbit sitting in the \(xy\) plane in \(\RR^3\text{.}\) This will be important, since it turns out to be useful to consider the perperdicular direction to the plane of orbit and use normals and binormals in \(\RR^3\text{.}\)
The curve \(\gamma(t) = (r(t),\theta(t),z(t))\) decribes the motion of the orbiting object over time, in cylindrical coordinates (which reduce to polar coordinates in the \(xy\) plane). In the third coordinate \(z(t)\text{,}\) I expect that \(z(t) = 0\) at all times \(t\) (which will produce a curve in the \(xy\) plane in polar coordinates), but I don't assume this.
The curve \(\gamma(t)\) is unknown and I wish to derive it from Newtonian physics. I will prove Kelper's laws using the derived curve.
The force of gravity has magnitude

\begin{equation*} F = \frac{GmM}{|\gamma(t)|^2} \text{.} \end{equation*}
The direction of the force is \(-\gamma(t)\text{,}\) since the force wishes to pull the object back towards the origin. Since I needl just direction without changing magnitude, I use the unit vector \(u(t) = \gamma(t) / |\gamma(t)|\) for the direction of \(\gamma(t)\text{.}\)
Newton's first law \(F = ma\) applies, so the acceleration is

\begin{equation*} a(t) = \frac{-GM}{|\gamma(t)|^2} u(t) = \frac{-GM}{|\gamma(t)|^3} \gamma(t)\text{.} \end{equation*}
Acceleration is the second derivative of position, therefore Newton's first law becomes

\begin{equation*} \frac{d^2}{dt^2} \gamma(t) = \frac{-GM}{|\gamma(t)|^3} \gamma(t) \end{equation*}
I write \(h(t) = \gamma(t) \times \gamma^\prime(t)\text{.}\) \(h(t)\) is in the binormal direction, but is not necessarily a unit vector. This seems like a strange definition, but the vector \(h\) turns out to be very useful in the proof.

Newton's law is a multivariable differential equation; solving it directly is very difficult and well beyond the scope of this course. Our approach is indirect. We start with two seeminly random calculations, which I will label as lemmas for convenient reference. For convenience of notation, I often drop the \(t\) variable, writing \(\gamma\) instead of \(\gamma(t)\) and likewise for other functions of \(t\text{.}\)

Lemma 8.2.2.

\begin{equation*} \frac{d}{dt} h(t) = 0 \end{equation*}

Proof.

\begin{equation*} \frac{d}{dt} h(t) = \frac{d}{dt} \gamma \times \gamma^\prime = \gamma^\prime \times \gamma^\prime + \gamma \times \gamma^{\prime \prime} = \gamma^\prime \times \gamma^\prime + \gamma \times a \end{equation*}

The first term \(\gamma^\prime \times \gamma^\prime = 0\) since any vector with itself in the cross product is \(0\text{.}\) Then \(\gamma\) and the acceleration \(a\) are in the same direction (up to \(\pm1\)), since the force pulls back towards the origin. Vectors in the same direction likewise have \(0\) cross product. Therefore, it is true that \(\frac{d}{dt} h = 0\text{.}\)

Lemma 8.2.3.

\begin{equation*} h = |\gamma|^2 (u \times u^\prime) \end{equation*}

Proof.

I can write \(h\) this way.

\begin{align*} h(t) \amp = \gamma \times \gamma^\prime = (|\gamma|u) \times \frac{d}{dt} (|\gamma| u)\\ \end{align*}

I use the product rule on the second term.

\begin{align*} \amp = |\gamma| u \times \left( u \frac{d}{dt} |\gamma| + |\gamma| \frac{d}{dt} u \right) = |\gamma| u \times u \frac{d}{dt} |\gamma| + |\gamma| u \times |\gamma| \frac{d}{dt} u\\ \amp = |\gamma| \frac{d}{dt} |\gamma| (u \times u ) + |\gamma|^2 u \times u^\prime\\ h \amp = |\gamma|^2 (u \times u^\prime) \end{align*}

In addition to these two calculation lemma, I'm going to list two results from linear algebra as lemmas here as well (thought I will not prove them).

Lemma 8.2.4.

(This formula is known as Lagrange's formula though, confusingly, it is not the only result with that name.) Let \(u, v, w \in \RR^3\) be any three vectors.

\begin{equation*} (u \times (v \times w)) = (u \cdot w) v - (u \cdot v) w \end{equation*}

Lemma 8.2.5.

Let \(u, v, w \in \RR^3\) be any three vectors.

\begin{equation*} u \cdot (v \times w) = v \cdot (w \times u) = w \cdot (u \times v) \end{equation*}

Proposition 8.2.6.

(Kepler's First Law) The differential equation

\begin{equation*} a = \frac{-GM}{|\gamma|^3} \gamma \end{equation*}

can be solved with a cylindrical parametric curve \(\gamma(t) = (r(t),\theta(r),0)\) which satisfies the equation

\begin{equation*} r(t) = \frac{ed}{1 + e \cos \theta(t)}\text{.} \end{equation*}

This is exactly the same as Equation (8.1.1) Such forms describe all conics, so Kepler's First Law states that orbital paths are conics.

Proof.

I start with Newton's First Law as a differential equation, using the force of gravity on the right side.

\begin{equation*} a = \frac{-GM}{|\gamma|^2} u \end{equation*}

I'll take the cross product with \(h\text{.}\) I use Lemma Lemma 8.2.3 on the right side.

\begin{align*} a \times h \amp = \frac{-GM}{|\gamma|^2} u \times h = \frac{-GM}{|\gamma|^2} u \times \left( |\gamma|^2 (u \times u^\prime) \right)\\ \amp = \frac{-GM}{|\gamma|^2} |\gamma|^2 \left( u \times \left( u \times u^\prime \right) \right) = -GM \left( u \times \left( u \times u^\prime \right) \right)\\ \end{align*}

I expand the triple cross product on the right side using Lemma 8.2.4.

\begin{align*} a \times h \amp = -GM ((u\cdot u^\prime) u - (u \cdot u) u^\prime)) \end{align*}

Since \(u\) is always a unit vector, I can apply Lemma Lemma 7.1.17, which says that \(u \times u^\prime = 0\text{.}\) Also since \(u\) is a unit vector, \(u \cdot u = |u|^2 = 1\text{.}\) This deals with both of the dot products.

\begin{equation} a \times h = -GM ( 0 u - 1 u^\prime) = GM u^\prime\tag{8.2.1} \end{equation}

Then I consider the following derivative.

\begin{equation*} \frac{d}{dt} (\gamma^\prime \times h) = \gamma^{\prime \prime} \times h + \gamma^\prime \times h^\prime \end{equation*}

The first term here is \(a \times h\text{.}\) The second term involves the derivative of \(h\text{.}\) But Lemma Lemma 8.2.2 said that the derivative of \(h\) is zero, so this term vanishes.

\begin{equation*} \frac{d}{dt} (\gamma^\prime \times h) = a \times h \end{equation*}

Then I'll replace \(a \times h\) with the expression from Equation (8.2.1)

\begin{equation*} \frac{d}{dt} (\gamma^\prime \times h) = GM u^\prime \end{equation*}

Now I've made progress with our difficult differential equation: I have a time derivative on both sides. I can simply integrate both sides.

\begin{equation*} \gamma^\prime \times h = GM u + c \end{equation*}

In this integration, \(c\) is a vector of constants of integration. (That vector corresponds to intial conditions of velocity and some orbital distance; the rest of the initial position is essentially determined by the choice of coordinates.) I take the dot product with \(\gamma\text{.}\)

\begin{equation*} \gamma \cdot (\gamma^\prime \times h) = GM (\gamma \cdot u) + \gamma \cdot c = GM |\gamma| u \cdot u + |\gamma| |c| \cos \theta = GM + |\gamma| |c| \cos \theta \end{equation*}

Here \(\theta\) is the angle between \(\gamma\) and \(c\text{.}\) Since I haven't specified a starting point, I can choose coordinates such that \(c\) is in the positive \(x\) direction, which means that this \(\theta\) is the usual \(\theta\) of polar coordinates and \(|\gamma|\) is the usual \(r\) of polar coodinates.

\begin{align*} \gamma \cdot (\gamma^\prime \times h) \amp = GMr + r |c| \cos \theta\\ \end{align*}

I solve for \(r\text{.}\)

\begin{align*} r \amp = \frac{\gamma \cdot (\gamma^\prime \times h) } { GM + |c| \cos \theta} \end{align*}

The expression \(|c|/GM\) is a constant, so let's give it a label.

\begin{equation} e \defeq \frac{|c|}{GM}\tag{8.2.2} \end{equation}

Then I put this new \(e\) into the equation.

\begin{equation*} r = \frac{\gamma \cdot (\gamma^\prime \times h) } { GM + GM \frac{|c|}{GM} \cos \theta} = \frac{\gamma \cdot (\gamma^\prime \times h) } { 1 + e \cos \theta} \left( \frac{1}{GM} \right) \end{equation*}

The numerator is \(\gamma \cdot (\gamma^\prime \times h)\text{,}\) which I can rearrange to \(h \cdot (\gamma \times \gamma^\prime)\) according to Lemma Lemma 8.2.5. But \((\gamma \times \gamma^\prime)\) is the definition of \(h\text{,}\) so this dot product is \(h\cdot h = |h|^2\text{.}\)

\begin{equation} r = \frac{|h|^2 } { 1 + e \cos \theta} \left( \frac{1}{GM} \right)\tag{8.2.3} \end{equation}

\(|h|^2\) is a constant, since \(h\) doesn't change according to Lemma Lemma 8.2.2. Let's make another definition.

\begin{equation} d = |h|^2/|c|\tag{8.2.4} \end{equation}

I can then replace the numerator of Equation (8.2.3) using \(d\text{.}\)

\begin{equation*} r = \frac{d|c|} {1 + e \cos \theta} \left( \frac{1}{GM} \right) = \frac{d} {1 + e \cos \theta} \left( \frac{|c|}{GM} \right) \end{equation*}

Replace \(\frac{|c|}{GM}\) by \(e\text{,}\) since that's how I defined \(e\) in Equation (8.2.2).

\begin{equation*} r = \frac{ed } { 1 + e \cos \theta} \end{equation*}

This is the desired form.

In the previous work, the eccentricity was defined as \(e = |c|/GM\text{.}\) Notice that the eccentricity depends on the constants of integration. That makes sense since those constants determine inital velocity and some orbital radius. If \(e\lt 1\text{,}\) then the initial conditions indicate an ellipse. If \(e \geq1\) then the result is a hyperbola. The difference is precisely the notion of escape velocity — these initial conditions tell us if the satellite has enough initial energy to escape or be trapped in orbit. If trapped in orbit, the orbits are elliptical. If escaping, the path is parabolic or hyperbolic. For the rest of this section, I will assume that the orbits are elliptical, i.e., that the original velocity is less than escape velocity, to prove the final two laws. (The final two laws specifically refer to periodic orbits, so the assumption is both reasonable and necessary.)

Subsection 8.2.3 Kepler's Second Law

Proposition 8.2.7.

(Kepler's Second Law). The area swept out by a line between the large mass \(M\) and the satellite is constant in time.

Figure 8.2.8. Approximate movement along the orbit over time \(dt\text{.}\)

Proof.

I will write \(A(t)\) for the area swept out in this way. My goal is to prove that \(A^\prime(t)\) is constant. I will approach this by looking at the infinitesimal area \(dA\) swept out over an infinitesimal time interval \(dt\text{.}\) Such an area is a small triangle, shown in gray in Figure 8.2.8. It has side length \(r\) and base \(db\text{,}\) which I can assume is perpendicular to the radius. Therefore, the area \(dA\) of the infinitesimal triangle is \(\frac{1}{2} r db\text{.}\) Then \(db\text{,}\) as an infinitesimal arclength, is \(rd\theta\text{,}\) so \(dA = \frac{1}{2} r^2 d\theta\text{,}\) which I can integrate (with a temporary internal variable for the integration).

\begin{equation*} A(t) = \int_0^t \frac{1}{2} r(w)^2 d\theta = \int_0^t \frac{1}{2} r(w)^2 \frac{d\theta}{dw} dw \end{equation*}

I calculate the derivative \(\frac{dA}{dt}\text{.}\)

\begin{equation*} \frac{dA}{dt} = \frac{d}{dt} \int_0^t \frac{1}{2} r(w)^2 \frac{d\theta}{dw} dw = \frac{1}{2} r(t)^2 \frac{d\theta}{dt} \end{equation*}

Now I return to some of the derivations of the previous section to understand this equation. In Cartesian coordinates, \(\gamma\) has this form.

\begin{equation*} \gamma(t) = (r(t) \cos \theta(t), r(t) \sin \theta(t), 0) \end{equation*}

The the unit vector \(u = \gamma(t) / |\gamma(t)|\) is \(u(t) = (\cos \theta (t), \sin \theta(t),0)\text{.}\) I calculate its derivative.

\begin{equation} \frac{du}{dt} = (-\sin \theta(t), \cos \theta(t), 0) \frac{d\theta}{dt}\tag{8.2.5} \end{equation}

I calculate \(u \times u^\prime\text{.}\)

\begin{equation} u \times \frac{du}{dt} = (0,0,1) \frac{d\theta}{dt}\tag{8.2.6} \end{equation}

Recall Lemma Lemma 8.2.3.

\begin{equation*} h = |\gamma(t)|^2 u \times u^\prime \end{equation*}

I replace the cross product on the right with the same expression from Equation (8.2.6).

\begin{equation*} h = |\gamma(t)|^2 \frac{d\theta}{dt} (0,0,1) \end{equation*}

I take the magnitude of this vector. (I choose a direction of orbit for that \(\frac{d\theta}{dt}\) is positive.)

\begin{equation*} |h| = |\gamma(t)|^2 \frac{d\theta}{dt} \end{equation*}

Recall \(r(t) = |\gamma(t)|\text{.}\)

\begin{equation*} |h| = r(t)^2 \frac{d\theta}{dt} \end{equation*}

Now this looks familiar: if has the same right side as Equation (8.2.5). Therefore, it is equation to the left side of that equation. \(\frac{dA}{dt}\text{.}\)

\begin{equation} \frac{dA}{dt} = \frac{1}{2} |h|\tag{8.2.7} \end{equation}

But I know that \(h\) is constant by Lemma Lemma 8.2.2. Therefore, the rate of change of the area \(A\) must be constant.

Subsection 8.2.4 Kepler's Third Law

Proposition 8.2.9.

(Kepler's Third Law) The period \(T\) of the revolution and the semi-major axis \(a\) of the ellipse satisfy \(T^2 = \alpha a^3\) for some constant \(\alpha\) which only depends on the mass of the central object.

Proof.

I start with the Equation (8.2.7) in the previous proof.

\begin{equation*} A(t) = \frac{|h|}{2} t \end{equation*}

The period \(T\) is the time needed to complete one whole orbit. The area swept out over time \(T\) should be the whole ellipse. Recall an ellipse with semiaxes \(a\) and \(b\) has area \(\pi a b\text{.}\) I make the two areas equal.

\begin{equation} A(T) = \frac{|h|T}{2} = \pi a b \implies T = \frac{2\pi ab}{|h|} \implies T^2 = \frac{4\pi^2 a^2 b^2}{|h|^2}\tag{8.2.8} \end{equation}

From Equation (8.1.3) and Equation (8.1.4), I have expressions for the semimajor and semiminor axes (\(a\) and \(b\)) in terms of the eccentricity \(e\) and the distance \(d\) from the focus to the directrix.

\begin{gather*} a^2 = \frac{e^2 d^2}{(1-e^2)^2} \\ b^2 = \frac{e^2 d^2}{1-e^2} \end{gather*}

I take the square root of \(a^2\) and then the ratio of the two equations.

\begin{equation} \frac{a}{b^2} = \frac{\frac{ed}{(1-e^2)}}{\frac{e^2d^2}{1-e^2}} = \frac{1}{ed} \implies ed = \frac{b^2}{a}\tag{8.2.9} \end{equation}

Now we can relate \(ed\) to \(G\text{,}\) \(M\) and \(h\text{.}\) Recall the definition of \(e\) from Equation (8.2.2) and \(d\) from Equation (8.2.3). I calculate the product \(ed\text{.}\)

\begin{equation} ed = \frac{|h|^2}{GM} \implies |h|^2 = edGM\tag{8.2.10} \end{equation}

eI use this to substitute for \(|h|^2\) in Equation (8.2.8).

\begin{equation*} T^2 = \frac{4\pi^2 a^2 b^2}{GMed} \end{equation*}

Then use Equation (8.2.9) to substitute for \(ed\) and simplify.

\begin{equation*} T^2= \frac{4\pi^2 a^2 b^2}{\frac{GMb^2}{a}} = \frac{4\pi^2}{GM} a^3 \end{equation*}

This is the desired result. The proportionality constant is \(\frac{4\pi^2}{GM}\) involves \(M\text{,}\) the mass of the central objects, and other constants.