Derivations of the Lorentz transformations
|Part of a series on|
In the fundamental branches of modern physics, namely general relativity and its widely applicable subset special relativity, as well as relativistic quantum mechanics and relativistic quantum field theory, the Lorentz transformation is the transformation rule under which all four-vectors and tensors containing physical quantities transform.
The prime examples of such four vectors are the four position and four momentum of a particle, and for fields the electromagnetic tensor and stress–energy tensor. The fact that these objects transform according to the Lorentz transformation is what mathematically defines them as vectors and tensors, see tensor.
Given the components of the four vectors or tensors in some frame, the "transformation rule" allows one to determine the altered components of the same four vectors or tensors in another frame, which could be boosted or accelerated, relative to the original frame. A "boost" should not be conflated with spatial translation, rather it's characterized by the relative velocity between frames. The transformation rule itself depends on the relative motion of the frames. In the simplest case of two inertial frames the relative velocity between enters the transformation rule. For rotating reference frames or general non-inertial reference frames, more parameters are needed, including the relative velocity (magnitude and direction), the rotation axis and angle turned through. There are many ways to derive the Lorentz transformations utilizing a variety of mathematical tools, spanning from elementary algebra and hyperbolic functions, to linear algebra and group theory.
This article provides a few of the easier ones to follow in the context of special relativity, for the simplest case of a Lorentz boost in standard configuration, i.e. two inertial frames moving relative to each other at constant (uniform) relative velocity less than the speed of light, and using Cartesian coordinates so that the x and x′ axes are collinear.
The usual treatment (e.g., Einstein's original work) is based on the invariance of the speed of light. However, this is not necessarily the starting point: indeed (as is exposed, for example, in the second volume of the Course of Theoretical Physics by Landau and Lifshitz), what is really at stake is the locality of interactions: one supposes that the influence that one particle, say, exerts on another can not be transmitted instantaneously. Hence, there exists a theoretical maximal speed of information transmission which must be invariant, and it turns out that this speed coincides with the speed of light in vacuum. The need for locality in physical theories was already noted by Newton (see Koestler's The Sleepwalkers), who considered the notion of an action at a distance "philosophically absurd" and believed that gravity must be transmitted by an agent (such as an interstellar aether) which obeys certain physical laws.
Michelson and Morley in 1887 designed an experiment, employing an interferometer and a half-silvered mirror, that was accurate enough to detect aether flow. The mirror system reflected the light back into the interferometer. If there were an aether drift, it would produce a phase shift and a change in the interference that would be detected. However, no phase shift was ever found. The negative outcome of the Michelson–Morley experiment left the concept of aether (or its drift) undermined. There was consequent perplexity as to why light evidently behaves like a wave, without any detectable medium through which wave activity might propagate.
In a 1964 paper, Erik Christopher Zeeman showed that the causality preserving property, a condition that is weaker in a mathematical sense than the invariance of the speed of light, is enough to assure that the coordinate transformations are the Lorentz transformations.
From physical principles
The problem is usually restricted to two dimensions by using a velocity along the x axis such that the y and z coordinates do not intervene. The following is similar to that of Einstein. As in the Galilean transformation, the Lorentz transformation is linear since the relative velocity of the reference frames is constant as a vector; otherwise, inertial forces would appear. They are called inertial or Galilean reference frames. According to relativity no Galilean reference frame is privileged. Another condition is that the speed of light must be independent of the reference frame, in practice of the velocity of the light source.
Spherical wavefronts of light
Consider two inertial frames of reference O and O′, assuming O to be at rest while O′ is moving with a velocity v with respect to O in the positive x-direction. The origins of O and O′ initially coincide with each other. A light signal is emitted from the common origin and travels as a spherical wave front. Consider a point P on a spherical wavefront at a distance r and r′ from the origins of O and O′ respectively. According to the second postulate of the special theory of relativity the speed of light is the same in both frames, so r and r′ will be different only if t and t′ are different:
The equation of the spherical wavefront in frame O will be
Similarly, the equation of the spherical wavefront in frame O′ will be
The origin O′ is moving along x-axis. Therefore,
The relation between x and x′ should be in linear form and be such that it reduces to the Galilean transformation at v ≪ c. Therefore, such a relation can be written in the form:
where γ is to be determined. At this point γ is not necessarily a constant and independent of the coordinates t, x, t' , x' , but is required to reduce to 1 for v ≪ c.
The inverse is:
The above two equations give the relation between t and t′ as:
Substituting the expressions of x′, y′, z′ and t′ in terms of x, y, z and t in spherical wavefront equation of O′ frame,
comparing the coefficients of t2 from above equation with the spherical wavefront equation of O frame produces
or, choosing the positive root to ensure that the x and x' axes and the time axes point in the same direction,
The Lorentz transformation is not the only transformation leaving invariant the shape of spherical waves, as there is a wider set of spherical wave transformations in the context of conformal geometry, leaving invariant the expression . However, scale changing conformal transformations cannot be used to symmetrically describe all laws of nature including mechanics, whereas the Lorentz transformations (the only one implying ) represent a symmetry of all laws of nature and reduce to Galilean transformations at .
Galilean and Einstein's relativity
- Galilean reference frames
In classical kinematics, the total displacement x in the R frame is the sum of the relative displacement x′ in frame R′ and of the distance between the two origins x − x′. If v is the relative velocity of R′ relative to R, the transformation is: x = x′ + vt, or x′ = x − vt. This relationship is linear for a constant v, that is when R and R′ are Galilean frames of reference.
In Einstein's relativity, the main difference from Galilean relativity is that space and time coordinates are intertwined, and in different inertial frames t ≠ t′.
Since space is assumed to be homogeneous, the transformation must be linear. The most general linear relationship is obtained with four constant coefficients, A, B, γ, and b:
The Lorentz transformation becomes the Galilean transformation when γ = B = 1, b = −v and A = 0.
An object at rest in the R′ frame at position x′ = 0 moves with constant velocity v in the R frame. Hence the transformation must yield x′ = 0 if x = vt. Therefore, b = −γv and the first equation is written as
- Principle of relativity
According to the principle of relativity, there is no privileged Galilean frame of reference: therefore the inverse transformation for the position from frame R′ to frame R should have the same form as the original but with the velocity in the opposite direction, i.o.w. replacing v with -v:
- The speed of light is constant
Since the speed of light is the same in all frames of reference, for the case of a light signal, the transformation must guarantee that t = x/c and t′ = x′/c.
Substituting for t and t′ in the preceding equations gives:
Multiplying these two equations together gives,
At any time after t = t′ = 0, xx′ is not zero, so dividing both sides of the equation by xx′ results in
which is called the "Lorentz factor".
When the transformation equations are required to satisfy the light signal equations in the form x = ct and x′ = ct′, by substituting the x and x'-values, the same technique produces the same expression for the Lorentz factor.
- Transformation of time
The transformation equation for time can be easily obtained by considering the special case of a light signal, satisfying
Substituting term by term into the earlier obtained equation for the spatial coordinate
which determines the transformation coefficients A and B as
So A and B are the unique coefficients necessary to preserve the constancy of the speed of light in the primed system of coordinates.
Einstein's popular derivation
In his popular book Einstein derived the Lorentz transformation by arguing that there must be two non-zero coupling constants λ and μ such that
that correspond to light traveling along the positive and negative x-axis, respectively. For light x = ct if and only if x′ = ct′. Adding and subtracting the two equations and defining
Substituting x′ = 0 corresponding to x = vt and noting that the relative velocity is v = bc/γ, this gives
The constant γ can be evaluated as was previously shown above.
The Lorentz transformations can also be derived by simple application of the special relativity postulates and using hyperbolic identities. It is sufficient to derive the result for a boost in one direction, since for an arbitrary direction the decomposition of the position vector into parallel and perpendicular components can be done after, and generalizations therefrom follow, as outlined above.
- Relativity postulates
Start from the equations of the spherical wave front of a light pulse, centred at the origin:
which take the same form in both frames because of the special relativity postulates. Next, consider relative motion along the x-axes of each frame, in standard configuration above, so that y = y′, z = z′, which simplifies to
Now assume that the transformations take the linear form:
where A, B, C, D are to be found. If they were non-linear, they would not take the same form for all observers, since fictitious forces (hence accelerations) would occur in one frame even if the velocity was constant in another, which is inconsistent with inertial frame transformations.
Substituting into the previous result:
and comparing coefficients of x2, t2, xt:
- Hyperbolic rotation
The formulae resemble the hyperbolic identity
Introducing the rapidity parameter ϕ as a parametric hyperbolic angle allows the self-consistent identifications
where the signs after the square roots are chosen so that x and t increase. The hyperbolic transformations have been solved for:
If the signs were chosen differently the position and time coordinates would need to be replaced by −x and/or −t so that x and t increase not decrease.
To find what ϕ actually is, from the standard configuration the origin of the primed frame x′ = 0 is measured in the unprimed frame to be x = vt (or the equivalent and opposite way round; the origin of the unprimed frame is x = 0 and in the primed frame it is at x′ = −vt):
and manipulation of hyperbolic identities leads to
so the transformations are also:
From group postulates
Following is a classical derivation (see, e.g.,  and references therein) based on group postulates and isotropy of the space.
- Coordinate transformations as a group
The coordinate transformations between inertial frames form a group (called the proper Lorentz group) with the group operation being the composition of transformations (performing one transformation after another). Indeed the four group axioms are satisfied:
- Closure: the composition of two transformations is a transformation: consider a composition of transformations from the inertial frame K to inertial frame K′, (denoted as K → K′), and then from K′ to inertial frame K′′, [K′ → K′′], there exists a transformation, [K → K′][K′ → K′′], directly from an inertial frame K to inertial frame K′′.
- Associativity: the result of ([K → K′][K′ → K′′])[K′′ → K′′′] and [K → K′]([K′ → K′′][K′′ → K′′′]) is the same, K → K′′′.
- Identity element: there is an identity element, a transformation K → K.
- Inverse element: for any transformation K → K′ there exists an inverse transformation K′ → K.
- Transformation matrices consistent with group axioms
Let us consider two inertial frames, K and K′, the latter moving with velocity v with respect to the former. By rotations and shifts we can choose the x and x′ axes along the relative velocity vector and also that the events (t, x) = (0, 0) and (t′, x′) = (0, 0) coincide. Since the velocity boost is along the x (and x′) axes nothing happens to the perpendicular coordinates and we can just omit them for brevity. Now since the transformation we are looking after connects two inertial frames, it has to transform a linear motion in (t, x) into a linear motion in (t′, x′) coordinates. Therefore it must be a linear transformation. The general form of a linear transformation is
where α, β, γ, and δ are some yet unknown functions of the relative velocity v.
Let us now consider the motion of the origin of the frame K′. In the K′ frame it has coordinates (t′, x′ = 0), while in the K frame it has coordinates (t, x = vt). These two points are connected by the transformation
from which we get
Analogously, considering the motion of the origin of the frame K, we get
from which we get
Combining these two gives α = γ and the transformation matrix has simplified,
Now let us consider the group postulate inverse element. There are two ways we can go from the K′ coordinate system to the K coordinate system. The first is to apply the inverse of the transform matrix to the K′ coordinates:
The second is, considering that the K′ coordinate system is moving at a velocity v relative to the K coordinate system, the K coordinate system must be moving at a velocity −v relative to the K′ coordinate system. Replacing v with −v in the transformation matrix gives:
Now the function γ can not depend upon the direction of v because it is apparently the factor which defines the relativistic contraction and time dilation. These two (in an isotropic world of ours) cannot depend upon the direction of v. Thus, γ(−v) = γ(v) and comparing the two matrices, we get
According to the closure group postulate a composition of two coordinate transformations is also a coordinate transformation, thus the product of two of our matrices should also be a matrix of the same form. Transforming K to K′ and from K′ to K′′ gives the following transformation matrix to go from K to K′′:
In the original transform matrix, the main diagonal elements are both equal to γ, hence, for the combined transform matrix above to be of the same form as the original transform matrix, the main diagonal elements must also be equal. Equating these elements and rearranging gives:
The denominator will be nonzero for nonzero v, because γ(v) is always nonzero;
If v = 0 we have the identity matrix which coincides with putting v = 0 in the matrix we get at the end of this derivation for the other values of v, making the final matrix valid for all nonnegative v.
For the nonzero v, this combination of function must be a universal constant, one and the same for all inertial frames. Define this constant as δ(v)/vγ(v) = κ where κ has the dimension of 1/v2. Solving
we finally get
and thus the transformation matrix, consistent with the group axioms, is given by
If κ > 0, then there would be transformations (with κv2 ≫ 1) which transform time into a spatial coordinate and vice versa. We exclude this on physical grounds, because time can only run in the positive direction. Thus two types of transformation matrices are consistent with group postulates:
- with the universal constant κ = 0, and
- with κ < 0.
- Galilean transformations
If κ = 0 then we get the Galilean-Newtonian kinematics with the Galilean transformation,
where time is absolute, t′ = t, and the relative velocity v of two inertial frames is not limited.
- Lorentz transformations
where the speed of light is a finite universal constant determining the highest possible relative velocity between inertial frames.
If v ≪ c the Galilean transformation is a good approximation to the Lorentz transformation.
Only experiment can answer the question which of the two possibilities, κ = 0 or κ < 0, is realised in our world. The experiments measuring the speed of light, first performed by a Danish physicist Ole Rømer, show that it is finite, and the Michelson–Morley experiment showed that it is an absolute speed, and thus that κ < 0.
Howard Percy Robertson and others showed that the Lorentz transformation can also be derived empirically. In order to achieve this, it's necessary to write down coordinate transformations that include experimentally testable parameters. For instance, let there be given a single "preferred" inertial frame in which the speed of light is constant, isotropic, and independent of the velocity of the source. It is also assumed that Einstein synchronization and synchronization by slow clock transport are equivalent in this frame. Then assume another frame in relative motion, in which clocks and rods have the same internal constitution as in the preferred frame. The following relations, however, are left undefined:
- differences in time measurements,
- differences in measured longitudinal lengths,
- differences in measured transverse lengths,
- depends on the clock synchronization procedure in the moving frame,
then the transformation formulas (assumed to be linear) between those frames are given by:
depends on the synchronization convention and is not determined experimentally, it obtains the value by using Einstein synchronization in both frames. The ratio between and is determined by the Michelson–Morley experiment, the ratio between and is determined by the Kennedy–Thorndike experiment, and alone is determined by the Ives–Stilwell experiment. In this way, they have been determined with great precision to and , which converts the above transformation into the Lorentz transformation.
- Gyrovector space
- Proper time
- Relativistic metric
- Noether's theorem
- Lorentz group
- Poincaré group
- Zeeman, Erik Christopher (1964), "Causality implies the Lorentz group", Journal of Mathematical Physics 5 (4): 490–493, Bibcode:1964JMP.....5..490Z, doi:10.1063/1.1704140
- University Physics – With Modern Physics (12th Edition), H.D. Young, R.A. Freedman (Original edition), Addison-Wesley (Pearson International), 1st Edition: 1949, 12th Edition: 2008, ISBN (10-) 0-321-50130-6, ISBN (13-) 978-0-321-50130-1
- Einstein, Albert (1916). "Relativity: The Special and General Theory" (PDF). Retrieved 2012-01-23.
- Stauffer, Dietrich; Stanley, Harry Eugene (1995). From Newton to Mandelbrot: A Primer in Theoretical Physics (2nd enlarged ed.). Springer-Verlag. p. 80,81. ISBN 978-3-540-59191-7.
- Born, Max (2012). Einstein's Theory of Relativity (revised ed.). Courier Dover Publications. p. 236-237. ISBN 0-486-14212-4., Extract of page 237
- Gupta, S. K. (2010). Engineering Physics: Vol. 1 (18th ed.). Krishna Prakashan Media. p. 12-13. ISBN 81-8283-098-2., Extract of page 12
- Relativity DeMystified, D. McMahon, Mc Graw Hill (USA), 2006, ISBN 0-07-145545-0
- An Introduction to Mechanics, D. Kleppner, R.J. Kolenkow, Cambridge University Press, 2010, ISBN 978-0-521-19821-9
- Robertson, H. P. (1949). "Postulate versus Observation in the Special Theory of Relativity". Reviews of Modern Physics 21 (3): 378–382. Bibcode:1949RvMP...21..378R. doi:10.1103/RevModPhys.21.378.
- Mansouri R., Sexl R.U. (1977). "A test theory of special relativity. I: Simultaneity and clock synchronization". General. Relat. Gravit. 8 (7): 497–513. Bibcode:1977GReGr...8..497M. doi:10.1007/BF00762634.