The mathematical language of modern physics; and the foundation underneath field theory.
Special relativity is covered conceptually in the Modern Physics reference. What wasn’t developed there is the mathematical structure that modern physics actually uses. Four-vectors, tensors, and index notation aren’t just convenient bookkeeping; they’re what make the statement “the laws of physics are the same in every inertial frame” into something you can write down and work with.
This document builds that apparatus properly. By the end, you should be comfortable writing Maxwell’s equations in the form , understand why that notation is powerful, and be able to manipulate tensor expressions fluently enough to read a field theory textbook.
Table of Contents
- Why Tensors?
- Four-Vectors and Minkowski Space
- Index Notation and the Einstein Convention
- The Minkowski Metric
- Lorentz Transformations as Matrices
- Tensors: General Definition
- Tensor Operations
- The Levi-Civita Symbol and Duality
- Covariant Formulation of Mechanics
- Maxwell’s Equations in Covariant Form
- The Stress-Energy Tensor
- Worked Examples
- Appendix: Conventions and Identity Reference
1. Why Tensors?
Consider Newton’s second law . This is a vector equation; both sides are vectors in 3D space. The reason we can write it this compactly is that vectors transform in a specific way under rotations: if you rotate your coordinate axes, both sides transform identically, so the equation looks the same in every rotated frame. A scalar equation like would only hold in a particular frame; the vector equation holds in all of them.
This is what covariance means: an equation is covariant under a group of transformations if both sides transform the same way, so the equation’s form is preserved.
Special relativity demands covariance under Lorentz transformations (rotations plus boosts). Objects that transform appropriately under Lorentz transformations are called tensors. An equation written with matching tensor indices on both sides automatically holds in every inertial frame; no re-derivation needed.
The Practical Payoff
Maxwell’s equations in a frame-dependent form take four equations plus some definitions. In covariant tensor form:
Two equations. Manifestly the same in every inertial frame. This isn’t just notational compression; it’s a revelation of the geometric structure underlying electromagnetism.
Every modern theory of physics; QED, QCD, general relativity, the Standard Model; is written in tensor language. Getting fluent with it is non-negotiable.
2. Four-Vectors and Minkowski Space
Spacetime as a 4D Manifold
In special relativity, space and time are unified into a four-dimensional space called Minkowski space, often denoted or . A point in Minkowski space is an event; a place at a time.
Four-Vectors
A four-vector has four components: one time-like and three space-like. Conventionally the components are labeled by Greek indices , with for time. In Cartesian spatial coordinates:
The contravariant components carry an upper index. We’ll shortly introduce covariant components with a lower index; they carry the same physical information, differently packaged.
The Position Four-Vector
Note the factor of ; this makes all components have dimensions of length and the metric (below) dimensionless. Most particle physics texts use natural units where and drop the factor. We’ll use throughout unless reinstating it matters.
So:
Transformation Between Frames
Under a Lorentz boost along the -axis with velocity , the components of any four-vector transform as:
where and . Anything that transforms this way (for boosts in the -direction, plus the obvious generalizations for other directions and for rotations) is a four-vector.
The Invariant
For any four-vector, the combination
is the same in every inertial frame. You can verify directly by applying the boost formulas. This is the Minkowski analog of in Euclidean space; but with a minus sign between time and space.
For the position four-vector :
This is the spacetime interval. It is:
- Positive (timelike): events causally connectable, with proper time
- Zero (lightlike or null): lightcone, connectable only by light
- Negative (spacelike): no causal connection possible
Geometry of Minkowski Space
Minkowski space has the geometric structure of a 4D space with the indefinite metric I’ll describe in section 4. The minus sign between time and space is what distinguishes it from Euclidean 4D space and is the source of every strange feature of special relativity. Time dilation, length contraction, and the relativity of simultaneity are all consequences of this single sign.
3. Index Notation and the Einstein Convention
Before we go further, we need to be fluent with the notation. Sloppiness here is the single biggest source of confusion in learning field theory.
Upper vs. Lower Indices
In Minkowski space, there are two types of indices:
- Upper (contravariant):
- Lower (covariant):
They’re related via the metric (section 4). The distinction matters because transformation laws differ.
The Einstein Summation Convention
Repeated indices; one upper, one lower; are implicitly summed:
The summation sign is omitted. Any index that appears repeated in a term must appear once as an upper and once as a lower index. An expression like (both upper) is ill-formed and a warning sign that something’s gone wrong.
Greek indices () run over 0-3 (spacetime). Latin indices () run over 1-3 (space only) by convention.
Free vs. Dummy Indices
A free index appears only once in a term; a dummy index is summed over.
Here is dummy (summed), is free (appears on both sides). Key rules:
- Free indices must match on both sides of any equation
- Dummy indices can be renamed freely () but can’t collide with existing indices
Partial Derivatives
The shorthand for partial derivatives:
Note: the lower index on comes from the upper index on . This is because derivatives naturally transform oppositely to coordinates. Similarly:
In components:
(With our metric convention ; more in a moment.)
The d’Alembertian
The contraction :
This is the d’Alembertian, the relativistic generalization of the Laplacian. It appears everywhere in field theory; for example, Klein-Gordon: .
4. The Minkowski Metric
Signature Convention
The Minkowski metric is a 4×4 matrix that defines the geometry of spacetime. Two conventions are common:
- Particle physics: ; “mostly minus”
- General relativity: ; “mostly plus”
We’ll use the particle physics convention throughout (consistent with the Lagrangian mechanics doc and standard for field theory). Physical results don’t depend on the choice, but signs of intermediate expressions do.
Raising and Lowering Indices
The metric is used to convert between contravariant and covariant components:
In practice, for our diagonal metric, this just flips signs:
The time component is unchanged; the spatial components pick up a minus sign.
The Inner Product
The Minkowski inner product of two four-vectors:
This is the Lorentz-invariant combination. For a four-vector with itself:
If this is positive, the four-vector is timelike; zero, lightlike (null); negative, spacelike.
Identity: the Metric as Its Own Inverse
where is the Kronecker delta (1 if indices match, 0 otherwise). Raising the index of the metric with itself gives the identity. This is consistent with the fact that raising and then lowering is identity: .
Why the Minus Signs?
In Euclidean geometry, is positive definite. In Minkowski geometry, can have any sign. This indefinite signature is the mathematical face of causality: timelike separation can be “bigger” than spacelike, and the lightcone; the null surface; marks where cause and effect meet.
5. Lorentz Transformations as Matrices
A Lorentz transformation is a linear map on Minkowski space that preserves the metric.
Matrix Form
A four-vector transforms as:
where is the transformation matrix. Note the index positions: one upper (row index, output), one lower (column index, input).
Defining Property
Lorentz transformations are exactly those linear maps that preserve the inner product:
Expanding and demanding this equal gives:
Or in matrix form: . This is the definition.
Examples: Rotations
A rotation by angle about the -axis:
Time is untouched; the block is a rotation. Spatial rotations are special cases of Lorentz transformations.
Examples: Boosts
A boost along the -axis with velocity :
With . Applied to :
Exactly the Lorentz transformation from Modern Physics, now as a matrix multiplication.
Rapidity
A useful parametrization: define rapidity by . Then and , and a boost looks like:
Notice: this is structurally identical to a rotation, but with hyperbolic functions instead of trig. Boosts are “rotations in the time-space plane by an imaginary angle,” in a sense. Rapidities also add linearly for collinear boosts; unlike velocities.
The Lorentz Group
All Lorentz transformations form a group called ; six-dimensional (3 rotations + 3 boosts). If we restrict to those that preserve time direction and handedness, we get the proper orthochronous Lorentz group ; still six-dimensional but now connected. Every element can be continuously deformed to the identity.
This group structure matters because the irreducible representations of the Lorentz group classify what kinds of fields can exist: scalars (trivial rep), vectors (defining rep), spinors (double-cover rep), and so on. Every particle in the Standard Model corresponds to a specific representation.
Transformation of Covariant Vectors
Lower-index four-vectors transform with the inverse transpose:
where . This is why raising/lowering indices matters: the two types of components transform differently, and only the same-type contractions yield invariants.
6. Tensors: General Definition
Four-vectors are a special case. Tensors generalize them.
Definition by Transformation
A tensor of type (rank ) has upper indices and lower indices, and transforms as:
Each upper index transforms with ; each lower index with its inverse transpose. Looks horrific, but in practice you rarely need the full formula; you use the index structure to predict transformation behavior.
Special Cases
- Scalar (rank 0): invariant, . Example: the spacetime interval , electric charge.
- Four-vector (rank 1, contravariant): . Example: position, momentum.
- Covector (rank 1, covariant): . Example: acting on a scalar.
- Rank-2 tensor (contravariant): . Example: the electromagnetic field strength .
- Rank-2 mixed: . Example: the Kronecker delta .
Tensor Fields
A tensor field assigns a tensor to every point of spacetime. Most physics quantities are tensor fields, not just tensors at a single point. Examples:
- Scalar field:
- Vector field: (e.g., electromagnetic potential)
- Rank-2 tensor field: (field strength)
The transformation law at each point is the tensor transformation above.
The Crucial Principle
If an equation is written in tensor form with matching free indices on both sides, it automatically holds in every inertial frame. This is the payoff of the whole formalism. You verify an equation once in any convenient frame; the tensor structure guarantees the rest.
Equations with mismatched indices, or that aren’t tensor-valued on both sides, are frame-dependent; possibly wrong, certainly fragile. Tensor notation provides a built-in error-checker.
7. Tensor Operations
Here are the basic manipulations you need fluently.
Addition
Tensors of the same type add componentwise:
Only tensors of the same rank and index structure can be added.
Outer (Tensor) Product
Multiplying two tensors gives a higher-rank tensor:
Ranks add: rank-1 times rank-1 = rank-2. In general, the outer product of a type- and type- tensor yields a type- tensor.
Contraction
Summing over a paired upper and lower index reduces rank by 2:
(Einstein convention in force; the repeated index is summed.) For a type- tensor, contracting one upper with one lower yields a type- tensor.
Inner (Scalar) Product
Contracting two tensors together:
yields a scalar. This is the invariant inner product.
Raising and Lowering
Any index can be raised or lowered using the metric:
All the “versions” of a tensor with different index placements carry the same information.
Symmetrization and Antisymmetrization
For any rank-2 tensor:
Round brackets denote symmetrization; square brackets denote antisymmetrization. Any rank-2 tensor decomposes uniquely:
For higher rank, you can symmetrize over any subset of indices. The notation means symmetrize over all three; you can also do partial, like .
Symmetry Properties
- A symmetric tensor satisfies . Has 10 independent components in 4D (for a rank-2 tensor).
- An antisymmetric tensor satisfies . Has 6 independent components in 4D. Diagonal elements must be zero.
Symmetry properties are frame-independent; a tensor’s symmetry is preserved under Lorentz transformation.
Key Identity: Contracting Symmetric with Antisymmetric
Proof: rename indices and use symmetry/antisymmetry to show the quantity equals its own negative. Used constantly.
Derivatives
is a covector operator. Acting on a scalar, it produces a covector field:
Acting on a vector, it produces a rank-2 tensor:
Contracting:
These are tensor operations; they preserve transformation properties, because itself transforms as a covector under Lorentz.
8. The Levi-Civita Symbol and Duality
The Levi-Civita Symbol
In 4D, define as:
So , , etc.
Strictly, this is a tensor density; it transforms with an extra factor of ; but for proper Lorentz transformations , so it transforms as a tensor. (Parity flips the sign.) The lower-index version:
(Four minus signs, one per spatial index, gives an overall relative to . The two versions differ by a sign.)
Useful Identities
Total antisymmetry: swapping any two indices flips the sign.
Contraction identities:
Dual Tensors
Given an antisymmetric rank-2 tensor , define its dual:
For electromagnetism, the dual of the field strength swaps electric and magnetic fields. This duality plays a role in identifying the magnetic part of Maxwell’s equations as automatic (section 10).
Relation to 3D
The 4D Levi-Civita generalizes the familiar 3D symbol that appears in cross products. Many 3D vector identities (like ) have direct 4D analogs using .
9. Covariant Formulation of Mechanics
Now we use tensor language to restate special-relativistic mechanics properly.
Proper Time
Along a worldline, the proper time interval is:
(with ). Integrating gives the total proper time along a worldline; the time measured by a clock carried along that worldline.
is a Lorentz invariant. It is the natural “time” parameter for the particle.
Four-Velocity
In components:
Key property: (always). The four-velocity is a unit timelike vector.
Four-Momentum
Components:
- (with ; restore : )
Invariant:
This is the famous energy-momentum relation (or with units restored).
For massless particles (): is a null vector, (or ).
Four-Force and Four-Acceleration
Relativistic Newton’s second law in tensor form. Note that four-acceleration is orthogonal to four-velocity: (differentiate ).
Four-Wavevector
For a plane wave:
Invariant: . For light (dispersion ): null four-vector.
Relativistic Doppler Effect
The invariant combination gives the frequency seen by an observer with four-velocity . Working this out reproduces the Doppler formulas from Modern Physics; but now as a one-line invariant calculation.
10. Maxwell’s Equations in Covariant Form
This is the payoff; electromagnetism revealed as a tensor theory on Minkowski space.
The Four-Potential
Combine the scalar and vector potentials of electromagnetism:
This is a four-vector: it transforms under Lorentz boosts as should.
The Field Strength Tensor
Define:
This is antisymmetric: . It has 6 independent components. Writing them out in Cartesian coordinates:
The electric and magnetic fields, packaged into a single rank-2 antisymmetric tensor. Six components ↔ six field components ( and , three each).
Under boosts, mixes and ; they are frame-dependent manifestations of the same underlying geometric object.
The Four-Current
The charge density and current density, packaged as a four-vector. Conservation of charge becomes:
(The 3D continuity equation written in covariant form.)
The Two Inhomogeneous Maxwell Equations
This single four-vector equation encodes Gauss’s law (the component) and Ampère-Maxwell (the spatial components). Let’s verify:
For : . The terms are:
(diagonal of antisymmetric), and , so:
Gauss’s law.
For (spatial): :
and encodes the curl of ; working it out gives:
Ampère-Maxwell.
The Two Homogeneous Maxwell Equations
Square brackets denote antisymmetrization. This is equivalent to the Bianchi identity. Writing it out in 3D gives (no magnetic monopoles) and (Faraday’s law).
These equations are automatically satisfied when is written as ; you can check directly that antisymmetrizing derivatives of this form gives zero. So introducing the four-potential solves half of Maxwell’s equations identically; the other half become the equation of motion.
Gauge Invariance
The potential is not unique. The transformation
for any scalar function leaves unchanged (since ). This is gauge invariance; a redundancy in the description that turns out to be the key to constructing all modern interactions.
The Lagrangian Density
All of electromagnetism follows from the action
Applying the Euler-Lagrange equation for the field gives . (We did this in the Lagrangian mechanics doc.)
The single scalar ; a Lorentz invariant and gauge invariant combination; is the Lagrangian of electromagnetism. This is a dramatic consolidation: one number contains all of Maxwell’s theory.
Another Invariant
A second Lorentz invariant exists:
This is a pseudoscalar (flips sign under parity). It doesn’t appear in the standard Maxwell Lagrangian, but related terms appear in certain extensions (e.g., axion physics, the theta-term of QCD).
11. The Stress-Energy Tensor
For a field theory, the stress-energy tensor collects all the conserved currents associated with spacetime translations.
Physical Meaning of Components
- : energy density
- : energy flux density (energy flowing in direction )
- : momentum density (component )
- : stress tensor (flow of momentum in direction )
For symmetric (which holds for systems without intrinsic angular momentum), , meaning energy flux equals momentum density; a relativistic identity.
Conservation
This is the covariant expression of energy and momentum conservation. Four equations: is energy conservation; is momentum conservation in direction .
The Stress-Energy of the Electromagnetic Field
Some components:
- ; electromagnetic energy density
- ; the Poynting vector (energy flux = momentum density)
Every term you’ve seen for EM energy, momentum, and stress is packaged into this single rank-2 tensor.
Source of Gravity
In general relativity, the stress-energy tensor is the source of spacetime curvature: Einstein’s equations are
where is the Einstein curvature tensor. Matter tells spacetime how to curve via its stress-energy. That’s why matters so much: it’s the bridge from matter to geometry.
12. Worked Examples
Three calculations that show the machinery in action.
Example 1: Invariant Mass from Two Four-Momenta
Two particles collide with four-momenta and . The invariant mass of the system is defined by:
In any frame, this is the same number. In the center-of-mass frame, where and total energy is :
So the invariant mass is the center-of-mass energy. This is how the Higgs was found; reconstructing for various particle combinations and looking for a peak.
For a lab frame where particle 2 is at rest ():
So the CM energy squared is , giving
For large : ; growing only as . This is why fixed-target colliders are inefficient.
Example 2: Length Contraction from Four-Vectors
Consider a rod at rest in frame , with one end at and the other at (proper length). In the primed frame, both ends are “at rest”; they exist at all primed times.
Transform to frame in which moves with velocity . To measure the length in , we need positions of both ends at the same -time, say .
Using the Lorentz transformation:
For the end at at -time : , so .
For the end at at -time : , so .
Length in : . Length contraction.
The four-vector calculation makes explicit what was happening in the formula: it’s all about simultaneity; what means in doesn’t correspond to the same slice in .
Example 3: Verifying Maxwell’s Equations Transform Correctly
Take Gauss’s law in frame . In covariant form, this is . In another frame , the equation becomes
by tensor transformation. Expanding the component in the new frame:
- contains both fields and (after transforming) some fields
- contains both and (after transforming) some of
Result: Gauss’s law in the new frame mixes what were originally purely electric and purely magnetic phenomena. What looks like a static charge in one frame is a moving charge (current + charge) in another. The single covariant equation is simultaneously all four Maxwell equations for the and fields in every frame.
No calculation of this mixing is required when using tensor notation; it’s automatic. That’s the power of the formalism.
Appendix: Conventions and Identity Reference
Sign Conventions
We use:
- Metric:
- Index ordering: with 0 = time
- Natural units: (and typically )
Alternative convention (GR texts): . Intermediate signs differ; physical observables are the same.
Key Objects
| Symbol | Meaning |
|---|---|
| , | Minkowski metric |
| Kronecker delta | |
| Levi-Civita symbol | |
| Spacetime position | |
| Four-momentum | |
| Four-velocity | |
| EM four-potential | |
| Four-current | |
| EM field strength | |
| Stress-energy tensor | |
| Lorentz transformation | |
| Partial derivative | |
| d’Alembertian |
Key Invariants
- : spacetime interval
- : mass shell
- : four-velocity normalization (if )
- : EM scalar
- : EM pseudoscalar
Common Contraction Tricks
Rename dummy indices freely:
Swap order in scalars:
Contract symmetric with antisymmetric gives zero:
Factor out shared index:
(Separating into symmetric and antisymmetric parts clarifies structure.)
Transformation Rules
For a scalar: .
For a vector: .
For a rank-2 tensor: .
For a covector: .
Basic Lorentz transformation condition: , equivalently .
Useful Identities
(in 4D)
(where )
Levi-Civita Reminders
Total antisymmetry: swapping any two indices flips the sign.
Contractions (with signs that depend on metric convention; these are for mostly-minus):
Closing Note
Tensor notation is a language; and like any language, fluency comes from use, not from reading about it. Ten carefully worked problems will do more than a hundred pages of reading. Recommended practice:
-
Verify all of Maxwell’s equations in covariant form. Start with , expand component by component, recover the 3D versions.
-
Derive the transformation of and under a boost by transforming as a rank-2 tensor.
-
Prove that is a Lorentz invariant. Compute it in terms of and .
-
Construct the stress-energy tensor of a massive scalar field from its Lagrangian . Verify on shell.
-
Practice index manipulations until the rules are automatic: raising, lowering, contracting, symmetrizing.
Once you can do these without hesitation, you have the foundation for classical field theory proper; which is the next document. There, tensors become the default language, and we’ll develop scalar fields with spontaneous symmetry breaking, gauge theory with the covariant derivative, and the Dirac equation. From there, quantum field theory is genuinely in reach.