# Stigmatic Optical Systems

There would appear to be little disagreement on what constitutes an astigmatic system in the case of a thin lens: the cylinder is not zero. A spherical thin lens is stigmatic or not astigmatic. The issue is less clear in the case of a thick system. For example, is an eye stigmatic merely because its refraction is stigmatic (spherical)? In this article, a system is defined to be stigmatic if and only if, through the system, every point object maps to a point image. Every other system is astigmatic. Thus, a system is astigmatic if and only if there exists a point object for which the image is not a point. This article is restricted to linear optics. The optical character of a system is completely determined by the ray transference of the system. The objective here is to find those conditions on the transference for which the system is stigmatic or astigmatic. The result is that, for a stigmatic system, all the 2 × 2 submatrices are scalar multiples of a common orthogonal matrix. For a system to be stigmatic, it is not sufficient that its power be stigmatic. An eye may be astigmatic despite having a stigmatic refraction.

Optometric Science Research Group, Department of Optometry, Rand Afrikaans University, Johannesburg, South Africa

Supported by the National Research Foundation under grant 2053699.

Received March 4, 2004; accepted August 13, 2004.

Astigmatism is a term bandied about in optometry frequently and in the vision sciences generally. In the case of a thin lens, I believe there is little disagreement over the meaning of the word; a thin lens is astigmatic if it has a nonzero cylinder. But what precisely is an astigmatic device when the device is not thin? When, for example, is an eye, or an eye in combination with an optical instrument, astigmatic? To classify an eye as astigmatic, is it sufficient to know that the refraction is not spherical? Is an optical system stigmatic (i.e., not astigmatic) if its power is spherical? This article is the product of a desire for an answer, both unambiguous and universal, to these questions.

It would seem that the right approach is to follow etymology and define an astigmatic system by means of negation: a system is astigmatic if it is not stigmatic. This thinking leads us to shift fixation rather to systems that are stigmatic. All the other systems are astigmatic. The purpose of this article then is to find those characteristics that make a system stigmatic in general. It will turn out that stigmatic systems fall naturally into two classes. Proper and improper stigmatic systems, as we shall call them, are examined in an accompanying article.^{1}

Astigmatism is commonly associated with dioptric power. However, there exist optical systems that are astigmatic but whose equivalent power, back-vertex power, and front-vertex power are all purely spherical. There are astigmatic systems of which all those powers are plano. An example is described elsewhere.^{2}

There would be, one supposes, general agreement on the meaning of stigmatic and astigmatic *pencils* in linear optics: the pencil is associated, upstream or downstream, with a point in the first case and an interval of Sturm in the second. Alternatively, one would say that the wavefronts are spherical in the first case and toric (including cylindrical but excluding spherical) in the second. Less clear, however, is the significance of these terms in the context of *optical systems* because an optical system is not necessarily associated with light at all; an optical system remains the same when inside a case from which light is completely excluded! Nevertheless, it would seem perverse to define astigmatism and stigmatism without reference to the interaction, actual or potential, of the system with light.

It is well known that the Greek word στιγμα denotes a spot or point. The suffix -*ism* in *stigmatism* creates an abstract noun. α-Privativum signifies negation. Following etymology, we define a system to be *stigmatic* if, through it, every point object maps to an image that is a point. All the other systems then are called *astigmatic*. We note that this definition does not exclude the possibility that a point object maps to a point image through an astigmatic system. The definition contains a universal quantifier (“every”) for stigmatic but an implicit existential quantifier for astigmatic; to be astigmatic, it is sufficient that there exists just one object point that does not map to a point image.

Our interest here is confined to linear optics. We shall represent light in terms of rays, which we shall characterize in terms of their local states. (Presumably, the analysis is also possible in terms of wavefronts instead of rays.) For thin centered and untilted systems, dioptric power is sufficient to describe the interaction of the system with light. For most other systems, it is not. Complete characterization is provided by the *transference* (also called the system matrix,^{3,4} the ray transfer matrix,^{5} and the ray transformation matrix^{6}) of the system. In the most general case, the transference is a 5 × 5 matrix.^{5,7} Six submatrices are identified as the fundamental optical properties of the system.^{8,9} Among the six are four 2 × 2 submatrices, which are of special importance for our purposes. Dioptric power, in effect, is one of them.^{10} Accordingly, we represent systems by means of their transferences. In traversing a system, the state of a ray is changed according to the transference of the system. We review these basics first. We then formally define a stigmatic system and thereafter an astigmatic system via negation. The analysis continues in an accompanying article,^{1} in which the two classes of stigmatic system are examined in detail and numerical examples are provided.

## THE BASICS

Fig. 1 shows an optical *system* S. By and large, we shall be concerned with the system as a whole and not with what is inside. Z is a *longitudinal axis*. Light travels roughly in the positive sense along Z. A ray is incident onto the *entrance plane* T_{0} of the system with *state* γ_{0} and emerges from the *exit* plane T with state γ. The emergent state is a 5 × 1 matrix of the form

in which **y** and α are the *position* vector and *reduced direction* or *inclination* of the ray at emergence. Reduced inclination is inclination *multiplied* by index of refraction. Where the context makes clear what is meant, we shall often omit the qualifier “reduced.” Both **y** and α are relative to Z. **y** is of the form

in which *y*_{1} and *y*_{2} are Cartesian coordinates, usually taken as horizontal and vertical, respectively, of **y**. α has the same form; its coordinates are α_{1} and α_{2}. The incident state is similar: γ_{0} has submatrices **y**_{0} and α_{0} whose coordinates are *y*_{10}, *y*_{20}, α_{10}, and α_{20}, and both **y**_{0} and α_{0} are relative to Z.

In linear optics, the emergent and incident states of a ray are related by^{2–12}

T is the *transference* of the system, a 5 × 5 matrix, which we write as^{5,7}

**A**, the *dilation*, **B**, the *disjugacy*, **C**, the *divergence*, and **D**, the *divarication*, are each 2 × 2 matrices. (The terminology is discussed elsewhere.^{9}) We write

and similarly for **B**, **C**, and **D**. **e**, the *translation*, and π, the *deviation*, are each 2 × 1 and similar to **y** (Equation 2). The fifth row of **T** is trivial; it consists of four 0s and a 1, **o**′ being the transpose of the 2 × 1 null vector **o**.

We call **A**, **B**, **C**, **D**, **e**, and π the six *fundamental* properties of the optical system.^{9} **A, B, C**, and **D** are the four 2 × 2 fundamental properties, and **e** and π are the two 2 × 1 fundamental properties. Other properties can be derived from them and are called *derived* properties.^{8}

A derived property of historical importance is the dioptric power **F** of the system; by definition it is simply the negative of the divergence:^{10}

It is convenient here to work in terms of **C**. Of course **C** can be replaced at any stage by −**F**.

Equation 3 can be written as the pair of equations:

The 2 × 2 fundamental properties are related by the *symplectic* equations^{5,6,8–13}

**I**, an identity matrix, is one of four basic matrices of which we shall make use; they are

Not only is the transference of every optical system a matrix of the form of Equation 4, constrained by Equations 9 to 11, but also every such matrix is realizable as an optical system whose transference is the matrix.^{14} For a system that is the realization of a given matrix, we say that the system *exists*.

Let us use **P** to represent any one of the fundamental properties **A**, **B**, **C**, and **D**. We shall sometimes write **P** as a linear combination of the basic matrices in Equation 12:^{15}

where *P*_{I}, *P*_{J}, *P*_{K}, and *P*_{L} are real numbers. *P*_{I}**I** + *P*_{J}**J** + *P*_{K}**K** is the *symmetric* component of **P**. *P*_{L}**L** is the *antisymmetric* component. *P*_{I}**I** is a scalar matrix; we shall refer to it as the *scalar* component. (Elsewhere^{15} it was called the stigmatic component. In the light of the finding in this article, however, it becomes apparent that the name is ill advised and hence best replaced.) *P*_{I}**I** + *P*_{L}**L** and *P*_{J}**J** + *P*_{K}**K** are what we shall call the *stigmatic* and *antistigmatic* components of property **P**, respectively. Campbell^{16,17} uses the same decomposition (Equation 13) in the case of dioptric power in particular.

Equation 7 shows that, for systems with disjugacy **B** = **O**, the emergent position **y** is independent of the incident inclination α_{0}. In other words, all the rays from a particular point on the entrance plane arrive at the same point on the exit plane, a fact we shall make use of below. We call such systems *conjugate*. All other systems are *disjugate*.^{8}

A homogeneous gap of reduced width ζ has transference^{5,11,12,15}

A thin system has transference^{5,11,12,15}

in which **C** is symmetric. Of course in Equation 14, ζ ≥ 0. However, optical systems with transferences given by Equation 14 and with ζ < 0 are realizable;^{5} they function as if they were gaps of negative thickness.^{14} It will be useful below to make use of such systems. Thus, we place no bounds on the reduced thickness ζ.

In the transference **T**, the translation **e** and the deviation π account for effects of elements in the system that may be tilted or decentered. When all the elements are centered on axis Z and none are tilted relative to Z, then **e** = **o** and π = **o**. We shall refer to such systems as *centered*; all the others are *decentered*. For centered systems, the fifth row and the fifth column of the transference **T** (Equation 4) are both trivial. They can simply be omitted. The transference then becomes a 4 × 4 matrix of the form

Furthermore, the fifth row (the 1) of the ray state γ (Equation 1) then falls away, and the ray state becomes 4 × 1. In most examples below, we shall treat centered systems. Whether a transference is 4 × 4 or 5 × 5 or whether it could be either will be stated explicitly or will be clear from the context.

It will prove convenient to represent systems symbolically as follows: [ ] for a homogeneous gap, | for a thin system (a refracting interface or a thin lens), and □ for a general system. Compound systems can be represented by means of strings with subscripts added where necessary to identify or distinguish subsystems. We make a distinction between a thick *lens* |[ ]| and a thick *system* that is not necessarily of the form |[ ]|. □_{1}□_{2}□_{3} represents a system that is a combination of three successive systems in the order □_{1}, □_{2}, □_{3} in the positive sense along the longitudinal axis Z.

For a system S_{C} of the form □_{1}□_{2}□_{3}, the transference is the product of the individual transverses in reverse order, i.e.,^{5,11,12}

Fig. 2 shows a system S = □ with homogeneous gaps [ ]_{0} and [ ] appended upstream and downstream. The first gap has reduced width ζ_{0} and the second ζ. Collectively, the three component systems constitute a compound system S_{C} = [ ]_{0}□[ ]. S_{C} has entrance plane O and exit plane I. (Below O and I will become object and image planes, but for the moment they are arbitrary planes.) Applying Equation 17 with **T**_{1} and **T**_{3} given by Equation 14 and **T**_{2} by Equation 4, we obtain the transference of the compound system:

Below we shall make use of two orthogonal matrices, the *rotation* matrix

and the *reflection* matrix

To avoid ambiguity of sign later, we impose the restriction −90° < θ ≤ 90° in the case of **R**_{θ} and −45° < θ ≤ 45° in the case of −R_{θ}, although it will be convenient occasionally to relax these constraints. As we shall see, the restrictions do not imply a loss of generality. **R**_{θ} represents rotation through angle θ, and −R_{θ} represents reflection in a line at angle θ. When we come to use these matrices below, they will represent operations on the image relative to the object. A positive angle θ is an angle measured counterclockwise when one is looking in the positive sense along the longitudinal axis Z. We note, for later reference, that

and that

Any matrix of the form

with unit determinant is a rotation matrix, and any matrix of the form

with determinant −1 is a reflection matrix.^{18} Under the restrictions on θ imposed above, p ≥ 0.

### STIGMATIC AND ASTIGMATIC SYSTEMS

Consider a point object located in transverse plane O in Fig. 2. Suppose the point object maps to a point image through system S. We choose transverse plane I such that it contains the point image. Planes O and I then become object and image planes, respectively. It follows (see the paragraph immediately before Equation 14 that compound system S_{C} is conjugate, and hence its disjugacy is **B**_{C} = **O**. Thus, from the top, middle block of Equation 18 we have

We now make the following definitions:

DEFINITION: A system is called *stigmatic* if to every point object there corresponds a point image through the system.

DEFINITION: A system that is not stigmatic is called *astigmatic*.

For an astigmatic system, there is at least one point object that does not map to a point image. There may be point objects that do map to point images.

For completeness, we must allow object and image planes to be “at infinity” or at any finite distance, negative or positive. Thus, we take ζ_{0} and ζ in Equation 27 as points on the extended real number line, i.e., the familiar number line plus ∞. That being so, we need to be on the look out for potential indeterminate forms involving ∞ in addition to those involving 0. This is especially so because of the importance in optometry of the case of ζ_{0} = ∞. ζ_{0} ≥ 0 implies a real object point; for ζ_{0} = 0 the object point is on the entrance plane of system S, and for ζ_{0} < 0 the object is virtual. Similarly for ζ ≥ 0 the image is real; for ζ = 0 the image is on the exit plane of S; and for ζ < 0 the image is virtual. However, these names are of no consequence for our purposes; we shall not need to refer to them again. As mentioned above, systems that function as if they were gaps of negative thickness are realizable. Although compound system S_{C} may be infinite in length, system S itself is finite as are all the entries in its transference, including **A**, **B**, **C**, and **D** in Equation 27 in particular.

Let *a*, *b*, *c*, and *d* represent real numbers and **R** either a particular rotation matrix **R**_{θ} with 90° < θ ≤ 90° or a particular reflection matrix −R_{θ} with 45° < θ ≤ 45°. We state and then prove the following theorem:

THEOREM: A system is stigmatic if and only if its transference is of the form

PROOF: The theorem takes the form of a biconditional. Following normal practice, we prove one conditional of the pair and then the other.

For one of the conditionals, we assume that the transference is of the form in Equation 28. We are then required to prove that for every ζ_{0} there exists a ζ that satisfies Equation 27. From the third symplectic equation (Equation 11) we have

which, because of Equation 23, shows that

The latter is the symplectic equation of two-dimensional linear optics.^{11,12} Substitution into Equation 27 leads to

and, hence, because of Equation 24,

Solving we obtain

This shows that, with two potential exceptions, there is a ζ that corresponds to every ζ_{0}. The two potential exceptions occur when ζ_{0} = ∞ and when the denominator on the right side of Equation 33 is zero. In the first case, we set ζ = ζ^{∞}; Equation 33 becomes

Because of Equation 30, *c* and *a* cannot both be zero, and when *c* = 0 ζ^{∞} = ∞. Thus, ζ^{∞} is always defined. For a zero denominator, again because of Equation 30, Equation 33 reduces to ζ = 1/(0 × *c*), which is always ∞. Thus, for every ζ_{0} there is a ζ, which means that for every point object there is a point image. By definition then the system is stigmatic. We have proved one of the two conditionals.

The converse conditional is more problematic. We assume that for every ζ_{0} there is a ζ, or in other words that Equation 27 is satisfied for all ζ_{0}. We have then to prove that Equation 28 is a consequence.

Because, by assumption, Equation 27 holds in general, it holds for ζ_{0} = 0 in particular. For ζ_{0} = 0 we represent the corresponding ζ by ζ^{0}. Substituting into Equation 27 we obtain

The only potential indeterminacy is in the term ζ**C**ζ_{0} in Equation 27: it could be that ζ^{0} = ∞, in which case we have the form ∞ × 0. The problem is overcome by rewriting Equation 27 as

and then setting ζ = ζ^{0} = ∞. One obtains **D** = **O**, a result already covered by Equation 35. Equation 35 then shows that **B** and **D** are scalar multiples of a common matrix, **M**, say. Thus, we can write

and

where ˜b and ˜d are real numbers. ˜b = 0 when **B** = **O** and ˜d = 0 when **D** = **O**. Because of the third symplectic equation (Equation 11), **M** is not null and ˜b and ˜d are not both zero. Hence, because of Equations 36 and 37, Equation 35 reduces to

or

ζ^{0} = ∞ when ˜d = 0. We substitute from Equations 36 and 37 into Equation 27 and rearrange to give

We choose any particular ζ_{0} other than 0 and ∞ and such that ζ is either 0 or ∞. Then Equation 39 shows that **A** + ζ**C** is a scalar multiple of **M**. Hence, we can write

where *s* is a scalar dependent on ζ_{0}. For a particular ζ_{0}, say ζ′_{0}, ζ and *s* are ζ′ and *s*′, respectively, and Equation 40 becomes

Similarly,

for another particular ζ_{0}. Subtraction shows that we can write

where ˜c is a real number. Substitution from Equations 36, 37, and 43 into Equation 27 gives

which shows that

for a real number ã.

Substituting from Equations 36, 37, 43, and 45 into the third symplectic equation (Equation 11), we obtain

As for **P** in Equation 13, we express **M** in terms of **I**, **J**, **K**, and **L** (Equation 12). Multiplying we obtain

Comparison with Equation 46 shows that the coefficients of **J** and **K** must be zero, i.e.,

and

There are two distinct ways to satisfy Equations 48 and 49 simultaneously: either

or

It follows from Equations 12 and 13 that **M** has one of two forms, namely,

and

We choose *M*_{I} and *M*_{J} such that *M*_{I} ≥ 0 and *M*_{J} ≥ 0. (We can always do so with the appropriate choice of sign for ã, ˜b, ˜c, and ˜d.) **M**_{R} and **M**_{S} are of the forms in Equations 25 and 26, respectively, except that their determinants are not generally 1 or −1, respectively. Dividing by √|det M| converts **M**_{R} to a rotation matrix (Equation 19) and **M**_{S} to a reflection matrix (Equation 20). Thus, Equation 45 can be written

where

and **R** is a rotation or a reflection matrix. Similarly, one obtains **B** = *b***R** (from Equation 36, **C** = *c***R** (from Equation 43, and **D** = *d***R** (from Equation 37, in which the **R’**s are all the same matrix and scalars *b*, *c*, and *d* in turn each replace *a* on both sides of Equation 55. Hence, Equation 28 holds; we have proved the converse conditional.

Having proved the two component conditionals of the biconditional, we have proved the theorem.

The proof above is rather tedious; possibly, a more elegant one is waiting in the wings.

The restrictions imposed on rotations by −90° < θ ≤ 90° and on reflections by −45° < θ ≤ 45° remove ambiguity in the signs of the scalars *a*, *b*, *c*, and *d* and the orthogonal matrix **R**; they make the scalars and the matrix unique.

We shall say a transference has stigmatic form or is stigmatic if it has the form of Equation 28.

It is a corollary of the theorem above that if a system has a transference of stigmatic form, then the transference is necessarily constrained by Equation 30.

## CONCLUDING REMARKS

We have here defined a system to be stigmatic if and only if, through the system, every object point maps to a point image. Astigmatic systems are defined by means of negation: they are systems that are not stigmatic. In other words, through an astigmatic system there is at least one point object that does not map to a point image through it; there may be a point object that maps to a point image. That a point object maps to a point image is not sufficient to make a system stigmatic.

The mathematics has shown that the condition that a system be stigmatic is that all four 2 × 2 fundamental properties of the system are scalar multiples of a common orthogonal matrix **R** (Equation 28). Because products of orthogonal matrices are orthogonal, a system compounded of stigmatic systems is stigmatic. The two types of orthogonal 2 × 2 matrix define two disjoint classes of stigmatic systems. They are the topic of an accompanying article.^{1}

The questions posed at the outset have now been answered. In general, an optical device is astigmatic when its four 2 × 2 fundamental properties are not all scalar multiples of the same orthogonal matrix. The same applies to an eye and an eye in combination with an optical instrument. Representing, in effect, a thin system, the refraction of an eye is astigmatic if the cylinder is not zero. However, an eye is more than its refraction and may well be astigmatic even if its refraction is scalar. For a system to be stigmatic, it is not sufficient that its dioptric power be spherical; in fact, if the dioptric power is spherical, the system is astigmatic unless all four 2 × 2 fundamental properties are scalar matrices.

## ACKNOWLEDGMENTS

I thank J. Rubinstein, Department of Mathematics, Indiana University, for useful comments on the manuscript. I also thank R. Blendowske of the Department of Optical Technologies and Image Processing, University of Applied Sciences, Darmstadt, and R. D. van Gool, G. E. MacKenzie, H. Abelman, W. Heath, A. Rubin, A. S. Carlson, and W. D. H. Gillan of the Optometric Science Research Group for continuing discussions.

W. F. Harris

Optometric Science Research Group

Department of Optometry

Rand Afrikaans University

P.O. Box 524

Auckland Park, Johannesburg, 2006

South Africa

e-mail: wfh@na.rau.ac.za

## REFERENCES

**Keywords:**

stigmatic system; astigmatic system; linear optics; ray transference; dioptric power matrix