Astigmatism is a term bandied about in optometry frequently and in the vision sciences generally. In the case of a thin lens, I believe there is little disagreement over the meaning of the word; a thin lens is astigmatic if it has a nonzero cylinder. But what precisely is an astigmatic device when the device is not thin? When, for example, is an eye, or an eye in combination with an optical instrument, astigmatic? To classify an eye as astigmatic, is it sufficient to know that the refraction is not spherical? Is an optical system stigmatic (i.e., not astigmatic) if its power is spherical? This article is the product of a desire for an answer, both unambiguous and universal, to these questions.
It would seem that the right approach is to follow etymology and define an astigmatic system by means of negation: a system is astigmatic if it is not stigmatic. This thinking leads us to shift fixation rather to systems that are stigmatic. All the other systems are astigmatic. The purpose of this article then is to find those characteristics that make a system stigmatic in general. It will turn out that stigmatic systems fall naturally into two classes. Proper and improper stigmatic systems, as we shall call them, are examined in an accompanying article.1
Astigmatism is commonly associated with dioptric power. However, there exist optical systems that are astigmatic but whose equivalent power, back-vertex power, and front-vertex power are all purely spherical. There are astigmatic systems of which all those powers are plano. An example is described elsewhere.2
There would be, one supposes, general agreement on the meaning of stigmatic and astigmatic pencils in linear optics: the pencil is associated, upstream or downstream, with a point in the first case and an interval of Sturm in the second. Alternatively, one would say that the wavefronts are spherical in the first case and toric (including cylindrical but excluding spherical) in the second. Less clear, however, is the significance of these terms in the context of optical systems because an optical system is not necessarily associated with light at all; an optical system remains the same when inside a case from which light is completely excluded! Nevertheless, it would seem perverse to define astigmatism and stigmatism without reference to the interaction, actual or potential, of the system with light.
It is well known that the Greek word στιγμα denotes a spot or point. The suffix -ism in stigmatism creates an abstract noun. α-Privativum signifies negation. Following etymology, we define a system to be stigmatic if, through it, every point object maps to an image that is a point. All the other systems then are called astigmatic. We note that this definition does not exclude the possibility that a point object maps to a point image through an astigmatic system. The definition contains a universal quantifier (“every”) for stigmatic but an implicit existential quantifier for astigmatic; to be astigmatic, it is sufficient that there exists just one object point that does not map to a point image.
Our interest here is confined to linear optics. We shall represent light in terms of rays, which we shall characterize in terms of their local states. (Presumably, the analysis is also possible in terms of wavefronts instead of rays.) For thin centered and untilted systems, dioptric power is sufficient to describe the interaction of the system with light. For most other systems, it is not. Complete characterization is provided by the transference (also called the system matrix,3,4 the ray transfer matrix,5 and the ray transformation matrix6) of the system. In the most general case, the transference is a 5 × 5 matrix.5,7 Six submatrices are identified as the fundamental optical properties of the system.8,9 Among the six are four 2 × 2 submatrices, which are of special importance for our purposes. Dioptric power, in effect, is one of them.10 Accordingly, we represent systems by means of their transferences. In traversing a system, the state of a ray is changed according to the transference of the system. We review these basics first. We then formally define a stigmatic system and thereafter an astigmatic system via negation. The analysis continues in an accompanying article,1 in which the two classes of stigmatic system are examined in detail and numerical examples are provided.
Fig. 1 shows an optical system S. By and large, we shall be concerned with the system as a whole and not with what is inside. Z is a longitudinal axis. Light travels roughly in the positive sense along Z. A ray is incident onto the entrance plane T0 of the system with state γ0 and emerges from the exit plane T with state γ. The emergent state is a 5 × 1 matrix of the form
in which y and α are the position vector and reduced direction or inclination of the ray at emergence. Reduced inclination is inclination multiplied by index of refraction. Where the context makes clear what is meant, we shall often omit the qualifier “reduced.” Both y and α are relative to Z. y is of the form
in which y1 and y2 are Cartesian coordinates, usually taken as horizontal and vertical, respectively, of y. α has the same form; its coordinates are α1 and α2. The incident state is similar: γ0 has submatrices y0 and α0 whose coordinates are y10, y20, α10, and α20, and both y0 and α0 are relative to Z.
In linear optics, the emergent and incident states of a ray are related by2–12
T is the transference of the system, a 5 × 5 matrix, which we write as5,7
A, the dilation, B, the disjugacy, C, the divergence, and D, the divarication, are each 2 × 2 matrices. (The terminology is discussed elsewhere.9) We write
and similarly for B, C, and D. e, the translation, and π, the deviation, are each 2 × 1 and similar to y (Equation 2). The fifth row of T is trivial; it consists of four 0s and a 1, o′ being the transpose of the 2 × 1 null vector o.
We call A, B, C, D, e, and π the six fundamental properties of the optical system.9 A, B, C, and D are the four 2 × 2 fundamental properties, and e and π are the two 2 × 1 fundamental properties. Other properties can be derived from them and are called derived properties.8
A derived property of historical importance is the dioptric power F of the system; by definition it is simply the negative of the divergence:10
It is convenient here to work in terms of C. Of course C can be replaced at any stage by −F.
Equation 3 can be written as the pair of equations:
The 2 × 2 fundamental properties are related by the symplectic equations5,6,8–13
I, an identity matrix, is one of four basic matrices of which we shall make use; they are
Not only is the transference of every optical system a matrix of the form of Equation 4, constrained by Equations 9 to 11, but also every such matrix is realizable as an optical system whose transference is the matrix.14 For a system that is the realization of a given matrix, we say that the system exists.
Let us use P to represent any one of the fundamental properties A, B, C, and D. We shall sometimes write P as a linear combination of the basic matrices in Equation 12:15
where PI, PJ, PK, and PL are real numbers. PII + PJJ + PKK is the symmetric component of P. PLL is the antisymmetric component. PII is a scalar matrix; we shall refer to it as the scalar component. (Elsewhere15 it was called the stigmatic component. In the light of the finding in this article, however, it becomes apparent that the name is ill advised and hence best replaced.) PII + PLL and PJJ + PKK are what we shall call the stigmatic and antistigmatic components of property P, respectively. Campbell16,17 uses the same decomposition (Equation 13) in the case of dioptric power in particular.
Equation 7 shows that, for systems with disjugacy B = O, the emergent position y is independent of the incident inclination α0. In other words, all the rays from a particular point on the entrance plane arrive at the same point on the exit plane, a fact we shall make use of below. We call such systems conjugate. All other systems are disjugate.8
A homogeneous gap of reduced width ζ has transference5,11,12,15
A thin system has transference5,11,12,15
in which C is symmetric. Of course in Equation 14, ζ ≥ 0. However, optical systems with transferences given by Equation 14 and with ζ < 0 are realizable;5 they function as if they were gaps of negative thickness.14 It will be useful below to make use of such systems. Thus, we place no bounds on the reduced thickness ζ.
In the transference T, the translation e and the deviation π account for effects of elements in the system that may be tilted or decentered. When all the elements are centered on axis Z and none are tilted relative to Z, then e = o and π = o. We shall refer to such systems as centered; all the others are decentered. For centered systems, the fifth row and the fifth column of the transference T (Equation 4) are both trivial. They can simply be omitted. The transference then becomes a 4 × 4 matrix of the form
Furthermore, the fifth row (the 1) of the ray state γ (Equation 1) then falls away, and the ray state becomes 4 × 1. In most examples below, we shall treat centered systems. Whether a transference is 4 × 4 or 5 × 5 or whether it could be either will be stated explicitly or will be clear from the context.
It will prove convenient to represent systems symbolically as follows: [ ] for a homogeneous gap, | for a thin system (a refracting interface or a thin lens), and □ for a general system. Compound systems can be represented by means of strings with subscripts added where necessary to identify or distinguish subsystems. We make a distinction between a thick lens |[ ]| and a thick system that is not necessarily of the form |[ ]|. □1□2□3 represents a system that is a combination of three successive systems in the order □1, □2, □3 in the positive sense along the longitudinal axis Z.
For a system SC of the form □1□2□3, the transference is the product of the individual transverses in reverse order, i.e.,5,11,12
Fig. 2 shows a system S = □ with homogeneous gaps [ ]0 and [ ] appended upstream and downstream. The first gap has reduced width ζ0 and the second ζ. Collectively, the three component systems constitute a compound system SC = [ ]0□[ ]. SC has entrance plane O and exit plane I. (Below O and I will become object and image planes, but for the moment they are arbitrary planes.) Applying Equation 17 with T1 and T3 given by Equation 14 and T2 by Equation 4, we obtain the transference of the compound system:
Below we shall make use of two orthogonal matrices, the rotation matrix
and the reflection matrix
To avoid ambiguity of sign later, we impose the restriction −90° < θ ≤ 90° in the case of Rθ and −45° < θ ≤ 45° in the case of −Rθ, although it will be convenient occasionally to relax these constraints. As we shall see, the restrictions do not imply a loss of generality. Rθ represents rotation through angle θ, and −Rθ represents reflection in a line at angle θ. When we come to use these matrices below, they will represent operations on the image relative to the object. A positive angle θ is an angle measured counterclockwise when one is looking in the positive sense along the longitudinal axis Z. We note, for later reference, that
Any matrix of the form
with unit determinant is a rotation matrix, and any matrix of the form
with determinant −1 is a reflection matrix.18 Under the restrictions on θ imposed above, p ≥ 0.
STIGMATIC AND ASTIGMATIC SYSTEMS
Consider a point object located in transverse plane O in Fig. 2. Suppose the point object maps to a point image through system S. We choose transverse plane I such that it contains the point image. Planes O and I then become object and image planes, respectively. It follows (see the paragraph immediately before Equation 14 that compound system SC is conjugate, and hence its disjugacy is BC = O. Thus, from the top, middle block of Equation 18 we have
We now make the following definitions:
DEFINITION: A system is called stigmatic if to every point object there corresponds a point image through the system.
DEFINITION: A system that is not stigmatic is called astigmatic.
For an astigmatic system, there is at least one point object that does not map to a point image. There may be point objects that do map to point images.
For completeness, we must allow object and image planes to be “at infinity” or at any finite distance, negative or positive. Thus, we take ζ0 and ζ in Equation 27 as points on the extended real number line, i.e., the familiar number line plus ∞. That being so, we need to be on the look out for potential indeterminate forms involving ∞ in addition to those involving 0. This is especially so because of the importance in optometry of the case of ζ0 = ∞. ζ0 ≥ 0 implies a real object point; for ζ0 = 0 the object point is on the entrance plane of system S, and for ζ0 < 0 the object is virtual. Similarly for ζ ≥ 0 the image is real; for ζ = 0 the image is on the exit plane of S; and for ζ < 0 the image is virtual. However, these names are of no consequence for our purposes; we shall not need to refer to them again. As mentioned above, systems that function as if they were gaps of negative thickness are realizable. Although compound system SC may be infinite in length, system S itself is finite as are all the entries in its transference, including A, B, C, and D in Equation 27 in particular.
Let a, b, c, and d represent real numbers and R either a particular rotation matrix Rθ with 90° < θ ≤ 90° or a particular reflection matrix −Rθ with 45° < θ ≤ 45°. We state and then prove the following theorem:
THEOREM: A system is stigmatic if and only if its transference is of the form
PROOF: The theorem takes the form of a biconditional. Following normal practice, we prove one conditional of the pair and then the other.
For one of the conditionals, we assume that the transference is of the form in Equation 28. We are then required to prove that for every ζ0 there exists a ζ that satisfies Equation 27. From the third symplectic equation (Equation 11) we have
which, because of Equation 23, shows that
The latter is the symplectic equation of two-dimensional linear optics.11,12 Substitution into Equation 27 leads to
and, hence, because of Equation 24,
Solving we obtain
This shows that, with two potential exceptions, there is a ζ that corresponds to every ζ0. The two potential exceptions occur when ζ0 = ∞ and when the denominator on the right side of Equation 33 is zero. In the first case, we set ζ = ζ∞; Equation 33 becomes
Because of Equation 30, c and a cannot both be zero, and when c = 0 ζ∞ = ∞. Thus, ζ∞ is always defined. For a zero denominator, again because of Equation 30, Equation 33 reduces to ζ = 1/(0 × c), which is always ∞. Thus, for every ζ0 there is a ζ, which means that for every point object there is a point image. By definition then the system is stigmatic. We have proved one of the two conditionals.
The converse conditional is more problematic. We assume that for every ζ0 there is a ζ, or in other words that Equation 27 is satisfied for all ζ0. We have then to prove that Equation 28 is a consequence.
Because, by assumption, Equation 27 holds in general, it holds for ζ0 = 0 in particular. For ζ0 = 0 we represent the corresponding ζ by ζ0. Substituting into Equation 27 we obtain
The only potential indeterminacy is in the term ζCζ0 in Equation 27: it could be that ζ0 = ∞, in which case we have the form ∞ × 0. The problem is overcome by rewriting Equation 27 as
and then setting ζ = ζ0 = ∞. One obtains D = O, a result already covered by Equation 35. Equation 35 then shows that B and D are scalar multiples of a common matrix, M, say. Thus, we can write
where ˜b and ˜d are real numbers. ˜b = 0 when B = O and ˜d = 0 when D = O. Because of the third symplectic equation (Equation 11), M is not null and ˜b and ˜d are not both zero. Hence, because of Equations 36 and 37, Equation 35 reduces to
ζ0 = ∞ when ˜d = 0. We substitute from Equations 36 and 37 into Equation 27 and rearrange to give
We choose any particular ζ0 other than 0 and ∞ and such that ζ is either 0 or ∞. Then Equation 39 shows that A + ζC is a scalar multiple of M. Hence, we can write
where s is a scalar dependent on ζ0. For a particular ζ0, say ζ′0, ζ and s are ζ′ and s′, respectively, and Equation 40 becomes
for another particular ζ0. Subtraction shows that we can write
where ˜c is a real number. Substitution from Equations 36, 37, and 43 into Equation 27 gives
which shows that
for a real number ã.
Substituting from Equations 36, 37, 43, and 45 into the third symplectic equation (Equation 11), we obtain
As for P in Equation 13, we express M in terms of I, J, K, and L (Equation 12). Multiplying we obtain
Comparison with Equation 46 shows that the coefficients of J and K must be zero, i.e.,
There are two distinct ways to satisfy Equations 48 and 49 simultaneously: either
It follows from Equations 12 and 13 that M has one of two forms, namely,
We choose MI and MJ such that MI ≥ 0 and MJ ≥ 0. (We can always do so with the appropriate choice of sign for ã, ˜b, ˜c, and ˜d.) MR and MS are of the forms in Equations 25 and 26, respectively, except that their determinants are not generally 1 or −1, respectively. Dividing by √|det M| converts MR to a rotation matrix (Equation 19) and MS to a reflection matrix (Equation 20). Thus, Equation 45 can be written
and R is a rotation or a reflection matrix. Similarly, one obtains B = bR (from Equation 36, C = cR (from Equation 43, and D = dR (from Equation 37, in which the R’s are all the same matrix and scalars b, c, and d in turn each replace a on both sides of Equation 55. Hence, Equation 28 holds; we have proved the converse conditional.
Having proved the two component conditionals of the biconditional, we have proved the theorem.
The proof above is rather tedious; possibly, a more elegant one is waiting in the wings.
The restrictions imposed on rotations by −90° < θ ≤ 90° and on reflections by −45° < θ ≤ 45° remove ambiguity in the signs of the scalars a, b, c, and d and the orthogonal matrix R; they make the scalars and the matrix unique.
We shall say a transference has stigmatic form or is stigmatic if it has the form of Equation 28.
It is a corollary of the theorem above that if a system has a transference of stigmatic form, then the transference is necessarily constrained by Equation 30.
We have here defined a system to be stigmatic if and only if, through the system, every object point maps to a point image. Astigmatic systems are defined by means of negation: they are systems that are not stigmatic. In other words, through an astigmatic system there is at least one point object that does not map to a point image through it; there may be a point object that maps to a point image. That a point object maps to a point image is not sufficient to make a system stigmatic.
The mathematics has shown that the condition that a system be stigmatic is that all four 2 × 2 fundamental properties of the system are scalar multiples of a common orthogonal matrix R (Equation 28). Because products of orthogonal matrices are orthogonal, a system compounded of stigmatic systems is stigmatic. The two types of orthogonal 2 × 2 matrix define two disjoint classes of stigmatic systems. They are the topic of an accompanying article.1
The questions posed at the outset have now been answered. In general, an optical device is astigmatic when its four 2 × 2 fundamental properties are not all scalar multiples of the same orthogonal matrix. The same applies to an eye and an eye in combination with an optical instrument. Representing, in effect, a thin system, the refraction of an eye is astigmatic if the cylinder is not zero. However, an eye is more than its refraction and may well be astigmatic even if its refraction is scalar. For a system to be stigmatic, it is not sufficient that its dioptric power be spherical; in fact, if the dioptric power is spherical, the system is astigmatic unless all four 2 × 2 fundamental properties are scalar matrices.
I thank J. Rubinstein, Department of Mathematics, Indiana University, for useful comments on the manuscript. I also thank R. Blendowske of the Department of Optical Technologies and Image Processing, University of Applied Sciences, Darmstadt, and R. D. van Gool, G. E. MacKenzie, H. Abelman, W. Heath, A. Rubin, A. S. Carlson, and W. D. H. Gillan of the Optometric Science Research Group for continuing discussions.
W. F. Harris
Optometric Science Research Group
Department of Optometry
Rand Afrikaans University
P.O. Box 524
Auckland Park, Johannesburg, 2006
1.Harris WF. Proper and improper stigmatic optical systems. Optom Vis Sci 2004;81:953–9.
2.Harris WF. Optical effects of ocular surgery including anterior segment surgery. J Cataract Refract Surg 2001;27:95–106.
3.Keating MP. A system matrix for astigmatic optical systems: I. Introduction and dioptric power relations. Am J Optom Physiol Opt 1981;58:810–9.
4.Keating MP. A system matrix for astigmatic optical systems: II. Corrected systems including an astigmatic eye. Am J Optom Physiol Opt 1981;58:919–29.
5.Sudarshan ECG, Mukunda N, Simon R. Realization of first order optical systems using thin lenses. Optica Acta 1985;32:855–72.
6.Bastiaans MJ. Second-order moments of the Wigner distribution function in first-order optical systems. Optik 1991;88:163–8.
7.Harris WF. Paraxial ray tracing through noncoaxial astigmatic optical systems, and a 5×5 augmented system matrix. Optom Vis Sci 1994;71:282–5.
8.Harris WF. A unified paraxial approach to astigmatic optics. Optom Vis Sci 1999;76:480–99.
9.Harris WF. Magnification, blur, and ray state at the retina for the general eye with and without a general optical instrument in front of it: 1. Distant objects. Optom Vis Sci 2001;78:888–900.
10.Harris WF. Dioptric power: its nature and its representation in three- and four- dimensional space. Optom Vis Sci 1997;74:349–66.
11.Guillemin V, Sternberg S. Symplectic Techniques in Physics. Cambridge: Cambridge University Press, 1984.
12.Bamberg PG, Sternberg S. A Course of Mathematics for Students of Physics, vol. 1. Cambridge: Cambridge University Press, 1988.
13.Walther A. The Ray and Wave Theory of Lenses. Cambridge: Cambridge University Press, 1995.
14.Harris WF. Realizability of optical systems of given linear optical character. Optom Vis Sci 2004;81:807–9.
15.Harris WF. Analysis of astigmatism in anterior segment surgery. J Cataract Refract Surg 2001;27:107–28.
16.Campbell C. Ray vector fields. J Opt Soc Am A 1994;11:618–22.
17.Campbell C. The refractive group. Optom Vis Sci 1997;74:381–7.
18.Friedberg SH, Insel AJ, Spence LE. Linear Algebra, 2nd ed. Englewood Cliffs, NJ: Prentice-Hall, 1992.
Keywords:© 2004 American Academy of Optometry
stigmatic system; astigmatic system; linear optics; ray transference; dioptric power matrix