Share this article on:

Reassessing a Large-Scale Syphilis Epidemic Using an Estimated Infection Date

Schumacher, Christina M. MHS*; Bernstein, Kyle T. PhD; Zenilman, Jonathan M. MD; Rompalo, Anne M. MD, ScM

doi: 10.1097/01.olq.0000175400.40084.3e

Objectives: Timely ascertainment of syphilis cases is critical to initiating disease-control measures. Epidemic curves typically use the report date and may introduce lag-time bias into assessment.

Goal: To reassess a large syphilis epidemic using an imputed infection date.

Study: We compared 2 types of epidemic curves—1 based on report date and 1 on estimated infection date—using the large 1993–2003 Baltimore epidemic as our model.

Results: In general, the shape of the report curves did not accurately reflect the shape of the corresponding infection curves during the growth period (period of largest increase in incidence); during the hyperendemic period (period of highest incidence), peaks in report curves did not follow peaks in the infection curve by the appropriate lag time. There was a tendency for reporting data to underestimate infections during the growth period and overestimate infections during the hyperendemic period. A sensitivity analysis showed similar trends regardless of the length of stage-specific incubation period used.

Conclusions: Lag-time bias may be present when using epidemic curves based on report dates. Health departments should consider using an estimated infection date.

Reassessment of a large-scale syphilis epidemic found that epidemic curves based on estimated infection date may better describe epidemics than traditionally derived report date curves.

From the *Department of Epidemiology, Johns Hopkins Bloomberg School of Public Health, Baltimore, Maryland; †Department of Epidemiology, Johns Hopkins Bloomberg School of Public Health, Baltimore, Maryland, and Baltimore City Health Department, Baltimore, Maryland; and the ‡Johns Hopkins University School of Medicine, Division of Infectious Diseases, Baltimore, Maryland

This work was supported by NIH grants R01-AI45724 and K24-AI01633.

The authors would like to thank Vivian Go, Laura McGough, and Aaron Goodfellow for their insightful comments on and corrections to the manuscript.

Address for reprints: Anne M. Rompalo, MD, ScM, 1830 East Monument Street, Room 447, Baltimore, MD 21205. E-mail:

Received for publication June 4, 2004, and accepted April 4, 2004.

IN THE LATE 1980s, incidence of early syphilis increased rapidly, especially among heterosexual blacks. The outbreak was associated with increased crack cocaine use, decreased health care access and reduced efficacy of public health clinical and partner notification services.1 In response, the Centers for Disease Control and Prevention (CDC) and local authorities instituted the National Syphilis Elimination Plan, which combined improved health care access, rapid ethnographic assessment, community involvement, enhanced surveillance, enhanced health promotion, and rapid outbreak response.

Precise surveillance is paramount to launching a rapid outbreak response. Epidemic curves are used to determine both the presence of epidemic transmission within a community and its temporal context. In an outbreak investigation, cases often are parceled into 2 groups: “pre-epidemic” and “epidemic”; comparing characteristics of cases from pre-epidemic to those from epidemic periods can elucidate important sociological, behavioral, and biologic risk factors that may inform disease control.

Epidemic curves typically are derived from data that use the date of report. However, the time between the initiation of infection, the diagnosis of infection, and the provider report of the infection varies between individuals. Traditionally generated epidemic curves may introduce a lag time bias into epidemiologic investigations. This analysis investigates the effect of the incubation period and time between diagnosis and report on the accuracy of the epidemic curve during the early stages of a large epidemic: the initial period of increased incidence, the “growth period,” and the period of highest incidence, the “hyperendemic period.”2

Baltimore experienced a large-scale syphilis epidemic in the mid 1990s.3 Using this epidemic as a model, we compared the epidemic curves that were actually used based on report date to epidemic curves generated from imputed dates of infection during the growth and hyperendemic periods.

Back to Top | Article Outline

Materials and Methods

Data for this analysis were records of early (primary, secondary, and early latent) syphilis reported to the Baltimore City Health Department (BCHD) between January 1994 and June 2003. The report date associated with a syphilis case corresponds to the day BCHD processed and assigned each case (received from either the laboratory or the provider report of the case) to a disease intervention specialist (DIS). We imputed the infection date by subtracting the median incubation time of the corresponding syphilis stage from the date of diagnosis. For the primary analysis, we determined the median incubation time for each syphilis stage to be one-half of the stage-specific incubation periods used for identification of at-risk sexual partners as outlined in the CDC treatment guidelines: 45 days for primary syphilis, 90 days for secondary, and 183 days for early latent syphilis.4 In accordance with the National Plan to Eliminate Syphilis recommendations for analysis of syphilis data,5 cases were aggregated into yearly quarters. Quarter 1 (Q1) is defined to be January–March, quarter 2 (Q2) April–June, quarter 3 (Q3) July–September, and quarter 4 (Q4) October–December. Cases with data missing on sex, report date, or diagnosis date were excluded.

Two epidemic curves were generated. A report curve was based on the date of report and an infection curve was based on an estimated date of infection. All curves were stratified by sex in addition to disease stage (primary, secondary and early latent).6 Using infection date curves, the growth period was defined as the period with the largest increases in incidence, and the hyperendemic period was defined as the period with the largest overall incidence.

Because of the individual variability of incubation periods, a sensitivity analysis was performed for each disease stage using a range of incubation periods. For primary syphilis, we examined incubation periods of 15, 21, 30, and 45 days, for secondary syphilis, 60, 90, and 120 days, and for early latent syphilis 150, 183, and 210 days.4,7

In order to assess any effect of the time between diagnosis and report on the epidemic curves, the average number of days per yearly quarter between the date of syphilis diagnosis and assignment to a DIS was calculated for each stratum. Calculations of infection dates, means, and 95% confidence intervals were done using Stata Statistical Software, Version 7.0 (College Station, TX).8 Graphs were created in Microsoft Excel (Seattle, WA).

Back to Top | Article Outline


Of the 8409 syphilis cases reported to BCHD between January 1994 and June 2003, 7806 (92.8%) had a diagnosis of primary, secondary, or early latent syphilis. We excluded 143 (1.8%) cases due to missing data: 19 for missing sex data, 1 for a missing report date, and 123 for missing diagnosis dates. This left 7663 cases in the final analysis.

Back to Top | Article Outline

Primary Syphilis

Epidemic curves for males with primary syphilis using are shown in Figure 1A. Using the infection curve, the growth period occurred between the first quarter of 1994 and the third quarter of 1995 (Q1 1994–Q3 1995); the hyperendemic period occurred between Q4 1995 and Q2 1997. Results for females with primary syphilis were excluded in this analysis because only 6% of all cases of early syphilis among women were diagnosed as primary syphilis.

Fig. 1

Fig. 1

Early in the growth period, the initial increases in infections are not accounted for in the report curve; between Q1 and Q2 1994, infections increased by 167%, yet reports decreased by 13%. At the end of the growth period, although both infection and report curves showed a decline in cases, reports showed a larger decrease. Between Q3 and Q4 1995, infections decreased by 27%, while reports decreased by 57%. The overall shape of the curves differs throughout most of the hyperendemic period, and there is a lag of 3 months between the peak in infections in Q2 1997 and the peak in reports in Q3 1997, and the size of each of the 2 peaks shown in the report curve is greater than that shown in the infection curve.

During the growth period, reports underestimated the number of estimated infections that occurred. From Q1 1994 to Q3 1995, an estimated 171 infections occurred, while only 147 infections were reported (86%); however, the number of reports and estimated infections during the hyperendemic period are the same.

Back to Top | Article Outline

Secondary Syphilis

Gender-specific curves for secondary syphilis are shown in Figure 1B and C. In both males (Fig. 1B) and females (Fig. 1C), the infection curve shows the growth period to be from Q1 1994 to Q4 1995. In males, the hyperendemic period is from Q1 1996 to Q2 1997; in females, the hyperendemic period lasts 9 months longer, from Q1 1996 to Q1 1998.

In both curves, the most striking increase in infections occurred in the first half of 1995. In males, infections increased by 91% between Q1 and Q2 1995; 3 months later (Q2–Q3 1995), reports increased by only 19%. Similarly, between Q2 and Q3 1995, infections among males decreased by 21%, yet reports showed a 44% decrease in the corresponding period (Q3–Q4 1995). We found comparable discrepancies between the infection and report curves in females. From Q2 to Q3 1995, infections among females increased by 54%; 3 months later (Q3–Q4 1995), reports increased by 18%. Infections continued to increase by 30% between Q3 and Q4 1995; paradoxically, reports decreased by 45% between Q4 and Q1 1996.

During the hyperendemic period, the incongruity of the infection and report curves differed by sex. In males, although the overall incidence peaks showed the expected 3-month lag period, the magnitudes of report curve peaks were notably larger than those in the infection curve. In females, the report curve suggests that a sharp peak in incidence occurred in Q4 1996, after which incidence began to subside. In contrast, the infection curve shows a broad peak between Q3 1996 and 2Q 1997, which suggests that the incidence was sustained at a high level over a year-long period.

Among both sexes, the total number of reports underestimated infections during the growth period and overestimated infections during the hyperendemic period. During the growth period (Q1 1994–Q4 1995), an estimated 205 and 243 infections occurred among males and females, respectively. However, only 175 (85%) infections among males and 197 (81%) infection among females were reported in the corresponding period (Q2 1994–Q1 1995). During the hyperendemic period (Q1 1996–Q2 1997 males, Q1 1996–Q1 1998 females), 231 estimated infections occurred in males, and 261 (110%) infections were reported during the corresponding period (Q2 1996–Q3 1997). The extent of overreporting was similar among females (512 infections, 557 reports [109%]), despite the fact that the hyperendemic period was 9 months longer.

Back to Top | Article Outline

Early Latent Syphilis

Unlike in primary and secondary syphilis curves, the incidence of early latent syphilis infection among both men (Fig. 1D) and women (Fig. 1E) remained constant throughout 1994 and increased throughout 1995. However, the hyperendemic period is similar to that of both primary and secondary syphilis, although for both sexes, the duration of this period is shorter than for secondary syphilis (Q1 1996–Q1 1997 for males, Q1 1996–Q3 1997 for females).

During the growth period (Q1 1995–Q4 1995), the discrepancies between the infection and report curves in both males and females were similar to those found in secondary syphilis curves. Between Q1 and Q2 1995, the largest increase in infections among both men and women occurred; infections increased by 40% in males and by 96% in females. Six months later, between Q3 and Q4 1995, reports decreased by 48% and 46%, respectively. Furthermore, among females, the growth period of the report curve did not begin until the first quarter of 1996, 1 year after the initial rise in infections began.

During the hyperendemic period (Q1 1996–Q1 1997), the overall trends in incidence in males as shown by the infection curve are reflected in the report curve with a 6-month lag time; however, the overall magnitudes of these trends differ between the 2 curves. In females, the shape of the report curve does not correspond to that of the infection curve, and peaks in incidence as shown by the 2 curves occurred simultaneously in the middle of the hyperendemic period (Q4 1996) and only a 3-month lag time at the end (Q3 1997 infections, Q4 1997 reports).

Similar to secondary syphilis, reports of early latent syphilis also underestimate infections during the growth period and overestimate infections during the hyperendemic period. In 1995, an estimated 386 early latent infections in males occurred, but only 330 (85%) were reported during the corresponding timeframe (Q3 1995–Q2 1996). For females, 301 infections occurred, and 258 (86%) infections were reported. During the hyperendemic period, 602 infections in males and 797 infections in females occurred; 660 (110%) cases in males and 862 cases in females (108%) were reported.

Back to Top | Article Outline

Sensitivity Analysis of Incubation Periods

An assessment of a range of possible incubation periods, shown with the corresponding report curves, is found in Figure 2. For each disease stage, the infection curves are comparable to one another and diverge from the report curve in a similar manner, regardless of the incubation period used in calculating the infection curve. For example, during the growth period of primary syphilis (Fig. 2A), infections increased no matter what incubation period was used to calculate the infection curve. Between Q1 and Q2 1994, reports decreased by 13%, while infections increased by 85% with a 15-day incubation period, by 57% with a 21-day incubation period, by 200% with a 30-day incubation period, and by 167% with a 45-day incubation period. At the end of the growth period (Q3–Q4 1995), all 4 infection curves showed a smaller decrease in cases than the report curve. During the hyperendemic period, although the overall peak in each of the curves occurred during the same quarter, Q4 1996, the magnitude of the report peak is greater than any of the infection peaks. Furthermore, the pattern of underestimation of infections during the growth period is similar regardless of the length of the incubation period.

Fig. 2

Fig. 2

Back to Top | Article Outline

Time Between Syphilis Diagnosis and Assignment of Case

Figure 3 displays the average number of days between the day of syphilis diagnosis and the day the case was assigned to a DIS at BCHD for men with primary syphilis. In general, the average number of days between diagnosis and case assignment stays above zero during the growth period, fluctuates during the hyperendemic period, and drops to zero or below as the epidemic subsides. This trend is similar across all stage and gender strata. (Data not shown)

Fig. 3

Fig. 3

Back to Top | Article Outline


The infection date and report date curves for early syphilis demonstrate important differences. During the growth period, the shape of the report curves did not accurately reflect the shape of the corresponding infection curves. During the hyperendemic period, peaks in report curves did not follow peaks in the infection curve by the appropriate lag time, and there was a tendency for reporting data to underestimate infections during the growth period and overestimate infections during the hyperendemic period.

In theory, the report and infection curves should be similar but shifted in time by the appropriate incubation period. However, in the data presented here, all 5 report curves failed to account for large increases in their respective infection curves during the growth of the epidemic; differences in the epidemic peaks were found in male primary syphilis and female secondary and early latent syphilis. Furthermore, reports underestimated infections during the growth period regardless of disease stage or gender and overestimated infections during the hyperendemic period for all but male primary syphilis.

Reassessment of the Baltimore syphilis epidemic using estimated infection dates resulted in 2 important findings. First, a lag time bias due to the disease-specific incubation stage may be present when defining the epidemic period based on the report date versus the actual date on which disease transmission occurred. Second, the divergence in the shape of the 2 curves may indicate that reporting was not occurring promptly after a syphilis diagnosis was made.

The presence of lag time bias can skew assessments of demographic and social factors associated with pre-epidemic and epidemic periods. Being able to ascertain factors associated with the shift from pre-epidemic to an epidemic can provide insight into both causes of the epidemic and means of controlling it. However, even if reporting to the health department occurs within 1 day of diagnosis, as is currently recommended,5 epidemiologic analyses may be biased about the incubation time of the syphilis disease stage, especially about risk factors at the community level. In our analysis, the pattern of reporting deviated from the pattern of infection by more than just the relative median incubation periods, especially during the initial development of the epidemic. This may lead to additional misclassification of the epidemic period and would increase the level of bias in analyses.

Comparison of the 2 sets’ methods also indicated that reporting was not promptly made after diagnosis. This is shown by disproportionate increases and paradoxical decreases in reporting throughout 1995, when we would expect a large increase based on the infection curves, as well as by reports underestimating infections early in the epidemic. During this period, the average number of days between diagnosis and case assignment to a DIS officer remained more than 1 day regardless of sex or disease status; for a considerable portion of the growth period, the average lag time was 10 days or more. The period between exposure and becoming infectious, the “critical period,” is optimal for partner notification and treatment. Delayed reporting impedes implementation of partner notification programs.

We believe the superimposed infection and reporting peaks and the overestimation of infections by reports during the epidemic period are likely due to increased efforts in controlling syphilis rates in the city, which included expansion of services and professional education. Improved reporting would occur due to increased physician awareness and more intense case finding activities. Furthermore, the increase in infections noted in the infection curves throughout the growth period that are not accounted for as expected in the corresponding report curves is likely to be accounted for during the early stages of the hyperendemic period. This may be why overestimation of infections occurs throughout the hyperendemic period.

There are 2 main factors that influence the accuracy of epidemic curves: the ability to capture all cases and lag time between infection and report to the health department. First, in order to be captured by the health department, cases must seek treatment from a provider. Second, the lag time between infection and report is affected by the time between disease transmission and the onset of symptoms, the time between the onset of symptoms and diagnosis, and the time between the diagnosis and the report.

A significant proportion of syphilis infections are never reported because those infected never seek treatment, are never named as a partner, or are unable to be located. Typically, those in the highest-risk categories are never reached by health officials; thus, our analysis may not include cases among high-risk populations. Data from such groups are not included in either the infection or report curve; therefore, any bias on this analysis from their exclusion should be small. However, the effects of these data upon either curve are unknown. Effects of disease stage misclassification on infection or report curves are also unknown.

The stage-specific incubation periods for syphilis vary between individuals, and the precise incubation period for each individual is unknown. Although the natural history of syphilis has been well studied, median incubation periods are difficult to quantify due to the ethical ramifications of performing an appropriate study. In this report, we compared curves derived from a range of incubation periods to the appropriate report curves and found similar divergences with the report curve regardless of which incubation period was used.

The time between the onset of symptoms and diagnosis should vary greatly between individuals and was not available for this study. Symptoms of primary and secondary syphilis resolve within a few weeks,9 and it is likely that cases who may have significantly delayed seeking treatment after symptom onset were those who never sought treatment or were diagnosed later as early latent cases. However, the effect of lag time between disease onset and diagnosis on this analysis is unknown.

An estimate of the time between syphilis diagnosis and case assignment by the health department was calculated. Our results indicate that the average days between diagnosis and DIS assignment may be associated with periods of increasing incidence. Since negative and zero averages are indicative of extensive contact tracing by the health department, our results suggest that active case finding and prompt reporting of cases are associated with declining and stable incidence levels. We believe that a significant proportion of lag time bias in the report date based epidemic curve can be attributed to delayed reporting.

Estimated infection date as the basis for an epidemiologic investigation better represents the epidemic timeline than does report date. It more accurately differentiates the preepidemic and epidemic periods, adding insight into community characteristics that facilitate disease transmission. Understanding community dynamics at the time transmission occurs may be more useful in discerning causes of the epidemic and devising a means to control it. In addition, comparison of the 2 curves can serve as a check on communication between providers and the health department.

Syphilis can occur in a number of different populations with different risk behaviors, and outbreak investigations must determine the context in which the epidemic is occurring while disease is confined to a relatively small population, ideally during the “growth period.” Syphilis surveillance must be timely and accurate and therefore should be sensitive enough to accurately portray transmission dynamics within the community. Since most data are now stored electronically, producing timely data to generate curves quickly could be reasonably simple and inexpensive to implement and execute. Health departments should consider using a date of infection when investigating and evaluating epidemics.

Back to Top | Article Outline


1. Rolfs RT, Nakashima AK. Epidemiology of primary and secondary syphilis in the United States, 1981 through 1989. JAMA 1990; 264:1432–1437.
2. Wasserheit JN, Aral SO. The dynamic topology of sexually transmitted disease epidemics: implications for prevention strategies. J Infect Dis 1996; 174(suppl 2):S201–S213.
3. Outbreak of primary and secondary syphilis: Baltimore City, Maryland, 1995. MMWR Morb Mortal Wkly Rep 1996; 45:166–169.
4. Centers for Disease Control and Prevention. Sexually transmitted diseases treatment guidelines 2002. MMWR Recomm Rep 2002; 51: 1–78.
5. Division of STD Prevention. The National Plan to Eliminate Syphilis from the United States. Atlanta, Ga: Centers for Disease Control and Prevention, 1999.
6. Hutchinson CM, Rompalo AM, Reichart CA, Hook EW III. Characteristics of patients with syphilis attending Baltimore STD clinics: multiple high-risk subgroups and interactions with human immunodeficiency virus infection. Arch Intern Med 1991; 151:511–516.
7. Golden MR, Marra CM, Holmes KK. Update on syphilis: resurgence of an old problem. JAMA 2003; 290:1510–1514.
8. StataCorp. Stata Statistical Software: Release 7.0. College Station, TX: StataCorp, 2000.
9. Musher DM. Early Syphilis. In: Holmes KK, Mårdh P, Sparling PF, et al., eds. Sexually Transmitted Diseases. New York, NY: McGraw-Hill, 1999:479–485.
© Copyright 2005 American Sexually Transmitted Diseases Association