Word Use in the Poetry of Suicidal and Nonsuicidal Poets
Psychosomatic Medicine:
Original Articles

Word Use in the Poetry of Suicidal and Nonsuicidal Poets

Wiltsey Stirman, Shannon MA; Pennebaker, James W. PhD

Author Information

From the University of Pennsylvania (S.W.S.), Philadelphia, Pennsylvania and The University of Texas at Austin (J.W.P.), Austin, Texas.

Address reprint requests to: James W. Pennebaker, PhD, Department of Psychology, The University of Texas, Austin, TX 78712. Email: Pennebaker@psy.utexas.edu

Received for publication August 1, 2000; revision received November 8, 2000.

Objective: The purpose of this study was to determine whether distinctive features of language could be discerned in the poems of poets who committed suicide and to test two suicide models by use of a text-analysis program.

Method: Approximately 300 poems from the early, middle, and late periods of nine suicidal poets and nine nonsuicidal poets were compared by use of the computer text analysis program, Linguistic Inquiry and Word Count (LIWC). Language use within the poems was analyzed within the context of two suicide models.

Results: In line with a model of social integration, writings of suicidal poets contained more words pertaining to the individual self and fewer words pertaining to the collective than did those of nonsuicidal poets. In addition, the direction of effects for words pertaining to communication was consistent with the social integration model of suicide.

Conclusions: The study found support for a model that suggests that suicidal individuals are detached from others and are preoccupied with self. Furthermore, the findings suggest that linguistic predictors of suicide can be discerned through text analysis.

LIWC = Linguistic Inquiry and Word Count.

Suicide rates are much higher among poets than among authors of other literary forms as well as the general population (1). This phenomenon has variously been attributed to the types of writers who are naturally drawn to poetry as well as to the features of poetry itself. For example, there is retrospective evidence to suggest that many suicidal poets have suffered from some form of depressive disorder throughout their lives (1, 2). Poetry, it has been argued, may be a particularly appealing medium by which to cope with the unpredictable episodes of mood swings.

Critics of poetry therapy suggest an opposite theory: writing poetry may be harmful to the psychological health of the poet. Silverman and Will (3) claimed that Sylvia Plath’s use of poetry as a coping technique may have undermined her basic control mechanisms, which, in the face of highly stressful life events, contributed to her death. Although Lester and Terry (4) emphasize the benefits of poetry, they also acknowledge that writing poetry can, at times, be stressful because of factors such as harsh reviews, writer’s block, and the social isolation of writing. Of course, the majority of poets do not commit suicide. In fact, among the 83 poets noted by Jamison (1) to suffer from cyclothymia, bipolar, or major depressive illness, only 25% committed suicide.

By studying the poetry of suicidal vs. nonsuicidal poets, we can begin to track the language of the poets over the course of their careers and to isolate which themes or linguistic features may predict future suicide attempts. Previous studies of poetry and its relationship to suicide suggest that there are detectable signs of suicidal ideation in the poetry of the poets who ultimately commit suicide. Silverman and Will (3) found that Sylvia Plath’s poetry shifted from traditional forms and mediated images to a more personal, expressive form over the years. Hoyle (5) pointed to evidence of Plath’s ambivalence toward death, as well as signs of disengagement from social concerns in his examination of her more life-affirming poetry. Similarly, Long (6) noted Anne Sexton’s shifting attitude toward death in her work.

These studies, however, can be faulted on multiple methodological and conceptual grounds. They have generally been limited to examinations of the works of single poets. The poems that have been studied have generally not been selected over the poet’s entire career and have usually been chosen on the basis of their theme or were examined according to their proximity to suicide attempts. Furthermore, in most poetry analyses, only the author(s) rated the poems themselves. Finally, these studies have lacked any sort of controls. That is, poems of the suicidal poets were not compared with the works of nonsuicidal poets who were living in the same country at the same time.

Recent text analytic strategies have suggested that we may be able to identify predictors of suicide by examining the word choices of writers. There is growing evidence that the frequency of word use in written text can be used as indicators of psychological state (7). Researchers have used text analysis to accurately distinguish somatization (8), depression (9), and schizophrenia (10). Thomas and Duszynski (11) found that a text-analysis approach was an effective predictor of suicidal tendencies among medical students. A promising effort to study the language of “death” poetry was that of McDermott and Porter (12), which used the General Inquirer Computer Content Analysis Program. Emily Dickinson’s death poems were analyzed for factors that accounted for their therapeutic efficacy, by use of poems that spanned her career and a comparison group of psychiatric patients, as well as a control group of nondepressed volunteers (13).

Although promising, these earlier text analysis strategies relied on word-counting computer programs that analyzed a small group of poems in a relatively theoretical way. The current approach involved systematically evaluating a circumscribed group of word categories among a broad sample of poems written by both suicidal and nonsuicidal poets. Furthermore, we sought to test the viability of Durkheim’s (14) social integration/disengagement model vs. a more traditional hopelessness model of suicide (15, 16) by examining the word choices of the two groups of poets across the span of their careers.

According to Durkheim’s model, the suicidal individual has failed to integrate into society sufficiently and is therefore detached from social life (17). Support for Durkheim’s ideas surrounding social integration has been found in a recent study of suicide in 21 countries (18). Theories of disengagement (19, 20) indicate that suicidal individuals detach from the source of their pain, withdraw from social relationships, and become more self-oriented. For poets who received a degree of recognition for their work, this disengagement may have been enhanced by an increase in self-awareness that can result from fame (21). Fame, Schaller argues, conveys entrance into a smaller, more exclusive group, and, therefore, those who become famous become increasingly self-attentive. In linguistic terms, we would predict that individuals who are disengaged and self-attentive would use more self-references and fewer references to others (21, 22).

A more traditional model of suicide is that of hopelessness. Hopelessness theories suggest that suicide takes place during extended periods of sadness and desperation. Accordingly, the pervasive feelings of hopelessness and helplessness and the tendency to think in terms of absolutes leads to the conclusion that suicide is the only solution (16). Consistent with general hopelessness views, one would also assume that those who were most hopeless would also be most preoccupied with death (23). To the degree that feelings of hopelessness predict suicide, we would assume that suicidal poets would use more negative emotion terms (including sadness and anger words), fewer positive emotion words, and more references to death than nonsuicidal poets.

The text analytic strategy that allowed for word counting is based on the recently developed computerized text analysis program LIWC (24). LIWC was developed to provide an efficient method for studying the various emotional, cognitive, structural, and process components present in individuals’ verbal and written speech samples. The program was designed to analyze written text on a word-by-word basis and to calculate the percentage of words in the text that matched over 70 language dimensions.

LIWC captures approximately 80% of all words used in most writing and speech samples and can score words and word stems into multiple categories. For example, the word “cried” is part of four word categories: sadness, negative emotion, overall affect, and past tense verb. Table 1 provides a listing of LIWC categories that proved relevant to this study and examples of the words in each.

Table 1
Table 1
Image Tools

Each language category of LIWC was rated independently by judges during the development of the program and was subsequently validated (see Ref. 7, 25 for psychometric information). The program has previously been used to correlate word choice and physical health, grade point average, adjustment to college, and adaptive bereavement (26).

The two models of suicide that were examined are suited to a word-by-word text-analysis approach, because there is very little overlap in word groups that would reflect each theory. For example, evidence in support of social integration or disengagement theories should be found in the percentage of references to self and communication with others. At the same time, evidence in support of hopelessness models should be found in the percentage of death and positive and negative emotion words, rather than in the categories that would show support for disengagement models.

We further predicted that the word profiles of the poems by the two groups of poets might also differ according to the period of their careers (early, middle, or late). If the differences in the language of the two groups of poets become more pronounced over time, it would suggest that the condition that led to suicide developed or became more pronounced as the poets’ careers progressed. We hypothesized that there would be an increase in hopelessness words and in references to self and a decrease in positive emotion words and references to communication with others in the later periods of the poets’ careers.

Poet and Poem Selection Procedures

A total of 156 poems by poets who committed suicide were analyzed for differences in language between early (written within 2–3 years of the poet’s first recorded poem), middle (written 2–5 years halfway between the earliest and final poems), and late (written within 1 year of suicide) poems. Criteria for selection were that only published, well-known poets were included; poems must be in English or translated into English; poets must have written poems within 1 year of committing suicide; and only poets whose suicide attempts were completed and for whom a sufficient amount of material could be reliably dated (at least two to three poems per period, and generally at least five per period). Only 9 out of Jamison’s (1) 21 poets who committed suicide met these criteria for selection. Poems were selected solely on the basis of the year they were written and were not reviewed for subject or theme before analysis. The time between early and late poems ranged from 10 years (Sexton) to 32 years (Berryman). See Table 2 for represented suicidal and matched control poets.

Table 2
Table 2
Image Tools

Fifteen poems from each of nine nonsuicidal poets (N = 135), five from each period, were selected as controls. Each control poet was matched as closely as possible for nationality, era, education, and sex with one of the suicidal poets. Most of the poets in the control group were born within 10 years of the poet with whom they were matched.

Criteria for the early and middle periods of the control poets’ careers were the same as those for the suicide group. The one exception was Osip Mandelstam, for whom only one poem was available for the first 2 to 3 years of his career, necessitating selection from a publication that appeared 5 years after the first recorded poem for the remainder of the poems comprising his early period. The late poems for the control poets were those that were written when the control poet was at the approximate age of death for the suicidal poet to whom they were matched (within 1–8 years). For example, poems from Sylvia Plath, who died at age 31, were selected from among her earliest to final works. Poems from Plath’s matched poet, Denise Levertov (who lived to be 74) were only selected from her early years roughly corresponding to Plath’s writing period. By definition, then, breaking the control poets’ poems into three phases was more arbitrary, since they may have continued to write poems for many years after the suicidal poets’ death.

Presence of mental illness or other common contributors to suicidal behavior was not a factor in matching the poets. Nearly half of the poets (Lowell, Millay, Mandelstam, and Pasternak) matched to the suicidal poets exhibited signs of the same depression and other mood disorders found in some of the suicidal poets (1). Thus, differences between the suicidal and control poets are more likely associated with suicide than mental illness.

Poems were scanned from books of each poet’s collected works into text files and checked for accuracy. Each poem was then analyzed by use of the LIWC program. Because the number of poems in each phase of each poet’s life was different, the LIWC data were aggregated by phase and poet. For each poet, then, three data sets, corresponding to the three poetry phases, were available. Furthermore, because each suicidal poet was matched with a nonsuicidal poet, the two matched poets were linked together as part of the 2 (suicidal vs. nonsuicidal poet) × 3 (phase of career) within-subject design.

An initial 2 (suicidal vs. nonsuicidal) × 3 (phase of career) × 2 (word types: integration vs. hopelessness) × 3 (specific word categories within word types) omnibus repeated-measures analysis of variance (ANOVA) was computed on the six LIWC variables. An overall suicide by word-type interaction emerged, F (1,8) = 15.67, p < .01, which indicates that the suicidal and nonsuicidal poets differed in their relative use of integration vs. hopelessness words. In addition, a suicide by word type by word category interaction attained significance, F (1,16) = 5.04, p = .02. A significant word main effect, a type by word interaction (both p < .001), and a marginal phase by word-type interaction emerged, F (2,16) = 3.13, p = .07. No other effects approached significance.

In order to understand the suicide effects, separate 2 (suicide) × 3 (phase) within-subject ANOVAs were computed on the word categories of interest. Table 1 summarizes the LIWC categories analyzed for each suicide model. Table 1 contains the means for each word type by phase, and results of the analyses are explained below.

Social Integration Theories

We predicted that suicidal poets would use more references to self and fewer references to communication with others than the nonsuicidal poets and tested the following categories for support of this hypothesis: first-person singular, first-person plural, references to others, and communication words. The data indicate that, overall, the suicidal group used more first-person singular (eg, I, me, my) than did the control group, F (1,8) = 7.87, p = .02, but no significant differences emerged in the other categories.

Hopelessness Models

We predicted that the suicidal poets would use more negative emotion and death words, and fewer positive emotion words, than their nonsuicidal counterparts, No main effects or interactions emerged for overall positive or negative emotion words. The death variable approached significance F (1,8) = 4.00, p = .08; that is, suicidal poets did use death words more than the nonsuicidal poets.

Word by Phase Interactions

Additionally, we hypothesized that the differences between the groups would become more pronounced during the later phases of the poets’ careers and ran phase by word type analyses for each category tested for the two models above. Although we found support for the first social integration hypothesis, that the suicide group used more references to self, we did not see evidence to support the hypothesis that their self-references would increase as they approached their deaths. There was a marginal suicide by phase interaction effect for the use of first-person plural, F (2,16) = 3.23, p = .06. That is, suicidal poets used the words we, us, and our more in the early and middle phases of their career than did the nonsuicidal group, with the percentage of use dropping sharply below that of the nonsuicide group during the late periods. Although the suicide by phase interaction for the use of communication words was not significant, F (1,16) = 1.74, p = .20, the direction of effects was consistent with the disengagement/social integration model. As the use of communication words decreased across the suicidal group’s career, it increased across the careers of the poets in the nonsuicide group. Our test of the hopelessness by phase interaction revealed no evidence to support decreases in positive emotion words or increases in negative emotion or death words as the suicidal poets approached the late phases of their careers.

Other Relevant Data

Post hoc, exploratory analyses revealed one other interesting pattern. The suicide group used a significantly higher percentage of sexual words throughout all phases of their careers, F (1,8) = 5.1 p = .05. Although not predicted by either suicide model tested, it is interesting to note that if this finding is not due to chance, stronger evidence was found for a preoccupation with sexual matters than with matters pertaining to death. Post hoc tests of other LIWC categories revealed no other between-group differences.

In general, the results indicate that certain features of poetry may be associated with suicide. Most intriguing was the combined linguistic evidence to support a social integration approach. That is, the suicide group’s poetry contained more first-person singular self-references throughout their careers. That self-references do not increase over time indicates that the suicidal poets’ level of preoccupation with self is not due to a factor such as increasing levels of fame or recognition of their work over time. Additionally, the use of the first-person plural, which might indicate an awareness of and an integration in social and personal relationships, was lower in the suicide group’s poetry than it was in that of the nonsuicide group. Consistent with social integration theories of suicide, the direction of the effects for communication words (eg, talk, share, listen) indicates that the suicide group might have shown a decreased interest in social relationships as they approached the last years of their lives. The poets who ultimately committed suicide also used more words associated with death than did the nonsuicidal group. Surprisingly, though, the amount of negative or positive emotion did not vary significantly between the two groups.

Although some of the distinctive features of the suicide group’s writing (I and death) were clearly present throughout the poets’ careers, from this study, it is less clear whether changes in social integration over time can be discerned and linked to suicide through text analysis. The current study suggests that there might be indications of a predisposition to suicide in the writing, or at least poetry, of particular individuals throughout their careers but that other indicators of suicide might emerge in writing as individuals draw closer to suicide. It should be noted, however, that the testing of each suicide theory involved counts of multiple categories. In light of the exploratory nature of this study and the small sample size, levels of significance were assessed without correcting for the number of analyses run. Further study, with a larger sample of poets, will be necessary to determine whether findings that indicated decreasing levels of social integration in this study can be replicated.

Although it is premature to fingerprint suicide potential by use of a text analysis approach on the basis of this exploratory study, it should be noted that certain configurations of language in poetry might be predictors of suicide. If poetry allows individuals to reveal their ideation in a more symbolic manner, text analysis is a method of sorting through a group of poems to reveal characteristics of writing that are associated with suicide. Future studies to compare the poetry, as well as other forms of writing, of a larger sample of suicidal and nonsuicidal individuals will be useful in determining patterns of language with the greatest predictive value and also to determine whether changes in language configuration over time are associated with suicide.

There are a number of factors that contribute to the decision to commit suicide. The text-analysis approach to analyzing the poetry of suicidal individuals indicates that a combination of factors can also be discerned in the writing of suicidal individuals. That is, text analysis can be used as a tool for understanding the way that psychological pain, preoccupation with death and self, and association between thought and feeling can be manifested in writing and potentially predict (or indicate the current state of) psychological and emotional health.

The research reported in this paper was funded, in part, by Grant MH52391 from the National Institutes of Health. We are indebted to Bob Josephs, who initially suggested the idea of examining poets, for making comments on an earlier draft. Thanks are also extended to Tom Lay for his help in data collection.

