IN ITS USE of individual-level data, epidemiologic methodology compares the frequency of occurrence of an event (either an exposure or an outcome) in groups of people with contrasting characteristics. The statistical result is a measure of association (e.g., a relative risk) that compares the groups. Social network analysis of group phenomena, in contrast, focuses on the relationships among persons in a group, and produces a set of statistics that examines the quality, density, position, and structure of such relationships. ^{1} The considerable stream of theory and analysis that has developed in the field since its first enunciation in the 1930s ^{2} includes such substantive areas as community decision making, diffusion and adoption of innovation, corporate interlocking, market analysis, scientific interactions, consensus and social influence, and coalition formation. ^{1} Although there are some antecedents, ^{3} the recent rapprochement of social network analysis and disease transmission dynamics can probably be attributed to the attempts to link men with a syndrome of acquired immunodeficiency during the early years of the HIV pandemic. ^{4,5}

In the years that have followed, two fundamental questions have motivated the investigation of networks as determinants of disease transmission ^{6–9} : “Do networks matter?” and, “What can we do about them?” Proponents vault over such questions; detractors raise the bar. In truth, the field is a work in progress and necessitates some willing suspension of disbelief. In this commentary, I propose a simple thought experiment that attempts to show how a network might matter, and points to the possibility of several network-informed interventions.

The probability of *acquiring* an infection is a function of partner choice and sexual practices—two different properties that are often subsumed under the term “behaviors.” The fundamental network hypothesis posits that these behaviors take place in a social context, a network of persons, that influences the risk of transmission and the propagation of disease. Similarly, the probability of *transmitting* the infection depends on these same behaviors, and is conditional on becoming infected. Therefore, prima facie, the probability of transmitting an infection should be lower than the probability of acquiring an infection. The real situation is more complex, however, because of the many factors that influence the passage of an organism from one person to another. Nevertheless, a simplification of the real situation that imposes rigid restrictions on these factors may illustrate the specific effect of network connections.

#### A Simple Construct

We imagine that a starting graph is a group of persons in a fixed, static network configuration that is then permitted to change in accordance with the probabilities of transmission (Figure 1). The persons in the diamond-shaped boxes are infected and the others are not. Contact lines between persons represent a necessary and sufficient encounter to transmit disease with a given probability (*P*_{T} = 0.5). Sexual practices are constant and identical for each person in the network, and gender is ignored. Therefore, with numerous trials, disease on average will be transmitted half of the time.

In this exercise, we will examine the effect of a single new partnership between uninfected persons at low risk, a single new partnership between uninfected persons where one of them is at higher risk, an infected and an uninfected person, the creation of a microstructure (here, a clique of n = 3, which is equivalent to creating three persons with concurrent partnerships), and the addition to the clique of a single new partnership between two uninfected persons (Figure 2). In network parlance, a clique is formally defined as a group of *n* persons, each of whom is connected to all the others. Other “microstructures” have more complex definitions, but all are meant to signal the presence of a small group within which an infectious agent can circulate.

The quantities of interest are the probability that a person becomes infected and the probability that a person transmits an infection. The probability of becoming infected can be calculated using the general approach: That is, the probability that *J* becomes positive is one minus the probability that all of *J* ’s contacts (*J* ’s degree) are not infected. Independence of *J* ’s contacts with each *I* is assumed. The calculation of *P* (*J* +) depends on the *P* (*I* +), which in turn depends on the probability that *I* ’s contacts are positive. Therefore, the process is an iterative calculation that proceeds from the border of the network (or the terminus of each pathway) to *J*

The probability that *J* transmits an infection can be calculated with the following general approach:MATH where *P* (*J* *+) =*P* (*J* +)|*I* − That is, the probability that *J* transmits to *I* is the probability that *J* is positive times the probability of transmission times the probability that *I* is negative. *J* ’s probability of being positive has to be recalculated (as in equation 1) omitting *I*. *J* ’s probability of transmitting to *I* cannot include an infection caused by *I*. In essence, these equations calculate (1) the probability that persons in the network acquire infection by considering the likelihood that they will be infected by the joint action of their partners; and (2) the likelihood that persons in the network transmit infection to any of their partners, given that some of their partners are already infected.

Equation 1 Image Tools |
Equation U2 Image Tools |

This process requires calculation by inspection to determine the transmission potential from any person to any contact. The addition of a new partnership has the potential to change some or all of the calculations and quickly increases the complexity of the calculation. Taken over the entire network, the sum of the probabilities of becoming infected divided by the number of persons in the network will provide the disease prevalence after the transmission events have taken place. For each person, the sum of the probabilities of transmission to each of his or her contacts represents the reproductive number (R_{0}). The reproductive number is the average number of persons infected by a given individual, conditional on whether that individual becomes infected. A mean R_{0} (taken over all persons in network) can then be calculated for each configuration of the network.

For this construct, the calculated R_{0}s are artificially small and the prevalences are artificially large because of the network assumptions. The initial network (Figure 1) is sharply bounded, and 12 of the 20 participants cannot transmit infection. The remaining eight participants would each have to transmit to 2.50 others to create an average R_{0} for the network of 1.0—clearly an impossibility. Therefore, the R_{0} should be thought of as a relative measure in this case, and we are interested in the way it changes with changing configurations. Similarly, we begin with a prevalence of 20% (4 of 20 persons), and the prevalence within this bounded group increases quickly.

A single cycle of transmission within the original configuration (Figure 1) leads to a prevalence of 45.0% and an R_{0} of 0.250 (Table 1). The addition of a single new connection between two infected persons at lower risk (connection [1]) increase both prevalence and R_{0} by a small amount, but that amount is augmented by the current connections of the new partner (connection [2]) or, obviously, if the new partner is infected (connection [3]). What is perhaps more important is that the mean R_{0} for the entire network is increased by this single pairing. If three new pairings are introduced to create a clique of three uninfected persons at lower risk (configuration [4]), the effect on prevalence is the same as a single pairing between an infected and an uninfected person (configuration [3]). If an additional connection between uninfected persons is added to configuration, ^{4} thereby increasing the complexity of connections, the ending prevalence and the R_{0} are both increased by 25%

#### Comment

The numbers associated with this exercise have little absolute meaning and cannot be generalized directly. This lack of generalizability results from the complexity of network effects and from isolating one feature of transmission, network structure, and “magically” keeping constant many others (e.g., nonconstant transmission probabilities, temporal relationships, varying disease duration and healthcare seeking, varying partnership durations, injecting drug use and other risks). Given the complexity that even this simple approach generates, the exercise highlights the considerable challenge of separating the relative impact of network, behavioral, and other social factors on transmission. This approach does, however, provide some tentative insights into the transmission process.

##### Presumed Probability of Transmission

It is immediately apparent from the equations used that the probability of transmission is critical to network spread, because the “signal” for transmission decreases as an exponential function of the transmission probability. For some sexually transmitted diseases, a probability of 0.5 ^{10} may be the correct order of magnitude, but even such a high probability does not generate uninhibited transmission. For a disease such as HIV, where the transmission probability is 0.002 to 0.01, ^{11} it is logical to contemplate the considerable amount of structure that must be necessary to support transmission.

##### Relative Probability of Acquisition or Transmission

Under these circumstances, the probability of becoming infected is always greater than the probability of transmitting a single infection. However, the R_{0} for an individual, which is the summation for all of his or her contacts of the probabilities of transmission, can be higher than the probability of becoming infected. Therefore, the specific propensity to acquire or transmit will change with network properties, even in the absence of sexual behavior change. The same sex act, or behavioral complex, will have different impact in different settings.

##### Uncertainty in Predicting Network Effects

The impact of new connections may be difficult to predict a priori without accurate knowledge of the network. In this construct, the overall network risk is increased more by the formation of a microstructure among uninfected persons than by the addition of a direct connection between an infected and uninfected person. It is, however, possible to create circumstances in which the opposite might be true. Although perhaps possible in theory, a general algorithm for predicting transmission (i.e., predicting the size of R_{0} for a network) that counts pathways and recognizes the other factors that influence probabilities along those pathways is a daunting task. These considerations pose a considerable challenge to mathematical modelers in their efforts to interpret the results of stochastic sampling schemes. These issues also challenge network analysts and epidemiologists who must use summary measures of structure as a substitute for understanding the impact of specific sets of connections. Finally, intervention messages are predicated on risks that may be seriously misestimated (and usually magnified) if they are linked solely to personal behavior without regard to underlying network structure.

##### Uncertainty in Predicting Individual Infection

In fact, this construct suggests that predicting who will become infected may be subject to considerable error. In this exercise, the probability of acquiring the disease was defined as a point estimate (0.5), and the variance of the estimate was not considered. In real situations, other factors mentioned previously can only serve to increase that variance and make predictions about individuals even more difficult. An example from a study of the urban networks of persons at risk for HIV because of drug-using and sexual activity illustrates this point. ^{12} The groups studied had a moderate prevalence of HIV (13.3%) and a moderate incidence of 1.8% per year based on three seroconversions. Two of these seroconversions took place in the same community chain of connected persons. A visual display of the network of joint risk (sexual activity and needle sharing) illustrates the variability of network structure from moderate (Figure 3) to low (Figure 4) to moderate (Figure 5). Within these dynamics, one person converted between interview 1 and 2, and the other between interviews 3 and 4 (diagram not shown). Unseen and unmeasured structure (where, in nonnetwork terms, structure refers to the Tinkertoy that can be constructed from the set of relationships in a subgroup of persons) must have played some role in determining their risk, but from the data available, predicting seroconversion among negative persons would not have been possible. However, predicting that persons in this network had an “average” risk of seroconversion of 1.8% per year is a defensible claim based on overall network characteristics. Although this average risk does not apply to all network members equally, it provides a potentially more useful approach than simply declaring them to be “at high risk” because of their behavior. It is clear that similar behavior in other settings ^{6,13} confers a markedly different risk.

Fig. 3 Image Tools |
Fig. 4 Image Tools |
Fig. 5 Image Tools |

##### Network-Informed Intervention

Most tools for the reduction of sexually transmitted disease and HIV transmission are oriented either to the individual or to an undifferentiated group. It is likely that interventions that are network informed (the goal of which is often personal behavior change) would use many of these same tools but would target them to people and groups based on network considerations. Examples of such activity, some of which are in more advanced states of development than others, would include the following:

* •*Segmenting networks in which transmission takes place.* Although not originally offered as such, closing bathhouses frequented by men who have sex with men, or closing shooting galleries frequented by drug injectors who share needles, have a strong network basis. Social concerns, particularly regarding the former, have raised important objections to such segmentation, but should not obscure the underlying theoretical basis for these actions.

* •*Influencing central figures in a network structure*. Focusing risk-reduction efforts on persons whose network centrality in social and sexual or drug-using activity is high may have amplifying effects on preventive practices (R. Trotter, personal communication, 1996). Although not formally tested, the notion that key figures within a group can have special influence on the group is attractive, and network evaluation (qualitative or quantitative) is needed to identify them.

* •*Group intervention with small networks.* When networks consist of small closed groups—the circumstances that obtain in many sparsely populated areas (R. Trotter, personal communication, 1996)—interventions can be addressed to the entire group. Again, although largely untested, such approaches have intuitive appeal and necessitate formal evaluation.

* •*Network-informed approaches for enrollment in risk-reduction activities.* It is possible that many ongoing interventive school- or community-based programs make insufficient use of the natural groupings of persons within their settings. Asking current participants to recruit future participants from their personal network of friends, acquaintances, or at-risk partners can also serve as a targeting mechanism.

* •*Directly targeted network risk-reduction activities.* Among groups whose risk may be presumed or increased, providing risk-reduction methods directly is eminently feasible. One programmatic example is the daily visit to commercial sex workers “on the stroll” to provide condoms at times of peak activity (J. Potterat, personal communication, 1992).

* •*Network-informed case investigation and contact tracing.* Applying traditional methods of case investigation to groups believed to be at high risk has enormous potential for interrupting transmission. That such groups, and the nature of their interactions, can be identified is well established. ^{13–15} A more concerted epidemiologic and ethnographic effort to identify such groups in areas of high risk and to direct partner services to them has considerable appeal and merits further attention.

* •*Network-informed behavioral messages*. As noted, many of the tools oriented to changing individual behavior can be used to carry network messages. One example is the introduction of messages that counsel persons about simultaneous, or concurrent, partnerships. ^{16} As noted previously, the addition of relationships, particularly concurrent ones, between uninfected persons can alter the probability of infection appreciably. Obviously, if one of the three partners in a concurrent relationship is infected, the likelihood that all three will become infected is greatly increased.

In contrast to traditional epidemiologic studies, networks matter because they consider the influence of relationships within a group, and how the totality of these relationships can influence a person’s likelihood of getting or transmitting a disease. At least within the confines of this construct, the same behavior in different settings will lead to different outcomes. Further generalization necessitates considerable complexity, and is the current subject of both empirical and theoretical work. Some initial ideas suggesting that network information can inform interventions, even in the absence of intervening “on the network” per se, have emerged. Perhaps the most important of these is the need to focus on heterogeneous networks of people in real-life situations instead of focusing on groups such as “adolescents,” or “prostitutes,” or “gay men” who are connected only by a stereotypic label.