THE EPIDEMIOLOGY OF HIV has varied greatly between locations. Most infections in industrialized countries such as the United Kingdom are restricted to those with specific risk behaviors. In contrast, infection has spread widely in many developing countries, particularly in sub-Saharan Africa. ^{1} Mathematical models in combination with epidemiologic and behavior data can assist our understanding of the dramatic differences in observed HIV epidemics and help explore the impact of interventions. ^{2,3} Two approaches to developing models are possible. First, by making explicit our assumptions and then estimating parameter values, a framework can be created that can generate predictions based on a priori understanding. In the field of HIV epidemiology, such models have progressed from representations of average contact patterns to individual-based network simulations through increasingly detailed stratification of populations. ^{4–10} In a second approach, the description of observations can be used to construct a theoretical framework. In statistical physics, a traditional approach for understanding networks is to try to describe key characteristics of the network and then interpret the consequences of the network structure. Here we explore observed contact patterns using model descriptions of networks. If the data are consistent with model descriptions, this would help answer the questions as to whether behavior is sufficient to sustain a sexually transmitted HIV epidemic.

### Random Graphs

One fundamental model of networks is the random graph, in which every pair of nodes (individuals) is connected randomly with probability *P.* ^{11} From a starting point of a number of unconnected nodes, the probability of connections, *P*, can be increased to create a more dense network (or graph). Because connections are randomly assigned, the construction of the network leads to some people having no partner, whereas others have more than 1 partner. The latter nodes can form subgraphs such as triangles and trees. With a high enough value for the probability *P,* the graph is fully connected, ie, everyone has a connection with everyone else.

Networks have been described using 3 main properties: 1) the way the average distance between nodes changes with the size (ie, number of nodes) of the network; 2) the clustering of nodes; and 3) the degree distribution. ^{12,13}

#### Scaling of Network Distance With Size.

Random graphs display the “small-world behavior” found in some real-world systems. The small-world concept, which goes back to the work of Stanley Milgram, is that there are a limited number of steps to progress from 1 node (ie, person) in a network and any other. ^{14,15} More formally, the characteristic distances in the network scale significantly less than linearly with system size. In a random graph, the diameter scales with the logarithm of the system size. For example, if the size of the population is increased, eg, 17-fold, the diameter or characteristic path length (the average distance between 2 nodes) will roughly increase by only sixfold.

#### The Clustering of Nodes.

Another property of networks is their clique-structure as shown in friendship networks in which 2 people who know each other are likely to have friends in common. In graph theory, this property is frequently measured by the clustering coefficient, which is the probability that 2 connected nodes are both connected to another common node (forming a triangle). In sexual networks, this makes sense for homosexuals, but for heterosexuals, 2 people of the same sex would not be connected. For heterosexuals, the clustering coefficient could be defined as the probability that 2 people of 1 sex have 2 sex partners of the other sex in common. In a sexual partner network constructed in social and geographic space, a measure of clustering could be misleading because such close correlations could be rare, whereas common connections a few partners removed are likely.

#### The Degree Distribution.

The degree is the number of connections of a node, and the degree distribution describes the frequency with which nodes with a given number of connections appear in a network. This can be illustrated by a plot of the number of partners a person has and the probability of having that many partners. Many probability distributions for networks such as a random graph decay quite rapidly (exponentially) above a certain characteristic degree. However, some observed networks have a slowly decaying “fat” tail that requires an alternative model.

### Scale-Free Networks

The “scale-free network” describes the degree distribution as a power law, ie, the probability of having *k* partners *P(k)* is directly proportional to *k* to the minus γ ^{16–18}:

EQUATION

The existence of a power law means that a description of the system is similar regardless of our viewpoint. Plotted on a log–log scale, γ is the slope of the degree distribution, which is a straight line. Scale-free networks have small-world properties, display a high clustering coefficient, and often provide a good description of observed systems.

One can construct a scale-free network (like with a random graph) by starting with a few unconnected nodes and adding new nodes with *n* edges (connections) per node. Edges are added “at random” to existent nodes, but with a probability weighted in favor of those nodes, which already have many edges. This is called preferential attachment and, in mathematical terms, the probability p(*k* _{i}) that a new node will be connected to node *i* is a function of the number of edges this node already has divided by the total number of edges in the system. Numerical simulations as well as analytical calculations have shown that this method of construction leads to a scale-invariant network whose degree distribution follows a power law. In effect, scale invariance means that as one changes the “scale” (ie, the cutoff degree of the nodes, which is used to exclude them from an analysis), the pattern observed, in this case the rate of decay of the degree distribution remains unchanged. This “scale-free” label can also be related to the concept of a characteristic size. As described here in random graphs and many other models, the *P(k)* distributions have an exponential cutoff, and such networks have a characteristic size (here the average degree or connectivity <*k* >) which is a function of *p*, the probability of 2 nodes being connected. In contrast, the degree distribution of a scale-free model lacks a cutoff, and its average connectivity <*k* > is independent of *p*.

Scale-free behavior cannot be obtained without preferential attachment, but preferential attachment can be implemented in different ways. ^{12,13} For example, there could be a linear or nonlinear increase in the probability of an attachment to a node as function of its prior degree. Alternatively, the probability could saturate or decline with nodes being unable to accept more attachments. Different assumptions regarding preferential attachment generate different values for γ (the decay exponent in equation 1), which can be used to capture the structure of different observed networks. For a high value of γ, the degree distribution decreases steeply describing the situation in which most people have only a small number of partners and very few have a great number of partners. If the value of γ is small, a substantial fraction of the population has a high number of partners. Thus, the value of γ reflects the frequency of high levels of sexual partner acquisition. Despite the potential differences in mechanisms for preferential attachment in most models the exponent γ lies between 2 and 4. A minority of models lead to γ between 1 and 2. In real systems, γ was found to lie between 1 and 4. ^{12,13} Thus, models constructed using preferential attachment and described by equation 1 provide sufficient descriptions of different observed networks data and offer a variety of explanations as to the mechanisms generating the networks and the behavior underlying the observed interactions.

### Critical-Spreading Rates

The exponent γ for a sexual partner network is directly related to the mean degree (number of partners) of the distribution and its variance, which are epidemiologically important and determine whether there is a critical or threshold spreading rate (ε_{c}) for an infection to persist. For an infection without a finite duration, the spreading rate would be the transmission probability per unit of time. The value of the transmission probability required for an epidemic is a function of the rate of contacts within the population given by the ratio between the mean and the variance plus the mean squared. (These 2 quantities are referred to as first and second moments, respectively.) It can be shown analytically that if γ lies between 2 and 3, the second moment, goes to infinity, which means that the transmission probability required for infection to spread tends to zero. ^{19–23} Conceptually, this is because if γ is less than 3, then in an infinitely large population, there will always be some with a high enough number of contacts to spread infection. In that case, there would be no threshold and infections with very low rates of spread ξ will be able to pervade the whole system, because ε > ε_{c}. (The same argument holds for values of γ of less than 2, but the case γ equal to 2 has to be excluded to ensure that the mean is finite.) If γ is bigger than 3, there is a threshold and so the successful spreading of a disease depends on its transmission probability.

The concept of a critical spreading rate and its potential absence is important for understanding the epidemiology of an infection. However, the situation in which it is absent is identified mathematically and depends on a number of assumptions. For example, one of these assumptions is that the duration of infectiousness is long compared with the duration over which the network connections are formed.

### Are Sexual Partner Networks Scale-Free?

It has been suggested, based on the lifetime number of partners reported in the Swedish population, that sexual partner networks are scale-free and that there is no spreading threshold. ^{24,25} This finding has been debated with contradictory results derived for other data or other statistical means of analysis. ^{26,27} Here we seek to explore the validity of this finding in other populations, for alternative subsets of the population, and for different measures of numbers of sexual contacts. The possibility that the web of sexual contacts has a scale-free structure has a number of interesting implications in terms of prevention policy and disease persistence. The model suggests that control programs are best targeted toward the most sexually active people which is in accord with current public health strategies ^{7}; it also would follow that STDs could spread and persist even in the case of vanishingly small probabilities of transmission. In other words, the infection could never be eliminated through a biomedical intervention aimed at reducing transmission probabilities but would require alterations that reduce the network of contacts over which the infections spread.

## Methods

### Data

Four datasets are used in our analysis, the Natsal 1990 and 2000 surveys ^{28,29} (National Survey of Sexual Attitudes and Lifestyles, a household-based probability sample survey of men and women aged resident in Britain), a study of sexual behavior in rural Zimbabwe, ^{30} and a study over 6 years of gay men in London, the London Gay Men’s Sexual Health Survey (LGMSHS). ^{31} The Natsal data we used consist of interviews of 13765 respondents (7765 women, 6000 men aged 16–44) in 1990 (the *full* Natsal dataset 1990 comprises 18876 men and women aged 16–59) and 11161 respondents (4762 men, 6399 women aged 16–44) in 2000 from Britain, who were questioned using a combination of face-to-face and computer-assisted interviews. Topics covered included age at first intercourse, numbers of same and different sex partners over different time periods, sexual practices, and attitudes. With this probability sample method, one might not capture individuals with very rare high numbers of partners. Comparison included only people aged 16 to 44 from both datasets. The Zimbabwe study included data on the age of onset of sexual activity, numbers of partners, concurrent partnerships, condom use, and partner characteristics of 9843 respondents (4419 men, 5424 women) in rural Manicaland. In the LGMSHS, individuals were recruited at commercial gay venues and genitourinary medicine clinics within inner London. The study is made up of data from 6 consecutive years that were combined, because neither the mean nor the median number of sex partners shows a significant trend over this period. Survey methods, questionnaires, and the main findings of these studies are presented elsewhere. ^{28–31} The methods allow a description of the sexual network from cross-sectional reports of behaviors.

### Statistical Analysis

In addition to exploring whether the network is scale-free, we focus on whether the network’s degree distribution has infinite variance. An infinite variance, with γ between 2 and 3 in the tail region, theoretically eliminates the epidemic threshold allowing a pathogen of arbitrary small transmissibility to be maintained. We focus here on the frequency distribution of partnerships (the number of partners) and the *cumulative* degree distribution or “survival function” (the proportion of individuals remaining within the sample with a number of partners greater than or equal to the cutoff). This is a survival function in which the attrition is caused by increasing the cutoff number of partners rather than the exposure time normally used. What is plotted is the frequency of respondents with *n or more* partners against the number of partners *n*. The degree distribution shows what fraction of people have a certain number of partners, the cumulative degree distribution is the sum of the former from the right. A power law behavior in the tail of each distribution is indicative of a scale-free network (ie, *P(k)* is directly proportional to *k* to the power minus γ). The cumulative degree distribution, although not of interest in its own right, was used to illustrate the frequency distributions whose γ we want to determine. Because it is constructed by summation from the degree distribution, it is smooth and not obscured by noise. From the smoothed illustration, it is possible to observe, on the basis of where the cumulative degree plot appears to be linear, the approximate point where the “tail region” of the distribution begins and whether an exponential cutoff exists. The power law is exhibited by the tail region, but because determining its start is subjective, we are conservative in our choice of starting points for the tail. The definition of the tail is only uncertain for reported lifetime numbers of partners and LGMSHS data in which there is a plateau before the tail starts.

We explore the best estimate of parameter values in an assumed scale-free model and also test whether this is a valid assumption by comparison with an exponential model. To estimate the values of γ and their confidence intervals, we used a likelihood ratio test. The degree distribution gives the probabilities *P* _{k} =*m* _{k} */N* of respondents reporting *k* partners, *N* equals the sum over *m* _{k}, (*N*: total number of respondents, *m* _{k}: frequency of reporting *k* partners). The likelihood of finding these frequencies can be described by a multinomial distribution, which is the probability of finding the event *A* _{k} exactly *m* _{k} times in *N* independent trials, when in 1 trial only 1 event occurs, and this with probability *P* _{k}. Calculating the logarithm of the multinomial yields the log-likelihood function, which is maximal at the observed full model *P* _{k} =*m* _{k}/N. One can then calculate the log-likelihood function allowing the probabilities *P* _{k} to be a meaningful function of model parameter values. In our case, these functions are the power law *P(k)* =*Ik* ^{−γ}, with the scaling parameter γ and a “proportionality” constant *I*, and an exponential (*P* (*k*) =*Ie* ^{−θ} ^{k}. In both cases the proportionality constant *I* is the sum of *m* _{k} over *N* rather than an additional parameter. θ is the parameter of the exponential function, which determines the rate at which the tail of the partner degree distribution is curtailed. These are examples of reduced models whose goodness of fit in comparison to a full saturated model. Subtracting the log-likelihood function for the saturated model from that for the reduced model and multiplying by minus 2 yields a statistic *Q,* which can be used to test hypotheses, in this case whether a parameter value γ can be found so that all *P* _{k} can be expressed as following a power law or is better described by an exponential. If this is the case, the distribution function of this statistic converges to a chi-square distribution with N-1-r (r: number of parameters) degrees of freedom. ^{32} The statistic *Q* can also be used to determine the best estimate of the value of the parameter γ and confidence intervals.

In addition, we performed a chi-squared goodness-of-fit test using the likelihood derived values for γ and *I* by grouping (or binning) the expected and the observed data geometrically and calculated the value of chi-square. The data are grouped to account for misreporting caused by recall biases and the tendency for heaping (where round multiples of 5 or 10, and so on, are reported). ^{33,34}

## Results

The frequency distributions for reported numbers of partners over a 1-year period for Britain in 2000 are presented in Figure 1. In this nonlogarithmic presentation, the low values in the tail of the distributions are not discernible, but it is clear that most people report a low number of partners. Furthermore, the range of reported numbers of partners and the speed with which the distribution declines to the tail is evident. The tail is better displayed in the cumulative (Fig. 2) and the latter form is used to fit the model curves, which can additionally be plotted on logarithmic scales in Figure 3. The inset in Figure 3 (Natsal 1990 data) with geometrically grouped data are smooth and shows that the scatter of the ungrouped data are mainly the result of memory effects and misreporting.

Slightly different analyses were required depending on the pattern over the whole range of reported numbers of partners. To explore whether the networks are scale-free, we are mainly concerned with the tail of the distribution. In some distributions, the behavior is uniform from very low numbers of partners so a single model can be fitted for the entire distribution. In others, there is a level distribution before the probability falls off. In such cases, we use data well to the right for fitting the model. In the Natsal 1990 and 2000 data, the linear region was deemed to begin at a single partner for reports over a 1-year period and for homosexuals. For lifetime partner numbers reported by heterosexuals, we started the analysis of the tail at 16 reported partners. Similarly, for the data collected in Zimbabwe, the tail of the distribution of reported lifetime number of partners for men was analyzed from 16 partners, but in all other cases, the analysis commenced at 4 reported partners. In contrast to reports of MSM from the entire country from the Natsal data, the LGMSHS data had a plateau at low numbers of partners, so evaluation started at 12 partners over a 1-year period.

In addition to commencing our analysis well into the tail in some of the distributions, 2 models can be fitted with the region before that tail also described as a power law with very low decay exponent, γ, indicating a substantial pool of people with a few partners.

In a chi-squared goodness-of-fit test, the power law model was compared with observations and proved a reasonable fit for numbers of partners reported over 1 year from the 1990 Natsal (heterosexual men chi-squared = 17.2, df = 9; heterosexual women chi-squared = 4.24, df = 3) and for lifetime numbers of partners for MSM (chi-squared = 17.1, df = 6). Estimates for Natsal 2000 data were less likely to conform to a power law with the exception of MSM (chi-squared = 12.1, df = 5). Comparing an exponential model with the power law model as a description of the data revealed a smaller value of the test statistic Q for the power law in all cases (results not shown). The results of the likelihood ratio tests comparing the power law model with the saturated model are presented in Table 1 for each dataset, period of reporting partners for heterosexual and homosexual men and women.

The power law model provides a good fit for the available data, but it should be noted that the range of numbers of partners for which the model can be compared with data are limited. The number of orders of magnitude for which data are available reflects the extent to which the scale-free behavior can be confidently attributed to the system. However, the orders of magnitude covered in the data will be a function of the sample size because more extreme behaviors are rare. The scale-free region for lifetime partners of MSM covers 3 orders of magnitude. This despite the small sample size and implies that there are relatively large numbers of MSM with many sex partners.

### Comparing the Observed Distributions and the Decay Constants

#### Heterosexuals.

In all cases, the power law model provides a better fit to the tail of the distribution than the exponential model (Table 1). The differences in goodness of fit between the power law and the exponential are more pronounced in descriptions of number of partners over 1 year than over a lifetime. The scaling exponent for Zimbabwean women falls in the region between 2 and 3 where there is no spreading threshold, whereas for the British population, it is around 3, the borderline for the existence of a transmission threshold. In Britain, values of γ for women over a 1-year period fell from 3.7 in 1990 to 3.1 in 2000, indicating an increase in the variance in numbers of sex partners. Despite this, both values are still significantly greater than that of the Zimbabwe women who show a markedly greater variance. An increase in variance between reports from 1990 and 2000 also took place among British men. The tail of the distribution of reported number of partners over 1 year for Zimbabwean men was well described by the power law model, but the evaluation began at 4 partners because there are substantial numbers of men reporting 2, 3, and 4 sex partners over 1 year: 41% of Zimbabwean men have more than 1 partner in a year, 12% more than 3 partners in a year, whereas women have predominantly only 1 partner over this period (5.7% >1 and 1.4% >3). This is in contrast to the long tail of the distribution for women, which extends to 80 partners in 1 year with only 1.4% of the women contributing approximately 14% of all partnerships/edges. The power law model provides a less convincing description of the reported number of partners of British men but is still somewhat better than the exponential model. The results with the Zimbabwean and British data show that values of γ determined by use of the lifetime data usually differ somewhat from the values as determined from 1-year data, indicating an increase in the numbers of sex partners over a time scale of decades.

In Britain, the value of γ for men is always smaller than that of women consistent with the maximum reported number of partners for women being much lower than that for men. This contrasts with the behaviors reported in Zimbabwe where the greatest number of partners reported by women and men in a lifetime are the same, but over 1 year, women report a maximum 3 times greater than that reported by men with a concomitantly smaller value of γ over 1 year. Over 1 year, British men appear to include more high-activity individuals compared with Zimbabwe men. Over a lifetime, however, the values of γ for Zimbabwean and British men are similar.

#### Homosexuals.

There are wide confidence intervals for the estimate of γ for homosexual women in the Natsal study because only a small number of such women were recruited into the study, and it is not possible to determine whether this group has a value of γ in the region between 2 and 3 for partners reported over 1 year. It is also not possible to distinguish between the power law and the exponential model. For lifetime numbers of partners, homosexual women have a degree distribution with a value for γ clearly below 3.

For homosexual men, the power law model is in all cases the better model as shown by the large differences in the test statistic. In contrast to data for homosexual men from the whole of Britain, data for those in London has a plateau at low numbers of partners. This indicates that MSM are as likely to have 2 sex partners as 1. This pattern resembles that reported for 1 year for heterosexuals in Zimbabwe and is probably not analogous to the plateau observed for the lifetime number of partners of British heterosexual men, because it occurs over different time scales. This latter plateau could be predominantly generated by partners a person is involved with prior to forming a long-term partnership. In MSM and Zimbabwean men, there is an order of magnitude greater rate at which partners are acquired and this could often lead to multiple sexual partnerships.

Values of γ for MSM are in the region between 1.5 and 2, which is fundamentally different, indicates a network with accelerated growth, ie, as the number of people in the network increases, the number of connections each one has also increased. In other words, the number of partners of each man who has sex with men increases as the population of MSM increases in size. The value of the decay constant γ depends on the size of the network and decreases linearly with it. The values for γ for reported number of partners of MSM over a single year and a lifetime were practically identical. If they were not identical, this would imply that the rate of sex partner change has changed lasting recent years.

## Discussion

In a range of different populations, we found that the distribution of numbers of partners appeared to be scale-free. However, a number of caveats need to be noted. First, the number of orders of magnitude over which the power law is observed for the behavior is limited. The range varies between the groups analyzed, from a low of 1 order of magnitude for women reporting their number of partners over a year to a wide range of more than 3 orders of magnitude for the lifetime number of partners of MSM. However, the maximum number of partners reported in the datasets did not seem to be restricted by an exponential cutoff in which the fraction reporting high numbers rapidly decays because the number possible/desirable saturates. Rather, it appears that we fail to sample individuals with very rare behaviors, implying that if the sample size was increased, we would expect a higher maximum number of reported partners. An important question is whether at some stage we would expect the power law to be replaced by an exponential cutoff rather than the network remaining scale-free at all scales. It seems likely that the number of partners will be bounded at some scale and there are a variety of reasons for networks to be bounded. Most obviously, the human population has a finite size and one imagines that there is a physiological limit to the number of people with whom one can have sex. The question is where we would observe this limit if our data provided a valid representation of the entire population. To better capture high-activity people with many partners, one would want to oversample them by, for example, sampling those who have sexually transmitted infections. It is also likely that the time window with which we view the network influences where the boundary occurs with a shorter period for accumulating partners lowering the maximum number of partners. A third source of finite size effects is demographic stochasticity, ie, the random effects that arise from the population consisting of a discrete set of individuals.

The behavior of the tail is crucial for the epidemiology of sexually transmitted infections because the invasion and spread of infection can be dominated by a few very high-risk individuals. The pattern of sexual partner acquisition over a short time period determines the creation of contacts for the short-lived bacterial sexually transmitted infections and for the rapid spread of a longer duration infection such as HIV or HSV-2. Looking at the distribution of partner numbers over the shorter period more relevant for the acute STIs, the decay constant γ has values in the region smaller than 3 in which there is unlikely to be a critical “spreading rate” (which is synonymous with an infinite R_{0}) for all the populations except for heterosexual women in Britain.

If we accept that there is a scale-free distribution and use a continuous approximation for analytical simplicity, then the moments of the distribution can be calculated for different values of γ. The validity of the continuous approximation depends on the maximum value of k and is more robust when we have observed a greater range in partner numbers. As to the conclusions we can derive from this model, it is crucial whether the maximum possible number of partners has a high probability and has been found. In our data, this is easier to accept in the MSM data, which extend over a longer period and have a less steep slope. In the physical literature, a continuous approximation for the degree distribution is used for analytic simplicity, which holds best for large k. Accepting these 2 restrictions, one can calculate the moments of the continuous approximation of the distribution of the number of sex partners. These can then serve to derive the basic reproductive number R_{o} ^{,} essentially the average number of successful offspring that an animal produces. In the case of microparasitic infections, R_{0} can be defined as the number of new infections generated by 1 infected individual that is introduced into a fully susceptible population. ^{4}

For a behaviorally homogeneous population, the basic reproductive number is given by R_{0} = β c *D*, where β the transmission probability, c is the rate of forming new contacts, and *D* the mean duration of infectiousness. ^{4} However, for a heterogeneous pattern of risk behavior with random mixing, the formula can be changed to include the disproportionate influence of those with a very high number of contacts by including the coefficient of variance (the variance σ^{2} divided by the mean number of partners, *m*) or the second moment of the distribution divided by the first moment ^{19,23,35}:

EQUATION

In a scale-free network, the variance or second moment tends to infinity.

EQUATION

The value of the basic reproductive number needs to be greater than 1 for an infection to invade a population. However, in case of a scale-free network with a coefficient less than 3 because <k^{2} > tends to infinity, R_{o} will always be greater than 1 regardless of the value of the transmission probability or the duration of infection. This means that changes to the transmission probability or duration of infectiousness would be unable to eliminate infection. However, the analysis depends on the sexual partnerships that create the scale-free network being formed and staying in place for a period commensurate with the duration of infectiousness. Thus, for more acute sexually transmitted infections, the scale-free network needs to exist over a wide range of values in a short period, which is less convincingly demonstrated in the data. In interpreting the implications of an infinite variance in contact patterns for the basic reproductive number, it is important to remember that the value determines whether invasion or persistence can occur, not the likely endemic prevalence. The more invasion is dependent on a high variance, the lower the likely prevalence associated with infection. Thus, if there is no threshold spreading rate, elimination will only be possible if the contact pattern is changed, but a reduced incidence would still be possible through alterations in transmission probability and duration of infection.

Our analysis draws attention to some interesting differences in distributions of partner numbers, which almost certainly have had important consequences for the epidemiology of HIV. The value of γ for British MSM lies between 1.5 and 2, which implies an accelerating network. An increase in the number of men increases the average number of partnerships, and it is immediately apparent that without a limited number of MSM, there will be an infinite variance and lack of spreading threshold. This helps explain why sexually transmitted infections have readily invaded and are difficult to eliminate from the MSM community. We expect a higher mean number of sex partners for MSM in a large population like London when compared with other small local populations. If there were no geographic restrictions on partnerships, we could see the behavior of MSM inside and outside London as part of a single large network.

The values of the decay constant are more difficult to interpret for heterosexuals in Britain. Although the value is less than 3 for men, it appears to be greater than 3 for women. Theoretically, it is unclear what the implications of combining these 2 different reported distributions are for the heterosexual spread of an infection, but it appears likely the transmission probability of infection and its duration will play an important role in determining the persistence of an STI in British heterosexuals. The same problem of interpreting 2 different distributions, which combine for the spread of an infection exists for Zimbabwe. The scope for HIV spread is clearly not as great as that for MSM in Britain, but an interesting comparison of the distribution of behaviors for men and women could in part explain the greater heterosexual spread of HIV in Zimbabwe than in the United Kingdom. Here, most men have more than 1 sex partner, whereas very many fewer women have more than 1 partner. However, the tail of the partner number distribution for women does appear scale-free with a γ less than 3. The sex partner distribution for Zimbabwean men can be interpreted as 2 power laws, 1 short comprising only a few points (although comprising many individuals), and a longer tail with a value of γ at the boundary of the at-risk region (γ = 3.0). The high number of men with more than 1 partner would provide a means of a wide spread of HIV given its invasion through the long tails of the distribution. The maximum number of sex partners per year of the women is approximately 3 times that of men. Thus, the devastating HIV epidemic could be a consequence of this combination of a significant fraction of women with extreme numbers of sexual partners and a generally high level of non-monogamous behavior among men. This conjecture requires further exploration in simulation models of infection in bipartite graphs in which the 2 types of nodes have different power laws describing observed contact distributions. In comparing populations, an accurate exploration of the tail of the distributions in partner numbers is extremely difficult in random samples of the population. More focus is required on those likely to have many partners, for example, through studies of those with STIs or in high-risk occupations like sex work or truck driving.

Over the decade between the 2 British sexual behavior surveys, the network for heterosexuals appears to have changed to a slower decay rate indicating that extreme numbers of partners are both more common and more extreme. It should be noted that the collection of data in both the United Kingdom and Zimbabwe occurred after concern about HIV arose so could have been influenced by concerns over acquiring infection.

These high-risk individuals have been conceptually thought of as a core group for the transmission of a sexually transmitted disease. However, it is possible to conceive of the core group as individuals who are separate and other, or individuals whose behavior is a continuation of that of the rest of the population. In Britain, for heterosexuals, the steady power law decline in reported number of partners suggests that there is no discontinuity so the “core group” is simply a part of a continuous pattern of behaviors. For MSM, the same is true of partners over 1 year but not over a lifetime where there is a plateau or first power law and a tail or second power law. The same is the case for the Zimbabwe. It would be possible to define a part of the population whose behaviors mean that they are placed within the scale-free tail of the distribution. It is interesting to speculate whether in the case of these discontinuous distributions different behavioral mechanisms, which might be characterized as “settled” or “searching” generate the observed outcome.

## References

1. Marais H, Wilson A. Report on the Global HIV/AIDS Epidemic. Geneva, Switzerland: UNAIDS, 2001.

2. Anderson RM, Garnett GP. Mathematical models of the transmission and control of sexually transmitted diseases. Sex Transm Dis 2000; 27:636–643.

3. Grassly NC, Garnett GP, Schwartlander B, Anderson RM. The effectiveness of HIV prevention and the epidemiological context. Bull World Health Organ 2001; 79:1121–1132.

4. Anderson RM, May RM. Infectious Diseases of Humans. Oxford: Oxford University Press, 1991.

5. Anderson RM, Medley GF, May RM, Johnson AM. A preliminary study of the transmission dynamics of the human immunodeficiency virus (HIV), the causative agent of AIDS. IMA Math Appl Med Biol 1986; 3:229–263.

6. May RM, Anderson RM. The transmission dynamics of human immunodeficiency virus (HIV). Philos T Roy Soc B 1988; 321:565–607.

7. Garnett GP, Anderson RM. Sexually transmitted diseases and sexual behaviour: Insights from mathematical models. J Infect Dis 1996; 174(suppl):S150–S161.

8. Kretzschmar M. Graphs and line graphs as a model for contact patterns. Zeitschrift F Angew Math Mech 1996; 76(suppl 2):433–436.

9. Ghani AC, Donnelly CA, Garnett GP. Sampling biases and missing data in explorations of sexual partner networks for the spread of sexually transmitted diseases. Stat Med 1998; 17:2079–2097.

10. Ghani A, Swinton J, Garnett GP. The role of sexual partnership networks in the epidemiology of gonorrhea. Sex Transm Dis 1997; 24:45–56.

11. Erdõs P, Rényi A. On random graphs. Pub Math 1959; 6:290.

12. Dorogovtsev SN, Mendes JFF. Evolution of networks. Advanc Physics 2002; 51:1079–1187.

13. Albert R, Barabási A-L. Statistical mechanics of complex networks. Rev Mod Physics 2002; 74:47–97.

14. Milgram S. The small world problem. Psychol Today 1967; 1:60.

15. Watts DJ, Strogatz SH. Collective dynamics of ‘small-world’ networks. Nature 1998; 393:440–442.

16. Barabási A-L, Albert R. The emergence of scaling in random networks. Science 1999; 286:509–512.

17. Barabási A-L, Albert R, Jeong H. Mean-field theory for scale-free random networks. Physica A 1999; 272:173–187.

18. Dorogovtsev SN, Mendes JFF, Samukhin AN. Size-dependent degree distribution of a scale-free growing network. Physical Rev E 2001; 63:0621011–4.

19. May RM, Lloyd AL. Infection dynamics on scale-free networks. Physical Rev E 2001; 64:0661121–4.

20. Lloyd AL, May RM. How viruses spread among computers and people. Science 2002; 292:1316–1318.

21. Pastor-Satorras R, Vespignani A. Epidemic dynamics and endemic states in complex networks. Physical Rev E 2001; 63:06611.71–78.

22. Pastor-Satorras R, Vespignani A. Epidemic spreading in scale-free networks. Physical Rev Lett 2001; 86:3200–3203.

23. Pastor-Satorras R, Vespignani A. Epidemic dynamics in finite size scale-free networks. Physical Rev E 2002; 65:03510.81–84.

24. Liljeros F, Edling CR, Amaral LAN, Stanley HE, Aberg Y. The web of human sexual contacts. Nature 2001; 411:907–908.

25. Liljeros F, Edling CR, Stanley HE, Aberg Y. Social networks (communication arising): Sexual contacts and epidemic thresholds [Reply]. Nature 2003; 423:606.

26. Holland Jones J, Handcock MS. An assessment of preferential attachment as a mechanism for human sexual network formation. Proc R Soc Lond B 2003; 270:1123–1128.

27. Holland Jones J, Handcock MS. Social networks (communication arising): Sexual contacts and epidemic thresholds. Nature 2003; 423:605–606.

28. Johnson AM, Wadsworth J, Wellings K, Field J. Sexual Attitudes and Lifestyles. Oxford: Blackwell Scientific Publications, 1994.

29. Erens B, McManus S, Field J. National Survey of Sexual Attitudes and Lifestyles. II: Technical Report. London: National Centre for Social Research, 2001.

30. Gregson SJA, Zhuwau T, Ndlovu J, Nyamupaka CA. Methods to reduce social desirability bias in sex surveys in low-development settings: experience in Zimbabwe. Sex Transm Dis 2002; 29:568–575.

31. Dodds JP, Nardone A, Mercey DE, Johnson AM. Increase in high risk sexual behaviour among homosexual men, London 1996–8: Cross-sectional, questionnaire study. BMJ 2000; 320:1510–1511.

32. Aitkin M, Anderson D, Francins B, Hinde J. Statistical Modelling in GLIM. Oxford: Clarendon Press, 1989.

33. Morris M. Telling tails explain the discrepancies in sexual partner records. Nature 1993; 365:437–440.

34. Roberts JM, Brewer DD. Measures and tests of heaping in discrete quantitative distributions. J Appl Stat 2001; 28:889–896.

35. May RM, Gupta S, McLean AR. Infectious disease dynamics: what characterizes a successful invader? Phil Trans R Soc Lond B 2001; 356:901–910.