In 2006, UNAIDS declared that India had 5.7 million HIV-positive people. NACO put the figure at 5.2 million. And, finally, NFHS-3 put the HIV burden at 2.5 million. M Prasanna Kumar demystifies the numbers game
Estimates of HIV in India have always been challenged, either by the government or by other agencies. In 2006, UNAIDS declared that India had an estimated 5.7 million HIV cases (range: 3.4 to 9.4 million) (1), making it the country with the largest HIV burden in the world. The National AIDS Control Organisation (NACO) quickly countered that figure, saying it was no more than 5.206 million (2). That same year, the Joint UN Programme on AIDS (UNAIDS) stated that there had been 400,000 deaths due to AIDS in India in 2005. This number too was dismissed by the government. Not too long ago, a US Central Intelligence Agency report projecting that India would have 20-25 million HIV infections by 2010 (3) was widely circulated in Indian and international publications.
On the other hand, government estimates of the HIV burden in the country have been regarded with scepticism by international agencies and AIDS activists who believed them to be underestimates.
Such wide variations baffle the average reader. How exactly are HIV estimates made? How reliable are they?
The explanation, in a nutshell, is that the various estimates are based on mostly the same data but on different calculations. NACO’s calculations are made in consultation with a select team of experts including people from the World Health Organisation (WHO) and UNAIDS. However, UNAIDS uses a different modelling approach for making global estimates. Since this method is different from NACO’s, it will give a different estimate when applied to India. Differences in the methodology and algorithms used lie at the root of all disputes regarding HIV estimates.
The real question may be: what drives different groups to use different methods? Do they have an interest in inflating -- or downplaying -- figures?
In 2006, NACO calculated that there were 5.206 million people with HIV in India. This was based on the government’s sentinel surveillance data collected in 2005.
Then, on June 6, 2007, a beaming Anbumani Ramadoss, Union Health Minister, released the results of the third (and latest) National Family Health Survey (NFHS-3) which arrived at an HIV burden of only 2.5 million (range: 2 to 3.1 million), half the previous year’s figures (4). (By way of comparison, there are 1.5 to 2 million people with cancer  and 8.5 million people with TB  in the country at any given time.)
HIV prevalence in the 15-49 age-group is down from 0.91% to 0.28% (2). Prevalence among men is 60% more than in women, at 0.36% and 0.22% respectively (4).
What are we to make of this huge discrepancy between the earlier and latest estimates?
Sentinel surveillance and HIV trends in India
Every year since 1998, the National AIDS Control Organisation (NACO), which is under the Ministry of Health and Family Welfare, has released figures on India’s HIV burden. These figures are based on data from annual rounds of sentinel surveillance in which blood samples are taken from designated sentinel groups in every state and union territory in the country: pregnant women at government hospitals, clients visiting sexually transmitted diseases (STD) clinics, and groups with high-risk behaviour such as female sex workers, men having sex with men, and injecting drug users. Most surveillance sites are in urban areas, although a certain number of rural sentinel sites are also sampled to get an idea of HIV prevalence in rural areas. A specified number of samples is collected from each sentinel site (400 from the sites of pregnant women and 250 from others) and tested for HIV. India has the largest HIV sentinel surveillance programme in the world and it is being improved and extended every year. In 2005, it was carried out at 703 sites throughout the country. That number was increased to 1,122 in 2006.
Still, not all 611 districts in India are represented in the sentinel surveillance programme. There are plans to increase the number of sites further, to ensure greater geographical representation and to include all significant risk groups in an area. This would certainly mean closer monitoring of the local epidemic, and more accurate assessments of the trend of the epidemic.
Sentinel surveillance is used along with other measures to look at trends in HIV prevalence. Information from various sources is triangulated -- surveillance data, number of AIDS cases reported, number of AIDS deaths reported, age-specific mortality, blood bank data, and size of population of groups with high-risk behaviour. In any particular site, if data from multiple sources is available, it helps build a truer picture of the local epidemic. Such triangulation also helps to limit errors in noting trends. But one should never forget that in any study of this kind, where only selected small subsets of the population are sampled and the results of the sample used to make projections for the entire country, there is bound to be some degree of uncertainty in the final estimate. This is why the methodology is constantly being refined to reduce the margin of error.
Sentinel surveillance is a good tool for noting trends in HIV prevalence, and changes over the years. It is a costly and labour-intensive exercise (over Rs 3 crore is spent every year). In India, it is well carried out and generally well supervised. Preparations for each annual round of sentinel surveillance are made months ahead, all staff involved are trained, a standard protocol of data gathering and testing in all sites is followed, and there is monitoring by observers at all levels as well as external quality assurance in the testing of samples.
Sentinel surveillance and the HIV burden
The problem arises when sentinel surveillance data is used to estimate a country’s HIV burden. Sentinel surveillance is not designed to make estimates, but it has been used to provide rough estimates of the HIV burden for many years, for want of a better approach. One could ask why other methods were not used. But the problem is not so much in making rough estimates as in their dissemination and use to make a point that they cannot make.
HIV is not very prevalent among the general population in India. (The latest figures suggest that the prevalence is half of the already low prevalence of less than 1%.) The accuracy of a sample survey depends partly on the prevalence of the condition. The lower the prevalence, the higher the minimum sample needed. Also, sampling biases are worsened when the condition has a low prevalence.
Biases in the sampling process
There are various biases in the existing sampling process. However, these biases are not publicly acknowledged, as a result of which the public is misled.
For example, practically all sentinel sites are in government hospitals, whereas the majority of people use private services. We don’t know the HIV prevalence among those who attend private hospitals. Estimates of the overall HIV burden are mainly based on prevalence among pregnant women attending government hospitals. This excludes those who go to private hospitals for antenatal care -- and those who don’t receive any healthcare at all.
Further, the samples are of pregnant women and various groups with risk behaviour. They offer no direct information on other women or on men outside these groups.
In addition, samples taken from STD clinics are intrinsically biased -- they are taken from people with symptoms of a sexually transmitted disease who attend government STD clinics for treatment. These should not have been included when calculating the HIV burden of the country.
Finally, if the condition is unevenly distributed in the population, any sample taken will not be representative of the population. Representative samples are necessary in order to make projections or estimates, or else the results will be unreliable.
To illustrate, each state provides 400 samples each for the annual surveillance, from several antenatal clinics. Just two or three positive samples among them could skew the overall results. In Uttar Pradesh, in 2003, at least eight of the 17 antenatal clinics did not have a single positive sample.
By contrast, in South Africa, the antenatal prevalence in 2003 was 28%. If they had used the same system as ours, they would have had 112 positive samples out of 400 samples at a single site.
The only way to get an accurate picture of the HIV burden is through a head count -- that is, testing everyone -- which obviously is not possible. One therefore has to be satisfied with the limitations of using sentinel surveillance data, and now the NFHS data which is likely to be more accurate.
Assumptions behind the NACO algorithm
When NACO mentions an increase in HIV burden, it is referring to an estimated number based on a calculation. These estimates are affected by sampling errors and the number in a particular year does not tell you much. The numbers are useful only when one wants to look at trends over time, to assess the rate of growth of the epidemic.
Even here, there are questions. This is illustrated by looking at the estimates from 1998 to 2005, given in the table.
HIV estimate in lakhs
Increase over previous year in lakhs
2003 estimate with improved method
2004 estimate with above method
2005 estimate with above method
Source: National AIDS Control Organisation. UNGASS India report, 2005.
How should we understand these annual estimates?
First, these estimates are based on an algorithm used by NACO which utilises sentinel surveillance data. The algorithm in turn is based on HIV prevalence among pregnant women, prevalence among STD clinic attendees, percentage of men and women between the ages of 15 and 49 in urban and rural areas, ratio of HIV prevalence among men and women, ratio of HIV prevalence in the urban population to that of the rural population, etc. When one is projecting from a sample subset to the entire population, one is forced to make some assumptions or use values which seem plausible for certain parameters. However, these could result in large margins of error in the final result (7). The most critical is the one that HIV is uniformly present among the general population. This may not be true. Another important assumption is that there is a differential of 2.4:1 between the HIV prevalence in urban areas and rural areas. A third source of error is the assumed HIV prevalence differential between men and women; 1.2:1 in high-prevalence states, 2:1 in medium-prevalence states, and 3:1 in low-prevalence states. In defence, it must be said that without making these assumptions it is not possible to calculate the HIV burden of the country, when men are not sampled and the main input is that of HIV prevalence among pregnant women.
Then, from time to time, NACO also makes changes in the algorithm, taking new evidence into account to help make a better estimate. Changing the methodology or algorithm will alter the final estimates, so estimates made with different algorithms are not strictly comparable.
Is the epidemic stabilising?
In India, which has an overall low HIV prevalence and a non-uniform spread, when we make projections with only 400 samples from each site, there are bound to be uncertainties in the final estimate. No one can be sure what the margin of error actually is. But we can have some confidence in the fact that since 1998, when sentinel surveillance was first done at a national level, a number of states have registered only marginal increases in HIV prevalence and many have remained at the same level for years. Several large states such as Uttar Pradesh, Madhya Pradesh, Bihar, etc, have only 0-0.25% prevalence levels (8). Tamil Nadu has had a sustained HIV prevalence of under 0.7% and should no longer be considered a high-prevalence state. All these indicate that HIV is not spreading as rapidly as we once thought it would in our country; it is probably stabilising (9).
Are our preventive efforts paying off? Or is this just part of the long-term history of HIV infection? Will it become less virulent over time, as happened in the case of syphilis? This is something that we will not know at this stage, as HIV is a new disease. One can be really certain only if a similar slowing is seen in subsequent years as well.
In order to improve HIV estimates, epidemiologists have started implementing population-based surveys or household surveys, in which blood samples are taken from the male and female members of randomly selected households. About 30 countries worldwide, mostly in Africa, have conducted such population-based surveys. In most cases, this has resulted in a downward revision of earlier HIV estimates based on sentinel surveillance. Kenya, Ethiopia, Cambodia and now India are the countries where the new method cut previous estimates by half. As these surveys are accepted as being more accurate than the previous ones, the total number of HIV infected in the world is being revised downwards constantly. In 2006, UNAIDS estimated the global HIV burden to be 38.6 million, with a range of 33.4 to 46.0 million (10). Now this will shrink by 2.5 million based on the latest estimates from India.
Third National Family Health Survey
The third National Family Health Survey was carried out throughout India in 2006. This community-based household survey was carried out to obtain data on indicators of population, health and nutrition, according to background characteristics. Information was collected about households, and interviews conducted with women aged 15-49 years and men aged 15-54 years. Blood tests were also conducted for anaemia and HIV for a sub-population of respondents.
A community-based survey is the ideal method of finding out the HIV burden of the country because it covers men as well as women in the reproductive age-groups, both married and unmarried, and not just pregnant women. This study is far more representative than sentinel surveillance, since it represents all adults. However, such a survey is most accurate in countries with a generalised epidemic -- more than 1% of adults must have HIV infection. In India, HIV infection is seen predominantly in vulnerable groups such as female sex workers, their male clients, men having sex with men, injecting drug users, their spouses and other sexual partners. Only five states have an HIV prevalence of over 1%, so it may not be true to say that India has a generalised epidemic.
Limitations of the NFHS
While the NFHS-3 estimate is believed to be more accurate than the annual estimates provided by NACO, we must remember that it too has certain limitations.
A community-based study may also introduce errors of various kinds. One is that as only members of households are sampled, people on the move, migrants, those who have no regular living place such as sex workers and similar groups at higher risk are excluded. In this survey, the sample sizes for the four high-prevalence states of Maharashtra, Karnataka, Andhra Pradesh and Manipur and the low-prevalence but high-population state of Uttar Pradesh, were adequate to provide state-level HIV estimates.
In the other 22 states, the sample size was good enough to provide HIV estimates at the national level but inadequate to provide state-level estimates.
In low-prevalence states, a very large number of samples are required to provide accurate HIV-prevalence estimates. This measure, no doubt enforced by the need to keep costs down, reduces the validity of the study.
An advantage of sentinel surveillance is that it provides state-specific estimates. More than 100,000 blood samples were collected in the NFHS-3 survey. In contrast, 225,000 samples were collected in the 2005 sentinel surveillance round, and with half as many centres as in the 2006 round, many more samples were collected.
Another possible source of error in any community-based survey is the need to link blood samples to personal interviews and household surveys. The NFHS-3 survey used the Linked Anonymous method: individual interview data can be linked to his or her HIV result. After the interview, which included sexual history-taking, every participant was informed about the purpose of blood testing and was asked to sign a consent form. Blood was drawn only from individuals consenting to participate.
Linked surveys underestimate HIV prevalence. People who know they are HIV-positive or suspect they may be infected may refuse to provide a blood sample. Non-participation of infected individuals aggravates the error in low-HIV-prevalence situations. Most low-prevalence states have an HIV prevalence of 0.1% to 0.3% which means that among 1,000 individuals only one to three individuals may be positive. Even if a few such positive individuals do not participate in the survey, the estimate is thrown off considerably. By contrast, sentinel surveillance employs the Unlinked Anonymous method, in which HIV testing is done in blood samples which are routinely collected for other purposes, and participation error is minimised.
It was speculated for some time that India’s HIV burden was overestimated. AIDS was not very visible in large parts of the country with low numbers of reported AIDS cases and AIDS deaths. When antiretroviral therapy was initiated, the uptake was remarkably low and the expected hordes of AIDS patients demanding ART did not materialise. The algorithm used for estimating the HIV burden had a lot of unvalidated assumptions. In a landmark study, Lalit Dandona et al (11) did a population-based HIV prevalence survey in the high-prevalence district of Guntur in Andhra Pradesh. They demonstrated that the sentinel surveillance methodology and the algorithm used for estimating HIV burden were overestimating the infected population.
The NFHS-3 study has provided valuable information on the state of the epidemic in the country. It provides a more accurate figure than previous estimates but it needs further refinement. While there is no doubt about the usefulness of the study, its limitations have to be addressed so that it is more informative and reliable.
What do we make of the different estimates?
Coming back to the question we started with, why do we have conflicting estimates?
One reason is that different organisations use different models and algorithms when arriving at estimates, even though they might use the same data.
One example is the estimate of AIDS-related deaths. Recording of AIDS deaths is important because they indicate the mortality toll of the disease. Historically, reported AIDS deaths were used to estimate HIV prevalence using the ‘back calculation’ approach.
UNAIDS’s estimate of AIDS deaths uses a projection method based on HIV prevalence data. This method, when used in low-prevalence regions where the epidemic mostly involves some vulnerable groups, can only provide death estimates with very wide margins of error (12). On the other hand, NACO does not make an estimate but gives the actual number of reported AIDS deaths in the country; only 8,097 AIDS deaths were reported till the end of 2005 (13). If the cause of death reporting is nearly complete, reported AIDS deaths do give an indication of the state of the epidemic in the area. However, AIDS death reporting is insisted on but rarely followed even by hospitals in India, with the result that only a tiny fraction of those with HIV infection are ever recorded to have died of the disease.
One must also remember that all groups concerned -- whether government, international organisations or civil society organisations -- may have their own biases as well as their own interests in projecting a particular number.
When we hear civil society organisations claiming that the country’s HIV burden is much higher than NACO’s estimate, we must also remember that many civil society organisations see only a small part of the whole picture. They tend to see people who are symptomatic or have AIDS. The number of symptomatic people and people with AIDS is certainly increasing, since those infected years ago are now developing symptoms/AIDS.
There is no doubt that doctors and civil society organisations are seeing more people who need care. But a spate of AIDS cases does not mean an absolute increase in the number of people infected. All it means is that the epidemic is becoming more visible as the proportion of symptomatic patients increases.
There is another factor as well. With the global interest in AIDS, institutions of all kinds could have an interest in high estimates. If surveillance and other data show that the HIV epidemic is not increasing as rapidly as was expected, it would also mean a cut in funding. In 2004, Richard Feacham, Head of the Global Fund for AIDS, TB and Malaria, declared that official NACO HIV estimates were conservative and that “the HIV/AIDS epidemic in India is extremely grave… a ticking time-bomb” (14). UNAIDS has been accused, most notably by Dr James Chin, former Chief Epidemiologist of the Global Programme on AIDS, of consistently overestimating HIV caseloads of not just India, but countries around the world, and not being prompt enough to adopt more accurate methods of estimation. This, some say, has resulted in the AIDS programme getting a greater share of the limited funds available for international health. The Global Programme on AIDS had projected earlier that if the HIV epidemic was not contained early, the cost of human life and economic devastation would be on a massive scale. This could explain why the HIV epidemic was able to attract funding, though by many estimates, it is still far short of what is necessary.
The implications of low figures are two-fold, depending on how they are used. First, they could suggest that the problem is not as grave as was believed. Second, they could be used to argue that prevention programmes are making a difference, and therefore gain support for future funding.
My belief is that prevention programmes are working, as shown by steep sustained falls in HIV prevalence among sex workers especially and among pregnant women in several states such as Tamil Nadu, Karnataka and Maharashtra, to some extent.
At the end of the day, do we have a better sense of how AIDS has affected the country? Are the latest numbers more accurate? The answer: the latest calculations give us a better picture of the problem, but they too have their limitations. Maybe we shouldn’t take all these numbers too seriously.
(Dr M Prasanna Kumar, former Deputy Director of the Kerala State AIDS Control Society, is based in Thiruvananthapuram)
- Report on the Global AIDS Epidemic. Geneva 2006. Available at
- National AIDS Control Organisation. HIV/AIDS epidemiological surveillance and
estimation report for the year 2005. New Delhi: Ministry of Health and Family Welfare,
Government of India; 2006. Available at http://www.nacoon line.org/fnlapil06rprt.pdf
- Jim Fisher-Thompson. ‘CIA Expert Warns of Looming HIV/AIDS Threat in Africa, Asia. David Gordon
bases dire predictions on “Next Wave” report’. Bureau of International Information Programmes.
US Department of State. February 23, 2004. Available at
- Press release. Indian government releases final report for new national health survey. Demographic and
Health Surveys. October 15, 2007. Available at http://www.measuredhs.com/aboutdhs/
- Government of India, Ministry of Health and Family Welfare. National Cancer Control Programme. Available at mohfw.nic.in/kk/95/i9/95i90e01.htm
- Gopi P G, Subramani R, Santha T, Chandrasekaran V, et al.
‘Estimation of burden of tuberculosis in India for the year 2000’.
Indian Journal of Medical Research. September 2005
- National AIDS Control Organisation. HIV/AIDS estimates 2003. Available at http://www.nacoonline.org/facts_hivestimates.htm
- National AIDS Control Organisation. Observed HIV prevalence levels state-wise: 1998-2004. Available at
- UNAIDS briefing call. June 15, 2007. Available at
- 2006 report of the global AIDS epidemic. Executive summary. Available at http://data.unaids.org/pub/GlobalReport/2006/2006_GR-ExecutiveSummary_en.pdf
- Lalit Dandona, Vemu Lakshmi, Rakhi Dandona. ‘Is the HIV burden in India being overestimated?’
BMC Public Health. December 20, 2006; 6: 308
- Grassly N C, Morgan M, Walker N, Garnett G,Stanecki K A, Stover J, Brown T,
Ghys P D. ‘Uncertainty in estimates of HIV/AIDS: the estimation and application of plausibility bounds’.
Sex Transm Infect. August 2004. 80 Suppl 1:i31-38
- National AIDS Control Organisation. UNGASS India report, 2005. Available at
- N Gopal Raj. ‘A ticking time-bomb?’ The Hindu. December 5, 2004
InfoChange News & Features, January 2008