false
Catalog
AOCOPM 2023 Midyear Educational Conference
259668 - Video 2
259668 - Video 2
Back to course
[Please upgrade your browser to play this video content]
Video Transcription
It is my honor to introduce a distinguished veteran of two military services. First he made the mistake of going into the Army, corrected it, joined the Air Force, and never joined the Navy. Dr. Murray Berkowitz received his Doctor of Osteopathic Medicine degree from the College of Osteopathic Medicine and Surgery at what is now the Des Moines University of Osteopathic Medicine Center. He earned his Master's of Public Health degree at the John Hopkins University Bloomberg School of Public Health. In Baltimore, Maryland, Dr. Berkowitz is residency trained in general preventive medicine and public health. He is board certified in neuromuscular medicine and osteopathic manipulative medicine as well as preventive medicine and public health. And he holds the certification of added qualification in occupational medicine. He earned his Master of Arts and Master of Science degrees from Columbia University and received his engineering degree at what is now Polytechnic School of Engineering of NYU. He also did advanced studies and research at the University of Texas at Arlington and George Mason University. Dr. Berkowitz is a graduate of the 2004-05 AOA Health Policy Fellowship. He is also currently a research fellow at the Osteopathic Research Center in Fort Worth, Texas. He earned his fellowship in the American Osteopathic College of Occupational and Preventive Medicine and also fellow of the American Academy of Osteopathy. All right, sir, the floor is yours. You have big shoes to fill after that last dynamic, but you are just as a dynamic of a speaker. Thank you. Thank you. Or at least I'm as at least as loud. Good morning, all. And I just want to say it's a privilege to be here. And I am hoping to take this opportunity to present the review of the basics and give you some applications of this. This is an area epidemiology is one of those areas that you study as you go through any of the preventive medicine specialties. And as you are on your way to getting your MPH or equivalent degree portion in the academic year. Now, one of the things that a lot of people have trouble with is, is the esoterica that it is there on this stuff. So let me proceed onward. I'm in private practice. Now, some of these slides are modified from some that I got from my mentor. I am gone, who's a MD and a doctorate in public health and has done a lot of things in Israel. I have no actual or potential conflict of interest in relation to this activity or presentation. And I do not plan to discuss off label use of drugs or other products. So objectives got a lot of them, and we can go on with this and be able to identify measures of disease frequency and excess risk, be able to apply these in the context of epidemiologic questions and problems to be able to describe the elements required for disease transmission, to be able to describe the three types of prevention, be able to understand and apply big curves to outbreak investigations. And basically to be able to understand and apply these calculation and application of screening test utilities, specifically morbidity and mortality measures, incidence, prevalence, attack rate, positive and negative predictive value, sensitivity, specificity, and be able to identify measures of disease frequency and excess risk. Be able to understand and apply the calculation, application of odds, ratios, and relative risk, and be able to identify and apply the study designs to questions in epidemiology and problems. And obviously the big three are randomized clinical trials, cohort studies, and case control studies. And finally, be able to identify and describe the factors involved in causation and identify the essential factor. So let us continue on. So here's an outbreak. We've got two major medical centers in suburban Washington, D.C., report approximately 110 new patients seen in ERs with GI problems, and they reported them to whom? They report them to the State of Maryland Outbreak and Investigation Office. All the patients went to the hospitals approximately between 10 p.m. on Saturday and 2 a.m. Sunday, or as we say in the military, between 2200 and 02. So what needs to be done here? Let's look at this in the framework of epidemiology with regard to disease control and health services programs. And if you take the word epidemiology, epi means on or upon, demos comes from the word meaning people, and ology means study of. So WHO definition, and please note it's the WHO, it's not the WHO. WHO is a rock opera music group from my generation. Okay. Think of the rock opera Tommy. I digress. According to the WHO, the epi is the study of the distribution and determinants of health and disease in human populations to enable health services to be planned rationally, disease surveillance to be carried out, and preventive and control programs to be implemented and evaluated. And one can argue whether they fulfilled that in the recent pandemic. So if you will, epidemiology is in the study of how disease distributes in populations and also helps us to determine what factors influence or determine that particular distribution. How do we use epidemiology then? We try to elaborate what are the etiology or the causes of a disease? What interventions? It helps us supposedly choose interventions to modify the natural history and it will evaluate the effectiveness of health services to provide the basis for clinical and public health policy in dealing with any kinds of outbreak. So you get into a thing that Leon Gordas called epidemiologic reasoning. I was very fortunate to study under a lot of the really, really big people in epidemiology, especially at Hopkins. However, a lot of them when presenting this stuff made it very, very kind of esoteric and not really understandable to be used by those of us who are going to engage in public health and preventive medicine. That's different than the folks who are going to do research into epidemiology. Now, in looking at this then, how do we determine whether or not there's a statistical association? And I'm saying association, not causation. An association between a factor or characteristic and the development of a disease. So one way is by studying the characteristics of the group and the other is studying the characteristics of individuals who make up that group. And it might seem like we're splitting hairs here, but in reality we're really not. What are the things that impact that? How does then the overall grouping or the concept of that group or population subsume then the characteristics of the individuals? Derive appropriate inferences from or regarding possible causal relationship from the patterns of statistical association which have then been found. Okay. How do we identify then the groups at high risk of disease and why do we want to do that? We want to do that in order to be able to study the factors associated with the increased risk and also to direct preventive efforts and screening programs for early detection to an appropriate population. All right. So what we're really saying in the bottom line of all of this is you have to determine essentially what is it that makes up something to be studied and how do we develop then a screening program to screen for the characteristics that are of interest to us that we would like to prevent. Leads us into a segue. So let's look at a history of the deaths of cholera in 10,000 inhabitants by elevation of a residence above sea level in London in the 1848-49 year period. And this comes from the source of this data is actually an article by Farr and he published this in 1852. So if you look at this, you'd see that there's an inverse relationship, if you will, in the number of deaths from cholera with your elevation above sea level. All right. So the higher you are on this is due to the less likely you are to die of cholera. As you're under 20 feet, it's the highest number. It's 102 for 10,000 inhabitants in London. At 340 to 360 feet, there were only eight. If you remember one of the common things of cholera, okay, then this gives lie to the old expression, shit rolls downhill. Okay. So if you look at this by water supply, Southwark and a Volhall company, the number of houses there were 40,000 and change, and the number of deaths from cholera were 1,263. And so the deaths per 10,000 was 315. Lambeth Company, 26,000. When you go through it again, 37 per 1,000. And other districts in London, there were a little over a quarter of a million folks, a quarter of a million households. The number of deaths from cholera are over 1,400. But the number of deaths per 10,000 was 59. And this is from snow. And this is the famous snow article in the early 1930s, or the mid-1930s, I should say. Lambeth Company is clearly where you wanted to live if you wanted to avoid cholera. Notice only 37 per 10,000 houses. All right. To review the three types of prevention, primary, secondary, and tertiary, primary is, how do we prevent it to begin with? We typically look at things in terms of immunization, reducing exposure to a risk factor. Secondary prevention is an early detection of existing disease to reduce the severity and the complications, so screening for cancer. And tertiary is then reducing the impact of the disease, so let's say rehabilitation from stroke. Taking the latter one, or the last one, as an example, primary prevention would be preventing high blood pressure, diet, exercise, good health patterns. If that fails, then secondary prevention, if you will, would be controlling the elevated blood pressure that's been detected in an effort to prevent the event in question. The tertiary prevention would be to reduce the impact of the stroke, so rehabilitation from somebody who's suffered a stroke. All right. Looking at disease from deaths preventable, I got this data back in 1990 because it's well-documented and not arguable. Tobacco, estimated number of deaths were 400,000, and this constituted 19% of the total deaths during that time period. When you look at diet and activity patterns, 300,000. Alcohol, 100,000. And as you can see, as we go through all of these activities, tobacco diet activity patterns, physical patterns of health, alcohol, microbial agents, toxic agents, firearms, sexual behaviors, motor vehicle accidents, if you will, illicit use of drugs, all right, you can clearly see that the bottom three are very, very, very small percent of the total number of deaths. And let's say this is deaths of over a million people in 1990 in the United States. These are the biggest of them. So looking at the epidemiologic concepts of transmission of disease, in describing a disease, these are the aspects. You've got a vector at the center, a host, the environment, and an agent all playing on this stuff. One of the things that I would very often ask students of mine is, what is the host for this disease? Let's say malaria. What's the host for malaria? And most of the students would maybe say it was the mosquito. And it's not. Who's the host? It's us. It's people. Okay? The vector is then the mosquito. The environment could be things like swamps and what have you, human areas where the anopheles mosquitoes can grow and thrive. And the agent, then, was the falciform microbe that actually is the causing agent, if you will. So looking at host factors, then, these are things that all relate to it. A person's age, a person's sex, race, religion, customs, occupation. What's their hereditary factors? Now, I know you can't ask about some of the genetic relationships, but there are things that you do that. Marital status. Married people tend to live longer than unmarried people. Now, the joke's notwithstanding. Little old Jewish men, classic joke. They don't really live longer than... It just feels like it. Family background. And again, that's related to things like hereditary status, your marital status, your customs. Previous disease. What was your previous status? Was there any previous infection with this? Now, we all know that, as was mentioned in other places, sometimes you get a disease and you survive the disease. You then have lifelong immunity as a result of that. Measles, rubella, chicken pox, I should say. I mean, we can go along with any of those kinds of things. Things that used to be called, quote, usual childhood illnesses. We now prevent those by vaccinating kids at various early stages of their life, and we go on from there. So the agent factors are, are they bacterial? Are they chemical, physical, like a cough or radiation? Are they nutritional, lack of or excess of nutrition? If you eat too much, you get heavy, you get fat, and obesity has its own attendant problems as well, I know. Environmental factors. The temperature, the humidity, altitude. Again, what are the situations like crowding in the population? Places like major cities, New York, LA, Chicago. Housing. What are the housing abilities that are available for all sorts of people? Or the lack of housing. The neighborhood. And again, that's a collection of all of the people living in that neighborhood. And very often, historically, when people immigrated into this country, they wanted to be near people of similar customs and backgrounds. So like in New York, or Boston, or Philadelphia, all of the Jewish folks congregated together. All of the Greeks congregated together. The Italians, the blacks, the Puerto Ricans. Everybody did that because there was a level of comfort, a level of use of the customs, and so on. Other environmental factors include the prevalence of water, milk, and food, and radiation, air pollution, and noise. Hey, big cities have noise pollution, lots of it. Infectious disease process, the etiologic factors. What constitutes a reservoir or a place where that agent can basically have that disease? What's the portal of exit? How do things get out of the host and into the environment? How do things get transmitted? Again, a vector. And what's the portal of entry? How does something get into the new host? How does it get into the new person who gets the infection? And do you have a susceptible host? I can walk through an entire, having had the measles and having lifelong immunity, I can walk through a whole group of people who have active measles cases going on. I have no risk factor. I'm not a susceptible host. Take somebody who wasn't vaccinated for it, put them in there. They're very likely to become infected with measles. So questions are now, and we can look at this in terms, what's the difference between endemic and epidemic and a pandemic? Well, endemic just means that, okay, this is the stuff that's in a particular locale or particular type of population or subgroup. The epidemic is the occurrence of a disease that exceeds the level that we would expect just from it being an endemic. In some cases, by the way, an epidemic is one case, okay? And a pandemic is just a worldwide epidemic, right? So we go, ooh, pandemic. Think of just an epidemic writ large, and you can take some of the fear of the word pandemic out of everything that's going on. The reservoir, we talked about it. Humans versus hosts. Do we have an acute clinical case? Do we have people who are carriers, so-called? Think of the typhoid Mary situation. They have inapparent infections, if you will, subclinical cases. They could also be incubatory. They might be going through the incubation stage that it takes before they break out with a fulminant case. Are there people in a convalescent stage and are they able to transmit the infection? And then are there people who have chronic whatever the illness? Animals, they could be a reservoir in the environment. It could be free living in the environment. An example of the last one is tetanus. It just is in the environment and you can get it now, of course, the rusty nail. And I don't know how Scotch and Drambuie got such a bad rep for being the cause of tetanus, but it's not a rusty nail. It's the environment and the other things surrounding that. Portal of escape. Typically, GU, respiratory, alimentary. Okay, you can puke it up. Skin, it can escape through our sweat. All right. Superficial lesions or percutaneous. And then there are others where it passes between mother and child. Herd immunity is something we heard about during the pandemic. Herd immunity formally is the resistance of a group to attack by a disease to which a large portion of those members are already immune. That then lessens the likelihood that a susceptible person might catch or come in contact with an illness and therefore break out with the disease. Okay, that's what they're talking about. Requirements. You have a disease agent restricted to a single host species. Relatively direct transmission from one member of the host species to another, like person to person. And the infections must induce solid immunity. Outbreaks occur only in randomly mixing populations. Those are the aspects for formal requirements for herd immunity. The question now becomes one of what's an epidemic curve? And that's a distribution, a time of onset of a disease. Okay. So typically, single exposure, common vehicle epidemic, the epidemic curve represents the distribution of the incubation periods. And let's take an example of diarrheal illness experienced in passengers aboard the MS Sun Viking by their onset during the period of September 11th through 30th in 1976. And as you can see, on September 11th, it was virtually nothing. The main part of this was about the 15th and then the 16th, it began to taper off and so on. And then there's another little blip later on. So basically, this is your classic epidemic or curve, if you will, to an outbreak of a point source outbreak. One could surmise, by just looking at the data, as it is in looking at this frequency plot, is that the incubation period is about five days, four to five days. And it could secondarily be anywhere from approximately two and a half or three days to about seven. Okay. And that gives you the idea of what this is. So looking at this by levels of exposure, you can see that most people break out with whatever this thing is by the fourth day. A large number have already broken out with it by day three. And then the tapers off very, very strongly right after that, a day later. So this one, you would say the epidemic, the incubation period, looking at the epidemic curve, would simply be between three and four days. Again, basically, the interval from the receipt of the infection to the time of the illness being fulminant in a clinical outbreak situation. So what do you have? You have to look at it in terms of the time of disease onset, the time of the exposure, and the incubation period, and a susceptible host being necessary. What are their general factors of resistance? Now, in looking at the question of specific acquired immunity, we have natural immunity, which is either active by an antibody response, you got the disease, or passive, supplement or non-specific, which could be between mom and child in the womb. Artificial, which would be active in immunization, vaccination, or passive, things like breastfeeding or gamma globulin shots, GG shots. Let's now talk about some of the other ends of things, and that is measures of morbidity. Remember, morbidity is disease. Mortality is death. So we need to go on with that. Now, in general, just to look at it and review with you, formally, a rate is the probability of an occurrence of some particular event, and it's typically expressed as X over Y times K, X and Y, X is the number of events or cases, Y, the total population at risk, coming back to that notion, and K is the base number, like per hundred, per thousand, per 10,000, per hundred, per million, and so on. The condition, you look at the time, the place, and the population, must be specified for every type of rate. Okay? So, morbidity rates are then measures of illness, the measure rate of illness, mortality rates measure a rate of death, and natality rates are birth rates, measures of birth rate. So, we've got a number of things that we're going to look at. I'm going to get kind of specific here before to lay some groundwork on this. The incidence rate is, and let's define it per 100,000, the number of new cases of a disease occurring in a population during a specified period of time, divided by the number of people at risk of developing the disease during that period times 1,000, let's say, because we're doing this rate per thousand. So, what's the risk of becoming sick if you're healthy? It's similar to, but not identical to, an attack rate, and it's used with acute diseases. Why? Because you're going to go ahead, and you're going to have an incidence, you're going to get better from it, or not, in which case, but you're not going to be afflicted with the disease forever. Again, let's go back into relative, or even 100 years ago, people came down with the measles, there were no vaccinations, people came down with the measles, and they either got better, or they died from it. They didn't have chronic measles. The prevalence is the number of cases of a disease present in the population at a specified period of time, divided by the number of people in a population at that specified point in time. There's no implication of risk in prevalence. It's used with chronic diseases that has no meaning for acute diseases. As I said, there's no such thing as chronic measles. You get better in a week and a half, or a week, or you don't, okay? You don't. And we use prevalence for planning purposes, for the most part. So, here's that point where we can transition from, how do we go ahead and measure certain things, to how do we formulate programs based upon that data? So, what are the types of prevalence? There's point prevalence. Here, the time isn't specified. It's the most common usage of the word. Period prevalence, and cumulative or lifetime prevalence. And so, let's example of these. Do you currently have asthma? There you go. Go back to the MEPS thing of, it's different than, have you had asthma in the last five years, 10 years, whatever, 10 years, versus cumulative or lifetime prevalence. Have you ever had asthma? All right. Now, when I entered the service, one of the biggies was, have you ever had hay fever? And the magic cutoff was 12 years old. If you didn't have hay fever or allergic rhinitis at any time after 12, you were good to go. It did not matter. You were cleared for just everything. Now, what factors are influenced by the observed prevalence rates? One, you can increase the prevalence by having a longer duration of the disease. So, if you live longer with cancer, you're going to have more prevalence of cancer. A prolongation of the life of patients without a cure. Just an example I gave you just a moment ago. An increase in new cases. So, an increase in the incidence. If I have more cases of something coming into the population, then I'm increasing the prevalence. One of the questions is, am I also bringing in or migrating in cases from other areas? One of the concerns in the recent times has been, are some of the people immigrating to the United States bringing with them COVID or other kinds of disease? What about people who are healthy who leave the population or area or move out of the area of concern and they therefore are no longer in the denominator? That's now going to increase the overall number. Also, improving diagnostic facilities, better reporting. Do we really have more cases of people wearing glasses or needing to, or do we just basically have a better way of detecting problems with people with eye vision issues and therefore can give them glasses to cure the problem? On the opposite side of this is the prevalence being decreased while shorter duration of a disease. Higher case fatality rate, if people die from the disease sooner, okay. Decrease in new cases, I find a way to prevent them. So, I stop new cases. I stop, I lower the incidence. Migration of healthy people from other areas into the population overall that I'm in, you know, concerned with. So, I'm the health officer for Baltimore County, which is different than Baltimore. And I get people coming in from, let's say Anne Arundel County, that changes the dynamic if the people from Anne Arundel aren't afflicted with the same kind of illness as maybe the people in Baltimore County. Migration out of cases, people leaving the area who happen to be sick with an itch or being cured. It's great. Let's cure you of your whatever the disease happens to be. So, I stop people from coming in from other areas into the population overall that I'm in. Migration out of cases, people leaving the area who happen to be sick with an itch or being cured. It's great. Let's cure you of your whatever the disease happens to be. So, problem with numerators is this variability of defining an illness. How is somebody, are they sick or disabled, et cetera? How do we define that? And how do we make a specific disease reporting? Where do we report? How frequently do we report? What medical records are available to us for review? And with the advent of and the increasing flexibility and coordination between EHR EMRs, there is an ability to review records more readily than when we had only paper records and you had to be able to access them from whoever the current possessor of those records was. The other is using specially planned data collection, like interviews, direct examination of people. We get their labs. We get physical exam that I perform at this particular day, date, and time on individual John Smith. So, the quality of the interview data, though, is the problem with the denominator. How do I define who's the population at risk? Prevalence is the incidence times the duration of the disease for diseases that have a steady state. Things that are acute, no, that's not going to be an applicable formula. Hypothetical example. Let's have a chest x-ray screening. Let's look at the first situation. I've got a screen population of 1,000 people in Hightown and 1,000 people in Lowtown. And the number of people who have been screened and found to have a positive problem on a chest x-ray problem on a chest x-ray was 100 in Hightown and 60 in Lowtown. Where do you want to live? So, the question is, what's the duration of the disease? What's the risk here of acquiring whatever this illness is? Well, in Hightown, the number of people with a positive x-ray was 100, and the point prevalence was 100 per 1,000. Lowtown, it was 60, 60 per 1,000. Okay. So, you can see how we can apply now some of these data to the beginning of determining certain kinds of rates to determine how big of a problem something is. Now, let's look at another issue with the same data. But let's look at the point prevalence. We got 100 people per 1,000 in Hightown, 60 in Lowtown. But in Hightown, the incidence is four per year. Well, that says that those people, if they're afflicted with this illness, live an average of 25 years with this illness. Whereas the people in Lowtown who have 60 point prevalence, their incidence, if their incidence is 20 per year, they live an average duration of three years. So, the question is, where do you want to live in Lowtown where there's less number of people with this disease or in Hightown? Well, now that you have a little more data, you say, well, my possibility of acquiring the disease is only four versus 20, five times as much in Lowtown. And if I acquire it, gee, I'm going to live 25 years with it in Hightown. I'm only going to live three years in Lowtown. Now, we can all go to the viewpoint of, oh, this must be a really, really bad disease. Or maybe it's a bad disease, but maybe the people in Lowtown get it more, but what happens to them is they get cured quicker. So, it changes how you interpret what the data. So, the same numbers, the same data can be used in different ways depending upon what the circumstances are. Remember, the numbers are just that, they're numbers. It depends upon the circumstance, okay? So, we like to do surveillance. And I think in 1986, when the CDC was not as controversial of an organization as it has been in recent days, epidemiologic surveillance is the ongoing systematic collection, analysis, interpretation of health data essential to planning, implementation, and evaluation of public health practice, as well as the timely dissemination of these data to those who need to know, like public health officers out in the communities, okay? The WHO, again, similar verbiage. And here is a list of examples of health events that are very often under surveillance in a community at different times, okay? So, in Maryland, when I was doing things and I was at Hopkins, the doctor had 24 to 48 hours to report something to the local health officer or the local health office, if it was one of the diseases that was considered reportable. The health office at the community level, which was one in each county and one in the city of Baltimore, for a total of 24 health offices in the state of Maryland, those folks then had 24 hours to get it up to the outbreak investigations and epidemiology office at the state of Maryland's Department of Health, or Department of Health and Mental Hygiene, as it was then called, DHMH. Now, it's just simply the Department of Health. Passive surveillance, again, the advantage of passive, relatively cheap, relatively easy to develop, and the disadvantages are that you have a greater likelihood of under-reporting, and it's easy to miss, very easy to miss, or easy to miss very, very small, but local outbreaks, if you're doing passive. Active surveillance, the advantages is you get a more accurate reporting, the local outbreaks are identified, the disadvantages are that they're more expensive, and they're difficult to develop. So, measures in disease occurrence are things like, what are the rates? So, how fast is the disease occurring? What are the number of events? For example, new cases divided by the population at the time. Proportions are what proportion of the population is affected, and the number of people affected divided by the total population. Now, inherent in this is the notion that, in a proportion, the numerator is part of the denominator. So, clearly, the number of people affected with an illness are also part of the total number of people in that population. So, that's a proportion. This is not like an incidence rate where it's just the people who have a chance of developing the disease. So, let's talk about different rates of mortality or measures of mortality. Mortality rate, very, very simply, is simply the total number of deaths from all causes in one year divided by the number of people in the population at the mid-year. Why? We take the population in mid-year as an average of the population. Some people are going to be born at the beginning of the year, and people are going to come into the population all throughout the year, and people are going to start out there and are going to die throughout the year, old age, disease, etc. So, we typically take it at the midpoint of our time period of interest. If we're looking at the year as the time period in question, then we look at it at mid-year. This is a crude death rate, okay. Age-specific mortality would be what's the annual mortality rate from all causes for children under 10,000 in population. So the numerator here is now the deaths from all causes in children under the age of 10. If they're 10 or above, and if we're 11, 12, etc., they're not in the numerator, versus divided by the number of children in the population who are also under 10. And if they're 10, 11, 12, 18, 20, whatever, they're not included in that denominator either, times the 1,000, which was our core 1,000 population. Again, an example here, what's the annual mortality rate from lung cancer per 1,000, number of deaths from lung cancer in one year, number of people in the population at mid-year, again, a crude death rate. What's the annual mortality rate from leukemia for children under 10? So the children now had to die from leukemia in that one year, but they had to be under 10 years old. Again, who's the denominator are the number of children in the population under 10 at mid-year. Now, case fatality rate is a percent. So it's the number of individuals dying during a specified period of time after a disease onset or a diagnosis, divided by the number of people with that specific disease. And this gives you the rates for acute diseases. So for, let's say, COVID, it would be the number of individuals who were diagnosed with COVID as verified in a period of question, and they died from it, attributed to it, not something else, divided by the number of individuals who have that disease in that particular population. Then there's now a standardized mortality rate. Now, typically, this was done, originally, talk about occupational medicine, for people in the work environment. So it's the number of deaths occurring among men, originally, age 20 to 64, because 20 was considered you were working, and 64, you were going to retire at 65, if you lived that long, in a given occupation, expressed as the percentage number of deaths that might have been expected to occur if the given occupation had been experienced within each age group, the same as that as the standard population. So the question becomes was, okay, what's observed divided by what's expected? Okay, and this slide shows you exactly what I was talking about. So what are the advantages of crude, specific, and adjusted rates? So in a crude rate, you've got the actual summary rates. It's readily calculable for international studies. It's widely used, despite some limitations that they have. Specific rates, you're talking about homogeneous groups, children under 10, population of work age, men between 20 and 64, etc. Okay. Detailed rates are useful for epidemiologic and public health purposes. An adjusted rate now is a summary statement. It's when we take something, we say we're adjusting for age. Now you get the differences in the compositions of the groups are now removed because of this, let's say, adjustment for age. And what ideally this does is permits an unbiased kind of comparison. Disadvantages are that the crude rates, since the populations vary in composition age, differences are difficult to interpret. Specific rates, it's cumbersome to compare the many sub groups of two or more populations. And adjusted rates are, let's call it what it is. It's something that we did mathematically. It's a made up thing. Doesn't really exist. Okay. The absolute magnitude is dependent on the standard population shown or chosen. And opposing trends and subgroups are masked because they've been eliminated from being visualized because we've adjusted by age, let's say. Evaluating diagnostic and screening tests, validity, addictive value, and reliability or repeatability. Well, validity is the ability of a test to indicate which individuals have the disease and which do not. And part of that is sensitivity and specificity. And these two terms are bantered about and misused by lots of physicians, never mind other health professionals. Sensitivity is the question of what's the ability of a test to correctly identify those who have the disease. Specificity is the ability of the test to correctly identify those who do not have the disease. Okay. So the concept of sensitivity and specificity is applied regularly through screening. So let's use an example here. Assume you have a population of 1,000 people of whom 100 people have a disease and 900 do not have the disease. So what we're going to do is we're developing some sort of a test or we have some sort of a test that screens to identify the 100 people in the population with the disease. Well, the way in which you set up a classic two-by-two table is important. And you need to understand this because very often the summary data graphs or tables will depict it in an orthogonal way. In other words, it won't have the data in the form you want. So the question is here is the disease versus no disease, and that's put at the top of the table. Classically, you put the people with the disease on the left, people who don't have the disease on the right. In the test or the intervention in question, it's those who have it, those who don't, or those who are positive detected, those who aren't detected with it. So in looking at this, we have the true characteristics of the population, people who have the disease and no disease. And if you look at this table – whoops, sorry, guys. If you look at – okay, I got a fat finger error here, guys. I was trying to get the area here. If you look over to the left on this table, so 80 people of the 100 have the disease and 20 people test negative for the disease. Similarly, of the 900 people who do not have the disease, this test says 100 people have the disease and 800 don't. Okay. So what happens is the test says 180 people tested positive, negative 820 people tested negative for the illness in question. If you look at the sensitivity, it's 80 divided by 100, so the number of people who have the disease divided by the total number with the disease, 80 percent is the sensitivity. And for this, it would be the number of people for specificity, 800 who don't have the disease and actually don't, divided by the 900 people who really don't, or 800 divided by 900, or about 89 percent. If you will, the people who have the disease and test positive are true positives. The people who don't have the disease and test negative are true negatives. People in the bottom right-hand corner are the people who test positive and really don't have the disease. They are a false positive. Okay, I'm sorry, false negative on that. Then you have the people who don't have the disease and test positive. They're the false positives because they really don't have the disease. Okay. If you will, sensitivity can be written by the formula, the number of true positives divided by the sum of the true positives and false negatives. The specificity then is the true negatives divided by the sum of the true negatives plus the false positives. Okay. If you will, we can look at this by making positive. We could bring this down. Here's a more general two-by-two. The people who are positive with the disease are labeled A. Those who are positive without the disease are B, the false positives. Those who have C are the false negatives they really have. D is the people who are true positives and test positive. Sensitivity then could be A over A plus C, and specificity is D over B plus D. Now, one of the big questions comes down to, what about sequencing tests in two stages? For example, you get the number of people. You get the number of people who are going to be checked for diabetes during pregnancy. Basically, what we would do is we would do this test in two stages. Initially, you perform good blood glucose, and then those then who test positive under the blood glucose, we might want to go ahead and then give them a more precise test of the glucose tolerance tests, if they're positive. Let's assume that we have a disease prevalence of 5% of the population, and the population is 10,000 people. The blood glucose, let's say that test is a sensitivity of 70% with a specificity of 80%. Taking that 10,000 population and looking at this and breaking this up, 5% says 500 people have the disease. That means 950, 500 don't, okay? Now, those who test positive in 70, I do 500 times 70% or 0.7, and I get 350 people who will test positive, should test positive, and of that, 150 people don't. Similarly, of the 9,500 people who don't have the disease or don't have diabetes, the specificity is 80%, so 9,500 times 0.8 gives me 7,600. That difference is 1,900. Who am I going to test? I'm going to test only the top row for glucose tolerance. I'm only going to put that top row through that. What happens is, to do this, what I do is take that 350 positive, 1,900 test negative, 2,500 people, and now I put them here. Now, the test for glucose tolerance test here, let's assume that it's 90 specificity and 90% sensitivity. 90% says that those who have the disease, turns out that that's 315, and 35 people don't have it. They are basically people who have the disease but have tested negative. Then in the other case, the specificity is 90% of the 1,900, so that works out to 1710. The net specificity is then 7,600 plus 1710, because that's from both tests, divided by the entire 9,500 originally. That gives me about a 98% overall net specificity. The net sensitivity is going to drop, however, because it's 315 after the two tests over the original 500, and that's 63%. Now, we all learned the shortcut, or you should have all learned the shortcut. For net sensitivity of two sequential tests, it's the sensitivity of the first times the sensitivity of the second. You get the answer, 0.7 times 0.9, 0.63, boom, you're done. You don't have to go through this gyration. Unfortunately, there isn't a nice simple formula. There is a formula, but it's really, really convoluted and complex, and it takes less time to do this out longhand. However, we use computers, and they do it for us. So, sequential testing, people, again, the sensitivity decreases, as I mentioned. Specificity increases. People with negative results on either or both tests are considered negative, and we ordinarily don't perform a second test if the first test is negative. There are numerous exceptions, or there are a number of exceptions. I'm going to say numerous, a number of exceptions to that, however. So, simultaneous testing, though, is where you're going to do both tests concurrently. The sensitivity here increases, and the specificity decreases when you do that. And here, you ordinarily don't perform a second test if the first test is positive. So, one positive, and you're done. Okay? Predictive values. Predictive values are a positive and negative predictive value. Now, this is different. This is the question of, what's the probability that an individual patient tests positive truly has the disease? Now, going back to the situation here, what you're doing is, and I'm going to jump ahead to the numbers, what I'm looking at is, I had 80 people tested positive, but 100 people tested positive didn't have the disease out of 180. So, the positive predictive value is 80 divided by 180 is about 44%, or a little under half. However, the negative predictive value says that somebody who's testing negative doesn't have a test. If I tell you, John Smith, hey, here's the negative predictive value for it, well, yeah, you tested positive, or rather, you tested negative. Well, the reality here is that your predictive value is 90% assured you really don't have that disease. Okay? So, there's a difference between sensitivity and specificity, which speaks to the test, versus the predictive value for how accurate might that test be in determining person A has it or doesn't have it. So, the positive predictive value test depends primarily on the prevalence of the disease in the population being tested. So, if the prevalence is higher, the number of people with positive disease are going to increase. The specificity of that test and the sensitivity has no effect on either positive or negative predictive values. Okay? So, predictive values depend only on prevalence and specificity of the test. Let's talk a little bit about reliability, and then I'll probably give you a break here. Reliability is the repeatability of a test. Also, what's the inter-observer agreement? So, here, let's say I had two people that are taking an individual's blood pressure. What's the possibility or what's the reliability of this test? So, what we're really wanting is that I could be an abnormal blood pressure, a suspect blood pressure, doubtful or normal blood pressure, let's say. Well, it's determined I've got two individuals who are making this. The idea that this is abnormal actually is, or the percent agreement on this is actually the eigenvalue, if you will, of this matrix. It's the A, abnormal, abnormal, suspect, suspect in both, K value, doubtful, and then the normal, compared with the normal, times 100. Okay? So, you typically know the D value, so it's A over A plus B plus C on looking at this. Reliability has two main measures, the kappa statistic and the correlation coefficient. Correlation, there's a general relationship, is between-subject variability divided by the total of or the sum of the between-subjects variability plus the within-subject variability. So, the estimate of reliability is always a number between zero and one. Zero says it's not reliable at all. One says you have perfect matches, 100%. In general, it's divided into fifths. So, 80% and above, it's excellent agreement. 0.61, good agreement. 0.41, 0.26 is moderate. And then 0.21, 2.4 is fair. And then it's core agreement below 0.21. Okay? So, that's where a lot of this kind of stuff comes from. Let's take a break here for just physiologic purposes or a stretch, and we'll come back in, and I'll pick up with talking about study designs. This is a great lecture. This is one that should be given all the time, because this is the stuff that we need to review. You don't forget it. And then we'll do this next time. Well, hopefully, this will be out there. I've got another one that's going to be shorter. That's just going to talk about how to calculate. I'm going to do things where I do calculations, and I just refine that down to any calculable, so that people will be able to go back and review. That's one thing. This is a few basics. I'm glad that you're getting this recorded, because that means that people can review it. I'm going to have to go, because I've got another thing coming up. I just want to say thanks again. All right. It's great seeing you, as always. Thank you. You got to split. Okay. I'm getting that. I'm hungry. I'm going to see you later. I'll talk to you. Take care now, guys. We'll see you, too. All righty. Safe travel to both of you. Yeah. Okay, ladies and gentlemen, I am back. Hopefully you guys are still with me, but I'm going to continue onward. All right, looking at study designs, as you know, talking about study designs, there are a number of them, randomized clinical trials, cohort studies, case control, and case reports or case series. And in looking at the case control, you know, in looking at any of these, let's go back to our classic two-by-two table for epidemiologic studies. And you now look at the disease versus an exposure. So the disease is either present or absent, positive or negative. You have the disease, you don't have the disease. The question is now, instead of an intervention, were you exposed or were you not exposed to the disease? Okay, and we can move on with this. So why do trials? Clinical randomized trials only started in 1949. Okay, hear me again. Randomized clinical or randomized control trials only began in medicine in 1949. So why do we want to use them? Evaluate new forms of therapy and prevention, new drugs from other treatments, new medical or healthcare technology, new methods of primary prevention, new methods, new programs of screening and early detection of a disease, new ways of organizing and delivering health services, and measuring the impact of policies on healthcare and healthcare financing. And a lot of people need that. So in looking at clinical trials, look at the design. The design comes from some defined population, and we randomize a sample from that population. And they randomize to either getting the new treatment or the current treatment or the placebo or whatever you're comparing it to. And then looking at the new treatment, you're looking at, is it improved or not? Is the current treatment, has it shown improvement or no improvement of the underlying circumstance? So in other words, basically it's therapy versus not or therapy A versus therapy B. You can end up with either natural or non-experimental studies or experimental or randomized studies. And Stanley Campbell called all of these things or quasi-experimental studies or things like cohort and case control trials. So you're going to hear this by a rather large number and diverging points of definition. So in looking at the types of randomized clinical trials, very often it's superiority. Is one drug or treatment better than another? Are they equivalent? In other words, is giving AZT for eight weeks pre-partum, does it result in a better outcome or not than giving it for 16 weeks, let's say? You know, does giving AZT for four days pre-partum result in better outcomes or is it not as good as the eight-week time frame? Okay. Elements in the design of clinical control trials are who gets selected as a subject? And remember, they are no longer patients once they enter into a trial. They're subjects. How do I allocate the subjects to the treatment groups? Well, the idea is between randomization or sometimes stratification and randomization. Okay. Data collection on the subjects, and what we want to do is masking or blinding of the subjects. And the reason that I said this as masking is I was talking…we were talking about double-blind placebo-controlled trials, and my colleague was involved in doing double-blind placebo-controlled trials, but on ophthalmology issues. And in ophthalmology, the word blind has an entirely different connotation. So hence, I used the word masking in deference to Bob, and thank you very much, Bob. So what you're doing here is we're masking or blinding the subjects, the people who are getting the treatment or the placebo or whichever, on the data collectors and on the analysts. And what are we trying to do? We're trying to measure the outcomes, the need for explicit criteria. How do we know we got what the results are going to be? We need to be able to comparatively measure things in both groups, so they have to be the same kind of measurements. So why randomization? The idea behind it is to assure properly related or treated groups that we're comparing between the treated groups and the comparison groups. This is to lessen the probability of known potential biasing factors and also unknown potential biasing factors that we might not have thought of but are somehow into the system. So when is masking necessary? For the participants, I'm reporting the outcome subject might be due to bias. The observers, when recording of the outcome, might be subject to bias. And the data analysts, when the analysis interpretations, might be subject to bias. Notice I didn't talk about the types of bias. It's immaterial to the type of bias because we're trying to reduce or mitigate the effects of any types of bias. So a single-blind or single-masking trial is one where the patients are the only folks who do not know the drug they're taking. In a double-blind, both the patients don't know what drug they're taking and those giving them the drugs, so typically the physicians don't know what drug they're giving in looking at this. Now, one of the things that's there and needs to be studied more is, what's the placebo effect? We have found that just giving people something has a positive effect of some sort, often overdoing nothing. And therefore, that's known as the placebo effect. But we've never really adequately studied. There have been a lot of attempts to study. I wouldn't say a lot. There have been attempts to study the placebo effect. But I think what we need to do is still try to operationalize what we mean by that. In clinical evaluation of drugs, phase one studies are basically clinical pharmacology. You determine the level of toxicity and pharmacologic effect. Usually, you'll have a small number of patients, like maybe 20 to 80 people on the drug. You're just looking to see, does this have bad things happen when people are on it? The phase two studies, which is the biggest one, come into play and a lot of people talk about. This is what's called clinical investigation. Now, you're controlling here the clinical trials for effectiveness and relative safety. And here, you typically are going to put on anywhere between 100 and 200 people on that drug, just to see what's its effectiveness, what's the relative safety, one to the other. Phase three studies are the clinical trials themselves. This is the expanded control trials. These are performed after the effectiveness has been demonstrated to at least some sort of certain degree. Phase four studies are basically the post-marketing clinical trials. These are sometimes where you start to get the VAERS impact. In other words, those things where there are reports of poor outcomes after something has been in use for a while. Some of the ethical issues in randomized clinical trials is, is randomization actually ethical? Can you truly give informed consent if you don't know what you're being given? Are placebo-controlled trials even ethical themselves? And under what conditions should a randomized clinical trial be stopped earlier than originally planned? And an example of that was the Women's Health Initiative, which was supposed to go on and it was stopped after two years because they found that by giving the hormone replacement that they thought was going to be cardiac helpful, it turned out that it really wasn't at all and more people were dying who had the replacement than that didn't. Here, let's go by the question of reality. Well, the treatments are not different or are they different versus conclude that the decision that the treatments are not different versus they are different. Well, if in reality the treatments are not different and the conclusion was made that they're not different, that was a correct decision. Similarly, if you concluded that the treatments are different and the treatments really are different, well, that's also a correct conclusion. However, if you conclude that the treatments are different and they're really not, now you have a type one error. On the other hand, if you conclude that the treatments are not different, you have a type two error. Let me give you an example, totally out of medicine. Go back to the Old West and the 1968 movie, Hang Them High. Okay, so you go hang them high and basically the court determined that this person was not guilty and the person really was not guilty. Well, that's a correct decision. Determined the person was guilty and they really were guilty and you hanged them. Okay, good thing. Therefore, you've made a correct decision. But what happens if you conclude that they are guilty and they're really not guilty? To commit that then is a type one error. But what if they're really, and you, what if the other situation occurs and you basically think that they're not guilty and they really are guilty? You've committed a type two error, but in the justice system, allowing somebody who's really guilty to go free versus subjecting somebody who is really innocent, or I should say not guilty. To some sort of punishment, especially in the old days of the West, where the punishment was hang them high. You really didn't want to do that error. That's why, and you can always think of, well, how, which one's a type one error? Which one's a type two? Just think of type one error is worse. The type one error is the type where. You concluded that the person was guilty when they really weren't. Type two is where you concluded they were not guilty when they really were. Okay. That hopefully will help people. Probability of committing an error of type one is an A. Probability of committing a type two error is beta. And the correct decision, the probability is one minus beta. Okay, which also has to do with the power. Power is one minus the probability of making a type two error or one minus beta. All right, so this slide summarizes a lot of what I just mentioned. Okay, it's the probability of correctly concluded that the treatments do differ. Probability of detecting a difference between treatments if the treatments do, in fact, differ. Okay, let's look at cohort study. So here we determine an association. We got an environmental exposure and the disease or some other kind of an outcome. And the question is, did the environmental exposure cause the disease or other outcome? So what you do in the cohort design is you look at the things and you say, let's look at the people who are exposed versus not exposed. And then let's see who of the exposed developed the disease versus who doesn't versus the non-exposed. Some of them may actually develop that disease. Let's say lung cancer. You look at smoking versus non-smoking. Okay, some people are going to develop lung cancer who never smoked in their life. So in looking at cohort study, you have, again, disease develops, disease does not, exposed versus not exposed. And the totals are of the exposed, A plus B. The incidence rates are A over A plus B. It's the same thing that we had before. Non-exposed was C plus D. So we have C over C plus D. And the idea, remember, that's the incidence rate. So we first select exposure versus non-exposure. Then we go looking to see whether the disease develops or doesn't develop. So incidence in exposed, A over A plus B. Incidence in non-exposed, C over C plus D. So here's a different, more realistic example than the fake numbers I just gave you or the generalized slide. So the sample of a population to study the exposure to smoking on the development of coronary heart disease. Okay, in the people who smoked, 84, let's say there were 3,000 who smoked, and 5,000 people didn't smoke. So 84 people go on to develop CHD, and 2,916 don't. And then the people who did not smoke, 87 of those people go on to develop coronary heart disease, but 4,900 and change don't. All right. Well, you can see when you do the math, the 84 divided by 3,000 is 28 per 1,000. The people who didn't smoke, 87 per 5,000 is only 17.4 per 1,000 developed coronary heart disease. So now you have the question of how do I select study groups in epidemiologic studies? Clearly, again, experimental studies, you would go with a randomization, a random allocation. Observational studies, there's no randomization, but you still have control groups. The Framingham study, the penultimate, in my opinion, study, by the way, the guy who started the study in 1948 only died like eight years ago. So, objectives, to study the impact of several factors on the incidence of cardiovascular disease, and then look for exposures. They look for blood pressure, smoking, diet, physical activity, or lack thereof, age, gender, and they got multiple outcomes. P.S., this would never pass an IRB today. Oh, you want to just take a look and take a bunch of studies of this population, of these volunteers, and see what develops? Okay, yeah, sure. No, they wouldn't approve that. Okay, and yet, this is the classic study on all of this stuff that we do a lot of our stuff in medicine, okay? So, it began in 1948 to study cardiovascular disease. 5,200-plus men and women were recruited, given comprehensive exams in labs, and then they returned every two years for follow-up. Their children, 5,100 and change, children and their spouses were recruited in 1971, and back in like 2006, I haven't updated the slide, unfortunately, the third generation was recruited from their children, the grandchildren of the people who were in the original study, and the goal there was to get at least 3,500 participants. So, what they found was really interesting. Cardiovascular, you know, CHD, coronary heart disease, increases with age. It occurs earlier in more frequent neon males. People with hypertension develop CHD at a greater rate than those who don't have, who are normal blood pressures. Elevated cholesterol is associated with an increased risk of CHD. Tobacco smoking and habitual use of alcohol are associated with increased incidence, notwithstanding that there were some studies that showed some benefits to drinking red wine, but that has been, is a moderated amount of alcohol, not the amount we were talking about in 1948, and there have been some recent studies that have followed up that said, maybe drinking red wine isn't all that good for you. I need to read more about those studies in detail, but I'm not predominantly a red wine drinker except on two days a year, and I have four glasses on the first night, four glasses on the second. Somebody laughed because they got the Passover Seder reference. Okay, so increase in body weight predisposes one to coronary heart disease, and there's an increased development in coronary heart disease in people who have diabetes. So here's the population, derivation of the Framingham study in 1971. Okay, so they recruited a total there of 3,000 plus men and 3,400 plus women. The respondents, the volunteers, people who responded to the request for the sample were 2,200, I'm sorry, 2,400 men and women, and volunteers were 300 and 400, respectively. Respondents who were free of coronary heart disease were a little under 2,000 in men and a little over 2,400 in women, and in the volunteers who were free of coronary heart disease were 307 versus 427 in women. Why is this important? Well, if you're looking to see who is going to develop coronary heart disease, they can't already have coronary heart disease, okay, because otherwise you can't tell whether they develop it or not. So the total free from coronary heart disease became the Framingham study group. Types of potential bias in cohort study is selection bias. Who's been exposed? Who's been not exposed? What about information bias? Now, this is particularly important in retrospective cohort studies, okay? Bias in assessing the outcomes. Non-response bias. Is there something that's different maybe in the group that didn't respond to your request or you lost a follow-up? And is there any kind of inherent analytic bias that was introduced? So when is a cohort study warranted? When there's good evidence that the association of a disease with a certain exposure, when the exposure rate is rare but the incidence among the exposed is high, when the time between exposure and disease onset is short, and when the attrition of the study population can be minimized. When ample funds are available, you got to be able to go across the entire time. So in general, we don't study things typically longer than five years because we can't be assured that if we get the study grant that they're going to fund us for anything more than five years. And when the investigator has a long life expectancy, that usually helps. As I said, the guy who started this whole thing, the Framingham study, only died about eight years ago. He was like 98 or 99. Okay, well it depends upon how long you plan to do it. Maybe it's a motivating factor. The comment was, Mabel, maybe I shouldn't start a long-term study. I have one of our colleagues here who's showing an increase like me, increasing sparsity of hair on top. As my barber says, my hair grows faster in my beard than on the top of my head. Okay, case control study designs. So case control, we've got the question of cases versus controls. Cases are the disease, controls, no. And now we're going to go ahead and look at exposed versus not exposed. So where we had the cohort study, we were looking for the exposure and then to see what developed in term of disease. With case control, we're going to first select the cases and the controls. Who has the disease and who doesn't? Then we're going to go and measure the past exposures. How much were they exposed or not exposed? And the proportions would be determined. So let's look at an example of a case control study of ectopic pregnancy in relation to confirmed PID, confirmed pelvic inflammatory disease. So the number of cases are in the top and they are the columns. The PID are the rows, yes and no. And so this comes from a 1991 article by Closted and Associates. Basically, what they determined when looking at this was that 11% were the exposed and 3% of the controls basically versus the exposed on that point of matter. In a case control, you have a reference population and you pull the samples out, okay, the cases out. You then look for the controls. You have the reference population out of the total. And what are some types of non-hospital controls? Because hospital controls are fairly easy to get hold of, okay. Probability sample of the total population in some group, okay. Neighborhood, walking door to door. Well, I've heard of cases where people walking door to door doing sampling like that and interviewing ended up getting their crap beat out of them and ended up in the hospital. So maybe not so much. Random digit dialing, the idea that you would just get a whole thing and whether you would randomly call people. Too much like when I was a kid and we would call people and say, is your refrigerator running? Well, then you better catch it. We were mean kids, okay. But that was about the worst thing we did until our parents found out and that was it. I had more than one person, one woman go, Murray Berkowitz, is that you? I'm going to tell your mother. You're going to get in trouble. Anyway, I did. Worse would have been if she told my father. Best friend or an associate, spouse or sibling, birth certificate matching for childhood diseases. Classmate, again, matching for childhood diseases. When you match, you're selecting controls similar to the cases based on age, their gender or sex, race and so on. In matching, group matching is a question of frequency versus stratification versus individually matching where you have matched pairs. Problems with matching is if you match on too many variables, it may make it difficult to find an appropriate control. If you cannot explore the possible association of the disease with any variable on which the cases and the controls have been matched, that's critical. You cannot explore the possible association of a disease with any variable on which the cases and controls have been matched. In other words, if they say match for age, no, you can't compare for age now. Once matched, you can't generalize for that characteristic. And don't match on more than four or five variables because of these inherent problems. You're going to find things too limiting. The problems of recall in case control studies. There's a limitation in human ability to recall. Recall bias and cases may remember their exposure more than the controls do. There also may be rumination bias. So remembering details such as everything that happened during a pregnancy that resulted in a child who was deformed, born deformed. Missing an arm or missing a leg or something like that. Well, rare again, but, you know, mom and dad might remember every little thing. Whereas somebody born normally, um, they don't remember those details. Now, in looking at things, there's a terminology jungle. Case control study, risk perspective study, cohort prospective, concurrent cohort, longitudinal study or concurrent perspective study, retrospective cohort study, historical or non-concurrent perspective study, randomized clinical trial experimental study, cross-sectional study, which is a prevalence study. Ladies and gentlemen, there are so many different things that you can sit there and find and say, uh, in terms of this. And there are so many different things that you will read. And if you can find this table, you might say, somebody might sit there and say, well, we've done a longitudinal study. And you go, wait, what the heck is this longitudinal study? And you go back and say, oh, it's a concurrent cohort study. Fine. I got this. You know, so now the question becomes one, how do I estimate risk? Well, let's go back to that outbreak that was started in the beginning of this talk. 110 patients with GI symptoms to suburban medical centers overnight, ages were 15 to 17 years old. 110 out of 300 high school students attending a three-week student leadership camp at the University of Maryland with 110 students in this. And they became ill approximately 10 to 14 hours after eating food at a picnic. Okay. So we can continue this. Let's go on. The attack rate is the number of sick divided by the number of at risk or the number exposed. As I said, very similar to an incident rate. Absolute risk is the incidence of the disease, absolute risk. And the question now comes down is, here's what the food was. So here's the table. People who got sick, 83% of them ate the egg salad. People who didn't get sick, 30% of them ate the egg salad. Okay. Macaroni, you know, 76% of those who got sick had the macaroni. 67% didn't. And you can see that we have this all the way down, tuna salad, ice cream, et cetera, et cetera. In every case here, you can see that the number of people who got sick ate one or more of these things. Okay. They got sick in a higher percentage of these things. People who didn't eat of it got less sick. So differences in risk. The attributable risk is the risk of the exposed minus the risk in the non-exposed. The ratio of risks or the relative risks are the risk in the exposed divided by the risks in the non-exposed. And I'm going to go back to that previous slide just so you get an idea one more time. Okay. So going forward, what we had here is, number one, 8% there, and one minus two is 53%. So the attributable risk, okay, of egg salad versus not eating the egg salad was 53%. Macaroni, only nine. Cheese, two. Cheese, two. Ice cream, 14. And other things, only 22. So looking along with this, though, let's look at the relative risk. So here we have the table in the first left two columns. We have the absolute risk. Okay. One minus two. And then we have the relative risk. The risk of egg salad over the people who didn't eat the egg salad was 2.77, 1.73. Now, if you understand that a relative risk is in, let's look at the differences here. So in population A in this fictitious thing, all right, they were exposed to something and 80, 40% of the people who had this thing were exposed and 90% in population B were exposed. In the non-exposed, it was 10% versus 60. In both cases, the attributable risk, the risk of the exposed minus the non-exposed is 30. 40 minus 10 is 30. 90 minus 60 is 30. However, the ratio of the incident rates, the relative risk in the first case in population A, it's 4.1. Meanwhile, it's 1.5 in the population B. So a relative risk of one says there's no association. Doesn't matter. A relative risk greater than one says that the risk in the exposure is greater than the unexposed, if you will. That's a positive association. The question is, is it causal? You have a positive association. Is it causal? A risk of less than one says that the risk in the exposed is less than the risk in the unexposed. It's a negative association. Question, is this now protective? So the greater the magnitude of the relative risk, the stronger the association. Remember, the absolute value of something is just the positive difference in their determination. All right. Let's go back to that cohort study. So we had the example of the risks associated with smoking and developing heart disease. Repeated the two-by-two chart in the left two columns of this. Total, again, 3,000. The incidents per thousand were 28 versus 17.4. If you will, the relative risk was 28 divided by 17.4. So the relative risk is 1.61. There's an association there that says the association is stronger in people who smoke developing CHD than people who don't smoke. So what's the incidence, though, of the car? Is it attributable to their smoking? What's the attributable risk? It's 10.6. So measures of excess risk are then relative risk or odds ratios. And in calculating odds ratio in a cohort study, again, you look at these. What are the odds that an exposed person develops a disease? Well, it's A divided by B, exposed versus non-exposed. Now, what's the odds that a non-exposed person develops a disease? It's C divided by D. Or, if you will, odds OR, odds ratio, odds that an exposed person develops a disease divided by the odds non-exposed person develops the disease. The odds of the exposed person developing it, we already determined, was A over B. So if that becomes the numerator, then C over D is in the denominator, that quotient. If you cross-multiply, you see that AD divided by BC is going to give me my odds ratio. Now, calculating the odds ratio in a case control study, what are the odds that a case was exposed? It's A over C versus what's the odds that a control was exposed? B over D. Again, odds that a case is exposed is A over C, divided by odds that a control was exposed, denominator B over D, quotient there. Again, cross-multiplying, it's the same. So, you can do that as an estimate of relative risk when the following takes place. When cases are studied that are representative of all people with the disease in the population, from which the cases are drawn, when the controls are studied are representative of all people without the disease with regard to the exposure, and when the disease being studied is not a frequent disease. In other words, it's a low incidence or rare. So, if the incidence is low, then A plus B pretty much approaches B, and C plus D approaches D. So, in looking at this, then A divided by A plus B is going to approach A divided by B, and all of that's divided by C over C plus D. But again, C plus D is approaching D. So, here, if the incidence is low, this is the relative risks. Excuse me. And you can clearly see that this is an approximation of the odds ratio. And to show some examples of this, let's say what we have here is that we have a disease where out of 10,000 people who are having exposure, 200 develop a disease, and the other 9,800 don't. And the non-exposed are 100 out of the 10,000, and 9,900 don't. So, relative risk is 200 over 10,000 divided by 100 over 10,000. So, if you will, that's just simply relative risk is 2. On the other hand, and look at the odds ratio, 2 times 9,900 all divided by 100 over 9,800. And when you do that math out, it's 2.01. Clearly, it's pretty close, okay? And you can see I chose in formulating this number a number that was relatively low deliberately. The odds ratio is not a good estimate of the relative risk when the disease is not rare. So, let's say I have a disease where 50 people who were exposed develop a disease, and the 50 people don't for the 100 people. Whereas in 100 people who are not exposed, 25 people develop the disease and 75 don't. So, here, 50 divided by 100 all on 25 by 100 divided by 100 is 2. And the odds ratio, sorry guys, the odds ratio is 50 times 75, 8 times D, divided by 25, 50 times 20, divided by 50 times 25. So, that's simply 3. And you can see that 2 is not a good approximation of 3. So, odds ratio is independent of study design, whereas relative risk, not so true. Let's go with this table here or this slide here. You can see we talk about causation, and there are a number of associations or relationships. The first on this list is temporal relationship. Cause precedes effect. This is the essential thing for causation. If the cause didn't precede the risk, I don't care how strong of an association odds ratio or relative risk, how good the dose response relationship, how high the internal internal consistency of association is, you don't have and cannot have causation. Absolutely, the number one question for you in looking at a question of possible causation, did the cause precede the effect, temporality, okay? These are all things, is this biologically plausible, et cetera. If the confidence interval of relative risk or odds ratio includes the number one, 1.0, then it is not statistically significant. So, if something has an odds ratio from 0.8 to 1.5, it is right away, not statistically significant in terms of the confidence interval. So, let's go back to that outbreak. GI problems, most likely due to food poisoning, and the egg salad seems to be the most likely source. The causative agent, norovirus or norwalk virus. So, looking at process outcomes, efficacies, how well does the intervention work under ideal conditions? Effectiveness, how well does the intervention work under ideal conditions? Effectiveness, how well does the intervention work when it's applied in the community? And efficiency, are the results achieved in keeping with the efforts spent in time, money, and resources? And I have 10 minutes left, and you probably should be able to do all of these things. Open this up for questions.
Video Summary
Dr. Murray Berkowitz delivered an in-depth presentation on epidemiology, covering foundational concepts and methodologies critical to public health. He began with an introduction to his extensive background, emphasizing his dual military service and numerous academic achievements. The lecture tackled the fundamentals of epidemiology, delineating its role in understanding disease distribution and influencing public health policies. Dr. Berkowitz presented a detailed exploration of disease measures, distinguishing between rates of morbidity and mortality, and the importance of these metrics in assessing public health scenarios. He articulated complex topics like disease transmission, the distinction between endemic, epidemic, and pandemic, and the practical use of epidemiological studies in real-world outbreaks.<br /><br />Dr. Berkowitz also outlined the design and analysis of epidemiological studies, including randomized clinical trials and cohort studies, stressing the importance of randomization and blinding to minimize bias. He explained the applications of sensitivity, specificity, and predictive values in evaluating screening tests, clarifying common misconceptions regarding their use.<br /><br />The session further explored methods of estimating risk, using case studies to illustrate concepts such as incidence rates and relative risks. Dr. Berkowitz concluded with a focus on the interpretation of statistical data in understanding causation versus correlation in epidemiological research. This comprehensive review was designed to reinforce the audience's understanding and application of epidemiological principles in various health-related contexts, underscoring the necessity of accurate data interpretation in disease prevention and health policy formulation.
Keywords
epidemiology
public health
disease distribution
morbidity
mortality
disease transmission
endemic
epidemic
pandemic
randomized clinical trials
cohort studies
sensitivity
specificity
incidence rates
causation vs correlation
×
Please select your language
1
English