Friday, January 11, 2008
Writing the Research Report
Writing a research report is not about feature writing. Technical terms which are artistically defined should be used. Adjectives that describe the noun such as beautiful lady, extraordinary food, etc. should be avoided. I think you need to learn the difference of the two. Again, you need to take time in reading and learning the things you've missed.
I highly recommend to read technical writing book/s or those research output of the graduate school from Ateneo, La Salle, and state universities like UP and IIT. Get updated with my blogs because I will discuss Writing Technical Report.
Wednesday, January 9, 2008
Deduction & Induction
Deductive reasoning works from the more general to the more specific. Sometimes this is informally called a "top-down" approach. We might begin with thinking up a theory about our topic of interest. We then narrow that down into more specific hypotheses that we can test. We narrow down even further when we collect observations to address the hypotheses. This ultimately leads us to be able to test the hypotheses with specific data -- a confirmation (or not) of our original theories.
Inductive reasoning works the other way, moving from specific observations to broader generalizations and theories. Informally, we sometimes call this a "bottom up" approach (please note that it's "bottom up" and not "bottoms up" which is the kind of thing the bartender says to customers when he's trying to close for the night!). In inductive reasoning, we begin with specific observations and measures, begin to detect patterns and regularities, formulate some tentative hypotheses that we can explore, and finally end up developing some general conclusions or theories.
These two methods of reasoning have a very different "feel" to them when you're conducting research. Inductive reasoning, by its very nature, is more open-ended and exploratory, especially at the beginning. Deductive reasoning is more narrow in nature and is concerned with testing or confirming hypotheses. Even though a particular study may look like it's purely deductive (e.g., an experiment designed to test the hypothesized effects of some treatment on some outcome), most social research involves both inductive and deductive reasoning processes at some time in the project. In fact, it doesn't take a rocket scientist to see that we could assemble the two graphs above into a single circular one that continually cycles from theories down to observations and back up again to theories. Even in the most constrained experiment, the researchers may observe patterns in the data that lead them to develop new theories.
Structure of Research
Components of a Study
Most social research originates from some general problem or question. You might, for instance, be interested in what programs enable the unemployed to get jobs. Usually, the problem is broad enough that you could not hope to address it adequately in a single research study. Consequently, we typically narrow the problem down to a more specific research question that we can hope to address. The research question is often stated in the context of some theory that has been advanced to address the problem. For instance, we might have the theory that ongoing support services are needed to assure that the newly employed remain employed. The research question is the central issue being addressed in the study and is often phrased in the language of theory. For instance, a research question might be:
The problem with such a question is that it is still too general to be studied directly. Consequently, in most research we develop an even more specific statement, called an hypothesis that describes in operational terms exactly what we think will happen in the study. For instance, the hypothesis for our employment study might be something like:
The Metropolitan Supported Employment Program will significantly increase rates of employment after six months for persons who are newly employed (after being out of work for at least one year) compared with persons who receive no comparable program.
Notice that this hypothesis is specific enough that a reader can understand quite well what the study is trying to assess.
In causal studies, we have at least two major variables of interest, the cause and the effect. Usually the cause is some type of event, program, or treatment. We make a distinction between causes that the researcher can control (such as a program) versus causes that occur naturally or outside the researcher's influence (such as a change in interest rates, or the occurrence of an earthquake). The effect is the outcome that you wish to study. For both the cause and effect we make a distinction between our idea of them (the construct) and how they are actually manifested in reality. For instance, when we think about what a program of support services for the newly employed might be, we are thinking of the "construct". On the other hand, the real world is not always what we think it is. In research, we remind ourselves of this by distinguishing our view of an entity (the construct) from the entity as it exists (the operationalization). Ideally, we would like the two to agree.
Social research is always conducted in a social context. We ask people questions, or observe families interacting, or measure the opinions of people in a city. An important component of a research project is the units that participate in the project. Units are directly related to the question of sampling. In most projects we cannot involve all of the people we might like to involve. For instance, in studying a program of support services for the newly employed we can't possibly include in our study everyone in the world, or even in the country, who is newly employed. Instead, we have to try to obtain a representative sample of such people. When sampling, we make a distinction between the theoretical population of interest to our study and the final sample that we actually measure in our study. Usually the term "units" refers to the people that we sample and from whom we gather information. But for some projects the units are organizations, groups, or geographical entities like cities or towns. Sometimes our sampling strategy is multi-level: we sample a number of cities and within them sample families.
Finally, in a causal study we usually are comparing the effects of our cause of interest (e.g., the program) relative to other conditions (e.g., another program or no program at all). Thus, a key component in a causal study concerns how we decide what units (e.g., people) receive our program and which are placed in an alternative condition. This issue is directly related to the research design that we use in the study. One of the central questions in research design is determining how people wind up in or are placed in various programs or treatments that we are comparing.
These, then, are the major components in a causal study:
- The Research Problem
- The Research Question
- The Program (Cause)
- The Units
- The Outcomes (Effect)
- The Design
SMC: Ist Draft Project Proposal
Now, let us begin using the technology to reach my instruction even in your respective homes and to places where you will be assigned to. Beginning today, you will be submitting your work in this blogspot as comment but you have to register in blogspot.com in order for you to publish your assignment. This means that you need to copy your work in Microsoft word and paste it here.
Your work will be read by your classmates and by people who is actively engaged in research. You might be wondering why there is a need for everyone to see your work and for you to write your thoughts. This is a good venue for everyone to participate in the discussions including professional researchers. I am inviting people like Dr. Roberto Padua (former CHED Commissioner and a good friend), Dr. Chona Echavez (Research Institute for Mindanao Culture Research Associate), Ms. Ivy Sabuga (Asian Institute of Management Policy Center Researcher), and College Deans to view our site and interact with us.
Obviously, I will be giving a grade for participating and posting your questions to every blogs that I will be posting in this site. Ultimately, my goal is to seriously monitor your improvement and your credibility to write technical reports. By this time, it is unnecessary to teach you ENGLISH Grammar and logic considering the units you've earned from your first and second years in college. My presence in your class is not to teach you the basic education rather the application of your basic education.
Research is about logic. Although I am aiming for a basic research output which is technically difficult for amateurs, Applied research will be accepted with due consideration to logic, relevance, and accuracy. This should be clear to you.
Further, your 2nd Draft should be posted in this blog title on January 16, 2008, wednesday. Please indicate your Research Problem and the members of the group. No need for you to print your assignment. No paper will reach my office until I will instruct you to.
Be good and see you.
Positivism & Post-Positivism
When most people in our society think about science, they think about some guy in a white lab coat working at a lab bench mixing up chemicals. They think of science as boring, cut-and-dry, and they think of the scientist as narrow-minded and esoteric (the ultimate nerd -- think of the humorous but nonetheless mad scientist in the Back to the Future movies, for instance). A lot of our stereotypes about science come from a period where science was dominated by a particular philosophy -- positivism -- that tended to support some of these views. Here, I want to suggest (no matter what the movie industry may think) that science has moved on in its thinking into an era of post-positivism where many of those stereotypes of the scientist no longer hold up.
Let's begin by considering what positivism is. In its broadest sense, positivism is a rejection of metaphysics (I leave it you to look up that term if you're not familiar with it). It is a position that holds that the goal of knowledge is simply to describe the phenomena that we experience. The purpose of science is simply to stick to what we can observe and measure. Knowledge of anything beyond that, a positivist would hold, is impossible. When I think of positivism (and the related philosophy of logical positivism) I think of the behaviorists in mid-20th Century psychology. These were the mythical 'rat runners' who believed that psychology could only study what could be directly observed and measured. Since we can't directly observe emotions, thoughts, etc. (although we may be able to measure some of the physical and physiological accompaniments), these were not legitimate topics for a scientific psychology. B.F. Skinner argued that psychology needed to concentrate only on the positive and negative reinforcers of behavior in order to predict how people will behave -- everything else in between (like what the person is thinking) is irrelevant because it can't be measured.
In a positivist view of the world, science was seen as the way to get at truth, to understand the world well enough so that we might predict and control it. The world and the universe were deterministic -- they operated by laws of cause and effect that we could discern if we applied the unique approach of the scientific method. Science was largely a mechanistic or mechanical affair. We use deductive reasoning to postulate theories that we can test. Based on the results of our studies, we may learn that our theory doesn't fit the facts well and so we need to revise our theory to better predict reality. The positivist believed in empiricism -- the idea that observation and measurement was the core of the scientific endeavor. The key approach of the scientific method is the experiment, the attempt to discern natural laws through direct manipulation and observation.
OK, I am exaggerating the positivist position (although you may be amazed at how close to this some of them actually came) in order to make a point. Things have changed in our views of science since the middle part of the 20th century. Probably the most important has been our shift away from positivism into what we term post-positivism. By post-positivism, I don't mean a slight adjustment to or revision of the positivist position -- post-positivism is a wholesale rejection of the central tenets of positivism. A post-positivist might begin by recognizing that the way scientists think and work and the way we think in our everyday life are not distinctly different. Scientific reasoning and common sense reasoning are essentially the same process. There is no difference in kind between the two, only a difference in degree. Scientists, for example, follow specific procedures to assure that observations are verifiable, accurate and consistent. In everyday reasoning, we don't always proceed so carefully (although, if you think about it, when the stakes are high, even in everyday life we become much more cautious about measurement. Think of the way most responsible parents keep continuous watch over their infants, noticing details that non-parents would never detect).
One of the most common forms of post-positivism is a philosophy called critical realism. A critical realist believes that there is a reality independent of our thinking about it that science can study. (This is in contrast with a subjectivist who would hold that there is no external reality -- we're each making this all up!). Positivists were also realists. The difference is that the post-positivist critical realist recognizes that all observation is fallible and has error and that all theory is revisable. In other words, the critical realist is critical of our ability to know reality with certainty. Where the positivist believed that the goal of science was to uncover the truth, the post-positivist critical realist believes that the goal of science is to hold steadfastly to the goal of getting it right about reality, even though we can never achieve that goal! Because all measurement is fallible, the post-positivist emphasizes the importance of multiple measures and observations, each of which may possess different types of error, and the need to use triangulation across these multiple errorful sources to try to get a better bead on what's happening in reality. The post-positivist also believes that all observations are theory-laden and that scientists (and everyone else, for that matter) are inherently biased by their cultural experiences, world views, and so on. This is not cause to give up in despair, however. Just because I have my world view based on my experiences and you have yours doesn't mean that we can't hope to translate from each other's experiences or understand each other. That is, post-positivism rejects the relativist idea of the incommensurability of different perspectives, the idea that we can never understand each other because we come from different experiences and cultures. Most post-positivists are constructivists who believe that we each construct our view of the world based on our perceptions of it. Because perception and observation is fallible, our constructions must be imperfect. So what is meant by objectivity in a post-positivist world? Positivists believed that objectivity was a characteristic that resided in the individual scientist. Scientists are responsible for putting aside their biases and beliefs and seeing the world as it 'really' is. Post-positivists reject the idea that any individual can see the world perfectly as it really is. We are all biased and all of our observations are affected (theory-laden). Our best hope for achieving objectivity is to triangulate across multiple fallible perspectives! Thus, objectivity is not the characteristic of an individual, it is inherently a social phenomenon. It is what multiple individuals are trying to achieve when they criticize each other's work. We never achieve objectivity perfectly, but we can approach it. The best way for us to improve the objectivity of what we do is to do it within the context of a broader contentious community of truth-seekers (including other scientists) who criticize each other's work. The theories that survive such intense scrutiny are a bit like the species that survive in the evolutionary struggle. (This is sometimes called the natural selection theory of knowledge and holds that ideas have 'survival value' and that knowledge evolves through a process of variation, selection and retention). They have adaptive value and are probably as close as our species can come to being objective and understanding reality.
Clearly, all of this stuff is not for the faint-of-heart. I've seen many a graduate student get lost in the maze of philosophical assumptions that contemporary philosophers of science argue about. And don't think that I believe this is not important stuff. But, in the end, I tend to turn pragmatist on these matters. Philosophers have been debating these issues for thousands of years and there is every reason to believe that they will continue to debate them for thousands of years more. Those of us who are practicing scientists should check in on this debate from time to time (perhaps every hundred years or so would be about right). We should think about the assumptions we make about the world when we conduct research. But in the meantime, we can't wait for the philosophers to settle the matter. After all, we do have our own work to do!
Unit of Analysis
One of the most important ideas in a research project is the unit of analysis. The unit of analysis is the major entity that you are analyzing in your study. For instance, any of the following could be a unit of analysis in a study:
- individuals
- groups
- artifacts (books, photos, newspapers)
- geographical units (town, census tract, state)
- social interactions (dyadic relations, divorces, arrests)
Why is it called the 'unit of analysis' and not something else (like, the unit of sampling)? Because it is the analysis you do in your study that determines what the unit is. For instance, if you are comparing the children in two classrooms on achievement test scores, the unit is the individual child because you have a score for each child. On the other hand, if you are comparing the two classes on classroom climate, your unit of analysis is the group, in this case the classroom, because you only have a classroom climate score for the class as a whole and not for each individual student. For different analyses in the same study you may have different units of analysis. If you decide to base an analysis on student scores, the individual is the unit. But you might decide to compare average classroom performance. In this case, since the data that goes into the analysis is the average itself (and not the individuals' scores) the unit of analysis is actually the group. Even though you had data at the student level, you use aggregates in the analysis. In many areas of social research these hierarchies of analysis units have become particularly important and have spawned a whole area of statistical analysis sometimes referred to as hierarchical modeling. This is true in education, for instance, where we often compare classroom performance but collected achievement data at the individual student level.
Variables
Variables aren't always 'quantitative' or numerical. The variable 'gender' consists of two text values: 'male' and 'female'. We can, if it is useful, assign quantitative values instead of (or in place of) the text values, but we don't have to assign numbers in order for something to be a variable. It's also important to realize that variables aren't only things that we measure in the traditional sense. For instance, in much social research and in program evaluation, we consider the treatment or program to be made up of one or more variables (i.e., the 'cause' can be considered a variable). An educational program can have varying amounts of 'time on task', 'classroom settings', 'student-teacher ratios', and so on. So even the program can be considered a variable (which can be made up of a number of sub-variables).
An attribute is a specific value on a variable. For instance, the variable sex or gender has two attributes: male and female. Or, the variable agreement might be defined as having five attributes:
1 = strongly disagree
2 = disagree
3 = neutral
4 = agree
5 = strongly agree
Another important distinction having to do with the term 'variable' is the distinction between an independent and dependent variable. This distinction is particularly relevant when you are investigating cause-effect relationships. It took me the longest time to learn this distinction. (Of course, I'm someone who gets confused about the signs for 'arrivals' and 'departures' at airports -- do I go to arrivals because I'm arriving at the airport or does the person I'm picking up go to arrivals because they're arriving on the plane!). I originally thought that an independent variable was one that would be free to vary or respond to some program or treatment, and that a dependent variable must be one that depends on my efforts (that is, it's the treatment). But this is entirely backwards! In fact the independent variable is what you (or nature) manipulates -- a treatment or program or cause. The dependent variable is what is affected by the independent variable -- your effects or outcomes. For example, if you are studying the effects of a new educational program on student achievement, the program is the independent variable and your measures of achievement are the dependent ones.
Finally, there are two traits of variables that should always be achieved. Each variable should be exhaustive, it should include all possible answerable responses. For instance, if the variable is "religion" and the only options are "Protestant", "Jewish", and "Muslim", there are quite a few religions I can think of that haven't been included. The list does not exhaust all possibilities. On the other hand, if you exhaust all the possibilities with some variables -- religion being one of them -- you would simply have too many responses. The way to deal with this is to explicitly list the most common attributes and then use a general category like "Other" to account for all remaining ones. In addition to being exhaustive, the attributes of a variable should be mutually exclusive, no respondent should be able to have two attributes simultaneously. While this might seem obvious, it is often rather tricky in practice. For instance, you might be tempted to represent the variable "Employment Status" with the two attributes "employed" and "unemployed." But these attributes are not necessarily mutually exclusive -- a person who is looking for a second job while employed would be able to check both attributes! But don't we often use questions on surveys that ask the respondent to "check all that apply" and then list a series of categories? Yes, we do, but technically speaking, each of the categories in a question like that is its own variable and is treated dichotomously as either "checked" or "unchecked", attributes that are mutually exclusive.
Tuesday, January 8, 2008
Validity:the best available approximation to the truth of a given proposition, inference, or conclusion
We make lots of different inferences or conclusions while conducting research. Many of these are related to the process of doing research and are not the major hypotheses of the study. Nevertheless, like the bricks that go into building a wall, these intermediate process and methodological propositions provide the foundation for the substantive conclusions that we wish to address. For instance, virtually all social research involves measurement or observation. And, whenever we measure or observe we are concerned with whether we are measuring what we intend to measure or with how our observations are influenced by the circumstances in which they are made. We reach conclusions about the quality of our measures -- conclusions that will play an important role in addressing the broader substantive issues of our study. When we talk about the validity of research, we are often referring to these to the many conclusions we reach about the quality of different parts of our research methodology.
We subdivide validity into four types. Each type addresses a specific methodological question. In order to understand the types of validity, you have to know something about how we investigate a research question. Because all four validity types are really only operative when studying causal questions, we will use a causal study to set the context.
The figure shows that there are really two realms that are involved in research. The first, on the top, is the land of theory. It is what goes on inside our heads as researchers. It is were we keep our theories about how the world operates. The second, on the bottom, is the land of observations. It is the real world into which we translate our ideas -- our programs, treatments, measures and observations. When we conduct research, we are continually flitting back and forth between these two realms, between what we think about the world and what is going on in it. When we are investigating a cause-effect relationship, we have a theory (implicit or otherwise) of what the cause is (the cause construct). For instance, if we are testing a new educational program, we have an idea of what it would look like ideally. Similarly, on the effect side, we have an idea of what we are ideally trying to affect and measure (the effect construct). But each of these, the cause and the effect, has to be translated into real things, into a program or treatment and a measure or observational method. We use the term operationalization to describe the act of translating a construct into its manifestation. In effect, we take our idea and describe it as a series of operations or procedures. Now, instead of it only being an idea in our minds, it becomes a public entity that anyone can look at and examine for themselves. It is one thing, for instance, for you to say that you would like to measure self-esteem (a construct). But when you show a ten-item paper-and-pencil self-esteem measure that you developed for that purpose, others can look at it and understand more clearly what you intend by the term self-esteem.
Now, back to explaining the four validity types. They build on one another, with two of them (conclusion and internal) referring to the land of observation on the bottom of the figure, one of them (construct) emphasizing the linkages between the bottom and the top, and the last (external) being primarily concerned about the range of our theory on the top. Imagine that we wish to examine whether use of a World Wide Web (WWW) Virtual Classroom improves student understanding of course material. Assume that we took these two constructs, the cause construct (the WWW site) and the effect (understanding), and operationalized them -- turned them into realities by constructing the WWW site and a measure of knowledge of the course material. Here are the four validity types and the question each addresses:
1. Conclusion Validity: In this study, is there a relationship between the two variables?
In the context of the example we're considering, the question might be worded: in this study, is there a relationship between the WWW site and knowledge of course material? There are several conclusions or inferences we might draw to answer such a question. We could, for example, conclude that there is a relationship. We might conclude that there is a positive relationship. We might infer that there is no relationship. We can assess the conclusion validity of each of these conclusions or inferences.
2. Internal Validity: Assuming that there is a relationship in this study, is the relationship a causal one?
Just because we find that use of the WWW site and knowledge are correlated, we can't necessarily assume that WWW site use causes the knowledge. Both could, for example, be caused by the same factor. For instance, it may be that wealthier students who have greater resources would be more likely to use have access to a WWW site and would excel on objective tests. When we want to make a claim that our program or treatment caused the outcomes in our study, we can consider the internal validity of our causal claim.
3. Construct Validity: Assuming that there is a causal relationship in this study, can we claim that the program reflected well our construct of the program and that our measure reflected well our idea of the construct of the measure?
In simpler terms, did we implement the program we intended to implement and did we measure the outcome we wanted to measure? In yet other terms, did we operationalize well the ideas of the cause and the effect? When our research is over, we would like to be able to conclude that we did a credible job of operationalizing our constructs -- we can assess the construct validity of this conclusion.
4. External Validity: Assuming that there is a causal relationship in this study between the constructs of the cause and the effect, can we generalize this effect to other persons, places or times?
We are likely to make some claims that our research findings have implications for other groups and individuals in other settings and at other times. When we do, we can examine the external validity of these claims.
Notice how the question that each validity type addresses presupposes an affirmative answer to the previous one. This is what we mean when we say that the validity types build on one another. The figure shows the idea of cumulativeness as a staircase, along with the key question for each validity type.
For any inference or conclusion, there are always possible threats to validity -- reasons the conclusion or inference might be wrong. Ideally, one tries to reduce the plausibility of the most likely threats to validity, thereby leaving as most plausible the conclusion reached in the study. For instance, imagine a study examining whether there is a relationship between the amount of training in a specific technology and subsequent rates of use of that technology. Because the interest is in a relationship, it is considered an issue of conclusion validity. Assume that the study is completed and no significant correlation between amount of training and adoption rates is found. On this basis it is concluded that there is no relationship between the two. How could this conclusion be wrong -- that is, what are the "threats to validity"? For one, it's possible that there isn't sufficient statistical power to detect a relationship even if it exists. Perhaps the sample size is too small or the measure of amount of training is unreliable. Or maybe assumptions of the correlational test are violated given the variables used. Perhaps there were random irrelevancies in the study setting or random heterogeneity in the respondents that increased the variability in the data and made it harder to see the relationship of interest. The inference that there is no relationship will be stronger -- have greater conclusion validity -- if one can show that these alternative explanations are not credible. The distributions might be examined to see if they conform with assumptions of the statistical test, or analyses conducted to determine whether there is sufficient statistical power.The theory of validity, and the many lists of specific threats, provide a useful scheme for assessing the quality of research conclusions. The theory is general in scope and applicability, well-articulated in its philosophical suppositions, and virtually impossible to explain adequately in a few minutes. As a framework for judging the quality of evaluations it is indispensable and well worth understanding
Philosophy of Research
Before the modern idea of research emerged, we had a term for what philosophers used to call research -- logical reasoning. So, it should come as no surprise that some of the basic distinctions in logic have carried over into contemporary research. In Systems of Logic we discuss how two major logical systems, the inductive and deductive methods of reasoning, are related to modern research.
OK, you knew that no introduction would be complete without considering something having to do with assumptions and philosophy. (I thought I very cleverly snuck in the stuff about logic in the last paragraph). All research is based on assumptions about how the world is perceived and how we can best come to understand it. Of course, nobody really knows how we can best understand the world, and philosophers have been arguing about that very question for at least two millennia now, so all we're going to do is look at how most contemporary social scientists approach the question of how we know about the world around us. We consider two major philosophical schools of thought – Positivism and Post-Positivism -- that are especially important perspectives for contemporary social research (OK, I'm only considering positivism and post-positivism here because these are the major schools of thought. Forgive me for not considering the hotly debated alternatives like relativism, subjectivism, hermeneutics, deconstructivism, constructivism, feminism, etc. If you really want to cover that stuff, start your own Web site and send me your URL to stick in here).
Quality is one of the most important issues in research. We introduce the idea of validity to refer to the quality of various conclusions you might reach based on a research project. Here's where I've got to give you the pitch about validity. When I mention validity, most students roll their eyes, curl up into a fetal position or go to sleep. They think validity is just something abstract and philosophical (and I guess it is at some level). But I think if you can understand validity -- the principles that we use to judge the quality of research -- you'll be able to do much more than just complete a research project. You'll be able to be a virtuoso at research, because you'll have an understanding of why we need to do certain things in order to assure quality. You won't just be plugging in standard procedures you learned in school -- sampling method X, measurement tool Y -- you'll be able to help create the next generation of research technology. Enough for now -- more on this later.
Patters of Relationships
Then, we have the positive relationship. In a positive relationship, high values on one variable are associated with high values on the other and low values on one are associated with low values on the other. In this example, we assume an idealized positive relationship between years of education and the salary one might expect to be making.
On the other hand a negative relationship implies that high values on one variable are associated with low values on the other. This is also sometimes termed an inverse relationship. Here, we show an idealized negative relationship between a measure of self esteem and a measure of paranoia in psychiatric patients.
These are the simplest types of relationships we might typically estimate in research. But the pattern of a relationship can be more complex than this. For instance, the figure on the left shows a relationship that changes over the range of both variables, a curvilinear relationship. In this example, the horizontal axis represents dosage of a drug for an illness and the vertical axis represents a severity of illness measure. As dosage rises, severity of illness goes down. But at some point, the patient begins to experience negative side effects associated with too high a dosage, and the severity of illness begins to increase again.
Then we have to consider defining some basic terms like variable, hypothesis, data, and unit of analysis. If you're like me, you hate learning vocabulary, so we'll quickly move along to consideration of two of the major fallacies of research, just to give you an idea of how wrong even researchers can be if they're not careful (of course, there's always a certainly probability that they'll be wrong even if they're extremely careful).
Types of Relationships
A relationship refers to the correspondence between two variables. When we talk about types of relationships, we can mean that in at least two ways: the nature of the relationship or the pattern of it.
Two Research Fallacies
The ecological fallacy occurs when you make conclusions about individuals based only on analyses of group data. For instance, assume that you measured the math scores of a particular classroom and found that they had the highest average score in the district. Later (probably at the mall) you run into one of the kids from that class and you think to yourself "she must be a math whiz." Aha! Fallacy! Just because she comes from the class with the highest average doesn't mean that she is automatically a high-scorer in math. She could be the lowest math scorer in a class that otherwise consists of math geniuses!
An exception fallacy is sort of the reverse of the ecological fallacy. It occurs when you reach a group conclusion on the basis of exceptional cases. This is the kind of fallacious reasoning that is at the core of a lot of sexism and racism. The stereotype is of the guy who sees a woman make a driving error and concludes that "women are terrible drivers." Wrong! Fallacy!
Both of these fallacies point to some of the traps that exist in both research and everyday reasoning. They also point out how important it is that we do research. We need to determine empirically how individuals perform (not just rely on group averages). Similarly, we need to look at whether there are correlations between certain behaviors and certain groups (you might look at the whole controversy around the book The Bell Curve as an attempt to examine whether the supposed relationship between race and IQ is real or a fallacy.
Types of Data
Personally, while I find the distinction between qualitative and quantitative data to have some utility, I think most people draw too hard a distinction, and that can lead to all sorts of confusion. In some areas of social research, the qualitative-quantitative distinction has led to protracted arguments with the proponents of each arguing the superiority of their kind of data over the other. The quantitative types argue that their data is 'hard', 'rigorous', 'credible', and 'scientific'. The qualitative proponents counter that their data is 'sensitive', 'nuanced', 'detailed', and 'contextual'.
For many of us in social research, this kind of polarized debate has become less than productive. And, it obscures the fact that qualitative and quantitative data are intimately related to each other. All quantitative data is based upon qualitative judgments; and all qualitative data can be described and manipulated numerically. For instance, think about a very common quantitative measure in social research -- a self esteem scale. The researchers who develop such instruments had to make countless judgments in constructing them: how to define self esteem; how to distinguish it from other related concepts; how to word potential scale items; how to make sure the items would be understandable to the intended respondents; what kinds of contexts it could be used in; what kinds of cultural and language constraints might be present; and on and on. The researcher who decides to use such a scale in their study has to make another set of judgments: how well does the scale measure the intended concept; how reliable or consistent is it; how appropriate is it for the research context and intended respondents; and on and on. Believe it or not, even the respondents make many judgments when filling out such a scale: what is meant by various terms and phrases; why is the researcher giving this scale to them; how much energy and effort do they want to expend to complete it, and so on. Even the consumers and readers of the research will make lots of judgments about the self esteem measure and its appropriateness in that research context. What may look like a simple, straightforward, cut-and-dried quantitative measure is actually based on lots of qualitative judgments made by lots of different people.
On the other hand, all qualitative information can be easily converted into quantitative, and there are many times when doing so would add considerable value to your research. The simplest way to do this is to divide the qualitative information into units and number them! I know that sounds trivial, but even that simple nominal enumeration can enable you to organize and process qualitative information more efficiently. Perhaps more to the point, we might take text information (say, excerpts from transcripts) and pile these excerpts into piles of similar statements. When we do something even as easy as this simple grouping or piling task, we can describe the results quantitatively. For instance, if we had ten statements and we grouped these into five piles (as shown in the figure), we could describe the piles using a 10 x 10 table of 0's and 1's. If two statements were placed together in the same pile, we would put a 1 in their row-column juncture. If two statements were placed in different piles, we would use a 0. The resulting matrix or table describes the grouping of the ten statements in terms of their similarity. Even though the data in this example consists of qualitative statements (one per card), the result of our simple qualitative procedure (grouping similar excerpts into the same piles) is quantitative in nature. "So what?" you ask. Once we have the data in numerical form, we can manipulate it numerically. For instance, we could have five different judges sort the 10 excerpts and obtain a 0-1 matrix like this for each judge. Then we could average the five matrices into a single one that shows the proportions of judges who grouped each pair together. This proportion could be considered an estimate of the similarity (across independent judges) of the excerpts. While this might not seem too exciting or useful, it is exactly this kind of procedure that I use as an integral part of the process of developing 'concept maps' of ideas for groups of people (something that is useful!).
Hypotheses
Actually, whenever I talk about an hypothesis, I am really thinking simultaneously about two hypotheses. Let's say that you predict that there will be a relationship between two variables in your study. The way we would formally set up the hypothesis test is to formulate two hypothesis statements, one that describes your prediction and one that describes all the other possible outcomes with respect to the hypothesized relationship. Your prediction is that variable A and variable B will be related (you don't care whether it's a positive or negative relationship). Then the only other possible outcome would be that variable A and variable B are not related. Usually, we call the hypothesis that you support (your prediction) the alternative hypothesis, and we call the hypothesis that describes the remaining possible outcomes the null hypothesis. Sometimes we use a notation like HA or H1 to represent the alternative hypothesis or your prediction, and HO or H0 to represent the null case. You have to be careful here, though. In some studies, your prediction might very well be that there will be no difference or change. In this case, you are essentially trying to find support for the null hypothesis and you are opposed to the alternative.
If your prediction specifies a direction, and the null therefore is the no difference prediction and the prediction of the opposite direction, we call this a one-tailed hypothesis. For instance, let's imagine that you are investigating the effects of a new employee training program and that you believe one of the outcomes will be that there will be less employee absenteeism. Your two hypotheses might be stated something like this:
HO: As a result of the XYZ company employee training program, there will either be no significant difference in employee absenteeism or there will be a significant increase.
which is tested against the alternative hypothesis:
HA: As a result of the XYZ company employee training program, there will be a significant decrease in employee absenteeism.
In the figure on the left, we see this situation illustrated graphically. The alternative hypothesis -- your prediction that the program will decrease absenteeism -- is shown there. The null must account for the other two possible conditions: no difference, or an increase in absenteeism. The figure shows a hypothetical distribution of absenteeism differences. We can see that the term "one-tailed" refers to the tail of the distribution on the outcome variable.
When your prediction does not specify a direction, we say you have a two-tailed hypothesis. For instance, let's assume you are studying a new drug treatment for depression. The drug has gone through some initial animal trials, but has not yet been tested on humans. You believe (based on theory and the previous research) that the drug will have an effect, but you are not confident enough to hypothesize a direction and say the drug will reduce depression (after all, you've seen more than enough promising drug treatments come along that eventually were shown to have severe side effects that actually worsened symptoms). In this case, you might state the two hypotheses like this:
The null hypothesis for this study is:
HO: As a result of 300mg./day of the ABC drug, there will be no significant difference in depression
HA: As a result of 300mg./day of the ABC drug, there will be a significant difference in depression.
The figure on the right illustrates this two-tailed prediction for this case. Again, notice that the term "two-tailed" refers to the tails of the distribution for your outcome variable.
The important thing to remember about stating hypotheses is that you formulate your prediction (directional or not), and then you formulate a second hypothesis that is mutually exclusive of the first and incorporates all possible alternative outcomes for that case. When your study analysis is completed, the idea is that you will have to choose between the two hypotheses. If your prediction was correct, then you would (usually) reject the null hypothesis and accept the alternative. If your original prediction was not supported in the data, then you will accept the null hypothesis and reject the alternative. The logic of hypothesis testing is based on these two basic principles:
- The formulation of two mutually exclusive hypothesis statements that, together, exhaust all possible outcomes.
- The testing of these so that one is necessarily accepted and the other rejected.
OK, I know it's a convoluted, awkward and formalistic way to ask research questions. But it encompasses a long tradition in statistics called the hypothetical-deductive model, and sometimes we just have to do things because they're traditions. And anyway, if all of this hypothesis testing was easy enough so anybody could understand it, how do you think statisticians would stay employed?
Time in Research
A further distinction is made between two types of longitudinal designs: repeated measures and time series. There is no universally agreed upon rule for distinguishing these two terms, but in general, if you have two or a few waves of measurement, you are using a repeated measures design. If you have many waves of measurement over time, you have a time series. How many is 'many'? Usually, we wouldn't use the term time series unless we had at least twenty waves of measurement, and often far more. Sometimes the way we distinguish these is with the analysis methods we would use. Time series analysis requires that you have at least twenty or so observations. Repeated measures analyses (like repeated measures ANOVA) aren't often used with as many as twenty waves of measurement. and the different types of relationships we can estimate.
Types of Research Questions
Descriptive.When a study is designed primarily to describe what is going on or what exists. Public opinion polls that seek only to describe the proportion of people who hold various opinions are primarily descriptive in nature. For instance, if we want to know what percent of the population would vote for a Democratic or a Republican in the next presidential election, we are simply interested in describing something.
Relational.When a study is designed to look at the relationships between two or more variables. A public opinion poll that compares what proportion of males and females say they would vote for a Democratic or a Republican candidate in the next presidential election is essentially studying the relationship between gender and voting preference.
Causal.When a study is designed to determine whether one or more variables (e.g., a program or treatment variable) causes or affects one or more outcome variables. If we did a public opinion poll to try to determine whether a recent political advertising campaign changed voter preferences, we would essentially be studying whether the campaign (cause) changed the proportion of voters who would vote Democratic or Republican (effect).
The three question types can be viewed as cumulative. That is, a relational study assumes that you can first describe (by measuring or observing) each of the variables you are trying to relate. And, a causal study assumes that you can describe both the cause and effect variables and that you can show that they are related to each other. Causal studies are probably the most demanding of the three.
Getting to know the Jargon
Five Big Words
Research involves an eclectic blending of an enormous range of skills and activities. To be a good social researcher, you have to be able to work well with a wide variety of people, understand the specific methods used to conduct research, understand the subject that you are studying, be able to convince someone to give you the funds to study it, stay on track and on schedule, speak and write persuasively, and on and on.
Here, I want to introduce you to five terms that I think help to describe some of the key aspects of contemporary social research. (This list is not exhaustive. It's really just the first five terms that came into my mind when I was thinking about this and thinking about how I might be able to impress someone with really big/complex words to describe fairly straightforward concepts).
I present the first two terms -- theoretical and empirical -- together because they are often contrasted with each other. Social research is theoretical, meaning that much of it is concerned with developing, exploring or testing the theories or ideas that social researchers have about how the world operates. But it is also empirical, meaning that it is based on observations and measurements of reality -- on what we perceive of the world around us. You can even think of most research as a blending of these two terms -- a comparison of our theories about how the world operates with our observations of its operation.
The next term -- nomothetic -- comes (I think) from the writings of the psychologist Gordon Allport. Nomothetic refers to laws or rules that pertain to the general case (nomos in Greek) and is contrasted with the term "idiographic" which refers to laws or rules that relate to individuals (idiots in Greek???). In any event, the point here is that most social research is concerned with the nomothetic -- the general case -- rather than the individual. We often study individuals, but usually we are interested in generalizing to more than just the individual.
In our post-positivist view of science, we no longer regard certainty as attainable. Thus, the fourth big word that describes much contemporary social research is probabilistic, or based on probabilities. The inferences that we make in social research have probabilities associated with them -- they are seldom meant to be considered covering laws that pertain to all cases. Part of the reason we have seen statistics become so dominant in social research is that it allows us to estimate probabilities for the situations we study.
The last term I want to introduce is causal. You've got to be very careful with this term. Note that it is spelled causal not casual. You'll really be embarrassed if you write about the "casual hypothesis" in your study! The term causal means that most social research is interested (at some point) in looking at cause-effect relationships. This doesn't mean that most studies actually study cause-effect relationships. There are some studies that simply observe -- for instance, surveys that seek to describe the percent of people holding a particular opinion. And, there are many studies that explore relationships -- for example, studies that attempt to see whether there is a relationship between gender and salary. Probably the vast majority of applied social research consists of these descriptive and correlational studies. So why am I talking about causal studies? Because for most social sciences, it is important that we go beyond just looking at the world or looking at relationships. We would like to be able to change the world, to improve it and eliminate some of its major problems. If we want to change the world (especially if we want to do this in an organized, scientific way), we are automatically interested in causal relationships -- ones that tell us how our causes (e.g., programs, treatments) affect the outcomes of interest.
We'll only do a few for now, to give you an idea of just how esoteric the discussion can get (but not enough to cause you to give up in total despair). We can then take on some of the major issues in research like the types of questions we can ask in a project.