Term
A study of 2007 model automobiles was conducted. In the study the following variables were considered: the REGION in which the car was manufactured (Europe, North America, Asia); the TYPE of automobile (compact, midsize, large); the fuel economy in city driving of the automobile (MPG-CITY), volume of the engine in liters (ENGSIZE), and the type of FUEL used (regular, premium, 85% Ethanol). The variables REGION, TYPE, MPG-CITY, ENGSIZE, and FUEL are, respectively: (Categorical/Quantitative for each variable) |
|
Definition
categorical, categorical, quantitative, quantitative, categorical. |
|
|
Term
Which of the following statements is FALSE?
a) The distribution of a categorical variable lists the categories and gives the counts or the percent of individuals in each category.
b) Bar graphs, unlike histograms, have blank spaces between the bars to separate the items being compared.
c) A stemplot is particularly valuable for displaying the shape of the distribution of a categorical variable when there are few observations.
d) A histogram shows the distribution of counts or percents among the values of a single quantitative variable. |
|
Definition
c) A stemplot is particularly valuable for displaying the shape of the distribution of a categorical variable when there are few observations. |
|
|
Term
When examining a distribution of a quantitative variable, what features do we look for?
a) Overall shape, center, and spread b) Symmetry or skewness c) Deviations from overall patterns such as outliers d) All of the above |
|
Definition
|
|
Term
A scatterplot of a variable y versus a variable x produces a horizontal line. The value of y for all values of x is exactly 1.0. What do we know about the correlation between x and y?
a) It is +1 because the points lie perfectly on a line b) It is either +1 or -1 because the points lie perfectly on a line c) It is 0 because y does not change as x increases d) None of the above |
|
Definition
c) It is 0 because y does not change as x increases |
|
|
Term
In order to assess the effects on reducing cholesterol, a researcher sampled 50 people from a local gym who exercise regularly and 50 people from the surrounding community who do not exercise regularly. Each subject reported to a clinic to have their cholesterol measured. The average cholesterol levels of the two samples were compared. What type of study is this?
a) An observational study b) An experiment |
|
Definition
a) An observational study |
|
|
Term
A market research firm has been asked to survey the population of people in a particular city who are 16 years of age or over with respect to their preferences for television programming. To do this the firm divides the list of the target population into 5 age groups: 16 to 25, 26 to 35, 36 to 45, 46 to 55, and 56 or older. From each of these age groups a simple random sample of 225 people is selected for a total sample size of 1125 individuals. Assume that all of these selected individuals respond honestly to survey questions. The resulting sample is an example of what kind of sample design?
a) Multistage sample design b) Volunteer sample design c) Stratified random sample design d) Simple random sample design |
|
Definition
c) Stratified random sample design |
|
|
Term
A student organization at a local college posted a poll on its website. After a semester, the results were tallied and it was found that 95% of the respondents were in favor of raising fees to increase funding for student organizations. This conclusion was based on data collected from 5,000 votes cast on the website. Which of the following statements is true?
a) The results of this poll are biased due to the voluntary responses b) The results are reliable since the sample size is quite large c) The results are reliable since 95% is a very high proportion d) This is an example of random sampling since everybody has a chance to vote on the website |
|
Definition
a) The results of this poll are biased due to the voluntary responses |
|
|
Term
A study was conducted to determine the effect of weather on varieties of corn grown for grazing by cattle in Alberta, Canada. To conduct the study 3 varieties of corn were considered (one grazing, one short-stature, and one conventional) with harvesting of crops conducted on 3 occasions (one date in September, 1 in December, and 1 in January). The experimental treatments were assigned at random plots of land at each of two different locations in Alberta. The crude protein concentration of the harvested corn was determined and compared.
In this experiment the factor(s), number of treatments, and the responses are: |
|
Definition
2 Factors (Variety of corn, harvest time), 9 number of treatments, Protein Concentration is the response. |
|
|
Term
A study where experimental treatments are assigned at random to two locations that are similar has a ______ design.
a) completely random b) matched pairs c) block d) double-blind e) observational |
|
Definition
|
|
Term
Following the analysis of a well-designed completely randomized experiment it was reported that the observed effect was "statistically significant". Which of the following statements best explains the meaning of the phrase "statisticaly significant"?
a) The observed result made sense to the experimenter since it was what was hoped would happen b) The observed effect happened because the experiment was properly designed and carried out without bias c) The experimenter carefully employed the basic principles of experimental design in conducting the study d) The laws of probability say that this observed result would be expected to happen by chance e) The observed effect was sufficiently large so that it would rarely occur simply by chance |
|
Definition
e) The observed effect was sufficiently large so that it would rarely occur simply by chance |
|
|
Term
Each subject receives a label from 01 to 12. Use the list of random digits to determine the labels of the first five participants.
81507 27102 56027 55892 3063 41842 81868 71035 09001 23367 49497
a) 08 01 05 00 07 b) 08 01 05 07 02 c) 02 06 10 09 12 d) 02 02 06 10 09 |
|
Definition
|
|
Term
In conducting a randomized comparative experiment an experimenter will often include a group of subjects, known as a control group, to receive a sham treatment. What is the reason such a control group might be included in the experiment?
a) To increase the number of subjects and hence increase the sensitivity of the experiment in detecting effects of the treatment b) to gain more complete understanding of any biases that might be present in the experiment c) to enable the experimenter to control the effects of outside variables, such as lurking variables or confounding variables, on the outcome of the experiment d) To repeat the experiment over a greater number of experimental subjects e) To enable the experimenter to more easily make a random assignment of subjects to the groups |
|
Definition
c) to enable the experimenter to control the effects of outside variables, such as lurking variables or confounding variables, on the outcome of the experiment |
|
|
Term
Central middle school has calculated a 95% confidence interval for the mean height (mu) of 11-year-old boys at their school and found it to be 56+/- 2 inches (54, 58). Which statement is true?
a) There is a 95% probability that mu is between 54 and 58. b) There is a 95% probability that the true mean is 56, and there is a 95% chance that the true margin of error is 2. c) If we took many additional random samples of the same size, approximately 95% of the time the sample means would fall between 54 and 58. d) If we took many additional random samples of the same size and from each computed a 95% confidence interval for mu, approximately 95% of these intervals would contain mu. |
|
Definition
d) If we took many additional random samples of the same size and from each computed a 95% confidence interval for mu, approximately 95% of these intervals would contain mu. |
|
|
Term
Explain what is wrong.
"An approximate 99% confidence interval for an unknown proportion p is p̂ plus or minus its standard error."
a) The margin of error equals z times the standard deviation. b) A confidence interval cannot be found for data that describe population proportions. c) All confidence intervals are exact, not approximate. d) The proportions of a population will never be unknown. e) The margin of error equals z* times the standard error. |
|
Definition
e) The margin of error equals z* times the standard error. |
|
|
Term
Explain what is wrong.
"When performing a large-sample significance test for a population proportion, the t distribution is used to compute the P-value."
a) The P-value is calculated using the t distribution for a small-sample significance test. b) A P-value is not calculated when performing a large-sample significance test for a population proportion. c) A significance test cannot be used for data that describe population proportions. d) Use Normal distributions (and a z test statistic) for significance tests involving proportions. e) The t distribution is only used for small samples. |
|
Definition
d) Use Normal distributions (and a z test statistic) for significance tests involving proportions. |
|
|
Term
Explain what is wrong.
"A significance test is used to evaluate H0: p̂ = 0.2 versus the two-sided alternative."
a) p̂ will never have a proportion of 0.2. b) We do not use null or alternative hypotheses with proportions. c) We are actually testing the alternative hypothesis, so it should be Ha: p̂ = 0.2. d) Since we are evaluating H0: p̂ = 0.2, this is a one-sided test, not a two-sided test. e) H0 should refer to p (the population proportion), not p̂ (the sample proportion). |
|
Definition
e) H0 should refer to p (the population proportion), not p̂ (the sample proportion). |
|
|
Term
State the null hypothesis H0 and the alternative hypothesis Ha.
"A university gives credit in French language courses to students who pass a placement test. The language department wants to know if students who get credit in this way differ in their understanding of spoken French from students who actually take the French courses. Experience has shown that the mean score of students in the courses on a standard listening test is 24. The language department gives the same listening test to a sample of 39 students who passed the credit examination to see if their performance is different."
a) H0: μ = 39; Ha: μ ≠ 39 b) H0: μ > 24; Ha: μ = 24 c) H0: μ = 24; Ha: μ ≠ 24 d) H0: μ < 24; Ha: μ = 24 e) H0: μ = 39; Ha: μ > 39 |
|
Definition
c) H0: μ = 24; Ha: μ ≠ 24 |
|
|
Term
State the null hypothesis H0 and the alternative hypothesis Ha.
"Experiments on learning in animals sometimes measure how long it takes a mouse to find its way through a maze. The mean time is 26 seconds for one particular maze. A researcher thinks that playing rap music will cause the mice to complete the maze faster. She measures how long each of 18 mice takes with the rap music as a stimulus."
a) H0: μ = 26 seconds; Ha: μ > 26 seconds b)H0: μ = 18 seconds; Ha: μ < 18 seconds
c) H0: μ = 26 seconds; Ha: μ < 26 seconds d) H0: μ = 26 seconds; Ha: μ ≠ 26 seconds e) H0: μ > 18 seconds; Ha: μ = 18 seconds |
|
Definition
c) H0: μ = 26 seconds; Ha: μ < 26 seconds |
|
|
Term
State the null hypothesis H0 and the alternative hypothesis Ha.
"The average square footage of one-bedroom apartments in a new student-housing development is advertised to be 420 square feet. A student group thinks that the apartments are smaller than advertised. They hire an engineer to measure a sample of apartments to test their suspicion."
a) H0: μ > 420 ft2; Ha: μ = 420 ft2 b) H0: μ < 420 ft2; Ha: μ = 420 ft2
c) H0: μ = 420 ft2; Ha: μ > 420 ft2 d) H0: μ = 420 ft2; Ha: μ ≠ 420 ft2 e) H0: μ = 420 ft2; Ha: μ < 420 ft2 |
|
Definition
e) H0: μ = 420 ft2; Ha: μ < 420 ft2 |
|
|
Term
State what is wrong.
"A sampling distribution describes the distribution of some characteristic in a population."
a) The sampling distribution describes the distribution of all the characteristics of a population, not just one. b) The sampling distribution describes the variation of the characteristic of a sample. A characteristic of a population does not vary; it is a fixed number. c) The population distribution describes the variation of the characteristic of a population. A characteristic of a sample does not vary; it is a fixed number. d) The sampling distribution describes the distribution of all the characteristics of a sample, not just one. |
|
Definition
b) The sampling distribution describes the variation of the characteristic of a sample. A characteristic of a population does not vary; it is a fixed number. |
|
|
Term
State what is wrong.
"A statistic will have a large amount of bias whenever it has high variability."
a) Statistics are always highly biased regardless of the variability. b) Bias is always high when variability is low, not when variability is high. c) Bias and variability are independent; any combination of high/low bias and high/low variability is possible. d) Statistics are never biased because they rely on empirical data; the variability does not matter. |
|
Definition
c) Bias and variability are independent; any combination of high/low bias and high/low variability is possible. |
|
|
Term
State what is wrong.
"The variability of a statistic based on a small sample from a population will be the same as the variability of a large sample from the same population."
a) For a given population, variability is independent of sample size, so there is no way to know how the variability will change for the large sample. b) For a given population, variability is lowest when the sample size is moderate; a small or large sample size means that the variability will be larger. c) For a given population, variability decreases with increasing sample size, so the variability for the large sample will be smaller. d) For a given population, variability increases with increasing sample size, so the variability for the large sample will be larger. |
|
Definition
c) For a given population, variability decreases with increasing sample size, so the variability for the large sample will be smaller. |
|
|
Term
Describe the population and the sample.
A survey of 17,096 students in U.S. four-year colleges reported that 19.4% were binge drinkers. |
|
Definition
Population: College students at US four-year colleges Sample: 17,096 Students |
|
|
Term
Describe the population and the sample.
In a study of work stress, 100 restaurant workers were asked about the impact of work stress on their personal lives. |
|
Definition
Population: All restaurant workers Sample: 100 restaurant workers |
|
|
Term
Describe the population and the sample.
A tract of forest has 584 longleaf pine trees. The diameters of 40 of these trees were measured. |
|
Definition
Population: Longleaf pine trees Sample: 40 trees |
|
|
Term
Explain to someone who knows no statistics what it means to say that xbar is an "unbiased" estimator of μ.
a) xbar is not systematically higher than or lower than μ; that is, it has no particular tendency to underestimate or overestimate μ. b) xbar is not systematically higher than or lower than μ; that is, it has a tendency to underestimate or overestimate μ. c) xbar is always equal to μ; that is, it has no particular tendency to underestimate or overestimate μ. d) xbar is systematically lower than μ; that is, it has no particular tendency to overestimate μ. e) xbar is systematically higher than μ; that is, it has no particular tendency to underestimate μ. |
|
Definition
a) xbar is not systematically higher than or lower than μ; that is, it has no particular tendency to underestimate or overestimate μ. |
|
|
Term
The sample result xbar is an unbiased estimator of the population truth μ no matter what size SRS the study chooses. Explain to someone who knows no statistics why a large sample gives more trustworthy results than a small sample.
a) With large samples, xbar is less likely to be close to μ, because with a larger sample comes more information and therefore less uncertainty. b) With large samples, xbar is more likely to be close to μ, because with a larger sample comes more information and therefore less uncertainty. c) With large samples, xbar is less likely to be close to μ, because with a smaller sample comes more information and therefore less uncertainty. d) With small samples, xbar is more likely to be close to μ, because with a small sample comes more information and therefore less uncertainty. e) With large samples, xbar is more likely to be close to μ, because with a smaller sample comes more information and therefore less uncertainty. |
|
Definition
b) With large samples, xbar is more likely to be close to μ, because with a larger sample comes more information and therefore less uncertainty. |
|
|
Term
Product preference depends in part on the age, income, and gender of the consumer. A market researcher selects a large sample of potential car buyers. For each consumer, she records gender, age, household income, and automobile preference. Which of these variables are categorical and which are quantitative?
Gender Age Household Income Automobile Preference |
|
Definition
Gender- Categorical Age- Quantitative Household Income- Quantitative Automobile Preference- Categorical |
|
|
Term
What type of graph or graphs would you plan to make in this study?
What makes of cars do students drive?
bar graph pie chart histogram stemplot boxplot timeplot normal quantile plot none of the above |
|
Definition
|
|
Term
What type of graph or graphs would you plan to make in this study?
How old are cars among student drivers?
bar graph pie chart histogram stemplot boxplot timeplot normal quantile plot none of the above |
|
Definition
|
|
Term
What type of graph or graphs would you plan to make in this study?
How many hours per week do students study?
bar graph pie chart histogram stemplot boxplot timeplot normal quantile plot none of the above |
|
Definition
|
|
Term
What type of graph or graphs would you plan to make in this study?
How does the number of study hours among students change during a semester?
bar graph pie chart histogram stemplot boxplot timeplot normal quantile plot none of the above |
|
Definition
|
|
Term
What type of graph or graphs would you plan to make in this study?
Which radio stations are most popular with students?
bar graph pie chart histogram stemplot boxplot timeplot normal quantile plot none of the above |
|
Definition
|
|
Term
What type of graph or graphs would you plan to make in this study?
When many students measure the concentration of the same solution for a chemistry course laboratory assignment, do their measurements follow a normal distribution?
bar graph pie chart histogram stemplot boxplot timeplot normal quantile plot none of the above |
|
Definition
|
|
Term
Suppose that you and your friends emptied your pockets of coins and recorded the year marked on each coin. The distribution of dates would be skewed to the left. Explain why.
The distribution of coin years would be left-skewed because newer coins are more common than older coins. The distribution of coin years would be left-skewed because the sample is not random. The distribution of coin years would be left-skewed because the sample is small. The distribution of coin years would be left-skewed because older coins are more common than newer coins. The distribution of coin years would be left-skewed because newer coins are worth more than older coins. |
|
Definition
The distribution of coin years would be left-skewed because newer coins are more common than older coins. |
|
|
Term
When a histogram is skewed right, where is the mean located in relationship to the median? |
|
Definition
The mean is to the right of the median |
|
|
Term
Explain what is wrong with the following statement.
"There is a high correlation between the gender of American workers and their income."
The relationship between gender and American workers income has a curved relationship and correlation measures the strength of only the linear relationship between two variables. Gender would not have any association with American workers income. Because gender has a categorical (nominal) scale, we cannot compute the correlation between sex and anything. Income of American workers can not be correlated with other variables because it is too complex. |
|
Definition
Because gender has a categorical (nominal) scale, we cannot compute the correlation between sex and anything. |
|
|
Term
Explain what is wrong with the following statement.
"We found a high correlation (r = 1.09) between students' ratings of faculty teaching and ratings made by other faculty members."
Because students' ratings of faculty teaching has a categorical (nominal) scale, we cannot compute the correlation between these ratings and anything. A correlation r = 1.09 is impossible because −1 ≤ r ≤ 1 always. The relationship between students' ratings of faculty teaching and ratings made by other faculty members would have more of a curved relationship and correlation measures the strength of only the linear relationship between two variables. The correlation between students' ratings of faculty teaching and ratings made by other faculty members would not be this high. |
|
Definition
A correlation r = 1.09 is impossible because −1 ≤ r ≤ 1 always. |
|
|
Term
Explain what is wrong with the following statement.
"The correlation between planting rate and yield of corn was found to be r = 0.27 bushel."
There is no correlation between planting rate and yield of corn. Correlation has no units, so r = 0.27 bushel is incorrect. The correlation between planting rate and yield of corn would be much higher. Because planting rate has a categorical (nominal) scale, we cannot compute the correlation between planting rate and anything. |
|
Definition
Correlation has no units, so r = 0.27 bushel is incorrect. |
|
|
Term
Identify the population.
A college has changed its core curriculum and wants to obtain detailed feedback information from the students during each of the first 12 weeks of the coming semester. Each week, a random sample of 5 students will be selected to be interviewed.
all current students, or perhaps all current students who were enrolled during the year prior to the change the 5 students that are interviewed in each week all colleges that changed their core curriculum the feedback from students during each of the 12 weeks the weeks that the interviews with the students are conducted |
|
Definition
all current students, or perhaps all current students who were enrolled during the year prior to the change |
|
|
Term
Identify the population.
The American Community Survey (ACS) will replace the census "long form" starting with the 2010 census. The main part of the ACS contacts 250,000 addresses by mail each month, with follow-up by phone and in person if there is no response. Each household answers questions about their housing, economic, and social status.
all households that are involved in the U.S. census all households in North America the 250,000 households surveyed all U.S. households the upper-class households in the 250,000 households surveyed |
|
Definition
|
|
Term
Identify the population.
An opinion poll contacts 1161 adults and asks them, "Which political party do you think has better ideas for leading the country in the twenty-first century?" |
|
Definition
adult residents of the U.S. |
|
|
Term
An opinion poll in California uses random digit dialing to choose telephone numbers at random. Numbers are selected separately within each California area code. The size of the sample in each area code is proportional to the population living there.
What is the name for this kind of sampling design?
multistage sample simple random sample voluntary response sample convenience sample stratified random sample |
|
Definition
|
|
Term
Ability to grow in shade may help pines in the dry forests of Arizona resist drought. How well do these pines grow in shade? Investigators planted pine seedlings in a greenhouse in either full light or light reduced to 5% of normal by shade cloth. At the end of the study, they dried the young trees and weighed them.
Identify the experimental unit(s) or subject(s), factor(s), treatment(s), and variable(s). |
|
Definition
pine tree seedlings amount of light full light, shaded to 5% of normal dry weight at end of study |
|
|
Term
The Bayer Aspirin Web site claims that "Nearly five decades of research now link aspirin to the prevention of stroke and heart attacks." The most important evidence for this claim comes from the Physicians' Health Study, a large medical experiment involving 22,000 male physicians. One group of about 11,000 physicians took an aspirin every second day, while the rest took a placebo. After several years the study found that subjects in the aspirin group had significantly fewer heart attacks than subjects in the placebo group.
Identify the experimental subjects, the factor and its levels, the response variable, and what does it mean to say that the aspirin group had "significantly fewer heart attacks"? |
|
Definition
the physicians the medication (aspirin vs placebo) the incidence of heart attack of participants The difference in the number of heart attacks between the two groups was so great that it would rarely occur by chance if aspirin had no effect. |
|
|
Term
An ultramarathon, as you might guess, is a footrace longer than the 26.2 miles of a marathon. Runners commonly develop respiratory infections after an ultramarathon. Will taking 600 milligrams of vitamin C daily reduce these infections? Researchers randomly assigned ultramarathon runners to receive either vitamin C or a placebo. Separately, they also randomly assigned these treatments to a group of nonrunners the same age as the runners. All subjects were watched for 14 days after the big race to see if infections developed.
What is the name for this experimental design?
randomized comparative design correlational design case study design block design matched pairs design |
|
Definition
|
|
Term
Name the study.
A study to compare two methods of preserving wood started with boards of southern white pine. Each board was ripped from end to end to form two edge-matched specimens. One was assigned to Method A, the other to Method B. |
|
Definition
|
|
Term
Name the study.
A survey on youth and smoking contacted by telephone 300 smokers and 300 nonsmokers, all 14 to 22 years of age. |
|
Definition
|
|
Term
Name the study.
Does air pollution induce DNA mutations in mice? Starting with 40 male and 40 female mice, 20 of each sex were housed in a polluted industrial area downwind from a steel mill. The other 20 of each sex were housed at an unpolluted rural location 30 kilometers away. |
|
Definition
|
|
Term
A researcher studying the effect of price promotions on consumers' expectations makes up two different histories of the store price of a hypothetical brand of laundry detergent for the past year. Students in a marketing course are randomly assigned to view one or the other price history on a computer. Some students see a steady price, while others see regular promotions that temporarily cut the price. Then the students are asked what price they would expect to pay for the detergent.
Is this study an experiment? Why?
What are the explanatory and response variables? |
|
Definition
Yes. Each subject is randomly assigned to a treatment.
The explanatory variable is price history. The response variable is expected price. |
|
|