Term
|
Definition
Data consist of information coming from observations, counts, measurements or responses. |
|
|
Term
|
Definition
Statistics is the science of collecting, organizing, analyzing, and interpreting data in order to make decisions.
(Notice the "s" on the end: "statistics" not "statistic." |
|
|
Term
|
Definition
A population is the collection of all outcomes, responses, measurements, or counts that are of interest.
(Your sample might include 50 students at a high school, but the population you are interested in is all the students at the high school.) |
|
|
Term
|
Definition
A sample is a subset of a population. |
|
|
Term
|
Definition
A Parameter is a numerical description of a Population characteristic.
(like mean, median, mode, range, variance, standard deviation, proportion ...) |
|
|
Term
|
Definition
A Statistic is a numerical description of a Sample characteristic.
(like mean, median, mode, range, variance, standard deviation, proportion ...) |
|
|
Term
|
Definition
Descriptive statistics is the branch of statistics that involves the organization, summarization, and display of data. |
|
|
Term
|
Definition
Inferential statistics is the branch of statistics that involves using a sample to draw conclusions about a population. A basic tool in the study of inferential statistics is probability. |
|
|
Term
|
Definition
Qualitative data consist of attributes, labels, or nonnumerical entries.
(Latin qualis, "of what kind") |
|
|
Term
|
Definition
Quantitative data consist of numerical measurements or counts.
(Latin quantus, "how many") |
|
|
Term
nominal level of measurement |
|
Definition
- qualitative only
- categorized using names, labels or qualities
- cannot be put in order
- cannot make compuations with data
(Latin nomen, "name") |
|
|
Term
ordinal level of measurement |
|
Definition
- qualitative or quantitative
- can be arranged in order
(Latin ordo, "order") |
|
|
Term
interval level of measurement |
|
Definition
- quantitative only
- can categorize into ranges (1990–1993, 1994-1997, etc.)
- can calculate meaningful differences between data entries (You can subtract 'em from each other: Someone born in 1975 is three years younger than someone born in 1972 because 1972-1975=-3.)
(Latin inter, "between") |
|
|
Term
ratio level of measurement |
|
Definition
- quantitative only
- can categorize into ranges (.5 to .74 grams, .75 to .99 grams, etc.)
- can calculate meaningful differences between data entries (You can subtract 'em from each other.)
- can divide one data entry by another and get a meaningful result
(Latin ratio, "calculation") |
|
|
Term
|
Definition
a technique where the subject does not know whether he or she is receiving a treatment or a placebo |
|
|
Term
|
Definition
an experiment in which neither the subject nor the experimenter knows if the subject is receiving a treatment |
|
|
Term
|
Definition
the process of randomly assigning subjects to different treatment groups |
|
|
Term
|
Definition
the repetition of an experiment using a large group of subjects |
|
|
Term
|
Definition
a variable other than the variable under study, whose change might interfere with the experimeter's ability to isolate effects of the variable under study |
|
|
Term
measure of central tendancy |
|
Definition
A measure of central tendancy is a value that represents a typical, or central, entry of a data set.
The mean, median and mode of a data set are measures of central tendancy. |
|
|
Term
|
Definition
The mean of a data set is the sum of the data entries divided by the number of entries. The mean is not resistant to outlying values.
population mean: μ=Σx/N
sample mean: x bar = Σx/n |
|
|
Term
measure of central tendancy |
|
Definition
A measure of central tendancy is a value that represents a typical, or central, entry of a data set.
The mean, median and mode of a data set are measures of central tendancy. |
|
|
Term
|
Definition
The mean of a data set is the sum of the data entries divided by the number of entries. The mean is not resistant to outlying values.
population mean: μ=Σx/N
sample mean: x bar = Σx/n |
|
|
Term
|
Definition
The median of a data set is the value that lies in the middle of the data when the data set is ordered.
The median is resistant to outlying values. |
|
|
Term
|
Definition
The mode of a data set is the data entry that occurs with the greatest frequency. If no data entry is repeated, the data set has no mode. If two entries occur with the same greatest frequency, each entry is a mode and the data set is called bimodal.
For qualitative data, mode is the only measure of central tendancy. |
|
|
Term
|
Definition
A frequency distribution is symmetric when a vertical line can be drawn through the middle of a graph of the distribution and the resulting halves are approximately mirror images. |
|
|
Term
|
Definition
A frequency distribution is uniform when all entries, or classes, in the distribution have equal or approximately equal frequencies. A uniform distribution is also symmetric. |
|
|
Term
|
Definition
A frequency distribution is skewed if the "tail" of the graph elongates more to one side than to the other.
A distribution is skewed left (negatively skewed) if the tail extends to the left. A distribution is skewed right (positively skewed) if the tail extends to the right. |
|
|
Term
|
Definition
A measure of variation is a numerical description of how "spread out" a data set is.
Range, variance and standard deviation are measures of variation. |
|
|
Term
|
Definition
The range of a data set is the difference between the maximum and minimum data entries in the set.
range = (maximum entry) – (minimum entry) |
|
|
Term
|
Definition
the deviation of an entry x in a data set is the difference between the entry and the mean of the data set.
population: deviation of x = x – μ
sample: deviation of x = x – x bar |
|
|
Term
squared devation (expression) |
|
Definition
population: (x – μ)2
sample: (x – x bar)2 |
|
|
Term
sum of squared deviations (expression) |
|
Definition
population: Σ(x – μ)2
sample: Σ(x – x bar)2 |
|
|
Term
|
Definition
population: σ2 = Σ(x – μ)2/N
sample: s2 = Σ(x – x bar)2/(n-1) |
|
|
Term
standard deviation (equations) |
|
Definition
population: σ = √(Σ(x – μ)2/N)
sample: s = √(Σ(x – x bar)2/(n-1)) |
|
|
Term
|
Definition
A measure of position specifies where in a data set an entry is located.
Quartiles, percentiles ans z-scores are measures of position. |
|
|
Term
|
Definition
The five-number summary of a data set refers to the minimum entry, the first quartile (Q1), the median (Q2), the third quartile (Q3), and the maximum entry. |
|
|
Term
|
Definition
The minimum is the smallest value in a data set. |
|
|
Term
|
Definition
The maximum is the largest value in a data set. |
|
|
Term
|
Definition
A quartile is one of three numbers (the first, second and third quartiles, Q1, Q2 and Q3) that divide a data set into four approximately equal parts. 75% of the values in a data set fall below the third quartile. |
|
|
Term
|
Definition
A percentile is one of 99 numbers (P1 to P99) that divide a data set in 100 roughly equal parts. 99% of the data values fall below the 99th percentile. |
|
|
Term
|
Definition
The interquartile range (IQR) is the difference between the third and first quartiles.
IQR = Q3 – Q1 |
|
|
Term
z-score (or standard score) |
|
Definition
The z-score represents the number of standard deviations a given value falls from the mean.
z = (x – μ)/σ |
|
|