Term
|
Definition
the collection, display, analysis, and interpretation of numerical information. |
|
|
Term
|
Definition
methods of collecting, organizing, summarizing, and presenting data in an informative way. |
|
|
Term
|
Definition
methods used to interpret a sample and to draw conclusions about a population on the basis of that sample. |
|
|
Term
|
Definition
characteristic or attribute that can assume different values |
|
|
Term
|
Definition
calues that variables can assume |
|
|
Term
|
Definition
collection of data values |
|
|
Term
|
Definition
variables whose values are determined by chance |
|
|
Term
|
Definition
the entire group of items that interests us |
|
|
Term
|
Definition
part of the population we actually observe |
|
|
Term
|
Definition
using the sample to draw conclusions about the characteristics of the population from which the sample came. Also invovles making future predictions about certain populations on the basis of the study of past and present. use probability theory to trust sample is a good rep. |
|
|
Term
|
Definition
used to look at the relation between the sample and population. used because there is an element of chance in selecting a sample. can tell how likely it is that the sample proportions will be close to the population proportion. allows us to decide how confident we can be when claiming a relationship between 2 variables. |
|
|
Term
|
Definition
gender brand of car employment status eye color etc. |
|
|
Term
|
Definition
Discrete assumes values that can be counted. children in a family successful pitches Continuous assumes infinite # of values in an interval between 2 boundaries, obtained by measuring. rounded to the nearest unit. weight, height, time, temperature |
|
|
Term
measurement scales: nominal |
|
Definition
e.g. zip code, gender, eye color, nationality no ranking or order |
|
|
Term
measurement scales: ordinal |
|
Definition
e.g. judging, grade, rating scale, preference no precise differences but ranking |
|
|
Term
measurement scales: interval |
|
Definition
e.g. temperature, year/date on calendar precise differences, ranking, but no "true" zero zero is arbitrary |
|
|
Term
measurement scales: ratio-level |
|
Definition
i.e. height, weight, time, salary, age there is a "true" zero |
|
|
Term
|
Definition
aim to isolate one factor (independent variable) and to see if it has any effects on the dependent variable. Must conduct experiments under controlled conditions (control group/treatment group) |
|
|
Term
|
Definition
people's behavior and performance change following any new or increased attention |
|
|
Term
|
Definition
e effect on patient outcomes (improved or worsened) that may occur due to the expectation by a patient (or provider) that a particular intervention will have an effect |
|
|
Term
sampling techniques: random sampling |
|
Definition
subjects are selected by random numbers (computer generated or from random tables) |
|
|
Term
sampling techniques: systematic sampling |
|
Definition
subjects are selected by using every kth number after the first subject is randomly selected from 1-k |
|
|
Term
sampling techniques: stratified sampling |
|
Definition
subjects are selected by dividing up the population into groups (strata) and subjects w/n those groups are randomly selected |
|
|
Term
sampling techniques: cluster sampling |
|
Definition
subjects are selected by using an intact group that is representative of the population. |
|
|
Term
|
Definition
unrepresentative samples (small, convenience samples) misleading representations of data (1% of pop. vs. 2,000,000 ppl) detached statistics (y brand has twice as much x) implied connections (studies suggest that using this will....) framing questions |
|
|
Term
|
Definition
interested in how data is distributed |
|
|
Term
|
Definition
data divided into classes to measure frequency. can be used for qualitative or quantitative data |
|
|
Term
|
Definition
....of a class is the # of data values that fall into that specific class. |
|
|
Term
categorical frequency distribution |
|
Definition
used for data that can be placed in specific categories, such as nominal or ordinal-level data. |
|
|
Term
quantitative frequency distributions |
|
Definition
used for quantitative variables.For qfd each class has a lower and upper limit. when the range of the data is large we group data into classes that are more than 1 unit--grouped frequency distribution. |
|
|
Term
|
Definition
add lower and upper boundaries and divide by 2 |
|
|
Term
|
Definition
should be 5-20 classes class width preferably odd # classes must be mutually exclusive (non overlapping class limits) must be continuous must accommodate all data must be equal in width |
|
|
Term
|
Definition
[image]displayes the frequencies of classes in the form of contagious vertical bars of various heights (for continuous variables) |
|
|
Term
|
Definition
displays data by using lines that connect points plotted for the frequencies at the midpoints of the classes. |
|
|
Term
|
Definition
represents the cumulative frequencies for the classes in a frequency distribution |
|
|
Term
relative frequency histogram |
|
Definition
represents the relative frequencies for the classes w/ vertical bars |
|
|
Term
|
Definition
uses proportions instead of raw data as frequencies. divide frequency of each class by the total number of frequencies. |
|
|
Term
|
Definition
a circle that is divided into sections according to the percentage of the frequencies in each category of the distribution. |
|
|
Term
|
Definition
represents data that occur over a specific period of time |
|
|
Term
|
Definition
represents a frequency distribution for a categorical variable. the frequencies are displayed by the heights of vertical bars, which are arranged in order from highest to lowest. |
|
|
Term
graphical integrity requires that... |
|
Definition
- design does not change inappropriately--shows data variation (not design)
- irregular use of scale is not used to misrepresent
- must not quote data out of context
- clear, detailed, and thorough labeling
- w/ money in a time series, deflated or standardized units of measurement better
|
|
|
Term
|
Definition
a characteristic or measure obtained by using all the data values from a sample |
|
|
Term
|
Definition
a characteristic or measure obtained by using all the data values from a population |
|
|
Term
|
Definition
the mean is the balance point of the data in that the cumulative distance from the mean of those observations above the mean is equal to those below the mean |
|
|
Term
|
Definition
the half-way point or the middle value when the data is arranged in numerical order from the smallest to largest value. |
|
|
Term
|
Definition
most commonplace value/occurs the most in the data set usually for categorical data |
|
|
Term
|
Definition
highest value - lowest value |
|
|
Term
|
Definition
standard deviation divided by the mean. allows us to compare the standard deviations of 2 different variables. |
|
|
Term
|
Definition
for normal distributions - 68% of data values will fall w/n 1 s.d. of the mean
- 95% of data values will fall w/n 2 s.d. of the mean
- 99.7% of data values will fall w/n 3 s.d. of the mean
|
|
|
Term
|
Definition
represents the number of standard deviations that a data value falls above or below the mean. used to compare position of different raw data. |
|
|
Term
|
Definition
divides the data into 100 equal groups and indicates the position of an individual in a group. NOT THE SAME AS PERCENTAGES! gives position w/ respect to rest of data. |
|
|
Term
|
Definition
divide the distribution into 4 groups separated by Q1 (corresponding to 25th percentile), Q2 (50th percentile--the median), Q3 (75th percentile) - interquartile range- difference between Q1 and Q3 and is the range of the middle 50% of the data
|
|
|
Term
|
Definition
divide the distribution into groups 5 groups |
|
|
Term
|
Definition
divide distribution into 10 groups |
|
|
Term
|
Definition
the chance of an event occurring. can be expressed as fractions, decimals, or percentages. |
|
|
Term
|
Definition
a chance process that leads to well-defined results called outcomes. |
|
|
Term
|
Definition
the result of a single trial of a probability experiment. |
|
|
Term
|
Definition
set of all possible outcomes of a probability experiment. |
|
|
Term
|
Definition
consists of a set of outcomes of a probability experiment. - simple event-event w/ 1 outcome
- compound event- an event that consists of more than one outcome
|
|
|
Term
Classical/Theoretical Probability |
|
Definition
uses sample spaces to determine the numerical probability that an event will happen and it assumes that all outcomes in the sample space are equally likely to occur (have the same prob of ocurring) - 0<p(E)<1
- if event can't occur, P(E)=0
- if event is certain, P(E)=1
- sum of all probabilities of the outcomes = 1
|
|
|
Term
|
Definition
the set of outcomes in the sample space that are not included in the outcomes of the event E. 1- P(E) |
|
|
Term
|
Definition
relies on actual experience to determine the likelihood of outcomes. based on observation. |
|
|
Term
|
Definition
if the empirical prob. of getting tails is computed by using a small number of trials, it is usually not exactly 1/2. However, as the number of trials increases, the empirical probability of etting a tail will approach hte theoretical probability of 1/2. |
|
|
Term
|
Definition
personal or subjective definitions of probability assert probabilities on the basis of personal beliefs concerning the particular situation. if logically consistent, it provides a functional basis for prob. theory |
|
|
Term
|
Definition
two events are mutually exclusive if they cannot occur at the same time (no outcomes in common) |
|
|
Term
|
Definition
used to find the probability of two or more events that occur in sequence. |
|
|
Term
|
Definition
if event A occurs and does not affect the probability of B occurring. |
|
|
Term
|
Definition
when the outcome or occurrence of the first event affects the outcome or the occurrence of the second event in such a way that the probability of the second event is changed. |
|
|
Term
|
Definition
of an event B in relationship to an event A is the probability of that event B occurs after event A has already occurred. P(B/A) |
|
|
Term
|
Definition
used to determine when we want to know the number of all possible outcomes for a sequence of event. |
|
|
Term
fundamental counting rule |
|
Definition
in a sequence of n events in which the first one has k1 possibilities of outcomes, and the second event has k2 possibilities, and the third has k3 possibilities, and so forth, the total number of possibilities of the sequence will be k1 . k2 . k3.....kn |
|
|
Term
|
Definition
for any counting n n!= n(n-1)(n-2)....1 0!=1 |
|
|
Term
|
Definition
an arrangement of n objects in a specific order. the arrangement of n objects in a specific order using r objects at a time is written nPr nPr=(n!)/(n-r)! |
|
|
Term
|
Definition
a selection of distinct objects from n objects without regard to order. the number of combinations of r objects selected from n objects is denoted by nCr nCr=(n!)/(n-r)!r! |
|
|
Term
discrete probability distribution |
|
Definition
consists of values a random variable can assume and the corresponding probabilities of the values. probabilites determined theoretically or by observation. |
|
|
Term
|
Definition
the expected value of a discrete random variable of a prob. distribution is the theoretical average (Mean) of the variable. |
|
|
Term
|
Definition
a probability experiment whose outcomes are two or could be reduced to two. used to analyze categorical (ordinal or nominal) data. appropriate w/ sampling w/ replacement and w/o 4 requirements: - must be an n number of (bernoulli) trials
- each trial has 2 categories of outcomes, not necessarily equally likely; success or failure
- prob of success p is same for each trial
- trials are independent.
|
|
|
Term
|
Definition
in n binomial trials with the probability of p of success in each trial , the prob of exactly x successes is P(x successes) = (n!/x!(n-x)!) . p^x . q^n-x |
|
|
Term
probability histogram/population histogram/probability distribution |
|
Definition
graph of complete set of probabilities. data and probability histograms are related by the fact that in the long run, the observed frequencies will be very close to the theoretical probabilities. describes what our data would look like if the experiment were repeated innumerable times. |
|
|
Term
major characteristics of normal probability distribution |
|
Definition
- bell-shaped; mean, median, and mode are all equal and located at center of distribution
- symmetrical about the mean
- curve is continuous w/ no gaps or holes, for each x there is a y
- falls smoothly from mean in either direction. it is asymptotic (curve gets closer and closer to x-axis but never touches it)
- location of normal distribution is determined by the mean. the dispersion is determined by the standard deviation.
- empirical normal rule applies
- the prob that the value of x will be in a specified interval is shown by the corresponding aread under a probability density curve (total area=1)
|
|
|