Term
|
Definition
Rows. The individual items for which we record several different measurements. |
|
|
Term
|
Definition
Columns. The measurements (variables) about these individual items. |
|
|
Term
|
Definition
|
|
Term
|
Definition
The measurements recorded about each individual item and are shown in the columns of the data table. They identify What has been measured for each item. |
|
|
Term
|
Definition
how much of something was measured, and tells us exactly how far apart 2 individual items are. Ex: height, weight, salary, score, distance, time, GPA. They have units. |
|
|
Term
|
Definition
It is not possible to specify exactly how far apart 2 individuals are. Examples: gender, race, nationality, hair color, student ID, university major, phone number, zip code, Y/N |
|
|
Term
|
Definition
a unique identification assigned to each individual or item. This is listed in the first column of the data table. It could be a name or an alpha-numeric code. |
|
|
Term
|
Definition
Data that consist of the same item measured repeatedly over time. Ex: daily stock prices for the past 10 years, GDP of the US for the past 30 years |
|
|
Term
|
Definition
Data that are measured at the same point in time. ex: the GDP of all the European countries for 2013, or the closing price yesterday of all the stocks in the Nasdaq 100. |
|
|
Term
|
Definition
Sampling methods that over- or underemphasize certain characteristics of the population. Differs systematically from the population it is trying to represent. A larger sample size does not correct the bias since the underlying method has a systematic problem. |
|
|
Term
|
Definition
A “sample” that includes the entire population. Attempting a census is often less accurate than a sample |
|
|
Term
|
Definition
the true, exact value in the population that we wish we knew. For instance: the true average salary of Terry alumni, the true average hours per week worked by accountants, the true market share of cell-phone providers among US customers |
|
|
Term
|
Definition
Any number based on the sample |
|
|
Term
|
Definition
all the objects are put onto one big list, and individual objects are selected completely at random from this list. Every individual has the same chance of being selected; nobody has a higher or lower priority because of certain characteristics. |
|
|
Term
|
Definition
the list of individuals from which the sample is drawn. |
|
|
Term
Stratified random sampling |
|
Definition
the population is divided into similar groups, and a few people are selected from every group. |
|
|
Term
|
Definition
the population can be split into groups (clusters) such that each cluster is like a mini- population. Several clusters are chosen at random, and all objects within the chosen clusters are sampled. |
|
|
Term
|
Definition
created by selecting every 10th, 50th, 100th, etc. individual. The selection should start with a randomly selected individual. |
|
|
Term
|
Definition
The entire group you would like to study |
|
|
Term
|
Definition
the list from whom the sample will be chosen |
|
|
Term
|
Definition
the people who respond and take part in the survey. |
|
|
Term
|
Definition
People who don’t respond differ in a systematic way from those who do respond. A good sample was initially chosen, but many of the chosen people can’t be reached or don’t respond. |
|
|
Term
|
Definition
The sample is entirely self-selected -- people volunteer themselves to become part of the sample. |
|
|
Term
|
Definition
Certain characteristics of the population is entirely missing from the sample, or has a much smaller proportion than in the population. |
|
|
Term
|
Definition
occurs when we ask individuals who are convenient without making any effort to randomize. For instance, asking the first 100 people you see rather than choosing a random sample. |
|
|
Term
|
Definition
analyzing categorical data. This records the counts for each category. |
|
|
Term
|
Definition
displays the percentages in each category rather than the counts. |
|
|
Term
|
Definition
shows the counts for each category. It would not be a problem for a bar chart if people chose more than one answer, or if there were a large number of categories. |
|
|
Term
|
Definition
should be used when the focus is on percentages rather than actual counts, e.g. market share. Choosing more than one answer would be a problem |
|
|
Term
|
Definition
displays one variable in the columns and a second variable in the rows. |
|
|
Term
|
Definition
Peaks or humps seen in a histogram one main peak is called unimodal, two peaks – bimodal, three or more – multimodal. |
|
|
Term
|
Definition
Unlike a bar graph, there are no gaps between the bars |
|
|
Term
|
Definition
all the bars approximately the same height, there is no mode |
|
|
Term
|
Definition
the halves on either side of the center look (approximately) like mirror images. |
|
|
Term
|
Definition
If one tail stretches out longer than the other, to the side of the longer tail |
|
|
Term
|
Definition
a good measure for unimodal, symmetric distributions. It is commonly referred to as the “average”. very sensitive to outliers or skewed data. skewed= gets pulled towards the longer tail |
|
|
Term
|
Definition
The value that splits the histogram into 2 equal pieces. "resistant" because it isn't affected by outliers |
|
|
Term
|
Definition
the values that enclose the middle 50%. 25% of the data fall below the first quartile, Q1, and 25% are above the third quartile, Q3. Q1 | Median | Q3 |
|
|
Term
|
Definition
|
|
Term
|
Definition
a measure of the average distance of points from the mean (center) |
|
|
Term
|
Definition
If the shape is skewed, or if there are outliers, |
|
|
Term
Mean & Standard deviation |
|
Definition
If the shape is unimodal and symmetric and no outliers are present |
|
|
Term
|
Definition
consists of the min and max, Q1 and Q3, and the median. |
|
|
Term
|
Definition
|
|
Term
|
Definition
|
|
Term
|
Definition
tells us how many standard deviations a data point is from its mean A z-score greater than 3, or less than −3, is very unusual and provides another method of identifying outliers. |
|
|
Term
|
Definition
an event is its long-run frequency it is a number between 0 and 1 |
|
|