You are in the labyrinth/archive. Click here for the new exmosis.net.

Up

  • (none)

Stats Cheat Sheet

created 2006-01-27 09:50:08

A work in progress...

See below for deciding what test to use.

1 way ANOVA: I'm gonna find ya, I'm gonna get ya, get ya, get ya, get ya, get ya... A way of finding out if there is something significantly different when comparing more than 2 groups. For 2 groups, we can use a T-Test.

Between Subject: See Independent Samples.

Central Limit Theorem: Comes from the fact that adding 2 random numbers (e.g. from 2 dice) together are more likely to produce a total close to the average. Means = adding stuff, which lessens a one-sided population, but you need a sample size of more than 30 for this. In effect, CLT just comes out of the fact that means are normal by nature. I think. When you calculate a mean, you always find the mid-point in favour of the two extremes. This has to produce a normal distribution. As means aren't necessarily the best way to know everything about a dataset though (hence the need for median, mode and standard deviation), don't you lose something about the data by transferring to means?

This also means that for sample sizes > 50, you can assume that the distribution is normal.

Dependent Samples: Values being compared come from same subjects, aka Paired Samples and Within Subject. See also Independent Samples.

Distribution of Sample Means: What you get from taking many samples, and plotting the means of each are spread. As sample size gets bigger, the mean of these means gets closer to the mean of the population.

Equal Variances: Refers to if 2 groups are "comparable" - i.e. that other external factors, unaccounted for, don't have more of an effect on the data than the ones being looked at.

Error Bars: Show confidence intervals for each sample mean. If overlap, accept the Null Hypothesis.

Error Type I: ...

Error Type II: ...

Independent Samples: Values don't come from same subject. AKA "Between subject". For example, when subjects have been divided according to a "Split File" or some kind of grouping division in SPSS. See also Dependent Samples.

Independent Samples T-Test: Central Limit Theorem applies for big samples. Assumes Normality.

Levene's Test for Equality of Variance: If sig < 0.05, use second row. If > 0.05 (not significant) use first row.

Kolmogorov-Smirnov: A test of Skewness, but more "accurate" or something. Used for larger (>= 50) sample sizes. In SPSS, if Significance of test is < 0.05, sample is skewed (because you're testing for normality). For smaller ones, see Shapiro-Wilk. Also see Kurtosis and Skewness.

Kurtosis: A test of normal distribution in a group. Should be worked out in the same way as the Skewness statistic. See also Kolmogorov-Smirnov, and Shapiro-Wilk.

Mu: Mean of a population. See also X-bar.

Normality: A state of non-skewness - a normal curve. A dataset needs to be "normal" before you can run certain tests on it. If it is not, then you may be able to transform it into a normal distribution by natural logging it. tests that assume this are: * T-Tests * ...

To plot a histogram with normal curve in SPSS, use "Graph" -> "Histogram"...

Null Hypothesis: Always the "assumed" hypothesis that "nothing is out of the ordinary", as it were.

Paired Samples: See Dependent samples.

P-Value: Another name for Significance - how likely it is that a sample comes from another sample or population.

Shapiro-Wilk: A test of Skewness, but more "accurate" or something. Used for small (< 50) sample sizes. In SPSS, if Significance of test is < 0.05, sample is skewed (because you're testing for normality). For larger ones, see Kolmogorov-Smirnov. Also see Kurtosis and Skewness.

Significance: A measure of how likely it is that a sample comes from another sample or population. If the probability is less than 5%, we can assume (for the sake of argument) that the sample didn't come from the other sample/population. Also see P-Value.

Skewness: A test of skew in a group. To find out if a group is skewed, use SPSS to determine the "excess of skewness", then find the Z-Score of skewness by dividing the skewness statistic by its standard error. See also Kurtosis, Kolmogorov-Smirnov, and Shapiro-Wilk. If data is skewed, may need to transform into a normal distribution in order to perform certain tests - see Normality.

Standard Deviation: The average distance any given measurement will be from the mean. (Or, at least, a measure of this.)

T-Test: Way of testing whether two groups are "statistically different", using the means and the standard deviations of those groups. Can be one-tailed (checking in one direction, tail is 5%) or two-tailed (checking in both direction, each tail is 2.5%). See this page, but if we know that a group is going to be lower or higher than the mean being compared to, we can concentrate on one tail only. If we have the result from a 2-tailed test, we can halve our P-Value.

Also 2 kinds of t-test, depending on what you're comparing: * One-Sample T-Test: compare sample against population * Two-Sample T-Test: compare 2 samples (so in SPSS, if comparing 2 groups within a data subject, neither is a population, so this test should be used - or "Independent/Paired Samples T-Test" as they're called).

Within Subject: See Dependent Samples.

X-bar: Mean of a sample. See also Mu.

Z-Score: A measure similar to Significance, but measured against the Normal Curve. Z-Score > 1.96 or < -1.96 corresponds to a probability (see Significance) of < 5%.


What test should I use?

  1. Check whether variables are continuous or discrete (i.e. part of a range, or categories - if you can infinitely have values falling between 2 other values, it's continuous).
  2. If both are continuous, use a Paired Sample T-Test.
  3. If both are discrete, use a Chi Square test.
  4. If one is continuous and one is discrete, use an Independent Sample test or an ANOVA test, depending on whether you are comparing 2 different sets of values (-> Independent Samples) or more than 2 (-> ANOVA).

Down

  • (none)
ckpoevtugba pxcbrighton