mirror of
https://github.com/jackyzha0/quartz.git
synced 2025-12-24 21:34:06 -06:00
86 lines
2.0 KiB
Markdown
86 lines
2.0 KiB
Markdown
# L2 Statistics preview
|
|
|
|
Introduction
|
|
|
|
1. We recently revised the Econ 378 curriculum. Formerly, we started with basic theory and the basic tools based on it, introduced complex theory with complex tools, then more complex theory and more complex tools. This seemed reasonable, but I realized: "Still to this day, I've never learned to build a car, but even without knowing how to build one, I managed to learn to drive one."
|
|
|
|
2. Now: learn basic, complex, and more complex tools upfront. Then go learn the underlying theory.
|
|
|
|
Spreadsheets
|
|
|
|
1. Unit of observation
|
|
|
|
2. Quantitative variables
|
|
|
|
3. Binary variables
|
|
|
|
- Categorical variables as binary (or "dummy") variables
|
|
|
|
Data Visualization
|
|
|
|
1. Single variables
|
|
|
|
- Binary variables: Pie charts
|
|
|
|
- Quantitative variables: Histograms
|
|
|
|
2. Interactions
|
|
|
|
- Two binary variables: Double pie charts
|
|
|
|
- Binary and quantitative: bar chart
|
|
|
|
- Two quantitative: scatter chart
|
|
1. Quantitative & time: line graph
|
|
|
|
- Three variables
|
|
1. Two binary & quantitative: clustered bar chart
|
|
2. Two quantitative & binary: color-coded scatter chart
|
|
3. Three quantitative: bubble chart
|
|
|
|
Summary statistics
|
|
|
|
1. Proportions
|
|
|
|
2. Mean
|
|
|
|
- From histogram, eyeball center of gravity
|
|
|
|
3. Median/percentiles
|
|
|
|
4. Mode
|
|
|
|
5. Standard deviation
|
|
|
|
- Rule of thumb: two standard deviations
|
|
|
|
- Chebyshev's inequality: % of population outside $k$ standard deviations can't exceed $\frac{1}{k^{2}}$
|
|
|
|
6. Correlation coefficient
|
|
|
|
- $\rho\in\lbrack - 1,1\rbrack$
|
|
|
|
- $\rho^{2}$ gives fraction of variance in $Y$ that coincides with variation in $X$
|
|
|
|
7. Regressions
|
|
|
|
- Slope & intercept
|
|
1. Predict $y$ for any $x$
|
|
2. Predict future!
|
|
3. Counterfactual "experiments" (way less costly than real experiments)
|
|
|
|
- $R^{2}$ (coefficient of determination)
|
|
|
|
- Error terms / detrended data
|
|
|
|
- Multiple regression
|
|
|
|
Estimation
|
|
|
|
1. Population / samples
|
|
|
|
- Importance of representative sample
|
|
|
|
2. Point estimates, interval estimates / margin of error
|
|
|
|
3. Hypothesis test |