library(tidyverse)
library(haven)
library(naniar)
library(survey)
library(srvyr)
library(labelled)
library(sjmisc)
library(sjPlot)
library(gmodels)
library(gtsummary)
library(ggblanket)
Appendix C — Skill Assignment 2: Defining your research question, dataset, & variables
Find the sample assignment here.
C.1 Research Question
To what extent can interest in political campaigns be explained by demographic and political variables?
C.2 Dataset Selected
2020 American National Election Study
C.3 Variables Selected for Initial Study
Purpose of Variable | Variable Name & Description | Level of Measurement |
---|---|---|
Dependent | V201006 - How interested are you in following political campaigns? | ordinal |
Independent | V201507x - age of the respondent | interval |
V201600- gender of the respondent | nominal | |
V201018- party of the respondent | nominal | |
V201200 - ideology of the respondent | ordinal | |
V201549x - race of the respondent | nominal | |
V203003 - region of the respondent | nominal | |
C.3.1 Load packages
C.3.2 Load the correct dataset
Need help? Go to chapter x in the webbook.
load("anes_timeseries_2020.RData")
C.3.3 Using the dataset you have loaded, select your variables, and save your new smaller dataset
Need help? Go to chapter 4 in the webbook.
<- anes_timeseries_2020 |>
smaller_anes_2020 select(V200001, V200016b, V201006, V201507x, V201600, V201008, V201200, V201549x, V203003) |>
drop_unused_value_labels()
save(smaller_anes_2020, file = "smaller_anes_2020.RData")
C.3.4 Run variable frequencies on your new smaller dataset
Need help? Go to chapter x in the webbook.
|>
smaller_anes_2020 select(-V200001, -V200016b) |>
frq()
PRE: How interested in following campaigns (V201006) <numeric>
# total N=8280 valid N=8279 mean=1.61 sd=0.70
Value | Label | N | Raw % | Valid % | Cum. %
-----------------------------------------------------------------
1 | 1. Very much interested | 4320 | 52.17 | 52.18 | 52.18
2 | 2. Somewhat interested | 2890 | 34.90 | 34.91 | 87.09
3 | 3. Not much interested | 1069 | 12.91 | 12.91 | 100.00
<NA> | <NA> | 1 | 0.01 | <NA> | <NA>
PRE: SUMMARY: Respondent age (V201507x) <numeric>
# total N=8280 valid N=7932 mean=51.59 sd=17.21
Value | Label | N | Raw % | Valid % | Cum. %
------------------------------------------------------------
18 | 18 | 35 | 0.42 | 0.44 | 0.44
19 | 19 | 52 | 0.63 | 0.66 | 1.10
20 | 20 | 46 | 0.56 | 0.58 | 1.68
21 | 21 | 51 | 0.62 | 0.64 | 2.32
22 | 22 | 57 | 0.69 | 0.72 | 3.04
23 | 23 | 75 | 0.91 | 0.95 | 3.98
24 | 24 | 92 | 1.11 | 1.16 | 5.14
25 | 25 | 104 | 1.26 | 1.31 | 6.45
26 | 26 | 108 | 1.30 | 1.36 | 7.82
27 | 27 | 132 | 1.59 | 1.66 | 9.48
28 | 28 | 120 | 1.45 | 1.51 | 10.99
29 | 29 | 131 | 1.58 | 1.65 | 12.64
30 | 30 | 142 | 1.71 | 1.79 | 14.44
31 | 31 | 109 | 1.32 | 1.37 | 15.81
32 | 32 | 117 | 1.41 | 1.48 | 17.28
33 | 33 | 123 | 1.49 | 1.55 | 18.84
34 | 34 | 142 | 1.71 | 1.79 | 20.63
35 | 35 | 152 | 1.84 | 1.92 | 22.54
36 | 36 | 144 | 1.74 | 1.82 | 24.36
37 | 37 | 149 | 1.80 | 1.88 | 26.24
38 | 38 | 152 | 1.84 | 1.92 | 28.15
39 | 39 | 151 | 1.82 | 1.90 | 30.06
40 | 40 | 139 | 1.68 | 1.75 | 31.81
41 | 41 | 151 | 1.82 | 1.90 | 33.71
42 | 42 | 113 | 1.36 | 1.42 | 35.14
43 | 43 | 116 | 1.40 | 1.46 | 36.60
44 | 44 | 111 | 1.34 | 1.40 | 38.00
45 | 45 | 116 | 1.40 | 1.46 | 39.46
46 | 46 | 119 | 1.44 | 1.50 | 40.96
47 | 47 | 106 | 1.28 | 1.34 | 42.30
48 | 48 | 105 | 1.27 | 1.32 | 43.62
49 | 49 | 123 | 1.49 | 1.55 | 45.17
50 | 50 | 154 | 1.86 | 1.94 | 47.11
51 | 51 | 128 | 1.55 | 1.61 | 48.73
52 | 52 | 111 | 1.34 | 1.40 | 50.13
53 | 53 | 117 | 1.41 | 1.48 | 51.60
54 | 54 | 123 | 1.49 | 1.55 | 53.15
55 | 55 | 140 | 1.69 | 1.77 | 54.92
56 | 56 | 127 | 1.53 | 1.60 | 56.52
57 | 57 | 136 | 1.64 | 1.71 | 58.23
58 | 58 | 145 | 1.75 | 1.83 | 60.06
59 | 59 | 154 | 1.86 | 1.94 | 62.00
60 | 60 | 168 | 2.03 | 2.12 | 64.12
61 | 61 | 139 | 1.68 | 1.75 | 65.87
62 | 62 | 154 | 1.86 | 1.94 | 67.81
63 | 63 | 156 | 1.88 | 1.97 | 69.78
64 | 64 | 155 | 1.87 | 1.95 | 71.73
65 | 65 | 180 | 2.17 | 2.27 | 74.00
66 | 66 | 170 | 2.05 | 2.14 | 76.15
67 | 67 | 142 | 1.71 | 1.79 | 77.94
68 | 68 | 140 | 1.69 | 1.77 | 79.70
69 | 69 | 158 | 1.91 | 1.99 | 81.69
70 | 70 | 126 | 1.52 | 1.59 | 83.28
71 | 71 | 147 | 1.78 | 1.85 | 85.14
72 | 72 | 145 | 1.75 | 1.83 | 86.96
73 | 73 | 147 | 1.78 | 1.85 | 88.82
74 | 74 | 94 | 1.14 | 1.19 | 90.00
75 | 75 | 93 | 1.12 | 1.17 | 91.17
76 | 76 | 89 | 1.07 | 1.12 | 92.30
77 | 77 | 81 | 0.98 | 1.02 | 93.32
78 | 78 | 64 | 0.77 | 0.81 | 94.13
79 | 79 | 63 | 0.76 | 0.79 | 94.92
80 | 80. Age 80 or older | 403 | 4.87 | 5.08 | 100.00
<NA> | <NA> | 348 | 4.20 | <NA> | <NA>
PRE: What is your (R) sex? [revised] (V201600) <numeric>
# total N=8280 valid N=8213 mean=1.54 sd=0.50
Value | Label | N | Raw % | Valid % | Cum. %
---------------------------------------------------
1 | 1. Male | 3763 | 45.45 | 45.82 | 45.82
2 | 2. Female | 4450 | 53.74 | 54.18 | 100.00
<NA> | <NA> | 67 | 0.81 | <NA> | <NA>
PRE: Where is R registered to vote (pre-election) (V201008) <numeric>
# total N=8280 valid N=8270 mean=1.27 sd=0.61
Value | Label | N | Raw % | Valid % | Cum. %
------------------------------------------------------------------------------
1 | 1. Registered at this address | 6787 | 81.97 | 82.07 | 82.07
2 | 2. Registered at a different address | 765 | 9.24 | 9.25 | 91.32
3 | 3. Not currently registered | 718 | 8.67 | 8.68 | 100.00
<NA> | <NA> | 10 | 0.12 | <NA> | <NA>
PRE: 7pt scale liberal-conservative self-placement (V201200) <numeric>
# total N=8280 valid N=8257 mean=17.90 sd=33.50
Value | Label | N | Raw % | Valid % | Cum. %
-----------------------------------------------------------------------------
1 | 1. Extremely liberal | 369 | 4.46 | 4.47 | 4.47
2 | 2. Liberal | 1210 | 14.61 | 14.65 | 19.12
3 | 3. Slightly liberal | 918 | 11.09 | 11.12 | 30.24
4 | 4. Moderate; middle of the road | 1818 | 21.96 | 22.02 | 52.26
5 | 5. Slightly conservative | 821 | 9.92 | 9.94 | 62.20
6 | 6. Conservative | 1492 | 18.02 | 18.07 | 80.27
7 | 7. Extremely conservative | 428 | 5.17 | 5.18 | 85.45
99 | 99. Haven't thought much about this | 1201 | 14.50 | 14.55 | 100.00
<NA> | <NA> | 23 | 0.28 | <NA> | <NA>
PRE: SUMMARY: R self-identified race/ethnicity (V201549x) <numeric>
# total N=8280 valid N=8178 mean=1.63 sd=1.24
Value | Label | N | Raw % | Valid % | Cum. %
----------------------------------------------------------------------------------------------------------------
1 | 1. White, non-Hispanic | 5963 | 72.02 | 72.92 | 72.92
2 | 2. Black, non-Hispanic | 726 | 8.77 | 8.88 | 81.79
3 | 3. Hispanic | 762 | 9.20 | 9.32 | 91.11
4 | 4. Asian or Native Hawaiian/other Pacific Islander, non-Hispanic alone | 284 | 3.43 | 3.47 | 94.58
5 | 5. Native American/Alaska Native or other race, non-Hispanic alone | 172 | 2.08 | 2.10 | 96.69
6 | 6. Multiple races, non-Hispanic | 271 | 3.27 | 3.31 | 100.00
<NA> | <NA> | 102 | 1.23 | <NA> | <NA>
SAMPLE: Census region (V203003) <numeric>
# total N=8280 valid N=8280 mean=2.64 sd=1.00
Value | Label | N | Raw % | Valid % | Cum. %
------------------------------------------------------
1 | 1. Northeast | 1396 | 16.86 | 16.86 | 16.86
2 | 2. Midwest | 1997 | 24.12 | 24.12 | 40.98
3 | 3. South | 3081 | 37.21 | 37.21 | 78.19
4 | 4. West | 1806 | 21.81 | 21.81 | 100.00
<NA> | <NA> | 0 | 0.00 | <NA> | <NA>