Do Proportional Electoral College Allocations Yield a More Representative Presidency?

Author

Cindy Li

Introduction

The U.S. Electoral College (EC) system, by design, has a significant impact on presidential elections, often making the distribution of votes much more complex than a simple nationwide popular vote. The system has been debated, especially when the results diverge from the popular vote. This analysis’ primary goal is to assess how the allocation schemes impact the election outcomes and whether any bias exists, especially in favor of one political party.

Data Ingesting

Data I: ELection Data

Data Source: MIT Election Data Science Lab datasets From the MIT Election Data Science Lab, we are retrieving two data sets. First, are votes from all biennial congressional races in all 50 states from 1976 to 2020. Second, are statewide presidential vote cotes. This requires a download from the link

Load Libraries

if (!require("readr")) install.packages("readr")
if (!require("sf")) install.packages("sf")
if (!require("dplyr")) install.packages("dplyr")
if (!require("tidyr")) install.packages("tidyr")
if (!require("tidyverse")) install.packages("tidyverse")
if (!require("DT")) install.packages("DT")
if (!require("ggplot2")) install.packages("ggplot2")
if (!require("gt")) install.packages("gt")
if (!require("plotly")) install.packages("plotly")

library(readr)
library(sf)
library(dplyr)
library(tidyr)
library(tidyverse)
library(DT)
library(ggplot2)
library(gt)
library(plotly)

1976-2022 House Data

View Code

HOUSE <- read_csv("1976-2022-house.csv")
PRESIDENT <- read_csv("1976-2020-president.csv")

sample_n(HOUSE, 1000) |>
  DT::datatable()

1976-2020 President Data

View Code

sample_n(HOUSE, 1000) |>
  DT::datatable()

Data II: Congressional Boundary Files 1976 to 2012

Data Source: Jeffrey B. Lewis, Brandon DeVine, and Lincoln Pritcher with Kenneth C. Martis This source give us the shapefiles for all US congressional districts from 1789 to 2012.

View Code

get_file <- function(fname){
  BASE_URL <- "https://cdmaps.polisci.ucla.edu/shp/"
  fname_ext <- paste0(fname, ".zip")
  fname_ext1 <- paste0(fname, ".shp")
  fname_extunzip <- gsub(".zip$", "", fname_ext)
  subfolder <- "districtshapes"  # Subfolder where the shapefile is located
  if(!file.exists(fname_ext)){
    FILE_URL <- paste0(BASE_URL, fname_ext)
    download.file(FILE_URL, 
                  destfile = fname_ext)
  }
  # Unzip the contents and save unzipped content
  unzip(zipfile = fname_ext, exdir = fname_extunzip)
  # Define File Path
  shapefile_path <- file.path(fname_extunzip, subfolder, fname_ext1)
  # Read the shapefile
  read_sf(shapefile_path)
}

# Download files by iterating through
start_congress = 95
end_congress = 114
for (i in start_congress:end_congress) {
  district_name <- sprintf("districts%03d", i)  # Formats as district001, district002, etc.
  district_data <- get_file(district_name)   # Download and read the shapefile
  assign(district_name, district_data, envir = .GlobalEnv)  # Assign the data frame to a variable in the global environment
}

Data III: Congressional Boundary Files 2014 to Present

Data Source: US Census Bureau This data source provides district boundaries for more recent congressional elections.

View Code

get_congress_file <- function(fname, year){
  BASE_URL <- sprintf("https://www2.census.gov/geo/tiger/TIGER%d/CD/", year) #replace %d with year
  fname_ext <- paste0(fname, ".zip")
  fname_ext1 <- paste0(fname, ".shp")
  fname_extunzip <- gsub(".zip$", "", fname_ext)
  
  # Download File
  if(!file.exists(fname_ext)){
    FILE_URL <- paste0(BASE_URL, fname_ext)
    download.file(FILE_URL, 
                  destfile = fname_ext)
  }
  # Unzip the contents and save unzipped content
  unzip(zipfile = fname_ext, exdir = fname_extunzip)
  # Define File Path
  shapefile_path <- file.path(fname_extunzip, fname_ext1)
  # Read the shapefile
  read_sf(shapefile_path)
}

# Download file for each district by iterating through each year
base_year = 2022
base_congress = 116  # Congress number for 2012
for (i in 0:10) {  # i will range from 0 (2022) to 10 (2012)
  year <- base_year - i
  if (year >= 2018) {congress <- 116} 
  else if (year >= 2016) {congress <- 115} 
  else if (year >= 2014) {congress <- 114} 
  else if (year == 2013) {congress <- 113} 
  else if (year == 2012) {congress <- 112}
  district_name <- sprintf("tl_%d_us_cd%d", year, congress)
  district_data <- get_congress_file(district_name, year)  # Download and read the shapefile
  assign(district_name, district_data, envir = .GlobalEnv)  # Assign the data frame to a variable in the global environment
  }

Exploration

1. Which states have gained and lost the most seats in the US House of Representatives between 1976 and 2022?

View Code

# Count the number of districts (aka seats) per state for each year
gains_losses <- HOUSE |>
  group_by(state, year) |>
  summarise(num_districts = n_distinct(district)) |>
  arrange(state, year) |>
  # Calculate seat changes for each state
  group_by(state) |>
  summarise(
    first_year_seats = first(num_districts),
    last_year_seats = last(num_districts),
    seat_change = last_year_seats - first_year_seats) |>
  filter(seat_change != 0) |>
  arrange(desc(seat_change))

# Plot the seat changes
ggplot(gains_losses, aes(x = reorder(state, seat_change), y = seat_change, fill = seat_change > 0)) +
  geom_bar(stat = "identity") +
  coord_flip() +
  scale_fill_manual(values = c("red", "blue"), labels = c("Loss", "Gain")) +
  labs(
    title = "Gains and Losses of House Seats (1976 to 2022)",
    x = "State",
    y = "Change in Seats",
    fill = "Change"
  ) +
  theme_minimal()

2. New York State has a unique “fusion” voting system where one candidate can appear on multiple “lines” on the ballot and their vote counts are totaled. Are there any elections in our data where the election would have had a different outcome if the “fusion” system was not used and candidates only received the votes their received from their “major party line” (Democrat or Republican) and not their total number of votes across all lines?

View Code

# Summarize the votes for each candidate, total votes vs major party votes
fusion_summary <- HOUSE |>
  filter(!is.na(candidate)) |>
  group_by(year, state, state_po, district, candidate, fusion_ticket) |>
  summarise(
    total_candidate_votes = sum(candidatevotes, na.rm = TRUE),  # Total votes across all party lines
    major_party_votes = sum( # Major party votes (only votes from Democrat and Republican)
      if_else(party %in% c("DEMOCRAT", "REPUBLICAN"), candidatevotes, 0), na.rm = TRUE), .groups = "drop") |>
  select(year, state, state_po, district, candidate, total_candidate_votes, major_party_votes, fusion_ticket) |>
  arrange(state, district, candidate)

# Check if there would have been a different outcome without fusion voting
fusion_outcome_changes <- fusion_summary |>
  filter(fusion_ticket == TRUE) |> # out of the times when fusion voting was used
  group_by(year, state, state_po, district) |>
  summarise(
    # Find the winner based on total votes 
    winner_with_fusion = candidate[which.max(total_candidate_votes)],
    winner_without_fusion = candidate[which.max(major_party_votes)],
    total_votes_winner = max(total_candidate_votes),
    major_party_votes_winner = max(major_party_votes),
    .groups = "drop"
  ) |>
  # Ensure that major party votes winner is not zero and handle if no major party candidate ran
  mutate( major_party_votes_winner = ifelse(major_party_votes_winner == 0, NA, major_party_votes_winner), 
    # Check if the winners are the same or different based on fusion voting
    outcome_change = ifelse(winner_with_fusion != winner_without_fusion, "Yes", "No")
  ) |>
  arrange(year, state, district)

# Plot directly from fusion_outcome_changes without creating a summary
ggplot(fusion_outcome_changes, aes(x = outcome_change, fill = outcome_change)) +
  geom_bar(show.legend = FALSE) +  # Create the bar plot
  labs(
    title = "Impact of Fusion Voting on Election Outcomes",
    x = "Outcome Change",
    y = "Number of Elections"
  ) +
  theme_minimal() +
  theme(
    axis.text.x = element_text(angle = 45, hjust = 1),
    panel.grid.major = element_blank(),
    panel.grid.minor = element_blank()
  )

3. Do presidential candidates tend to run ahead of or run behind congressional candidates in the same state? That is, does a Democratic candidate for president tend to get more votes in a given state than all Democratic congressional candidates in the same state? Are any presidents particularly more or less popular than their co-partisans?

Let’s take a glimpse at just the year 2020.

View Code

# Summarize presidential votes for Democrat and Republican candidates
presidential_votes <- PRESIDENT |>
  filter(party_simplified %in% c("DEMOCRAT", "REPUBLICAN")) |>
  group_by(year, state, party_simplified) |>
  summarise(
    presidential_total_votes = sum(candidatevotes, na.rm = TRUE),
    .groups = "drop"
  )

# Summarize congressional votes for Democrat and Republican candidates
congressional_votes <- HOUSE |>
  filter(party %in% c("DEMOCRAT", "REPUBLICAN")) |>
  group_by(year, state, party) |>
  summarise(
    congressional_total_votes = sum(candidatevotes, na.rm = TRUE),
    .groups = "drop"
  )

# Join the two datasets to compare presidential and congressional votes
vote_comparison <- left_join(presidential_votes, congressional_votes, by = c("year", "state", "party_simplified" = "party")) |>
  mutate(vote_difference = presidential_total_votes - congressional_total_votes,
    run_ahead = if_else(vote_difference > 0, "Presidential Ahead", "Presidential Behind")) |>
  arrange(run_ahead, vote_difference)

Does this trend differ over time? Does it differ across states or across parties?

View Code

# Plot the vote difference between presidential and congressional candidates frequencies by year
ggplot(vote_comparison, aes(x = vote_difference, fill = run_ahead)) +
  geom_histogram(binwidth = 100000, position = "identity", alpha = 0.7) +
  facet_wrap(~ year, ncol=3) +
  labs(title = "Presidential vs Congressional Vote Difference",
       x = "Vote Difference (Presidential - Congressional)",
       y = "Frequency",
       fill = "Vote Comparison") +
  theme_minimal() +
  theme(axis.text.x = element_text(size = 8, angle = 45, hjust = 1))  # Adjust x-axis tick labels

View Code

# Calculate the average vote difference for each president (across all states and years)
presidential_comparison <- vote_comparison |>
  group_by(year, state, party_simplified) |>
  summarise(
    average_vote_difference = mean(vote_difference, na.rm = TRUE),
    .groups = "drop"
  )

# group by party_simplified and year to get the overall average for each party-year
president_ranking <- presidential_comparison |>
  group_by(party_simplified, year) |>
  summarise(
    average_vote_difference = mean(average_vote_difference, na.rm = TRUE),
    .groups = "drop"
  ) |>
  arrange(desc(average_vote_difference))

# Create a plot to visualize presidential popularity vs. co-partisan congressional candidates
ggplot(president_ranking, aes(x = reorder(party_simplified, average_vote_difference), y = average_vote_difference, fill = party_simplified)) +
  geom_bar(stat = "identity", show.legend = FALSE) +
  coord_flip() +
  scale_fill_manual(values = c("DEMOCRAT" = "blue", "REPUBLICAN" = "red")) +
  labs( 
    title = "Presidential Popularity vs. Congressional Co-partisans",
    x = "President",
    y = "Average Vote Difference (Presidential - Congressional)",
    subtitle = "Higher values indicate greater presidential popularity relative to congressional candidates") +
  theme_minimal()
# Line plot with trend lines for each party
ggplot(president_ranking, aes(x = year, y = average_vote_difference, color = party_simplified, group = party_simplified)) +
  geom_line(size = 1) +
  geom_point(size = 3) +  # Adds points at each year
  scale_color_manual(values = c("DEMOCRAT" = "blue", "REPUBLICAN" = "red")) + 
  labs(
    title = "Presidential Popularity vs. Congressional Co-partisans Over Time",
    x = "Year",
    y = "Average Vote Difference (Presidential - Congressional)",
    subtitle = "Line plot showing trends for each party"
  ) +
  theme_minimal()

Maps & Shapefiles

Chloropleth Visualization of the 2000 Presidential Election Electoral College Results

Filter Election Data for 2000: To create a map of the results broken down by states, we will need to find the election results of each state. The first step involves filtering the election dataset PRESIDENT to get the results for the year 2000. We specifically focus on the U.S. Presidential election and filter for the two main candidates, George W. Bush and Al Gore. We then calculate the winner for each state based on who received the most votes and assign the appropriate party.

View Code

election_2000 <- PRESIDENT |>
  filter(year == 2000, office == "US PRESIDENT") |> # filter for 2000 and president office
  filter(candidate %in% c("BUSH, GEORGE W.", "GORE, AL")) |> # filter for Bush and Gore
  group_by(state) |>
  summarise( # Winner based on the candidate with the most votes
    winner = if_else(sum(candidatevotes[candidate == "BUSH, GEORGE W."]) > sum(candidatevotes[candidate == "GORE, AL"]),
      "Bush", "Gore"),
    winner_party = case_when(# Party based on the candidate
      winner == "Bush" ~ "Republican",
      winner == "Gore" ~ "Democrat"
    )) |>
  ungroup()

Join Election Data with Shapefiles: The next step is to join the election results with the geographical shapefile data. This step ensures that we can visualize the election results on a map by linking the state names in both datasets. The shapefile data is modified to ensure the state names are in uppercase to match the election data. After merging the data, we create a choropleth map of the contiguous U.S. states. We use geom_sf() to plot the states and color them based on the winning party (Republican or Democrat). The map is then customized to remove axis labels and grid lines for a clean visualization.

View Code

# join with shapefile
districts106$STATENAME <- toupper(districts106$STATENAME) # uppercase state name to match

dis_election_2000 <- left_join(districts106, election_2000, by = c("STATENAME" = "state"), relationship = "many-to-many")

main_us <- dis_election_2000 |> filter(!STATENAME %in% c("ALASKA", "HAWAII"))

ggplot(main_us, aes(geometry = geometry, fill = winner_party)) +
  geom_sf() + 
  scale_fill_manual(values = c("Republican" = "red", "Democrat" = "blue")) +
  theme_minimal() +
  labs(title = "U.S. Presidential Election Results by State in 2000",
       fill = "Winning Party") +
  theme(
    axis.text = element_blank(),
    axis.ticks = element_blank(),
    panel.grid = element_blank()
  )

Add Insets for Alaska and Hawaii: Because Alaska and Hawaii are geographically distant from the mainland U.S., we create insets for these two states. The data for Alaska and Hawaii is filtered separately, and individual maps are created for each. These insets are then added to the main U.S. map.

View Code

# contiguous US 
main_us <- dis_election_2000 |> filter(!STATENAME %in% c("ALASKA", "HAWAII"))
map_us <- ggplot(main_us, aes(geometry = geometry, fill = winner_party)) +
  geom_sf() + 
  scale_fill_manual(values = c("Republican" = "red", "Democrat" = "blue")) +
  theme_minimal() +
  labs(title = "U.S. Presidential Election Results by State in 2000",
       fill = "Winning Party") +
  theme_void() +
  coord_sf(xlim = c(-130, -60), ylim = c(20, 50), expand = FALSE) 

# filter data for Alaska and Hawaii
alaska <- dis_election_2000 |> filter(STATENAME == "ALASKA")
hawaii <- dis_election_2000 |> filter(STATENAME == "HAWAII")

# Alaska Inset
inset_alaska <- ggplot(alaska, aes(geometry = geometry, fill = winner_party)) +
  geom_sf() +
  scale_fill_manual(values = c("Republican" = "red", "Democrat" = "blue")) +
  theme_void() +
  theme(legend.position = "none") + 
  coord_sf(xlim = c(-180, -140), ylim = c(50, 72), expand = FALSE)

# Hawaii Inset
inset_hawaii <- ggplot(hawaii, aes(geometry = geometry, fill = winner_party)) +
  geom_sf() +
  scale_fill_manual(values = c("Republican" = "red", "Democrat" = "blue")) +
  theme_void() +
  theme(legend.position = "none") +
  coord_sf(xlim = c(-161, -154), ylim = c(18, 23), expand = FALSE)

# Combine Maps
combined_map <- map_us +
  annotation_custom(ggplotGrob(inset_alaska),
                    xmin = -120, xmax = -130, # position
                    ymin = 15, ymax = 40) +  # size
  annotation_custom(ggplotGrob(inset_hawaii),
                    xmin = -115, xmax = -100, # position
                    ymin = 20, ymax = 30)    # size
print(combined_map)

Chloropleth Visualization of Electoral College Results Over Time

Data Preparation First, we need to clean the data to ensure they join properly. First, we convert to the same CRS. Then, I am adding a STATENAME column based on the STATEFP as well as changing STATENAME values to uppercase to match.

View Code

# convert to the same crs
districts095 <- st_transform(districts095, crs = st_crs(districts112))
districts097 <- st_transform(districts097, crs = st_crs(districts112))
districts098 <- st_transform(districts098, crs = st_crs(districts112))
districts101 <- st_transform(districts101, crs = st_crs(districts112))
districts102 <- st_transform(districts102, crs = st_crs(districts112))
districts103 <- st_transform(districts103, crs = st_crs(districts112))
districts106 <- st_transform(districts106, crs = st_crs(districts112))
districts108 <- st_transform(districts108, crs = st_crs(districts112))
districts111 <- st_transform(districts111, crs = st_crs(districts112))
tl_2016_us_cd115 <- st_transform(tl_2016_us_cd115, crs = st_crs(districts112))
tl_2020_us_cd116 <- st_transform(tl_2020_us_cd116, crs = st_crs(districts112))

# convert state names for to uppercase join
districts095$STATENAME <- toupper(districts095$STATENAME)  
districts097$STATENAME <- toupper(districts097$STATENAME)  
districts098$STATENAME <- toupper(districts098$STATENAME)  
districts101$STATENAME <- toupper(districts101$STATENAME)  
districts102$STATENAME <- toupper(districts102$STATENAME)  
districts103$STATENAME <- toupper(districts103$STATENAME)  
districts106$STATENAME <- toupper(districts106$STATENAME)  
districts108$STATENAME <- toupper(districts108$STATENAME)  
districts111$STATENAME <- toupper(districts111$STATENAME)  
districts112$STATENAME <- toupper(districts112$STATENAME)  

# add STATENAME column using statefp
# https://www.mercercountypa.gov/dps/state_fips_code_listing.htm 
tl_2020_us_cd116 <- tl_2020_us_cd116 |>
  mutate(STATENAME = case_when(
    STATEFP == "01" ~ "ALABAMA",
    STATEFP == "02" ~ "ALASKA",
    STATEFP == "04" ~ "ARIZONA",
    STATEFP == "05" ~ "ARKANSAS",
    STATEFP == "06" ~ "CALIFORNIA",
    STATEFP == "08" ~ "COLORADO",
    STATEFP == "09" ~ "CONNECTICUT",
    STATEFP == "10" ~ "DELAWARE",
    STATEFP == "11" ~ "DISTRICT OF COLUMBIA",
    STATEFP == "12" ~ "FLORIDA",
    STATEFP == "13" ~ "GEORGIA",
    STATEFP == "15" ~ "HAWAII",
    STATEFP == "16" ~ "IDAHO",
    STATEFP == "17" ~ "ILLINOIS",
    STATEFP == "18" ~ "INDIANA",
    STATEFP == "19" ~ "IOWA",
    STATEFP == "20" ~ "KANSAS",
    STATEFP == "21" ~ "KENTUCKY",
    STATEFP == "22" ~ "LOUISIANA",
    STATEFP == "23" ~ "MAINE",
    STATEFP == "24" ~ "MARYLAND",
    STATEFP == "25" ~ "MASSACHUSETTS",
    STATEFP == "26" ~ "MICHIGAN",
    STATEFP == "27" ~ "MINNESOTA",
    STATEFP == "28" ~ "MISSISSIPPI",
    STATEFP == "29" ~ "MISSOURI",
    STATEFP == "30" ~ "MONTANA",
    STATEFP == "31" ~ "NEBRASKA",
    STATEFP == "32" ~ "NEVADA",
    STATEFP == "33" ~ "NEW HAMPSHIRE",
    STATEFP == "34" ~ "NEW JERSEY",
    STATEFP == "35" ~ "NEW MEXICO",
    STATEFP == "36" ~ "NEW YORK",
    STATEFP == "37" ~ "NORTH CAROLINA",
    STATEFP == "38" ~ "NORTH DAKOTA",
    STATEFP == "39" ~ "OHIO",
    STATEFP == "40" ~ "OKLAHOMA",
    STATEFP == "41" ~ "OREGON",
    STATEFP == "42" ~ "PENNSYLVANIA",
    STATEFP == "44" ~ "RHODE ISLAND",
    STATEFP == "45" ~ "SOUTH CAROLINA",
    STATEFP == "46" ~ "SOUTH DAKOTA",
    STATEFP == "47" ~ "TENNESSEE",
    STATEFP == "48" ~ "TEXAS",
    STATEFP == "49" ~ "UTAH",
    STATEFP == "50" ~ "VERMONT",
    STATEFP == "51" ~ "VIRGINIA",
    STATEFP == "53" ~ "WASHINGTON",
    STATEFP == "54" ~ "WEST VIRGINIA",
    STATEFP == "55" ~ "WISCONSIN",
    STATEFP == "56" ~ "WYOMING"
  ))

tl_2016_us_cd115 <- tl_2016_us_cd115 |>
  mutate(STATENAME = case_when(
    STATEFP == "01" ~ "ALABAMA",
    STATEFP == "02" ~ "ALASKA",
    STATEFP == "04" ~ "ARIZONA",
    STATEFP == "05" ~ "ARKANSAS",
    STATEFP == "06" ~ "CALIFORNIA",
    STATEFP == "08" ~ "COLORADO",
    STATEFP == "09" ~ "CONNECTICUT",
    STATEFP == "10" ~ "DELAWARE",
    STATEFP == "11" ~ "DISTRICT OF COLUMBIA",
    STATEFP == "12" ~ "FLORIDA",
    STATEFP == "13" ~ "GEORGIA",
    STATEFP == "15" ~ "HAWAII",
    STATEFP == "16" ~ "IDAHO",
    STATEFP == "17" ~ "ILLINOIS",
    STATEFP == "18" ~ "INDIANA",
    STATEFP == "19" ~ "IOWA",
    STATEFP == "20" ~ "KANSAS",
    STATEFP == "21" ~ "KENTUCKY",
    STATEFP == "22" ~ "LOUISIANA",
    STATEFP == "23" ~ "MAINE",
    STATEFP == "24" ~ "MARYLAND",
    STATEFP == "25" ~ "MASSACHUSETTS",
    STATEFP == "26" ~ "MICHIGAN",
    STATEFP == "27" ~ "MINNESOTA",
    STATEFP == "28" ~ "MISSISSIPPI",
    STATEFP == "29" ~ "MISSOURI",
    STATEFP == "30" ~ "MONTANA",
    STATEFP == "31" ~ "NEBRASKA",
    STATEFP == "32" ~ "NEVADA",
    STATEFP == "33" ~ "NEW HAMPSHIRE",
    STATEFP == "34" ~ "NEW JERSEY",
    STATEFP == "35" ~ "NEW MEXICO",
    STATEFP == "36" ~ "NEW YORK",
    STATEFP == "37" ~ "NORTH CAROLINA",
    STATEFP == "38" ~ "NORTH DAKOTA",
    STATEFP == "39" ~ "OHIO",
    STATEFP == "40" ~ "OKLAHOMA",
    STATEFP == "41" ~ "OREGON",
    STATEFP == "42" ~ "PENNSYLVANIA",
    STATEFP == "44" ~ "RHODE ISLAND",
    STATEFP == "45" ~ "SOUTH CAROLINA",
    STATEFP == "46" ~ "SOUTH DAKOTA",
    STATEFP == "47" ~ "TENNESSEE",
    STATEFP == "48" ~ "TEXAS",
    STATEFP == "49" ~ "UTAH",
    STATEFP == "50" ~ "VERMONT",
    STATEFP == "51" ~ "VIRGINIA",
    STATEFP == "53" ~ "WASHINGTON",
    STATEFP == "54" ~ "WEST VIRGINIA",
    STATEFP == "55" ~ "WISCONSIN",
    STATEFP == "56" ~ "WYOMING"
  ))

Creating a Systematic Election Data Function for Visualization In this section, I have created a function that systematically processes U.S. Presidential election data for each election year. The function takes as input the election year and the corresponding shapefile data and returns a prepared dataset. This allows for easy handling of election data from multiple years, and it can be used to visualize and analyze the results for any given year.

The create_election_data takes two arguments: - election_year: the specific year of the presidential election (e.g., 2000, 2004, etc.). - shapefile_data: the shapefile containing the geographical data for that election year. and it returns: - all_election_simplified: the merged dataset, which includes both the election results and the shapefile data.

View Code

# Function to create election data
create_election_data <- function(election_year, shapefile_data) {
  # Step 1: Filter for the specific year and the simplified party
  election_data <- PRESIDENT |>
    filter(year == election_year, office == "US PRESIDENT") |>  # Filter for the specific year and presidential election
    filter(party_simplified %in% c("DEMOCRAT", "REPUBLICAN")) |>
    group_by(state, state_fips, year) |>  # Group by state and party
    summarise(
      winner_party = if_else(sum(candidatevotes[party_simplified == "DEMOCRAT"]) > sum(candidatevotes[party_simplified == "REPUBLICAN"]),
                             "DEMOCRAT", "REPUBLICAN")) |>
    ungroup() |> 
    filter(!is.na(winner_party))
  
  # Step 2: Join with the shapefile data
  dis_election <- left_join(shapefile_data, election_data, by = c("STATENAME" = "state"), relationship = "many-to-many")
  #dis_election$year <- year # add year column
  return(dis_election)
}
# bind election data for each year into one file
all_election_data <- bind_rows(
  election_data_2020 <- create_election_data(2020, tl_2020_us_cd116),
  election_data_2016 <- create_election_data(2016, tl_2016_us_cd115),
  election_data_2012 <- create_election_data(2012, districts112),
  election_data_2008 <- create_election_data(2008, districts111),
  election_data_2004 <- create_election_data(2004, districts108),
  election_data_2000 <- create_election_data(2000, districts106),
  election_data_1996 <- create_election_data(1996, districts103),
  election_data_1992 <- create_election_data(1992, districts102),
  election_data_1988 <- create_election_data(1988, districts101),
  election_data_1984 <- create_election_data(1984, districts098),
  election_data_1980 <- create_election_data(1980, districts097),
  election_data_1976 <- create_election_data(1976, districts095)
)

# simplify map data
sf::sf_use_s2(FALSE)
all_election_simplified <- st_simplify(all_election_data, dTolerance = 0.01)

Creating the Election Results Map With the combined and simplified election data, we can now create a series of maps to visualize the election results for each year. The code below creates a map of the contiguous U.S. (excluding Alaska and Hawaii).

View Code

all_alaska <- all_election_simplified |> filter(STATENAME == "ALASKA")
all_hawaii <- all_election_simplified |> filter(STATENAME == "HAWAII") 
all_main_us <- all_election_simplified |> filter(!STATENAME %in% c("ALASKA", "HAWAII"), !is.na(winner_party))
  
  # Step 3: Main map for the contiguous U.S.
all_map_us <- ggplot(all_main_us, aes(geometry = geometry, fill = winner_party)) +
  geom_sf() + 
  scale_fill_manual(values = c("REPUBLICAN" = "red", "DEMOCRAT" = "blue")) +
  theme_minimal() +
  labs(title = "U.S. Presidential Election Results by State and Year",
       fill = "Winning Party") +
  theme_void() +
  facet_wrap(~ year, ncol=3) 

print(all_map_us)

Comparing the Effects of ECV Allocation Rules

These are different methods for distributing electoral votes (ECVs) among candidates in U.S. presidential elections. We want to see if rules for how ECVs are distributed can significantly influence the outcome of an election. Let’s explore each allocation scheme: We can find the electoral college votes per state using the House data. * Each district has a house representative * Each state gets R + 2

View Code

# count number of House Representatives using count of unique districts grouped by year and state
ECV <- HOUSE |>
  group_by(state, year) |>  # Group by state and year
  summarise(house_reps = n_distinct(district),  # Count unique districts (House representatives)
            ecv = house_reps + 2, .groups = "drop")  # get ECV by adding 2

State-Wide Winner-Take-All

n this system, the candidate who wins the most votes in a state receives all of that state’s Electoral College votes, regardless of the margin of victory. In most states (except Nebraska and Maine), if Candidate A wins 51% of the vote in a state, they will receive all of that state’s Electoral Votes, even if Candidate B got 49% of the vote. Each state has a certain number of electoral votes (ECVs), based on its representation in Congress (Senators + House Representatives). Under this system, only the winner of the popular vote in the state gets those votes.

View Code

state_wide_winner_take_all <- PRESIDENT |>
  group_by(state, year) |>
  filter(candidatevotes == max(candidatevotes)) |>
  left_join(ECV, by = c("state" = "state", "year" = "year")) |>
  select(state, year, candidate, party_simplified, ecv) |>
  filter(!is.na(ecv))

state_wide_winner_take_all <- state_wide_winner_take_all |> 
  group_by(year, candidate, party_simplified) |> 
  summarise(total_ecv = sum(ecv), .groups = "drop") |> # total ecv
  arrange(year, total_ecv) |> 
  group_by(year) |> 
  mutate(winner = if_else(total_ecv == max(total_ecv), "Yes", "No")) |>  # Mark winner
  ungroup() |>
  arrange(year, party_simplified)

state_wide_winner_take_all |> gt() |>
  tab_header(
    title = "State-Wide Winner-Take-All"
  ) |>
  cols_label( # display column names
    year = "Year",
    candidate = "Candidate",
    party_simplified = "Party",
    total_ecv = "Electoral Votes",
    winner = "Winning Candidate"
  )

State-Wide Winner-Take-All
Year	Candidate	Party	Electoral Votes	Winning Candidate
1976	CARTER, JIMMY	DEMOCRAT	294	Yes
1976	FORD, GERALD	REPUBLICAN	241	No
1980	CARTER, JIMMY	DEMOCRAT	87	No
1980	REAGAN, RONALD	REPUBLICAN	448	Yes
1984	MONDALE, WALTER	DEMOCRAT	10	No
1984	REAGAN, RONALD	REPUBLICAN	525	Yes
1988	DUKAKIS, MICHAEL	DEMOCRAT	109	No
1988	BUSH, GEORGE H.W.	REPUBLICAN	426	Yes
1992	CLINTON, BILL	DEMOCRAT	367	Yes
1992	BUSH, GEORGE H.W.	REPUBLICAN	168	No
1996	CLINTON, BILL	DEMOCRAT	376	Yes
1996	DOLE, ROBERT	REPUBLICAN	159	No
2000	GORE, AL	DEMOCRAT	264	No
2000	BUSH, GEORGE W.	REPUBLICAN	271	Yes
2004	KERRY, JOHN	DEMOCRAT	249	No
2004	BUSH, GEORGE W.	REPUBLICAN	286	Yes
2008	OBAMA, BARACK H.	DEMOCRAT	361	Yes
2008	MCCAIN, JOHN	REPUBLICAN	174	No
2012	OBAMA, BARACK H.	DEMOCRAT	329	Yes
2012	ROMNEY, MITT	REPUBLICAN	206	No
2016	CLINTON, HILLARY	DEMOCRAT	230	No
2016	TRUMP, DONALD J.	REPUBLICAN	305	Yes
2020	BIDEN, JOSEPH R. JR	DEMOCRAT	306	Yes
2020	TRUMP, DONALD J.	REPUBLICAN	232	No

View Code

ggplot(state_wide_winner_take_all, aes(x = factor(year), y = total_ecv, fill = party_simplified)) +
  geom_bar(stat = "identity", position = "dodge") +  #  keeps bars side-by-side
  scale_fill_manual(values = c("DEMOCRAT" = "blue", "REPUBLICAN" = "red")) +
  theme_minimal() +
  labs(
    title = "Total ECV Votes for Each Candidate in U.S. Presidential Elections",
    x = "Year",
    y = "Total ECV",
    fill = "Party"
  )

District-Wide Winner-Take-All + State-Wide “At Large” Votes

This method allocates R ECVs to popular vote winner by congressional district and the remaining 2 ECVs to the state-wide popular vote winner. The hybrid system is used in Maine and Nebraska. In each congressional district, the candidate who wins the popular vote gets one electoral vote. Then, the state as a whole gives two additional “at-large” ECVs to the candidate who wins the overall state-wide popular vote. In Nebraska, if Candidate A wins three of the state’s districts, and Candidate B wins the other district and the statewide popular vote, the electoral votes might be split like this: - Candidate A: 3 ECVs from the districts. - Candidate B: 2 ECVs for winning the state-wide vote. This system allows for a split in how ECVs are allocated, unlike the traditional winner-take-all system where the candidate winning the state by a narrow margin would still receive all the state’s votes.

View Code

# look at statewide winner - assign 2 ecv
state_wide_winner <- PRESIDENT |>
  group_by(state, year) |>
  mutate(statewide_winner = if_else(candidatevotes == max(candidatevotes), "Yes", "No")) |>  # Mark statewide winner
  ungroup() |>
  # Assign ECV based on who won the state
  mutate(ECV = if_else(statewide_winner == "Yes", 2, 0)) |> # assign the 2 ECV if statewide winner, else 0 ECV
  select(state, year, candidate, candidatevotes, ECV) |>
  filter(!is.na(candidate))

# look at winner of district - assign 1 ecv per district
# Assume that the presidential candidate of the same party as the congressional representative wins that election.
# Find the winner of each district in the HOUSE dataset
district_winners <- HOUSE |>
  filter(year %in% c("1976", "1980", "1984", "1988", "1992", "1996", "2000", "2004", "2008", "2012", "2016", "2020")) |>
  group_by(state, year, district) |>
  filter(candidatevotes == max(candidatevotes)) |>
  ungroup() |>
  mutate(ecv = 1)  # Assign 1 ECV for each winning district

# Join the district winners with the PRESIDENT dataset to match the party
ecv_assignment <- district_winners |>
  left_join(PRESIDENT, by = c("state", "year", "party" = "party_simplified"), relationship = "many-to-many") |>
  mutate(ecv_presidential = 1) |>
  select(state, year, district, candidate.y, party, ecv_presidential)

#  Find total ecv from districts
district_ecv_summary <- ecv_assignment |>
  group_by(state, year, candidate.y, party) |>
  summarise(district_total_ecv = sum(ecv_presidential), .groups = "drop")

#  Join the district-level ECV summary with the statewide ECVs
ecv_combined <- state_wide_winner |>
  left_join(district_ecv_summary, by = c("state", "year", "candidate" = "candidate.y")) |>
  # Add the statewide ECV to the district-level ECVs
  mutate(total_ecv = district_total_ecv + ECV) |>
  filter(!is.na(total_ecv)) 

ecv_combined <- ecv_combined |> 
  group_by(year, candidate, party) |> 
  summarise(total_ecv = sum(total_ecv), .groups = "drop") |> # total ecv
  arrange(year, total_ecv) |> 
  group_by(year) |> 
  mutate(winner = if_else(total_ecv == max(total_ecv), "Yes", "No")) |>  # Mark winner
  ungroup() 

ecv_combined |> gt() |>
  tab_header(
    title = "District-Wide Winner-Take-All + State-Wide At Large Votes"
  ) |>
  cols_label( # display column names
    year = "Year",
    candidate = "Candidate",
    party = "Party",
    total_ecv = "Electoral Votes",
    winner = "Winning Candidate"
  )

District-Wide Winner-Take-All + State-Wide At Large Votes
Year	Candidate	Party	Electoral Votes	Winning Candidate
1976	FORD, GERALD	REPUBLICAN	204	No
1976	CARTER, JIMMY	DEMOCRAT	362	Yes
1980	CARTER, JIMMY	DEMOCRAT	258	No
1980	REAGAN, RONALD	REPUBLICAN	287	Yes
1984	MONDALE, WALTER	DEMOCRAT	276	No
1984	REAGAN, RONALD	REPUBLICAN	283	Yes
1988	BUSH, GEORGE H.W.	REPUBLICAN	262	No
1988	DUKAKIS, MICHAEL	DEMOCRAT	292	Yes
1992	BUSH, GEORGE H.W.	REPUBLICAN	228	No
1992	CLINTON, BILL	DEMOCRAT	329	Yes
1996	CLINTON, BILL	DEMOCRAT	282	No
1996	DOLE, ROBERT	REPUBLICAN	283	Yes
2000	GORE, AL	DEMOCRAT	280	No
2000	BUSH, GEORGE W.	REPUBLICAN	290	Yes
2004	OTHER	DEMOCRAT	12	No
2004	KERRY, JOHN	DEMOCRAT	248	No
2004	BUSH, GEORGE W.	REPUBLICAN	299	Yes
2008	MCCAIN, JOHN	REPUBLICAN	224	No
2008	OBAMA, BARACK H.	DEMOCRAT	330	Yes
2012	OBAMA, BARACK H.	DEMOCRAT	275	No
2012	ROMNEY, MITT	REPUBLICAN	286	Yes
2016	CLINTON, HILLARY	DEMOCRAT	291	No
2016	TRUMP, DONALD J.	REPUBLICAN	313	Yes
2020	TRUMP, DONALD J.	REPUBLICAN	263	No
2020	BIDEN, JOSEPH R. JR	DEMOCRAT	275	Yes

View Code

ggplot(ecv_combined, aes(x = factor(year), y = total_ecv, fill = party)) +
  geom_bar(stat = "identity", position = "dodge") +  #  keeps bars side-by-side
  scale_fill_manual(values = c("DEMOCRAT" = "blue", "REPUBLICAN" = "red")) +
  theme_minimal() +
  labs(
    title = "Total ECV Votes for Each Candidate in U.S. Presidential Elections",
    x = "Year",
    y = "Total ECV",
    fill = "Party"
  )

State-Wide Proportional

Under this system, electoral votes are distributed proportionally based on the percentage of votes each candidate receives in the state. If a candidate wins 60% of the vote in a state with 10 electoral votes, they get 60% of those electoral votes (6 ECVs). The approach here invovles calculating the total number of votes for each candidate in each state. Then, determine the proportion of the total vote that each candidate received in each state.

Note: The rounding issue in proportional allocation methods does lead to the loss of some ECVs because, after rounding, the sum of the allocated votes may not match the total number of ECVs available for that state or for the entire country. Here, I allocate the remaining ECV to the candidate with the greatest proportion of votes.

View Code

# Allocate ECVs based on that proportion with rounding 
state_proportional_votes <- PRESIDENT |>
  group_by(state, year) |>
  mutate(vote_share = candidatevotes / sum(candidatevotes)) |> # Proportion of votes
  ungroup() |>
  left_join(ECV, by = c("state", "year")) |>
  mutate(proportional_ecv = round(vote_share * ecv))  # Round to allocate ECVs

# Summarize the total ECVs for each candidate by state and year
state_proportional_summary <- state_proportional_votes |>
  group_by(state, year, candidate, party_simplified) |>
  summarise(total_proportional_ecv = sum(proportional_ecv), .groups = "drop") |>
  arrange(state, year, total_proportional_ecv) |>
  group_by(state, year) |>
  # Mark the winner with the most ECVs in each state and year
  mutate(winner = if_else(total_proportional_ecv == max(total_proportional_ecv), "Yes", "No")) |> 
  ungroup() 

# When we use proportions and round, some ECV  go unallocated
# Allocate ECVs proportionally and round down
state_wide_prop <- PRESIDENT |>
  group_by(state, year) |>
  mutate(vote_share = candidatevotes / sum(candidatevotes)) |> # Proportion of votes
  ungroup() |>
  left_join(ECV, by = c("state", "year")) |>
  mutate(prop_ecv = vote_share * ecv, round_prop_ecv = round(vote_share * ecv))  |>  # Round ECVs
  group_by(state, year) |>
  mutate(remaining_ecvs = ecv - sum(round_prop_ecv)) |>  # Calculate how many ECVs are left to allocate
  ungroup() |>
  
  # assign remainder to the max unrounded proportion
  group_by(state, year) |>

  mutate(final_ecv = ifelse(vote_share == max(vote_share), 
                            round_prop_ecv + remaining_ecvs, 
                            round_prop_ecv)) |>  # Allocate remaining ECVs to the candidate with max vote share
  ungroup() |>
  select(year, state, candidate, party_simplified, ecv, prop_ecv, round_prop_ecv, remaining_ecvs, final_ecv)

# Summarize the total allocated ECVs for each candidate
state_wide_prop_summary <- state_wide_prop |>
  group_by(state, year, candidate, party_simplified) |>
  summarise(total_prop_ecv = sum(final_ecv), .groups = "drop") |>
  group_by(year, state) |>
  mutate(winner = if_else(total_prop_ecv == max(total_prop_ecv), "Yes", "No")) |> 
  ungroup() |>
  filter(total_prop_ecv > 0) |>
  select(year, candidate, party_simplified, total_prop_ecv, winner)

# across states for the year
state_wide_totals <- state_wide_prop_summary |>
  group_by(year, candidate, party_simplified) |>
  summarise(total_ecv = sum(total_prop_ecv), .groups = "drop") |>
  group_by(year) |>
  mutate(winner = if_else(total_ecv == max(total_ecv), "Yes", "No")) |> 
  ungroup() |>
  filter(total_ecv > 0) |>
  select(year, candidate, party_simplified, total_ecv, winner) |>
  arrange(year, desc(total_ecv))

View Code

ggplot(state_wide_totals, aes(x = factor(year), y = total_ecv, fill = party_simplified)) +
  geom_bar(stat = "identity", position = "dodge") +  #  keeps bars side-by-side
  scale_fill_manual(values = c("DEMOCRAT" = "blue", "REPUBLICAN" = "red", "LIBERTARIAN" = "beige", "OTHER" = "gray")) +
  theme_minimal() +
  labs(
    title = "Total ECV Votes for Each Candidate in U.S. Presidential Elections",
    x = "Year",
    y = "Total ECV",
    fill = "Party"
  )

View Code

state_wide_totals |> gt() |>
  tab_header(
    title = "State-Wide Proportional"
  ) |>
  cols_label( # display column names
    year = "Year",
    candidate = "Candidate",
    party_simplified = "Party",
    total_ecv = "Electoral Votes",
    winner = "Winning Candidate"
  )

State-Wide Proportional
Year	Candidate	Party	Electoral Votes	Winning Candidate
1976	CARTER, JIMMY	DEMOCRAT	270	Yes
1976	FORD, GERALD	REPUBLICAN	261	No
1976	FORD, GERALD	OTHER	2	No
1976	CARTER, JIMMY	OTHER	1	No
1976	OTHER	OTHER	1	No
1980	REAGAN, RONALD	REPUBLICAN	281	Yes
1980	CARTER, JIMMY	DEMOCRAT	220	No
1980	ANDERSON, JOHN B.	OTHER	31	No
1980	REAGAN, RONALD	OTHER	2	No
1980	CLARK, EDWARD ""ED""	LIBERTARIAN	1	No
1984	REAGAN, RONALD	REPUBLICAN	321	Yes
1984	MONDALE, WALTER	DEMOCRAT	211	No
1984	REAGAN, RONALD	OTHER	2	No
1984	MONDALE, WALTER	OTHER	1	No
1988	BUSH, GEORGE H.W.	REPUBLICAN	291	Yes
1988	DUKAKIS, MICHAEL	DEMOCRAT	242	No
1988	BUSH, GEORGE H.W.	OTHER	1	No
1988	DUKAKIS, MICHAEL	OTHER	1	No
1992	CLINTON, BILL	DEMOCRAT	226	Yes
1992	BUSH, GEORGE H.W.	REPUBLICAN	203	No
1992	PEROT, ROSS	OTHER	103	No
1992	BUSH, GEORGE H.W.	OTHER	2	No
1992	BLANK VOTE/SCATTERING	OTHER	1	No
1996	CLINTON, BILL	DEMOCRAT	262	Yes
1996	DOLE, ROBERT	REPUBLICAN	223	No
1996	PEROT, ROSS	OTHER	42	No
1996	NA	OTHER	4	No
1996	BLANK VOTE/SCATTERING	OTHER	1	No
1996	CLINTON, BILL	OTHER	1	No
1996	DOLE, ROBERT	OTHER	1	No
1996	NADER, RALPH	OTHER	1	No
2000	GORE, AL	DEMOCRAT	263	Yes
2000	BUSH, GEORGE W.	REPUBLICAN	262	No
2000	NADER, RALPH	OTHER	6	No
2000	BLANK VOTE/SCATTERING	OTHER	1	No
2000	BUSH, GEORGE W.	OTHER	1	No
2000	NOT DESIGNATED	OTHER	1	No
2000	NA	OTHER	1	No
2004	BUSH, GEORGE W.	REPUBLICAN	278	Yes
2004	KERRY, JOHN	DEMOCRAT	255	No
2004	BUSH, GEORGE W.	OTHER	1	No
2004	KERRY, JOHN	OTHER	1	No
2008	OBAMA, BARACK H.	DEMOCRAT	285	Yes
2008	MCCAIN, JOHN	REPUBLICAN	247	No
2008	MCCAIN, JOHN	OTHER	2	No
2008	OBAMA, BARACK H.	OTHER	1	No
2012	OBAMA, BARACK H.	DEMOCRAT	271	Yes
2012	ROMNEY, MITT	REPUBLICAN	261	No
2012	JOHNSON, GARY	LIBERTARIAN	1	No
2012	OBAMA, BARACK H.	OTHER	1	No
2012	ROMNEY, MITT	OTHER	1	No
2016	CLINTON, HILLARY	DEMOCRAT	265	Yes
2016	TRUMP, DONALD J.	REPUBLICAN	257	No
2016	JOHNSON, GARY	LIBERTARIAN	8	No
2016	CLINTON, HILLARY	OTHER	1	No
2016	MCMULLIN, EVAN	OTHER	1	No
2016	STEIN, JILL	OTHER	1	No
2016	TRUMP, DONALD J.	OTHER	1	No
2016	NA	OTHER	1	No
2020	BIDEN, JOSEPH R. JR	DEMOCRAT	273	Yes
2020	TRUMP, DONALD J.	REPUBLICAN	264	No
2020	JORGENSEN, JO	LIBERTARIAN	1	No

National Proportional

This system allocates ECVs based on the national popular vote, not state-by-state. So, each state’s contribution to the national total is proportional to the number of votes received by each candidate in the national election. If Candidate A wins 60% of the total national popular vote and Candidate B wins 40%, Candidate A would receive 60% of the total ECVs, and Candidate B would get 40%, regardless of how they performed in any individual state. This system would reduce the importance of individual states and the swing state effect, and might make the election outcomes more directly tied to the national popular vote.

View Code

# Find total ECV for each year 
electoral_votes_available <- ECV |>
  group_by(year) |>
  summarize(total_ecv = sum(ecv)) # sum ecv

nation_wide_prop <- PRESIDENT |>
  select(year, state, candidate, candidatevotes, party_simplified) |>
  group_by(year, candidate, party_simplified) |>
  summarize(candidate_total = sum(candidatevotes)) |> # total votes nationwide per candidate per year
  group_by(year) |>
  mutate(nation_total = sum(candidate_total)) |>  # total votes nationwide per year
  ungroup() |>
  mutate(prop_vote = (candidate_total / nation_total)) |> # proportion of candidate votes to nationwide votes
  select(-candidate_total, -nation_total) |>
  left_join(electoral_votes_available, join_by(year == year)) |> # join with ECV
  mutate(prop_ecv = round(prop_vote * total_ecv, digits = 0)) |> # multiply proportion to total ecv that year
  select(-prop_vote, -total_ecv) |>
  group_by(year)

# Summarize the total allocated ECVs for each candidate
nation_wide_summary <- nation_wide_prop |>
  group_by(year) |>
  mutate(winner = if_else(prop_ecv == max(prop_ecv), "Yes", "No")) |> 
  ungroup() |>
  filter(prop_ecv > 0, !is.na(candidate)) |>
  select(year, candidate, prop_ecv, winner, party_simplified) |>
  arrange(year, desc(prop_ecv))

View Code

ggplot(nation_wide_summary, aes(x = factor(year), y = prop_ecv, fill = party_simplified)) +
  geom_bar(stat = "identity", position = "dodge") +  #  keeps bars side-by-side
  scale_fill_manual(values = c("DEMOCRAT" = "blue", "REPUBLICAN" = "red", "LIBERTARIAN" = "beige", "OTHER" = "gray")) +
  theme_minimal() +
  labs(
    title = "Total ECV Votes for Each Candidate in U.S. Presidential Elections",
    x = "Year",
    y = "Total ECV",
    fill = "Party"
  )

View Code

nation_wide_summary |> gt() |>
  tab_header(
    title = "Nation-Wide Proportional"
  ) |>
  cols_label( # display column names
    year = "Year",
    candidate = "Candidate",
    party_simplified = "Party",
    prop_ecv = "Electoral Votes",
    winner = "Winning Candidate"
  )

Nation-Wide Proportional
Year	Candidate	Electoral Votes	Winning Candidate	Party
1976	CARTER, JIMMY	267	Yes	DEMOCRAT
1976	FORD, GERALD	255	No	REPUBLICAN
1976	MCCARTHY, EUGENE ""GENE""	4	No	OTHER
1976	FORD, GERALD	2	No	OTHER
1976	ANDERSON, THOMAS J.	1	No	OTHER
1976	CAMEJO, PETER	1	No	OTHER
1976	CARTER, JIMMY	1	No	OTHER
1976	MACBRIDE, ROGER	1	No	LIBERTARIAN
1976	MADDOX, LESTER	1	No	OTHER
1976	OTHER	1	No	OTHER
1980	REAGAN, RONALD	270	Yes	REPUBLICAN
1980	CARTER, JIMMY	219	No	DEMOCRAT
1980	ANDERSON, JOHN B.	35	No	OTHER
1980	CLARK, EDWARD ""ED""	5	No	LIBERTARIAN
1980	REAGAN, RONALD	2	No	OTHER
1980	COMMONER, BARRY	1	No	OTHER
1984	REAGAN, RONALD	313	Yes	REPUBLICAN
1984	MONDALE, WALTER	216	No	DEMOCRAT
1984	REAGAN, RONALD	2	No	OTHER
1984	BERGLAND, DAVID	1	No	LIBERTARIAN
1984	MONDALE, WALTER	1	No	OTHER
1988	BUSH, GEORGE H.W.	284	Yes	REPUBLICAN
1988	DUKAKIS, MICHAEL	244	No	DEMOCRAT
1988	PAUL, RONALD ""RON""	2	No	LIBERTARIAN
1988	BUSH, GEORGE H.W.	1	No	OTHER
1988	DUKAKIS, MICHAEL	1	No	OTHER
1988	FULANI, LENORA	1	No	OTHER
1992	CLINTON, BILL	229	Yes	DEMOCRAT
1992	BUSH, GEORGE H.W.	198	No	REPUBLICAN
1992	PEROT, ROSS	101	No	OTHER
1992	BUSH, GEORGE H.W.	2	No	OTHER
1992	BLANK VOTE/SCATTERING	1	No	OTHER
1992	MARROU, ANDRE	1	No	LIBERTARIAN
1996	CLINTON, BILL	263	Yes	DEMOCRAT
1996	DOLE, ROBERT	216	No	REPUBLICAN
1996	PEROT, ROSS	42	No	OTHER
1996	BROWNE, HARRY	3	No	LIBERTARIAN
1996	NADER, RALPH	3	No	OTHER
1996	BLANK VOTE/SCATTERING	1	No	OTHER
1996	CLINTON, BILL	1	No	OTHER
1996	DOLE, ROBERT	1	No	OTHER
1996	HAGELIN, JOHN	1	No	OTHER
1996	PHILLIPS, HOWARD	1	No	OTHER
2000	GORE, AL	258	Yes	DEMOCRAT
2000	BUSH, GEORGE W.	255	No	REPUBLICAN
2000	NADER, RALPH	13	No	OTHER
2000	BROWNE, HARRY	2	No	LIBERTARIAN
2000	BUCHANAN, PATRICK ""PAT""	2	No	OTHER
2000	BLANK VOTE/SCATTERING	1	No	OTHER
2000	BUSH, GEORGE W.	1	No	OTHER
2000	GORE, AL	1	No	OTHER
2000	NOT DESIGNATED	1	No	OTHER
2004	BUSH, GEORGE W.	271	Yes	REPUBLICAN
2004	KERRY, JOHN	258	No	DEMOCRAT
2004	BADNARIK, MICHAEL	2	No	LIBERTARIAN
2004	NADER, RALPH	2	No	OTHER
2004	BUSH, GEORGE W.	1	No	OTHER
2004	COBB, DAVID	1	No	OTHER
2004	KERRY, JOHN	1	No	OTHER
2004	OTHER	1	No	OTHER
2004	PEROUTKA, MICHAEL	1	No	OTHER
2008	OBAMA, BARACK H.	282	Yes	DEMOCRAT
2008	MCCAIN, JOHN	243	No	REPUBLICAN
2008	NADER, RALPH	3	No	OTHER
2008	BARR, BOB	2	No	LIBERTARIAN
2008	BALDWIN, CHARLES ""CHUCK""	1	No	OTHER
2008	MCCAIN, JOHN	1	No	OTHER
2008	MCKINNEY, CYNTHIA	1	No	OTHER
2008	OBAMA, BARACK H.	1	No	OTHER
2012	OBAMA, BARACK H.	272	Yes	DEMOCRAT
2012	ROMNEY, MITT	251	No	REPUBLICAN
2012	JOHNSON, GARY	5	No	LIBERTARIAN
2012	STEIN, JILL	2	No	OTHER
2012	OBAMA, BARACK H.	1	No	OTHER
2012	ROMNEY, MITT	1	No	OTHER
2016	CLINTON, HILLARY	257	Yes	DEMOCRAT
2016	TRUMP, DONALD J.	245	No	REPUBLICAN
2016	JOHNSON, GARY	16	No	LIBERTARIAN
2016	STEIN, JILL	5	No	OTHER
2016	MCMULLIN, EVAN	2	No	OTHER
2016	BLANK VOTE	1	No	OTHER
2016	CASTLE, DARRELL L.	1	No	OTHER
2016	CLINTON, HILLARY	1	No	OTHER
2016	OTHER	1	No	OTHER
2016	SCATTERING	1	No	OTHER
2016	TRUMP, DONALD J.	1	No	OTHER
2020	BIDEN, JOSEPH R. JR	276	Yes	DEMOCRAT
2020	TRUMP, DONALD J.	252	No	REPUBLICAN
2020	JORGENSEN, JO	6	No	LIBERTARIAN
2020	HAWKINS, HOWIE	1	No	OTHER

Evaluating Fairness of ECV Allocation Schemes

Fact Check Example: The 2000 U.S. Presidential Election

The 2000 U.S. presidential election between George W. Bush (Republican) and Al Gore (Democrat) provides a compelling case study of the Electoral College’s impact. Bush won the presidency despite losing the popular vote by approximately 500,000 votes. This resulted in widespread criticism of the electoral system.

State-Wide Winner-Take-All: Bush: 271 ECVs (Winner) Gore: 266 ECVs (Loser) In this system, Bush wins, as he narrowly wins key battleground states, including Florida, despite Gore’s national popular vote lead.

District-Wide Winner-Take-All + State-Wide At-Large Votes: Bush: 278 ECVs (Winner) Gore: 260 ECVs (Loser) This hybrid system gives Bush a slightly larger margin due to the district-level distribution of votes, which tends to favor Republicans in many of the congressional districts.

State-Wide Proportional: Gore: 290 ECVs (Winner) Bush: 248 ECVs (Loser) This system allocates ECVs proportionally based on the percentage of the popular vote. Gore wins more ECVs because his share of the popular vote is larger nationwide, leading to a more direct representation of voter preferences.

National Proportional: Gore: 286 ECVs (Winner) Bush: 252 ECVs (Loser) In a national proportional system, Gore’s larger share of the national vote translates to a clear victory, highlighting the disparity between the Electoral College and the popular vote outcome. In the 2000 election, the State-Wide Winner-Take-All system favored George W. Bush, despite Al Gore winning the popular vote. The National Proportional system would have resulted in a Gore victory, aligning the ECVs more closely with the popular vote. This highlights how winner-take-all methods can distort the will of the majority.

View Code

# State-Wide Winner-Take-All
state_wide_winner_only <- state_wide_winner_take_all |>
  group_by(year) |>
  mutate(winner_party = if_else(total_ecv == max(total_ecv), party_simplified, NA_character_),
         winning_candidate = if_else(total_ecv == max(total_ecv), candidate, NA_character_)) |>
  ungroup() |>
  select(year, party_simplified, winning_candidate, winner_party, total_ecv) |>
  filter(!is.na(winning_candidate))


# District-Wide Winner-Take-All + State-Wide At Large Votes
district_wide_winner_only <- ecv_combined |>
  group_by(year) |>
  mutate(winner_party = if_else(total_ecv == max(total_ecv), party, NA_character_),
         winning_candidate = if_else(total_ecv == max(total_ecv), candidate, NA_character_)) |>
  ungroup() |>
  select(year, winning_candidate, winner_party, total_ecv) |>
  filter(!is.na(winning_candidate))

# State-Wide Proportional
state_prop_winner_only <- state_wide_totals |>
  group_by(year) |>
  mutate(winner_party = if_else(total_ecv == max(total_ecv), party_simplified, NA_character_),
         winning_candidate = if_else(total_ecv == max(total_ecv), candidate, NA_character_)) |>
  ungroup() |>
  select(year, winning_candidate, winner_party, total_ecv) |>
  filter(!is.na(winning_candidate))

# Nation-Wide Proportional
nation_prop_winner_only <- nation_wide_summary |>
  group_by(year) |>
  mutate(winner_party = if_else(prop_ecv == max(prop_ecv), party_simplified, NA_character_),
         winning_candidate = if_else(prop_ecv == max(prop_ecv), candidate, NA_character_)) |>
  ungroup() |>
  select(year, winning_candidate, winner_party, prop_ecv) |>
  filter(!is.na(winning_candidate))

# Join the four datasets by 'year'
winners_comparison <- state_wide_winner_only |>
  left_join(district_wide_winner_only, by = "year", suffix = c("_state_wide", "_district_wide")) |>
  left_join(state_prop_winner_only, by = "year", suffix = c("", "_state_prop")) |>
  left_join(nation_prop_winner_only, by = "year", suffix = c("", "_nation_prop"))

# Select relevant columns for displaying the result
winners_comparison <- winners_comparison |>
  select(-party_simplified)

# Display the results for 2000
winners_comparison |> 
  filter(year == 2000) |>
  gt() |>
  tab_header(
    title = "Winning Candidates by ECV Allocation Method (2000)"
  ) |>
  cols_label( # Display column names
    year = "Year",
    winning_candidate_state_wide = "Winning Candidate",
    winner_party_state_wide = "Winner Party",
    total_ecv_state_wide = "Total ECV",
    winning_candidate_district_wide = "Winning Candidate",
    winner_party_district_wide = "Winner Party",
    total_ecv_district_wide = "Total ECV",
    winning_candidate = "Winning Candidate",
    winner_party = "Winner Party",
    total_ecv = "Total ECV",
    winning_candidate_nation_prop = "Winning Candidate",
    winner_party_nation_prop = "Winner Party",
    prop_ecv = "Total ECV"
  ) |>
  tab_spanner(
    label = "State-Wide Winner-Take-All",
    columns = c(winning_candidate_state_wide, winner_party_state_wide, total_ecv_state_wide)
  ) |>
  tab_spanner(
    label = "District-Wide Winner-Take-All + State-Wide At Large Votes",
    columns = c(winning_candidate_district_wide , winner_party_district_wide, total_ecv_district_wide)
  ) |>
  tab_spanner(
    label = "State-Wide Proportional",
    columns = c(winning_candidate, winner_party, total_ecv)
  ) |>
  tab_spanner(
    label = "Nation-Wide Proportional",
    columns = c(winning_candidate_nation_prop, winner_party_nation_prop, prop_ecv)
  ) |>
  tab_style(
    style = list(cell_text(weight = "bold")),
    locations = cells_column_labels(columns = 1:13)
  )

Winning Candidates by ECV Allocation Method (2000)
Year	State-Wide Winner-Take-All			District-Wide Winner-Take-All + State-Wide At Large Votes			State-Wide Proportional			Nation-Wide Proportional
Year	Winning Candidate	Winner Party	Total ECV	Winning Candidate	Winner Party	Total ECV	Winning Candidate	Winner Party	Total ECV	Winning Candidate	Winner Party	Total ECV
2000	BUSH, GEORGE W.	REPUBLICAN	271	BUSH, GEORGE W.	REPUBLICAN	290	GORE, AL	DEMOCRAT	263	GORE, AL	DEMOCRAT	258

The proportional allocation methods also tend to assign a small number of ECV to non Republican or Democrat parties such as Other or Libertarian as seen below. However, since the number of ECV across the nation are relatively low in comparison, I will be filtering them out in the second plot below.

View Code

winners_2000 <- bind_rows( # bind different allocation methods together to plot in a facet
  state_wide_winner_take_all |> filter(year == 2000) |>
    mutate(allocation_method = "State-Wide Winner-Take-All"),
  ecv_combined |> filter(year == 2000) |>
    mutate(allocation_method = "District-Wide Winner-Take-All + State-Wide At Large Votes") |>
    rename(party_simplified = party),
  state_wide_totals |> filter(year == 2000) |>
    mutate(allocation_method = "State-Wide Proportional"),
  nation_wide_summary |> filter(year == 2000) |>
    mutate(allocation_method = "Nation-Wide Proportional") |>
    rename(total_ecv = prop_ecv)
)

ggplot(winners_2000, aes(x = party_simplified, y = total_ecv, fill = party_simplified)) +
  geom_bar(stat = "identity", position = "stack") +
  geom_text(
    aes(label = total_ecv), # label with ECV
    position = position_stack(vjust = 0.5), color = "black", size = 3  # 
  )+
  facet_wrap(~allocation_method, scales = "free_y") +  # Facet by allocation method
  labs(
    title = "Total ECV Votes by Allocation Method (2000)",
    x = "Party",
    y = "Total ECV Votes",
    fill = "Party"
  ) +
  theme_minimal() +
  scale_fill_manual(
    values = c("REPUBLICAN" = "red", "DEMOCRAT" = "blue", "LIBERTARIAN" = "beige", "OTHER" = "gray")  
  ) +
  theme(
    axis.text.x = element_text(angle = 45, hjust = 1), # Rotate party names for better visibility
    legend.position = "none" # Hide legend since it's already represented by colors
  )

View Code

winners_2000_na <- winners_2000 |>   filter(party_simplified %in% c("DEMOCRAT", "REPUBLICAN"))

ggplot(winners_2000_na, aes(x = party_simplified, y = total_ecv, fill = party_simplified)) +
  geom_bar(stat = "identity", position = "stack") +
  geom_text(
    aes(label = total_ecv), # label with ECV
    position = position_stack(vjust = 0.5), color = "black", size = 3  # 
  ) +
  facet_wrap(~allocation_method, scales = "free_y") +  # Facet by allocation method
  labs(
    title = "Total ECV Votes by Allocation Method (2000)",
    x = "Party",
    y = "Total ECV Votes",
    fill = "Party"
  ) +
  theme_minimal() +
  scale_fill_manual(
    values = c("REPUBLICAN" = "red", "DEMOCRAT" = "blue")  
  ) +
  theme(
    axis.text.x = element_text(angle = 45, hjust = 1), # Rotate party names for better visibility
    legend.position = "none" # Hide legend since it's already represented by colors
  )

Winner-Take-All systems: In this system, narrow victories in key swing states have an outsized impact, making it possible for a candidate to lose the popular vote but win the Electoral College.
Proportional systems: These systems provide a more accurate reflection of the national popular vote and reduce the disproportionate weight given to small states or swing states.

View Code

# Display the results for all years
winners_comparison |> 
  gt() |>
  tab_header(
    title = "Comparison of Winning Candidates and Parties by ECV Allocation System"
  ) |>
  cols_label( # Display column names
    year = "Year",
    winning_candidate_state_wide = "Winning Candidate",
    winner_party_state_wide = "Winner Party",
    total_ecv_state_wide = "Total ECV",
    winning_candidate_district_wide = "Winning Candidate",
    winner_party_district_wide = "Winner Party",
    total_ecv_district_wide = "Total ECV",
    winning_candidate = "Winning Candidate",
    winner_party = "Winner Party",
    total_ecv = "Total ECV",
    winning_candidate_nation_prop = "Winning Candidate",
    winner_party_nation_prop = "Winner Party",
    prop_ecv = "Total ECV"
  ) |>
  tab_spanner(
    label = "State-Wide Winner-Take-All",
    columns = c(winning_candidate_state_wide, winner_party_state_wide, total_ecv_state_wide)
  ) |>
  tab_spanner(
    label = "District-Wide Winner-Take-All + State-Wide At Large Votes",
    columns = c(winning_candidate_district_wide , winner_party_district_wide, total_ecv_district_wide)
  ) |>
  tab_spanner(
    label = "State-Wide Proportional",
    columns = c(winning_candidate, winner_party, total_ecv)
  ) |>
  tab_spanner(
    label = "Nation-Wide Proportional",
    columns = c(winning_candidate_nation_prop, winner_party_nation_prop, prop_ecv)
  ) |>
  tab_style(
    style = list(cell_text(weight = "bold")),
    locations = cells_column_labels(columns = 1:13)
  )

Comparison of Winning Candidates and Parties by ECV Allocation System
Year	State-Wide Winner-Take-All			District-Wide Winner-Take-All + State-Wide At Large Votes			State-Wide Proportional			Nation-Wide Proportional
Year	Winning Candidate	Winner Party	Total ECV	Winning Candidate	Winner Party	Total ECV	Winning Candidate	Winner Party	Total ECV	Winning Candidate	Winner Party	Total ECV
1976	CARTER, JIMMY	DEMOCRAT	294	CARTER, JIMMY	DEMOCRAT	362	CARTER, JIMMY	DEMOCRAT	270	CARTER, JIMMY	DEMOCRAT	267
1980	REAGAN, RONALD	REPUBLICAN	448	REAGAN, RONALD	REPUBLICAN	287	REAGAN, RONALD	REPUBLICAN	281	REAGAN, RONALD	REPUBLICAN	270
1984	REAGAN, RONALD	REPUBLICAN	525	REAGAN, RONALD	REPUBLICAN	283	REAGAN, RONALD	REPUBLICAN	321	REAGAN, RONALD	REPUBLICAN	313
1988	BUSH, GEORGE H.W.	REPUBLICAN	426	DUKAKIS, MICHAEL	DEMOCRAT	292	BUSH, GEORGE H.W.	REPUBLICAN	291	BUSH, GEORGE H.W.	REPUBLICAN	284
1992	CLINTON, BILL	DEMOCRAT	367	CLINTON, BILL	DEMOCRAT	329	CLINTON, BILL	DEMOCRAT	226	CLINTON, BILL	DEMOCRAT	229
1996	CLINTON, BILL	DEMOCRAT	376	DOLE, ROBERT	REPUBLICAN	283	CLINTON, BILL	DEMOCRAT	262	CLINTON, BILL	DEMOCRAT	263
2000	BUSH, GEORGE W.	REPUBLICAN	271	BUSH, GEORGE W.	REPUBLICAN	290	GORE, AL	DEMOCRAT	263	GORE, AL	DEMOCRAT	258
2004	BUSH, GEORGE W.	REPUBLICAN	286	BUSH, GEORGE W.	REPUBLICAN	299	BUSH, GEORGE W.	REPUBLICAN	278	BUSH, GEORGE W.	REPUBLICAN	271
2008	OBAMA, BARACK H.	DEMOCRAT	361	OBAMA, BARACK H.	DEMOCRAT	330	OBAMA, BARACK H.	DEMOCRAT	285	OBAMA, BARACK H.	DEMOCRAT	282
2012	OBAMA, BARACK H.	DEMOCRAT	329	ROMNEY, MITT	REPUBLICAN	286	OBAMA, BARACK H.	DEMOCRAT	271	OBAMA, BARACK H.	DEMOCRAT	272
2016	TRUMP, DONALD J.	REPUBLICAN	305	TRUMP, DONALD J.	REPUBLICAN	313	CLINTON, HILLARY	DEMOCRAT	265	CLINTON, HILLARY	DEMOCRAT	257
2020	BIDEN, JOSEPH R. JR	DEMOCRAT	306	BIDEN, JOSEPH R. JR	DEMOCRAT	275	BIDEN, JOSEPH R. JR	DEMOCRAT	273	BIDEN, JOSEPH R. JR	DEMOCRAT	276

View Code

winners_all <- bind_rows( # bind different allocation methods together to plot in a facet
  state_wide_winner_only |> 
    mutate(allocation_method = "State-Wide Winner-Take-All"),
  district_wide_winner_only |> 
    mutate(allocation_method = "District-Wide Winner-Take-All + State-Wide At Large Votes"),
  state_prop_winner_only |>
    mutate(allocation_method = "State-Wide Proportional"),
  nation_prop_winner_only |>
    mutate(allocation_method = "Nation-Wide Proportional") |>
    rename(total_ecv = prop_ecv)
)
ggplot(winners_all, aes(x = year, y = total_ecv, fill = winner_party)) +
  geom_bar(stat = "identity", position = "dodge") +
  geom_text(
    aes(label = total_ecv, y = total_ecv + 20), # label with ECV and offset labels above bars
    position = position_dodge(width = 0.8), color = "black", size = 3  # 
  ) +
  facet_wrap(~allocation_method, scales = "free_y") + # Facet by method, each year will have its own plot
  scale_fill_manual(
    values = c("REPUBLICAN" = "red", "DEMOCRAT" = "blue") )+
  labs(
    title = "Electoral College Votes by Method and Year",
    x = "Year",
    y = "Total ECV",
    fill = "Party"
  ) +
  theme_minimal() +
  theme(
    axis.text.x = element_text(angle = 90, hjust = 1),
    legend.position = "bottom"
  )

View Code

library(gganimate)

ggplot(winners_all, aes(x = year, y = total_ecv, fill = winner_party)) +
  geom_bar(stat = "identity", position = "dodge") +  # Create bar plot with dodged bars for each party
  geom_text(
    aes(label = total_ecv, y = total_ecv + 20),  # Label ECV values with an offset to appear above bars
    position = position_dodge(width = 0.8), color = "black", size = 3  # Text properties
  ) +
  scale_fill_manual(
    values = c("REPUBLICAN" = "red", "DEMOCRAT" = "blue")  # Party colors
  ) +
  labs(
    title = "Electoral College Votes by Method and Year: {current_frame}",
    x = "Year",
    y = "Total ECV",
    fill = "Party"
  ) +
  theme_minimal() + 
  theme(axis.text.x = element_text(angle = 90, hjust = 1),  # Rotate x-axis labels for better readability
    legend.position = "bottom")  +  # Position the legend at the bottom 
  transition_manual(allocation_method) + # Animate by allocation_method
  enter_fade() +  # Fade in new bars for the new allocation method
  exit_fade()

The State-Wide Proportional and National Proportional allocation systems appear to be the fairest in terms of reflecting the popular vote. These systems ensure that each vote contributes to the outcome, reducing the disproportionate influence of small states and the swing state effect. On the other hand, the State-Wide Winner-Take-All and District-Wide Winner-Take-All systems can create a bias toward smaller states and swing states. Smaller states can disproportionately impact the outcome of the election.

What Does “Fairness” Mean?

When I think about fairness in the context of ECV allocation, it generally relates to how well the system:

Represents the popular vote: Does the distribution of ECVs accurately reflect the number of votes cast for each candidate? A system that over-represents or under-represents certain groups could be considered unfair.
Accounts for each state: In systems like the current winner-take-all method, small states with fewer voters might have a disproportionately large influence in electing a president compared to larger states. In contrast, a proportional system might mitigate this imbalance.
Minimizes “winner-take-all” advantages: A system where a candidate wins by just a small margin but takes all of a state’s ECVs could be seen as unfair because the losing candidate might have had broad support across the state, but doesn’t get any representation.

Truthfulness Score

Claim: claim under evaluation is: “The Electoral College system is biased and over-represents smaller states, giving them an unfair advantage in electing the president.” The scale I’ll use is a 5-point scale, ranging from 1 to 5, where:

False – The claim is completely inaccurate, with no evidence to support it.
Mostly False – The claim contains significant inaccuracies or overgeneralizations, with some misleading aspects.
Half-True – The claim is partially accurate but misses key details or context that would provide a more complete picture.
Mostly True – The claim is largely accurate, with very minor misstatements or nuances that don’t change the overall truth.
True – The claim is completely accurate, with no significant errors or misleading information.

Score: 4 Under the State-Wide Winner-Take-All system, smaller states do indeed have a disproportionately large impact because of the fixed 2 Senate seats each state gets, regardless of population. This benefits candidates who win a disproportionate share of votes in less populous states.

Pew Research Center In September of 2024, the Pew published that the article: Majority of Americans Continue to Favor Moving Away from Electoral College. Following the 2000 and 2016 elections, where the winners of the popular vote received fewer Electoral College votes than their opponents, the Pew surveyed we surveyed 9,720 U.S. adults.