Do Proportional Electoral College Allocations Yield a More Representative Presidency?

Author

Cindy Li

Introduction

The U.S. Electoral College (EC) system, by design, has a significant impact on presidential elections, often making the distribution of votes much more complex than a simple nationwide popular vote. The system has been debated, especially when the results diverge from the popular vote. This analysis’ primary goal is to assess how the allocation schemes impact the election outcomes and whether any bias exists, especially in favor of one political party.

Data Ingesting

Data I: ELection Data

Data Source: MIT Election Data Science Lab datasets From the MIT Election Data Science Lab, we are retrieving two data sets. First, are votes from all biennial congressional races in all 50 states from 1976 to 2020. Second, are statewide presidential vote cotes. This requires a download from the link

Load Libraries
if (!require("readr")) install.packages("readr")
if (!require("sf")) install.packages("sf")
if (!require("dplyr")) install.packages("dplyr")
if (!require("tidyr")) install.packages("tidyr")
if (!require("tidyverse")) install.packages("tidyverse")
if (!require("DT")) install.packages("DT")
if (!require("ggplot2")) install.packages("ggplot2")
if (!require("gt")) install.packages("gt")
if (!require("plotly")) install.packages("plotly")

library(readr)
library(sf)
library(dplyr)
library(tidyr)
library(tidyverse)
library(DT)
library(ggplot2)
library(gt)
library(plotly)

1976-2022 House Data

View Code
HOUSE <- read_csv("1976-2022-house.csv")
PRESIDENT <- read_csv("1976-2020-president.csv")

sample_n(HOUSE, 1000) |>
  DT::datatable()

1976-2020 President Data

View Code
sample_n(HOUSE, 1000) |>
  DT::datatable()

Data II: Congressional Boundary Files 1976 to 2012

Data Source: Jeffrey B. Lewis, Brandon DeVine, and Lincoln Pritcher with Kenneth C. Martis This source give us the shapefiles for all US congressional districts from 1789 to 2012.

View Code
get_file <- function(fname){
  BASE_URL <- "https://cdmaps.polisci.ucla.edu/shp/"
  fname_ext <- paste0(fname, ".zip")
  fname_ext1 <- paste0(fname, ".shp")
  fname_extunzip <- gsub(".zip$", "", fname_ext)
  subfolder <- "districtshapes"  # Subfolder where the shapefile is located
  if(!file.exists(fname_ext)){
    FILE_URL <- paste0(BASE_URL, fname_ext)
    download.file(FILE_URL, 
                  destfile = fname_ext)
  }
  # Unzip the contents and save unzipped content
  unzip(zipfile = fname_ext, exdir = fname_extunzip)
  # Define File Path
  shapefile_path <- file.path(fname_extunzip, subfolder, fname_ext1)
  # Read the shapefile
  read_sf(shapefile_path)
}

# Download files by iterating through
start_congress = 95
end_congress = 114
for (i in start_congress:end_congress) {
  district_name <- sprintf("districts%03d", i)  # Formats as district001, district002, etc.
  district_data <- get_file(district_name)   # Download and read the shapefile
  assign(district_name, district_data, envir = .GlobalEnv)  # Assign the data frame to a variable in the global environment
}

Data III: Congressional Boundary Files 2014 to Present

Data Source: US Census Bureau This data source provides district boundaries for more recent congressional elections.

View Code
get_congress_file <- function(fname, year){
  BASE_URL <- sprintf("https://www2.census.gov/geo/tiger/TIGER%d/CD/", year) #replace %d with year
  fname_ext <- paste0(fname, ".zip")
  fname_ext1 <- paste0(fname, ".shp")
  fname_extunzip <- gsub(".zip$", "", fname_ext)
  
  # Download File
  if(!file.exists(fname_ext)){
    FILE_URL <- paste0(BASE_URL, fname_ext)
    download.file(FILE_URL, 
                  destfile = fname_ext)
  }
  # Unzip the contents and save unzipped content
  unzip(zipfile = fname_ext, exdir = fname_extunzip)
  # Define File Path
  shapefile_path <- file.path(fname_extunzip, fname_ext1)
  # Read the shapefile
  read_sf(shapefile_path)
}

# Download file for each district by iterating through each year
base_year = 2022
base_congress = 116  # Congress number for 2012
for (i in 0:10) {  # i will range from 0 (2022) to 10 (2012)
  year <- base_year - i
  if (year >= 2018) {congress <- 116} 
  else if (year >= 2016) {congress <- 115} 
  else if (year >= 2014) {congress <- 114} 
  else if (year == 2013) {congress <- 113} 
  else if (year == 2012) {congress <- 112}
  district_name <- sprintf("tl_%d_us_cd%d", year, congress)
  district_data <- get_congress_file(district_name, year)  # Download and read the shapefile
  assign(district_name, district_data, envir = .GlobalEnv)  # Assign the data frame to a variable in the global environment
  }

Exploration

1. Which states have gained and lost the most seats in the US House of Representatives between 1976 and 2022?

View Code
# Count the number of districts (aka seats) per state for each year
gains_losses <- HOUSE |>
  group_by(state, year) |>
  summarise(num_districts = n_distinct(district)) |>
  arrange(state, year) |>
  # Calculate seat changes for each state
  group_by(state) |>
  summarise(
    first_year_seats = first(num_districts),
    last_year_seats = last(num_districts),
    seat_change = last_year_seats - first_year_seats) |>
  filter(seat_change != 0) |>
  arrange(desc(seat_change))

# Plot the seat changes
ggplot(gains_losses, aes(x = reorder(state, seat_change), y = seat_change, fill = seat_change > 0)) +
  geom_bar(stat = "identity") +
  coord_flip() +
  scale_fill_manual(values = c("red", "blue"), labels = c("Loss", "Gain")) +
  labs(
    title = "Gains and Losses of House Seats (1976 to 2022)",
    x = "State",
    y = "Change in Seats",
    fill = "Change"
  ) +
  theme_minimal()

2. New York State has a unique “fusion” voting system where one candidate can appear on multiple “lines” on the ballot and their vote counts are totaled. Are there any elections in our data where the election would have had a different outcome if the “fusion” system was not used and candidates only received the votes their received from their “major party line” (Democrat or Republican) and not their total number of votes across all lines?

View Code
# Summarize the votes for each candidate, total votes vs major party votes
fusion_summary <- HOUSE |>
  filter(!is.na(candidate)) |>
  group_by(year, state, state_po, district, candidate, fusion_ticket) |>
  summarise(
    total_candidate_votes = sum(candidatevotes, na.rm = TRUE),  # Total votes across all party lines
    major_party_votes = sum( # Major party votes (only votes from Democrat and Republican)
      if_else(party %in% c("DEMOCRAT", "REPUBLICAN"), candidatevotes, 0), na.rm = TRUE), .groups = "drop") |>
  select(year, state, state_po, district, candidate, total_candidate_votes, major_party_votes, fusion_ticket) |>
  arrange(state, district, candidate)

# Check if there would have been a different outcome without fusion voting
fusion_outcome_changes <- fusion_summary |>
  filter(fusion_ticket == TRUE) |> # out of the times when fusion voting was used
  group_by(year, state, state_po, district) |>
  summarise(
    # Find the winner based on total votes 
    winner_with_fusion = candidate[which.max(total_candidate_votes)],
    winner_without_fusion = candidate[which.max(major_party_votes)],
    total_votes_winner = max(total_candidate_votes),
    major_party_votes_winner = max(major_party_votes),
    .groups = "drop"
  ) |>
  # Ensure that major party votes winner is not zero and handle if no major party candidate ran
  mutate( major_party_votes_winner = ifelse(major_party_votes_winner == 0, NA, major_party_votes_winner), 
    # Check if the winners are the same or different based on fusion voting
    outcome_change = ifelse(winner_with_fusion != winner_without_fusion, "Yes", "No")
  ) |>
  arrange(year, state, district)

# Plot directly from fusion_outcome_changes without creating a summary
ggplot(fusion_outcome_changes, aes(x = outcome_change, fill = outcome_change)) +
  geom_bar(show.legend = FALSE) +  # Create the bar plot
  labs(
    title = "Impact of Fusion Voting on Election Outcomes",
    x = "Outcome Change",
    y = "Number of Elections"
  ) +
  theme_minimal() +
  theme(
    axis.text.x = element_text(angle = 45, hjust = 1),
    panel.grid.major = element_blank(),
    panel.grid.minor = element_blank()
  )

3. Do presidential candidates tend to run ahead of or run behind congressional candidates in the same state? That is, does a Democratic candidate for president tend to get more votes in a given state than all Democratic congressional candidates in the same state? Are any presidents particularly more or less popular than their co-partisans?

Let’s take a glimpse at just the year 2020.

View Code
# Summarize presidential votes for Democrat and Republican candidates
presidential_votes <- PRESIDENT |>
  filter(party_simplified %in% c("DEMOCRAT", "REPUBLICAN")) |>
  group_by(year, state, party_simplified) |>
  summarise(
    presidential_total_votes = sum(candidatevotes, na.rm = TRUE),
    .groups = "drop"
  )

# Summarize congressional votes for Democrat and Republican candidates
congressional_votes <- HOUSE |>
  filter(party %in% c("DEMOCRAT", "REPUBLICAN")) |>
  group_by(year, state, party) |>
  summarise(
    congressional_total_votes = sum(candidatevotes, na.rm = TRUE),
    .groups = "drop"
  )

# Join the two datasets to compare presidential and congressional votes
vote_comparison <- left_join(presidential_votes, congressional_votes, by = c("year", "state", "party_simplified" = "party")) |>
  mutate(vote_difference = presidential_total_votes - congressional_total_votes,
    run_ahead = if_else(vote_difference > 0, "Presidential Ahead", "Presidential Behind")) |>
  arrange(run_ahead, vote_difference)

Does this trend differ over time? Does it differ across states or across parties?

View Code
# Plot the vote difference between presidential and congressional candidates frequencies by year
ggplot(vote_comparison, aes(x = vote_difference, fill = run_ahead)) +
  geom_histogram(binwidth = 100000, position = "identity", alpha = 0.7) +
  facet_wrap(~ year, ncol=3) +
  labs(title = "Presidential vs Congressional Vote Difference",
       x = "Vote Difference (Presidential - Congressional)",
       y = "Frequency",
       fill = "Vote Comparison") +
  theme_minimal() +
  theme(axis.text.x = element_text(size = 8, angle = 45, hjust = 1))  # Adjust x-axis tick labels

View Code
# Calculate the average vote difference for each president (across all states and years)
presidential_comparison <- vote_comparison |>
  group_by(year, state, party_simplified) |>
  summarise(
    average_vote_difference = mean(vote_difference, na.rm = TRUE),
    .groups = "drop"
  )

# group by party_simplified and year to get the overall average for each party-year
president_ranking <- presidential_comparison |>
  group_by(party_simplified, year) |>
  summarise(
    average_vote_difference = mean(average_vote_difference, na.rm = TRUE),
    .groups = "drop"
  ) |>
  arrange(desc(average_vote_difference))

# Create a plot to visualize presidential popularity vs. co-partisan congressional candidates
ggplot(president_ranking, aes(x = reorder(party_simplified, average_vote_difference), y = average_vote_difference, fill = party_simplified)) +
  geom_bar(stat = "identity", show.legend = FALSE) +
  coord_flip() +
  scale_fill_manual(values = c("DEMOCRAT" = "blue", "REPUBLICAN" = "red")) +
  labs( 
    title = "Presidential Popularity vs. Congressional Co-partisans",
    x = "President",
    y = "Average Vote Difference (Presidential - Congressional)",
    subtitle = "Higher values indicate greater presidential popularity relative to congressional candidates") +
  theme_minimal()
# Line plot with trend lines for each party
ggplot(president_ranking, aes(x = year, y = average_vote_difference, color = party_simplified, group = party_simplified)) +
  geom_line(size = 1) +
  geom_point(size = 3) +  # Adds points at each year
  scale_color_manual(values = c("DEMOCRAT" = "blue", "REPUBLICAN" = "red")) + 
  labs(
    title = "Presidential Popularity vs. Congressional Co-partisans Over Time",
    x = "Year",
    y = "Average Vote Difference (Presidential - Congressional)",
    subtitle = "Line plot showing trends for each party"
  ) +
  theme_minimal()

Maps & Shapefiles

Chloropleth Visualization of the 2000 Presidential Election Electoral College Results

Filter Election Data for 2000: To create a map of the results broken down by states, we will need to find the election results of each state. The first step involves filtering the election dataset PRESIDENT to get the results for the year 2000. We specifically focus on the U.S. Presidential election and filter for the two main candidates, George W. Bush and Al Gore. We then calculate the winner for each state based on who received the most votes and assign the appropriate party.

View Code
election_2000 <- PRESIDENT |>
  filter(year == 2000, office == "US PRESIDENT") |> # filter for 2000 and president office
  filter(candidate %in% c("BUSH, GEORGE W.", "GORE, AL")) |> # filter for Bush and Gore
  group_by(state) |>
  summarise( # Winner based on the candidate with the most votes
    winner = if_else(sum(candidatevotes[candidate == "BUSH, GEORGE W."]) > sum(candidatevotes[candidate == "GORE, AL"]),
      "Bush", "Gore"),
    winner_party = case_when(# Party based on the candidate
      winner == "Bush" ~ "Republican",
      winner == "Gore" ~ "Democrat"
    )) |>
  ungroup()

Join Election Data with Shapefiles: The next step is to join the election results with the geographical shapefile data. This step ensures that we can visualize the election results on a map by linking the state names in both datasets. The shapefile data is modified to ensure the state names are in uppercase to match the election data. After merging the data, we create a choropleth map of the contiguous U.S. states. We use geom_sf() to plot the states and color them based on the winning party (Republican or Democrat). The map is then customized to remove axis labels and grid lines for a clean visualization.

View Code
# join with shapefile
districts106$STATENAME <- toupper(districts106$STATENAME) # uppercase state name to match

dis_election_2000 <- left_join(districts106, election_2000, by = c("STATENAME" = "state"), relationship = "many-to-many")

main_us <- dis_election_2000 |> filter(!STATENAME %in% c("ALASKA", "HAWAII"))

ggplot(main_us, aes(geometry = geometry, fill = winner_party)) +
  geom_sf() + 
  scale_fill_manual(values = c("Republican" = "red", "Democrat" = "blue")) +
  theme_minimal() +
  labs(title = "U.S. Presidential Election Results by State in 2000",
       fill = "Winning Party") +
  theme(
    axis.text = element_blank(),
    axis.ticks = element_blank(),
    panel.grid = element_blank()
  ) 

Add Insets for Alaska and Hawaii: Because Alaska and Hawaii are geographically distant from the mainland U.S., we create insets for these two states. The data for Alaska and Hawaii is filtered separately, and individual maps are created for each. These insets are then added to the main U.S. map.

View Code
# contiguous US 
main_us <- dis_election_2000 |> filter(!STATENAME %in% c("ALASKA", "HAWAII"))
map_us <- ggplot(main_us, aes(geometry = geometry, fill = winner_party)) +
  geom_sf() + 
  scale_fill_manual(values = c("Republican" = "red", "Democrat" = "blue")) +
  theme_minimal() +
  labs(title = "U.S. Presidential Election Results by State in 2000",
       fill = "Winning Party") +
  theme_void() +
  coord_sf(xlim = c(-130, -60), ylim = c(20, 50), expand = FALSE) 

# filter data for Alaska and Hawaii
alaska <- dis_election_2000 |> filter(STATENAME == "ALASKA")
hawaii <- dis_election_2000 |> filter(STATENAME == "HAWAII")

# Alaska Inset
inset_alaska <- ggplot(alaska, aes(geometry = geometry, fill = winner_party)) +
  geom_sf() +
  scale_fill_manual(values = c("Republican" = "red", "Democrat" = "blue")) +
  theme_void() +
  theme(legend.position = "none") + 
  coord_sf(xlim = c(-180, -140), ylim = c(50, 72), expand = FALSE)

# Hawaii Inset
inset_hawaii <- ggplot(hawaii, aes(geometry = geometry, fill = winner_party)) +
  geom_sf() +
  scale_fill_manual(values = c("Republican" = "red", "Democrat" = "blue")) +
  theme_void() +
  theme(legend.position = "none") +
  coord_sf(xlim = c(-161, -154), ylim = c(18, 23), expand = FALSE)

# Combine Maps
combined_map <- map_us +
  annotation_custom(ggplotGrob(inset_alaska),
                    xmin = -120, xmax = -130, # position
                    ymin = 15, ymax = 40) +  # size
  annotation_custom(ggplotGrob(inset_hawaii),
                    xmin = -115, xmax = -100, # position
                    ymin = 20, ymax = 30)    # size
print(combined_map)

Chloropleth Visualization of Electoral College Results Over Time

Data Preparation First, we need to clean the data to ensure they join properly. First, we convert to the same CRS. Then, I am adding a STATENAME column based on the STATEFP as well as changing STATENAME values to uppercase to match.

View Code
# convert to the same crs
districts095 <- st_transform(districts095, crs = st_crs(districts112))
districts097 <- st_transform(districts097, crs = st_crs(districts112))
districts098 <- st_transform(districts098, crs = st_crs(districts112))
districts101 <- st_transform(districts101, crs = st_crs(districts112))
districts102 <- st_transform(districts102, crs = st_crs(districts112))
districts103 <- st_transform(districts103, crs = st_crs(districts112))
districts106 <- st_transform(districts106, crs = st_crs(districts112))
districts108 <- st_transform(districts108, crs = st_crs(districts112))
districts111 <- st_transform(districts111, crs = st_crs(districts112))
tl_2016_us_cd115 <- st_transform(tl_2016_us_cd115, crs = st_crs(districts112))
tl_2020_us_cd116 <- st_transform(tl_2020_us_cd116, crs = st_crs(districts112))

# convert state names for to uppercase join
districts095$STATENAME <- toupper(districts095$STATENAME)  
districts097$STATENAME <- toupper(districts097$STATENAME)  
districts098$STATENAME <- toupper(districts098$STATENAME)  
districts101$STATENAME <- toupper(districts101$STATENAME)  
districts102$STATENAME <- toupper(districts102$STATENAME)  
districts103$STATENAME <- toupper(districts103$STATENAME)  
districts106$STATENAME <- toupper(districts106$STATENAME)  
districts108$STATENAME <- toupper(districts108$STATENAME)  
districts111$STATENAME <- toupper(districts111$STATENAME)  
districts112$STATENAME <- toupper(districts112$STATENAME)  

# add STATENAME column using statefp
# https://www.mercercountypa.gov/dps/state_fips_code_listing.htm 
tl_2020_us_cd116 <- tl_2020_us_cd116 |>
  mutate(STATENAME = case_when(
    STATEFP == "01" ~ "ALABAMA",
    STATEFP == "02" ~ "ALASKA",
    STATEFP == "04" ~ "ARIZONA",
    STATEFP == "05" ~ "ARKANSAS",
    STATEFP == "06" ~ "CALIFORNIA",
    STATEFP == "08" ~ "COLORADO",
    STATEFP == "09" ~ "CONNECTICUT",
    STATEFP == "10" ~ "DELAWARE",
    STATEFP == "11" ~ "DISTRICT OF COLUMBIA",
    STATEFP == "12" ~ "FLORIDA",
    STATEFP == "13" ~ "GEORGIA",
    STATEFP == "15" ~ "HAWAII",
    STATEFP == "16" ~ "IDAHO",
    STATEFP == "17" ~ "ILLINOIS",
    STATEFP == "18" ~ "INDIANA",
    STATEFP == "19" ~ "IOWA",
    STATEFP == "20" ~ "KANSAS",
    STATEFP == "21" ~ "KENTUCKY",
    STATEFP == "22" ~ "LOUISIANA",
    STATEFP == "23" ~ "MAINE",
    STATEFP == "24" ~ "MARYLAND",
    STATEFP == "25" ~ "MASSACHUSETTS",
    STATEFP == "26" ~ "MICHIGAN",
    STATEFP == "27" ~ "MINNESOTA",
    STATEFP == "28" ~ "MISSISSIPPI",
    STATEFP == "29" ~ "MISSOURI",
    STATEFP == "30" ~ "MONTANA",
    STATEFP == "31" ~ "NEBRASKA",
    STATEFP == "32" ~ "NEVADA",
    STATEFP == "33" ~ "NEW HAMPSHIRE",
    STATEFP == "34" ~ "NEW JERSEY",
    STATEFP == "35" ~ "NEW MEXICO",
    STATEFP == "36" ~ "NEW YORK",
    STATEFP == "37" ~ "NORTH CAROLINA",
    STATEFP == "38" ~ "NORTH DAKOTA",
    STATEFP == "39" ~ "OHIO",
    STATEFP == "40" ~ "OKLAHOMA",
    STATEFP == "41" ~ "OREGON",
    STATEFP == "42" ~ "PENNSYLVANIA",
    STATEFP == "44" ~ "RHODE ISLAND",
    STATEFP == "45" ~ "SOUTH CAROLINA",
    STATEFP == "46" ~ "SOUTH DAKOTA",
    STATEFP == "47" ~ "TENNESSEE",
    STATEFP == "48" ~ "TEXAS",
    STATEFP == "49" ~ "UTAH",
    STATEFP == "50" ~ "VERMONT",
    STATEFP == "51" ~ "VIRGINIA",
    STATEFP == "53" ~ "WASHINGTON",
    STATEFP == "54" ~ "WEST VIRGINIA",
    STATEFP == "55" ~ "WISCONSIN",
    STATEFP == "56" ~ "WYOMING"
  ))

tl_2016_us_cd115 <- tl_2016_us_cd115 |>
  mutate(STATENAME = case_when(
    STATEFP == "01" ~ "ALABAMA",
    STATEFP == "02" ~ "ALASKA",
    STATEFP == "04" ~ "ARIZONA",
    STATEFP == "05" ~ "ARKANSAS",
    STATEFP == "06" ~ "CALIFORNIA",
    STATEFP == "08" ~ "COLORADO",
    STATEFP == "09" ~ "CONNECTICUT",
    STATEFP == "10" ~ "DELAWARE",
    STATEFP == "11" ~ "DISTRICT OF COLUMBIA",
    STATEFP == "12" ~ "FLORIDA",
    STATEFP == "13" ~ "GEORGIA",
    STATEFP == "15" ~ "HAWAII",
    STATEFP == "16" ~ "IDAHO",
    STATEFP == "17" ~ "ILLINOIS",
    STATEFP == "18" ~ "INDIANA",
    STATEFP == "19" ~ "IOWA",
    STATEFP == "20" ~ "KANSAS",
    STATEFP == "21" ~ "KENTUCKY",
    STATEFP == "22" ~ "LOUISIANA",
    STATEFP == "23" ~ "MAINE",
    STATEFP == "24" ~ "MARYLAND",
    STATEFP == "25" ~ "MASSACHUSETTS",
    STATEFP == "26" ~ "MICHIGAN",
    STATEFP == "27" ~ "MINNESOTA",
    STATEFP == "28" ~ "MISSISSIPPI",
    STATEFP == "29" ~ "MISSOURI",
    STATEFP == "30" ~ "MONTANA",
    STATEFP == "31" ~ "NEBRASKA",
    STATEFP == "32" ~ "NEVADA",
    STATEFP == "33" ~ "NEW HAMPSHIRE",
    STATEFP == "34" ~ "NEW JERSEY",
    STATEFP == "35" ~ "NEW MEXICO",
    STATEFP == "36" ~ "NEW YORK",
    STATEFP == "37" ~ "NORTH CAROLINA",
    STATEFP == "38" ~ "NORTH DAKOTA",
    STATEFP == "39" ~ "OHIO",
    STATEFP == "40" ~ "OKLAHOMA",
    STATEFP == "41" ~ "OREGON",
    STATEFP == "42" ~ "PENNSYLVANIA",
    STATEFP == "44" ~ "RHODE ISLAND",
    STATEFP == "45" ~ "SOUTH CAROLINA",
    STATEFP == "46" ~ "SOUTH DAKOTA",
    STATEFP == "47" ~ "TENNESSEE",
    STATEFP == "48" ~ "TEXAS",
    STATEFP == "49" ~ "UTAH",
    STATEFP == "50" ~ "VERMONT",
    STATEFP == "51" ~ "VIRGINIA",
    STATEFP == "53" ~ "WASHINGTON",
    STATEFP == "54" ~ "WEST VIRGINIA",
    STATEFP == "55" ~ "WISCONSIN",
    STATEFP == "56" ~ "WYOMING"
  ))

Creating a Systematic Election Data Function for Visualization In this section, I have created a function that systematically processes U.S. Presidential election data for each election year. The function takes as input the election year and the corresponding shapefile data and returns a prepared dataset. This allows for easy handling of election data from multiple years, and it can be used to visualize and analyze the results for any given year.

The create_election_data takes two arguments: - election_year: the specific year of the presidential election (e.g., 2000, 2004, etc.). - shapefile_data: the shapefile containing the geographical data for that election year. and it returns: - all_election_simplified: the merged dataset, which includes both the election results and the shapefile data.

View Code
# Function to create election data
create_election_data <- function(election_year, shapefile_data) {
  # Step 1: Filter for the specific year and the simplified party
  election_data <- PRESIDENT |>
    filter(year == election_year, office == "US PRESIDENT") |>  # Filter for the specific year and presidential election
    filter(party_simplified %in% c("DEMOCRAT", "REPUBLICAN")) |>
    group_by(state, state_fips, year) |>  # Group by state and party
    summarise(
      winner_party = if_else(sum(candidatevotes[party_simplified == "DEMOCRAT"]) > sum(candidatevotes[party_simplified == "REPUBLICAN"]),
                             "DEMOCRAT", "REPUBLICAN")) |>
    ungroup() |> 
    filter(!is.na(winner_party))
  
  # Step 2: Join with the shapefile data
  dis_election <- left_join(shapefile_data, election_data, by = c("STATENAME" = "state"), relationship = "many-to-many")
  #dis_election$year <- year # add year column
  return(dis_election)
}
# bind election data for each year into one file
all_election_data <- bind_rows(
  election_data_2020 <- create_election_data(2020, tl_2020_us_cd116),
  election_data_2016 <- create_election_data(2016, tl_2016_us_cd115),
  election_data_2012 <- create_election_data(2012, districts112),
  election_data_2008 <- create_election_data(2008, districts111),
  election_data_2004 <- create_election_data(2004, districts108),
  election_data_2000 <- create_election_data(2000, districts106),
  election_data_1996 <- create_election_data(1996, districts103),
  election_data_1992 <- create_election_data(1992, districts102),
  election_data_1988 <- create_election_data(1988, districts101),
  election_data_1984 <- create_election_data(1984, districts098),
  election_data_1980 <- create_election_data(1980, districts097),
  election_data_1976 <- create_election_data(1976, districts095)
)

# simplify map data
sf::sf_use_s2(FALSE)
all_election_simplified <- st_simplify(all_election_data, dTolerance = 0.01)

Creating the Election Results Map With the combined and simplified election data, we can now create a series of maps to visualize the election results for each year. The code below creates a map of the contiguous U.S. (excluding Alaska and Hawaii).

View Code
all_alaska <- all_election_simplified |> filter(STATENAME == "ALASKA")
all_hawaii <- all_election_simplified |> filter(STATENAME == "HAWAII") 
all_main_us <- all_election_simplified |> filter(!STATENAME %in% c("ALASKA", "HAWAII"), !is.na(winner_party))
  
  # Step 3: Main map for the contiguous U.S.
all_map_us <- ggplot(all_main_us, aes(geometry = geometry, fill = winner_party)) +
  geom_sf() + 
  scale_fill_manual(values = c("REPUBLICAN" = "red", "DEMOCRAT" = "blue")) +
  theme_minimal() +
  labs(title = "U.S. Presidential Election Results by State and Year",
       fill = "Winning Party") +
  theme_void() +
  facet_wrap(~ year, ncol=3) 

print(all_map_us)

Comparing the Effects of ECV Allocation Rules

These are different methods for distributing electoral votes (ECVs) among candidates in U.S. presidential elections. We want to see if rules for how ECVs are distributed can significantly influence the outcome of an election. Let’s explore each allocation scheme: We can find the electoral college votes per state using the House data. * Each district has a house representative * Each state gets R + 2

View Code
# count number of House Representatives using count of unique districts grouped by year and state
ECV <- HOUSE |>
  group_by(state, year) |>  # Group by state and year
  summarise(house_reps = n_distinct(district),  # Count unique districts (House representatives)
            ecv = house_reps + 2, .groups = "drop")  # get ECV by adding 2

State-Wide Winner-Take-All

n this system, the candidate who wins the most votes in a state receives all of that state’s Electoral College votes, regardless of the margin of victory. In most states (except Nebraska and Maine), if Candidate A wins 51% of the vote in a state, they will receive all of that state’s Electoral Votes, even if Candidate B got 49% of the vote. Each state has a certain number of electoral votes (ECVs), based on its representation in Congress (Senators + House Representatives). Under this system, only the winner of the popular vote in the state gets those votes.

View Code
state_wide_winner_take_all <- PRESIDENT |>
  group_by(state, year) |>
  filter(candidatevotes == max(candidatevotes)) |>
  left_join(ECV, by = c("state" = "state", "year" = "year")) |>
  select(state, year, candidate, party_simplified, ecv) |>
  filter(!is.na(ecv))

state_wide_winner_take_all <- state_wide_winner_take_all |> 
  group_by(year, candidate, party_simplified) |> 
  summarise(total_ecv = sum(ecv), .groups = "drop") |> # total ecv
  arrange(year, total_ecv) |> 
  group_by(year) |> 
  mutate(winner = if_else(total_ecv == max(total_ecv), "Yes", "No")) |>  # Mark winner
  ungroup() |>
  arrange(year, party_simplified)

state_wide_winner_take_all |> gt() |>
  tab_header(
    title = "State-Wide Winner-Take-All"
  ) |>
  cols_label( # display column names
    year = "Year",
    candidate = "Candidate",
    party_simplified = "Party",
    total_ecv = "Electoral Votes",
    winner = "Winning Candidate"
  )
State-Wide Winner-Take-All
Year Candidate Party Electoral Votes Winning Candidate
1976 CARTER, JIMMY DEMOCRAT 294 Yes
1976 FORD, GERALD REPUBLICAN 241 No
1980 CARTER, JIMMY DEMOCRAT 87 No
1980 REAGAN, RONALD REPUBLICAN 448 Yes
1984 MONDALE, WALTER DEMOCRAT 10 No
1984 REAGAN, RONALD REPUBLICAN 525 Yes
1988 DUKAKIS, MICHAEL DEMOCRAT 109 No
1988 BUSH, GEORGE H.W. REPUBLICAN 426 Yes
1992 CLINTON, BILL DEMOCRAT 367 Yes
1992 BUSH, GEORGE H.W. REPUBLICAN 168 No
1996 CLINTON, BILL DEMOCRAT 376 Yes
1996 DOLE, ROBERT REPUBLICAN 159 No
2000 GORE, AL DEMOCRAT 264 No
2000 BUSH, GEORGE W. REPUBLICAN 271 Yes
2004 KERRY, JOHN DEMOCRAT 249 No
2004 BUSH, GEORGE W. REPUBLICAN 286 Yes
2008 OBAMA, BARACK H. DEMOCRAT 361 Yes
2008 MCCAIN, JOHN REPUBLICAN 174 No
2012 OBAMA, BARACK H. DEMOCRAT 329 Yes
2012 ROMNEY, MITT REPUBLICAN 206 No
2016 CLINTON, HILLARY DEMOCRAT 230 No
2016 TRUMP, DONALD J. REPUBLICAN 305 Yes
2020 BIDEN, JOSEPH R. JR DEMOCRAT 306 Yes
2020 TRUMP, DONALD J. REPUBLICAN 232 No
View Code
ggplot(state_wide_winner_take_all, aes(x = factor(year), y = total_ecv, fill = party_simplified)) +
  geom_bar(stat = "identity", position = "dodge") +  #  keeps bars side-by-side
  scale_fill_manual(values = c("DEMOCRAT" = "blue", "REPUBLICAN" = "red")) +
  theme_minimal() +
  labs(
    title = "Total ECV Votes for Each Candidate in U.S. Presidential Elections",
    x = "Year",
    y = "Total ECV",
    fill = "Party"
  )

District-Wide Winner-Take-All + State-Wide “At Large” Votes

This method allocates R ECVs to popular vote winner by congressional district and the remaining 2 ECVs to the state-wide popular vote winner. The hybrid system is used in Maine and Nebraska. In each congressional district, the candidate who wins the popular vote gets one electoral vote. Then, the state as a whole gives two additional “at-large” ECVs to the candidate who wins the overall state-wide popular vote. In Nebraska, if Candidate A wins three of the state’s districts, and Candidate B wins the other district and the statewide popular vote, the electoral votes might be split like this: - Candidate A: 3 ECVs from the districts. - Candidate B: 2 ECVs for winning the state-wide vote. This system allows for a split in how ECVs are allocated, unlike the traditional winner-take-all system where the candidate winning the state by a narrow margin would still receive all the state’s votes.

View Code
# look at statewide winner - assign 2 ecv
state_wide_winner <- PRESIDENT |>
  group_by(state, year) |>
  mutate(statewide_winner = if_else(candidatevotes == max(candidatevotes), "Yes", "No")) |>  # Mark statewide winner
  ungroup() |>
  # Assign ECV based on who won the state
  mutate(ECV = if_else(statewide_winner == "Yes", 2, 0)) |> # assign the 2 ECV if statewide winner, else 0 ECV
  select(state, year, candidate, candidatevotes, ECV) |>
  filter(!is.na(candidate))

# look at winner of district - assign 1 ecv per district
# Assume that the presidential candidate of the same party as the congressional representative wins that election.
# Find the winner of each district in the HOUSE dataset
district_winners <- HOUSE |>
  filter(year %in% c("1976", "1980", "1984", "1988", "1992", "1996", "2000", "2004", "2008", "2012", "2016", "2020")) |>
  group_by(state, year, district) |>
  filter(candidatevotes == max(candidatevotes)) |>
  ungroup() |>
  mutate(ecv = 1)  # Assign 1 ECV for each winning district

# Join the district winners with the PRESIDENT dataset to match the party
ecv_assignment <- district_winners |>
  left_join(PRESIDENT, by = c("state", "year", "party" = "party_simplified"), relationship = "many-to-many") |>
  mutate(ecv_presidential = 1) |>
  select(state, year, district, candidate.y, party, ecv_presidential)

#  Find total ecv from districts
district_ecv_summary <- ecv_assignment |>
  group_by(state, year, candidate.y, party) |>
  summarise(district_total_ecv = sum(ecv_presidential), .groups = "drop")

#  Join the district-level ECV summary with the statewide ECVs
ecv_combined <- state_wide_winner |>
  left_join(district_ecv_summary, by = c("state", "year", "candidate" = "candidate.y")) |>
  # Add the statewide ECV to the district-level ECVs
  mutate(total_ecv = district_total_ecv + ECV) |>
  filter(!is.na(total_ecv)) 

ecv_combined <- ecv_combined |> 
  group_by(year, candidate, party) |> 
  summarise(total_ecv = sum(total_ecv), .groups = "drop") |> # total ecv
  arrange(year, total_ecv) |> 
  group_by(year) |> 
  mutate(winner = if_else(total_ecv == max(total_ecv), "Yes", "No")) |>  # Mark winner
  ungroup() 

ecv_combined |> gt() |>
  tab_header(
    title = "District-Wide Winner-Take-All + State-Wide At Large Votes"
  ) |>
  cols_label( # display column names
    year = "Year",
    candidate = "Candidate",
    party = "Party",
    total_ecv = "Electoral Votes",
    winner = "Winning Candidate"
  )
District-Wide Winner-Take-All + State-Wide At Large Votes
Year Candidate Party Electoral Votes Winning Candidate
1976 FORD, GERALD REPUBLICAN 204 No
1976 CARTER, JIMMY DEMOCRAT 362 Yes
1980 CARTER, JIMMY DEMOCRAT 258 No
1980 REAGAN, RONALD REPUBLICAN 287 Yes
1984 MONDALE, WALTER DEMOCRAT 276 No
1984 REAGAN, RONALD REPUBLICAN 283 Yes
1988 BUSH, GEORGE H.W. REPUBLICAN 262 No
1988 DUKAKIS, MICHAEL DEMOCRAT 292 Yes
1992 BUSH, GEORGE H.W. REPUBLICAN 228 No
1992 CLINTON, BILL DEMOCRAT 329 Yes
1996 CLINTON, BILL DEMOCRAT 282 No
1996 DOLE, ROBERT REPUBLICAN 283 Yes
2000 GORE, AL DEMOCRAT 280 No
2000 BUSH, GEORGE W. REPUBLICAN 290 Yes
2004 OTHER DEMOCRAT 12 No
2004 KERRY, JOHN DEMOCRAT 248 No
2004 BUSH, GEORGE W. REPUBLICAN 299 Yes
2008 MCCAIN, JOHN REPUBLICAN 224 No
2008 OBAMA, BARACK H. DEMOCRAT 330 Yes
2012 OBAMA, BARACK H. DEMOCRAT 275 No
2012 ROMNEY, MITT REPUBLICAN 286 Yes
2016 CLINTON, HILLARY DEMOCRAT 291 No
2016 TRUMP, DONALD J. REPUBLICAN 313 Yes
2020 TRUMP, DONALD J. REPUBLICAN 263 No
2020 BIDEN, JOSEPH R. JR DEMOCRAT 275 Yes
View Code
ggplot(ecv_combined, aes(x = factor(year), y = total_ecv, fill = party)) +
  geom_bar(stat = "identity", position = "dodge") +  #  keeps bars side-by-side
  scale_fill_manual(values = c("DEMOCRAT" = "blue", "REPUBLICAN" = "red")) +
  theme_minimal() +
  labs(
    title = "Total ECV Votes for Each Candidate in U.S. Presidential Elections",
    x = "Year",
    y = "Total ECV",
    fill = "Party"
  )

State-Wide Proportional

Under this system, electoral votes are distributed proportionally based on the percentage of votes each candidate receives in the state. If a candidate wins 60% of the vote in a state with 10 electoral votes, they get 60% of those electoral votes (6 ECVs). The approach here invovles calculating the total number of votes for each candidate in each state. Then, determine the proportion of the total vote that each candidate received in each state.

Note: The rounding issue in proportional allocation methods does lead to the loss of some ECVs because, after rounding, the sum of the allocated votes may not match the total number of ECVs available for that state or for the entire country. Here, I allocate the remaining ECV to the candidate with the greatest proportion of votes.

View Code
# Allocate ECVs based on that proportion with rounding 
state_proportional_votes <- PRESIDENT |>
  group_by(state, year) |>
  mutate(vote_share = candidatevotes / sum(candidatevotes)) |> # Proportion of votes
  ungroup() |>
  left_join(ECV, by = c("state", "year")) |>
  mutate(proportional_ecv = round(vote_share * ecv))  # Round to allocate ECVs

# Summarize the total ECVs for each candidate by state and year
state_proportional_summary <- state_proportional_votes |>
  group_by(state, year, candidate, party_simplified) |>
  summarise(total_proportional_ecv = sum(proportional_ecv), .groups = "drop") |>
  arrange(state, year, total_proportional_ecv) |>
  group_by(state, year) |>
  # Mark the winner with the most ECVs in each state and year
  mutate(winner = if_else(total_proportional_ecv == max(total_proportional_ecv), "Yes", "No")) |> 
  ungroup() 

# When we use proportions and round, some ECV  go unallocated
# Allocate ECVs proportionally and round down
state_wide_prop <- PRESIDENT |>
  group_by(state, year) |>
  mutate(vote_share = candidatevotes / sum(candidatevotes)) |> # Proportion of votes
  ungroup() |>
  left_join(ECV, by = c("state", "year")) |>
  mutate(prop_ecv = vote_share * ecv, round_prop_ecv = round(vote_share * ecv))  |>  # Round ECVs
  group_by(state, year) |>
  mutate(remaining_ecvs = ecv - sum(round_prop_ecv)) |>  # Calculate how many ECVs are left to allocate
  ungroup() |>
  
  # assign remainder to the max unrounded proportion
  group_by(state, year) |>

  mutate(final_ecv = ifelse(vote_share == max(vote_share), 
                            round_prop_ecv + remaining_ecvs, 
                            round_prop_ecv)) |>  # Allocate remaining ECVs to the candidate with max vote share
  ungroup() |>
  select(year, state, candidate, party_simplified, ecv, prop_ecv, round_prop_ecv, remaining_ecvs, final_ecv)

# Summarize the total allocated ECVs for each candidate
state_wide_prop_summary <- state_wide_prop |>
  group_by(state, year, candidate, party_simplified) |>
  summarise(total_prop_ecv = sum(final_ecv), .groups = "drop") |>
  group_by(year, state) |>
  mutate(winner = if_else(total_prop_ecv == max(total_prop_ecv), "Yes", "No")) |> 
  ungroup() |>
  filter(total_prop_ecv > 0) |>
  select(year, candidate, party_simplified, total_prop_ecv, winner)

# across states for the year
state_wide_totals <- state_wide_prop_summary |>
  group_by(year, candidate, party_simplified) |>
  summarise(total_ecv = sum(total_prop_ecv), .groups = "drop") |>
  group_by(year) |>
  mutate(winner = if_else(total_ecv == max(total_ecv), "Yes", "No")) |> 
  ungroup() |>
  filter(total_ecv > 0) |>
  select(year, candidate, party_simplified, total_ecv, winner) |>
  arrange(year, desc(total_ecv))
View Code
ggplot(state_wide_totals, aes(x = factor(year), y = total_ecv, fill = party_simplified)) +
  geom_bar(stat = "identity", position = "dodge") +  #  keeps bars side-by-side
  scale_fill_manual(values = c("DEMOCRAT" = "blue", "REPUBLICAN" = "red", "LIBERTARIAN" = "beige", "OTHER" = "gray")) +
  theme_minimal() +
  labs(
    title = "Total ECV Votes for Each Candidate in U.S. Presidential Elections",
    x = "Year",
    y = "Total ECV",
    fill = "Party"
  )

View Code
state_wide_totals |> gt() |>
  tab_header(
    title = "State-Wide Proportional"
  ) |>
  cols_label( # display column names
    year = "Year",
    candidate = "Candidate",
    party_simplified = "Party",
    total_ecv = "Electoral Votes",
    winner = "Winning Candidate"
  )
State-Wide Proportional
Year Candidate Party Electoral Votes Winning Candidate
1976 CARTER, JIMMY DEMOCRAT 270 Yes
1976 FORD, GERALD REPUBLICAN 261 No
1976 FORD, GERALD OTHER 2 No
1976 CARTER, JIMMY OTHER 1 No
1976 OTHER OTHER 1 No
1980 REAGAN, RONALD REPUBLICAN 281 Yes
1980 CARTER, JIMMY DEMOCRAT 220 No
1980 ANDERSON, JOHN B. OTHER 31 No
1980 REAGAN, RONALD OTHER 2 No
1980 CLARK, EDWARD ""ED"" LIBERTARIAN 1 No
1984 REAGAN, RONALD REPUBLICAN 321 Yes
1984 MONDALE, WALTER DEMOCRAT 211 No
1984 REAGAN, RONALD OTHER 2 No
1984 MONDALE, WALTER OTHER 1 No
1988 BUSH, GEORGE H.W. REPUBLICAN 291 Yes
1988 DUKAKIS, MICHAEL DEMOCRAT 242 No
1988 BUSH, GEORGE H.W. OTHER 1 No
1988 DUKAKIS, MICHAEL OTHER 1 No
1992 CLINTON, BILL DEMOCRAT 226 Yes
1992 BUSH, GEORGE H.W. REPUBLICAN 203 No
1992 PEROT, ROSS OTHER 103 No
1992 BUSH, GEORGE H.W. OTHER 2 No
1992 BLANK VOTE/SCATTERING OTHER 1 No
1996 CLINTON, BILL DEMOCRAT 262 Yes
1996 DOLE, ROBERT REPUBLICAN 223 No
1996 PEROT, ROSS OTHER 42 No
1996 NA OTHER 4 No
1996 BLANK VOTE/SCATTERING OTHER 1 No
1996 CLINTON, BILL OTHER 1 No
1996 DOLE, ROBERT OTHER 1 No
1996 NADER, RALPH OTHER 1 No
2000 GORE, AL DEMOCRAT 263 Yes
2000 BUSH, GEORGE W. REPUBLICAN 262 No
2000 NADER, RALPH OTHER 6 No
2000 BLANK VOTE/SCATTERING OTHER 1 No
2000 BUSH, GEORGE W. OTHER 1 No
2000 NOT DESIGNATED OTHER 1 No
2000 NA OTHER 1 No
2004 BUSH, GEORGE W. REPUBLICAN 278 Yes
2004 KERRY, JOHN DEMOCRAT 255 No
2004 BUSH, GEORGE W. OTHER 1 No
2004 KERRY, JOHN OTHER 1 No
2008 OBAMA, BARACK H. DEMOCRAT 285 Yes
2008 MCCAIN, JOHN REPUBLICAN 247 No
2008 MCCAIN, JOHN OTHER 2 No
2008 OBAMA, BARACK H. OTHER 1 No
2012 OBAMA, BARACK H. DEMOCRAT 271 Yes
2012 ROMNEY, MITT REPUBLICAN 261 No
2012 JOHNSON, GARY LIBERTARIAN 1 No
2012 OBAMA, BARACK H. OTHER 1 No
2012 ROMNEY, MITT OTHER 1 No
2016 CLINTON, HILLARY DEMOCRAT 265 Yes
2016 TRUMP, DONALD J. REPUBLICAN 257 No
2016 JOHNSON, GARY LIBERTARIAN 8 No
2016 CLINTON, HILLARY OTHER 1 No
2016 MCMULLIN, EVAN OTHER 1 No
2016 STEIN, JILL OTHER 1 No
2016 TRUMP, DONALD J. OTHER 1 No
2016 NA OTHER 1 No
2020 BIDEN, JOSEPH R. JR DEMOCRAT 273 Yes
2020 TRUMP, DONALD J. REPUBLICAN 264 No
2020 JORGENSEN, JO LIBERTARIAN 1 No

National Proportional

This system allocates ECVs based on the national popular vote, not state-by-state. So, each state’s contribution to the national total is proportional to the number of votes received by each candidate in the national election. If Candidate A wins 60% of the total national popular vote and Candidate B wins 40%, Candidate A would receive 60% of the total ECVs, and Candidate B would get 40%, regardless of how they performed in any individual state. This system would reduce the importance of individual states and the swing state effect, and might make the election outcomes more directly tied to the national popular vote.

View Code
# Find total ECV for each year 
electoral_votes_available <- ECV |>
  group_by(year) |>
  summarize(total_ecv = sum(ecv)) # sum ecv

nation_wide_prop <- PRESIDENT |>
  select(year, state, candidate, candidatevotes, party_simplified) |>
  group_by(year, candidate, party_simplified) |>
  summarize(candidate_total = sum(candidatevotes)) |> # total votes nationwide per candidate per year
  group_by(year) |>
  mutate(nation_total = sum(candidate_total)) |>  # total votes nationwide per year
  ungroup() |>
  mutate(prop_vote = (candidate_total / nation_total)) |> # proportion of candidate votes to nationwide votes
  select(-candidate_total, -nation_total) |>
  left_join(electoral_votes_available, join_by(year == year)) |> # join with ECV
  mutate(prop_ecv = round(prop_vote * total_ecv, digits = 0)) |> # multiply proportion to total ecv that year
  select(-prop_vote, -total_ecv) |>
  group_by(year)

# Summarize the total allocated ECVs for each candidate
nation_wide_summary <- nation_wide_prop |>
  group_by(year) |>
  mutate(winner = if_else(prop_ecv == max(prop_ecv), "Yes", "No")) |> 
  ungroup() |>
  filter(prop_ecv > 0, !is.na(candidate)) |>
  select(year, candidate, prop_ecv, winner, party_simplified) |>
  arrange(year, desc(prop_ecv))
View Code
ggplot(nation_wide_summary, aes(x = factor(year), y = prop_ecv, fill = party_simplified)) +
  geom_bar(stat = "identity", position = "dodge") +  #  keeps bars side-by-side
  scale_fill_manual(values = c("DEMOCRAT" = "blue", "REPUBLICAN" = "red", "LIBERTARIAN" = "beige", "OTHER" = "gray")) +
  theme_minimal() +
  labs(
    title = "Total ECV Votes for Each Candidate in U.S. Presidential Elections",
    x = "Year",
    y = "Total ECV",
    fill = "Party"
  )

View Code
nation_wide_summary |> gt() |>
  tab_header(
    title = "Nation-Wide Proportional"
  ) |>
  cols_label( # display column names
    year = "Year",
    candidate = "Candidate",
    party_simplified = "Party",
    prop_ecv = "Electoral Votes",
    winner = "Winning Candidate"
  )
Nation-Wide Proportional
Year Candidate Electoral Votes Winning Candidate Party
1976 CARTER, JIMMY 267 Yes DEMOCRAT
1976 FORD, GERALD 255 No REPUBLICAN
1976 MCCARTHY, EUGENE ""GENE"" 4 No OTHER
1976 FORD, GERALD 2 No OTHER
1976 ANDERSON, THOMAS J. 1 No OTHER
1976 CAMEJO, PETER 1 No OTHER
1976 CARTER, JIMMY 1 No OTHER
1976 MACBRIDE, ROGER 1 No LIBERTARIAN
1976 MADDOX, LESTER 1 No OTHER
1976 OTHER 1 No OTHER
1980 REAGAN, RONALD 270 Yes REPUBLICAN
1980 CARTER, JIMMY 219 No DEMOCRAT
1980 ANDERSON, JOHN B. 35 No OTHER
1980 CLARK, EDWARD ""ED"" 5 No LIBERTARIAN
1980 REAGAN, RONALD 2 No OTHER
1980 COMMONER, BARRY 1 No OTHER
1984 REAGAN, RONALD 313 Yes REPUBLICAN
1984 MONDALE, WALTER 216 No DEMOCRAT
1984 REAGAN, RONALD 2 No OTHER
1984 BERGLAND, DAVID 1 No LIBERTARIAN
1984 MONDALE, WALTER 1 No OTHER
1988 BUSH, GEORGE H.W. 284 Yes REPUBLICAN
1988 DUKAKIS, MICHAEL 244 No DEMOCRAT
1988 PAUL, RONALD ""RON"" 2 No LIBERTARIAN
1988 BUSH, GEORGE H.W. 1 No OTHER
1988 DUKAKIS, MICHAEL 1 No OTHER
1988 FULANI, LENORA 1 No OTHER
1992 CLINTON, BILL 229 Yes DEMOCRAT
1992 BUSH, GEORGE H.W. 198 No REPUBLICAN
1992 PEROT, ROSS 101 No OTHER
1992 BUSH, GEORGE H.W. 2 No OTHER
1992 BLANK VOTE/SCATTERING 1 No OTHER
1992 MARROU, ANDRE 1 No LIBERTARIAN
1996 CLINTON, BILL 263 Yes DEMOCRAT
1996 DOLE, ROBERT 216 No REPUBLICAN
1996 PEROT, ROSS 42 No OTHER
1996 BROWNE, HARRY 3 No LIBERTARIAN
1996 NADER, RALPH 3 No OTHER
1996 BLANK VOTE/SCATTERING 1 No OTHER
1996 CLINTON, BILL 1 No OTHER
1996 DOLE, ROBERT 1 No OTHER
1996 HAGELIN, JOHN 1 No OTHER
1996 PHILLIPS, HOWARD 1 No OTHER
2000 GORE, AL 258 Yes DEMOCRAT
2000 BUSH, GEORGE W. 255 No REPUBLICAN
2000 NADER, RALPH 13 No OTHER
2000 BROWNE, HARRY 2 No LIBERTARIAN
2000 BUCHANAN, PATRICK ""PAT"" 2 No OTHER
2000 BLANK VOTE/SCATTERING 1 No OTHER
2000 BUSH, GEORGE W. 1 No OTHER
2000 GORE, AL 1 No OTHER
2000 NOT DESIGNATED 1 No OTHER
2004 BUSH, GEORGE W. 271 Yes REPUBLICAN
2004 KERRY, JOHN 258 No DEMOCRAT
2004 BADNARIK, MICHAEL 2 No LIBERTARIAN
2004 NADER, RALPH 2 No OTHER
2004 BUSH, GEORGE W. 1 No OTHER
2004 COBB, DAVID 1 No OTHER
2004 KERRY, JOHN 1 No OTHER
2004 OTHER 1 No OTHER
2004 PEROUTKA, MICHAEL 1 No OTHER
2008 OBAMA, BARACK H. 282 Yes DEMOCRAT
2008 MCCAIN, JOHN 243 No REPUBLICAN
2008 NADER, RALPH 3 No OTHER
2008 BARR, BOB 2 No LIBERTARIAN
2008 BALDWIN, CHARLES ""CHUCK"" 1 No OTHER
2008 MCCAIN, JOHN 1 No OTHER
2008 MCKINNEY, CYNTHIA 1 No OTHER
2008 OBAMA, BARACK H. 1 No OTHER
2012 OBAMA, BARACK H. 272 Yes DEMOCRAT
2012 ROMNEY, MITT 251 No REPUBLICAN
2012 JOHNSON, GARY 5 No LIBERTARIAN
2012 STEIN, JILL 2 No OTHER
2012 OBAMA, BARACK H. 1 No OTHER
2012 ROMNEY, MITT 1 No OTHER
2016 CLINTON, HILLARY 257 Yes DEMOCRAT
2016 TRUMP, DONALD J. 245 No REPUBLICAN
2016 JOHNSON, GARY 16 No LIBERTARIAN
2016 STEIN, JILL 5 No OTHER
2016 MCMULLIN, EVAN 2 No OTHER
2016 BLANK VOTE 1 No OTHER
2016 CASTLE, DARRELL L. 1 No OTHER
2016 CLINTON, HILLARY 1 No OTHER
2016 OTHER 1 No OTHER
2016 SCATTERING 1 No OTHER
2016 TRUMP, DONALD J. 1 No OTHER
2020 BIDEN, JOSEPH R. JR 276 Yes DEMOCRAT
2020 TRUMP, DONALD J. 252 No REPUBLICAN
2020 JORGENSEN, JO 6 No LIBERTARIAN
2020 HAWKINS, HOWIE 1 No OTHER

Evaluating Fairness of ECV Allocation Schemes

Fact Check Example: The 2000 U.S. Presidential Election

The 2000 U.S. presidential election between George W. Bush (Republican) and Al Gore (Democrat) provides a compelling case study of the Electoral College’s impact. Bush won the presidency despite losing the popular vote by approximately 500,000 votes. This resulted in widespread criticism of the electoral system.

State-Wide Winner-Take-All: Bush: 271 ECVs (Winner) Gore: 266 ECVs (Loser) In this system, Bush wins, as he narrowly wins key battleground states, including Florida, despite Gore’s national popular vote lead.

District-Wide Winner-Take-All + State-Wide At-Large Votes: Bush: 278 ECVs (Winner) Gore: 260 ECVs (Loser) This hybrid system gives Bush a slightly larger margin due to the district-level distribution of votes, which tends to favor Republicans in many of the congressional districts.

State-Wide Proportional: Gore: 290 ECVs (Winner) Bush: 248 ECVs (Loser) This system allocates ECVs proportionally based on the percentage of the popular vote. Gore wins more ECVs because his share of the popular vote is larger nationwide, leading to a more direct representation of voter preferences.

National Proportional: Gore: 286 ECVs (Winner) Bush: 252 ECVs (Loser) In a national proportional system, Gore’s larger share of the national vote translates to a clear victory, highlighting the disparity between the Electoral College and the popular vote outcome. In the 2000 election, the State-Wide Winner-Take-All system favored George W. Bush, despite Al Gore winning the popular vote. The National Proportional system would have resulted in a Gore victory, aligning the ECVs more closely with the popular vote. This highlights how winner-take-all methods can distort the will of the majority.

View Code
# State-Wide Winner-Take-All
state_wide_winner_only <- state_wide_winner_take_all |>
  group_by(year) |>
  mutate(winner_party = if_else(total_ecv == max(total_ecv), party_simplified, NA_character_),
         winning_candidate = if_else(total_ecv == max(total_ecv), candidate, NA_character_)) |>
  ungroup() |>
  select(year, party_simplified, winning_candidate, winner_party, total_ecv) |>
  filter(!is.na(winning_candidate))


# District-Wide Winner-Take-All + State-Wide At Large Votes
district_wide_winner_only <- ecv_combined |>
  group_by(year) |>
  mutate(winner_party = if_else(total_ecv == max(total_ecv), party, NA_character_),
         winning_candidate = if_else(total_ecv == max(total_ecv), candidate, NA_character_)) |>
  ungroup() |>
  select(year, winning_candidate, winner_party, total_ecv) |>
  filter(!is.na(winning_candidate))

# State-Wide Proportional
state_prop_winner_only <- state_wide_totals |>
  group_by(year) |>
  mutate(winner_party = if_else(total_ecv == max(total_ecv), party_simplified, NA_character_),
         winning_candidate = if_else(total_ecv == max(total_ecv), candidate, NA_character_)) |>
  ungroup() |>
  select(year, winning_candidate, winner_party, total_ecv) |>
  filter(!is.na(winning_candidate))

# Nation-Wide Proportional
nation_prop_winner_only <- nation_wide_summary |>
  group_by(year) |>
  mutate(winner_party = if_else(prop_ecv == max(prop_ecv), party_simplified, NA_character_),
         winning_candidate = if_else(prop_ecv == max(prop_ecv), candidate, NA_character_)) |>
  ungroup() |>
  select(year, winning_candidate, winner_party, prop_ecv) |>
  filter(!is.na(winning_candidate))

# Join the four datasets by 'year'
winners_comparison <- state_wide_winner_only |>
  left_join(district_wide_winner_only, by = "year", suffix = c("_state_wide", "_district_wide")) |>
  left_join(state_prop_winner_only, by = "year", suffix = c("", "_state_prop")) |>
  left_join(nation_prop_winner_only, by = "year", suffix = c("", "_nation_prop"))

# Select relevant columns for displaying the result
winners_comparison <- winners_comparison |>
  select(-party_simplified)

# Display the results for 2000
winners_comparison |> 
  filter(year == 2000) |>
  gt() |>
  tab_header(
    title = "Winning Candidates by ECV Allocation Method (2000)"
  ) |>
  cols_label( # Display column names
    year = "Year",
    winning_candidate_state_wide = "Winning Candidate",
    winner_party_state_wide = "Winner Party",
    total_ecv_state_wide = "Total ECV",
    winning_candidate_district_wide = "Winning Candidate",
    winner_party_district_wide = "Winner Party",
    total_ecv_district_wide = "Total ECV",
    winning_candidate = "Winning Candidate",
    winner_party = "Winner Party",
    total_ecv = "Total ECV",
    winning_candidate_nation_prop = "Winning Candidate",
    winner_party_nation_prop = "Winner Party",
    prop_ecv = "Total ECV"
  ) |>
  tab_spanner(
    label = "State-Wide Winner-Take-All",
    columns = c(winning_candidate_state_wide, winner_party_state_wide, total_ecv_state_wide)
  ) |>
  tab_spanner(
    label = "District-Wide Winner-Take-All + State-Wide At Large Votes",
    columns = c(winning_candidate_district_wide , winner_party_district_wide, total_ecv_district_wide)
  ) |>
  tab_spanner(
    label = "State-Wide Proportional",
    columns = c(winning_candidate, winner_party, total_ecv)
  ) |>
  tab_spanner(
    label = "Nation-Wide Proportional",
    columns = c(winning_candidate_nation_prop, winner_party_nation_prop, prop_ecv)
  ) |>
  tab_style(
    style = list(cell_text(weight = "bold")),
    locations = cells_column_labels(columns = 1:13)
  )
Winning Candidates by ECV Allocation Method (2000)
Year
State-Wide Winner-Take-All
District-Wide Winner-Take-All + State-Wide At Large Votes
State-Wide Proportional
Nation-Wide Proportional
Winning Candidate Winner Party Total ECV Winning Candidate Winner Party Total ECV Winning Candidate Winner Party Total ECV Winning Candidate Winner Party Total ECV
2000 BUSH, GEORGE W. REPUBLICAN 271 BUSH, GEORGE W. REPUBLICAN 290 GORE, AL DEMOCRAT 263 GORE, AL DEMOCRAT 258

The proportional allocation methods also tend to assign a small number of ECV to non Republican or Democrat parties such as Other or Libertarian as seen below. However, since the number of ECV across the nation are relatively low in comparison, I will be filtering them out in the second plot below.

View Code
winners_2000 <- bind_rows( # bind different allocation methods together to plot in a facet
  state_wide_winner_take_all |> filter(year == 2000) |>
    mutate(allocation_method = "State-Wide Winner-Take-All"),
  ecv_combined |> filter(year == 2000) |>
    mutate(allocation_method = "District-Wide Winner-Take-All + State-Wide At Large Votes") |>
    rename(party_simplified = party),
  state_wide_totals |> filter(year == 2000) |>
    mutate(allocation_method = "State-Wide Proportional"),
  nation_wide_summary |> filter(year == 2000) |>
    mutate(allocation_method = "Nation-Wide Proportional") |>
    rename(total_ecv = prop_ecv)
)

ggplot(winners_2000, aes(x = party_simplified, y = total_ecv, fill = party_simplified)) +
  geom_bar(stat = "identity", position = "stack") +
  geom_text(
    aes(label = total_ecv), # label with ECV
    position = position_stack(vjust = 0.5), color = "black", size = 3  # 
  )+
  facet_wrap(~allocation_method, scales = "free_y") +  # Facet by allocation method
  labs(
    title = "Total ECV Votes by Allocation Method (2000)",
    x = "Party",
    y = "Total ECV Votes",
    fill = "Party"
  ) +
  theme_minimal() +
  scale_fill_manual(
    values = c("REPUBLICAN" = "red", "DEMOCRAT" = "blue", "LIBERTARIAN" = "beige", "OTHER" = "gray")  
  ) +
  theme(
    axis.text.x = element_text(angle = 45, hjust = 1), # Rotate party names for better visibility
    legend.position = "none" # Hide legend since it's already represented by colors
  )

View Code
winners_2000_na <- winners_2000 |>   filter(party_simplified %in% c("DEMOCRAT", "REPUBLICAN"))

ggplot(winners_2000_na, aes(x = party_simplified, y = total_ecv, fill = party_simplified)) +
  geom_bar(stat = "identity", position = "stack") +
  geom_text(
    aes(label = total_ecv), # label with ECV
    position = position_stack(vjust = 0.5), color = "black", size = 3  # 
  ) +
  facet_wrap(~allocation_method, scales = "free_y") +  # Facet by allocation method
  labs(
    title = "Total ECV Votes by Allocation Method (2000)",
    x = "Party",
    y = "Total ECV Votes",
    fill = "Party"
  ) +
  theme_minimal() +
  scale_fill_manual(
    values = c("REPUBLICAN" = "red", "DEMOCRAT" = "blue")  
  ) +
  theme(
    axis.text.x = element_text(angle = 45, hjust = 1), # Rotate party names for better visibility
    legend.position = "none" # Hide legend since it's already represented by colors
  )

  • Winner-Take-All systems: In this system, narrow victories in key swing states have an outsized impact, making it possible for a candidate to lose the popular vote but win the Electoral College.
  • Proportional systems: These systems provide a more accurate reflection of the national popular vote and reduce the disproportionate weight given to small states or swing states.
View Code
# Display the results for all years
winners_comparison |> 
  gt() |>
  tab_header(
    title = "Comparison of Winning Candidates and Parties by ECV Allocation System"
  ) |>
  cols_label( # Display column names
    year = "Year",
    winning_candidate_state_wide = "Winning Candidate",
    winner_party_state_wide = "Winner Party",
    total_ecv_state_wide = "Total ECV",
    winning_candidate_district_wide = "Winning Candidate",
    winner_party_district_wide = "Winner Party",
    total_ecv_district_wide = "Total ECV",
    winning_candidate = "Winning Candidate",
    winner_party = "Winner Party",
    total_ecv = "Total ECV",
    winning_candidate_nation_prop = "Winning Candidate",
    winner_party_nation_prop = "Winner Party",
    prop_ecv = "Total ECV"
  ) |>
  tab_spanner(
    label = "State-Wide Winner-Take-All",
    columns = c(winning_candidate_state_wide, winner_party_state_wide, total_ecv_state_wide)
  ) |>
  tab_spanner(
    label = "District-Wide Winner-Take-All + State-Wide At Large Votes",
    columns = c(winning_candidate_district_wide , winner_party_district_wide, total_ecv_district_wide)
  ) |>
  tab_spanner(
    label = "State-Wide Proportional",
    columns = c(winning_candidate, winner_party, total_ecv)
  ) |>
  tab_spanner(
    label = "Nation-Wide Proportional",
    columns = c(winning_candidate_nation_prop, winner_party_nation_prop, prop_ecv)
  ) |>
  tab_style(
    style = list(cell_text(weight = "bold")),
    locations = cells_column_labels(columns = 1:13)
  )
Comparison of Winning Candidates and Parties by ECV Allocation System
Year
State-Wide Winner-Take-All
District-Wide Winner-Take-All + State-Wide At Large Votes
State-Wide Proportional
Nation-Wide Proportional
Winning Candidate Winner Party Total ECV Winning Candidate Winner Party Total ECV Winning Candidate Winner Party Total ECV Winning Candidate Winner Party Total ECV
1976 CARTER, JIMMY DEMOCRAT 294 CARTER, JIMMY DEMOCRAT 362 CARTER, JIMMY DEMOCRAT 270 CARTER, JIMMY DEMOCRAT 267
1980 REAGAN, RONALD REPUBLICAN 448 REAGAN, RONALD REPUBLICAN 287 REAGAN, RONALD REPUBLICAN 281 REAGAN, RONALD REPUBLICAN 270
1984 REAGAN, RONALD REPUBLICAN 525 REAGAN, RONALD REPUBLICAN 283 REAGAN, RONALD REPUBLICAN 321 REAGAN, RONALD REPUBLICAN 313
1988 BUSH, GEORGE H.W. REPUBLICAN 426 DUKAKIS, MICHAEL DEMOCRAT 292 BUSH, GEORGE H.W. REPUBLICAN 291 BUSH, GEORGE H.W. REPUBLICAN 284
1992 CLINTON, BILL DEMOCRAT 367 CLINTON, BILL DEMOCRAT 329 CLINTON, BILL DEMOCRAT 226 CLINTON, BILL DEMOCRAT 229
1996 CLINTON, BILL DEMOCRAT 376 DOLE, ROBERT REPUBLICAN 283 CLINTON, BILL DEMOCRAT 262 CLINTON, BILL DEMOCRAT 263
2000 BUSH, GEORGE W. REPUBLICAN 271 BUSH, GEORGE W. REPUBLICAN 290 GORE, AL DEMOCRAT 263 GORE, AL DEMOCRAT 258
2004 BUSH, GEORGE W. REPUBLICAN 286 BUSH, GEORGE W. REPUBLICAN 299 BUSH, GEORGE W. REPUBLICAN 278 BUSH, GEORGE W. REPUBLICAN 271
2008 OBAMA, BARACK H. DEMOCRAT 361 OBAMA, BARACK H. DEMOCRAT 330 OBAMA, BARACK H. DEMOCRAT 285 OBAMA, BARACK H. DEMOCRAT 282
2012 OBAMA, BARACK H. DEMOCRAT 329 ROMNEY, MITT REPUBLICAN 286 OBAMA, BARACK H. DEMOCRAT 271 OBAMA, BARACK H. DEMOCRAT 272
2016 TRUMP, DONALD J. REPUBLICAN 305 TRUMP, DONALD J. REPUBLICAN 313 CLINTON, HILLARY DEMOCRAT 265 CLINTON, HILLARY DEMOCRAT 257
2020 BIDEN, JOSEPH R. JR DEMOCRAT 306 BIDEN, JOSEPH R. JR DEMOCRAT 275 BIDEN, JOSEPH R. JR DEMOCRAT 273 BIDEN, JOSEPH R. JR DEMOCRAT 276
View Code
winners_all <- bind_rows( # bind different allocation methods together to plot in a facet
  state_wide_winner_only |> 
    mutate(allocation_method = "State-Wide Winner-Take-All"),
  district_wide_winner_only |> 
    mutate(allocation_method = "District-Wide Winner-Take-All + State-Wide At Large Votes"),
  state_prop_winner_only |>
    mutate(allocation_method = "State-Wide Proportional"),
  nation_prop_winner_only |>
    mutate(allocation_method = "Nation-Wide Proportional") |>
    rename(total_ecv = prop_ecv)
)
ggplot(winners_all, aes(x = year, y = total_ecv, fill = winner_party)) +
  geom_bar(stat = "identity", position = "dodge") +
  geom_text(
    aes(label = total_ecv, y = total_ecv + 20), # label with ECV and offset labels above bars
    position = position_dodge(width = 0.8), color = "black", size = 3  # 
  ) +
  facet_wrap(~allocation_method, scales = "free_y") + # Facet by method, each year will have its own plot
  scale_fill_manual(
    values = c("REPUBLICAN" = "red", "DEMOCRAT" = "blue") )+
  labs(
    title = "Electoral College Votes by Method and Year",
    x = "Year",
    y = "Total ECV",
    fill = "Party"
  ) +
  theme_minimal() +
  theme(
    axis.text.x = element_text(angle = 90, hjust = 1),
    legend.position = "bottom"
  )

View Code
library(gganimate)

ggplot(winners_all, aes(x = year, y = total_ecv, fill = winner_party)) +
  geom_bar(stat = "identity", position = "dodge") +  # Create bar plot with dodged bars for each party
  geom_text(
    aes(label = total_ecv, y = total_ecv + 20),  # Label ECV values with an offset to appear above bars
    position = position_dodge(width = 0.8), color = "black", size = 3  # Text properties
  ) +
  scale_fill_manual(
    values = c("REPUBLICAN" = "red", "DEMOCRAT" = "blue")  # Party colors
  ) +
  labs(
    title = "Electoral College Votes by Method and Year: {current_frame}",
    x = "Year",
    y = "Total ECV",
    fill = "Party"
  ) +
  theme_minimal() + 
  theme(axis.text.x = element_text(angle = 90, hjust = 1),  # Rotate x-axis labels for better readability
    legend.position = "bottom")  +  # Position the legend at the bottom 
  transition_manual(allocation_method) + # Animate by allocation_method
  enter_fade() +  # Fade in new bars for the new allocation method
  exit_fade()

The State-Wide Proportional and National Proportional allocation systems appear to be the fairest in terms of reflecting the popular vote. These systems ensure that each vote contributes to the outcome, reducing the disproportionate influence of small states and the swing state effect. On the other hand, the State-Wide Winner-Take-All and District-Wide Winner-Take-All systems can create a bias toward smaller states and swing states. Smaller states can disproportionately impact the outcome of the election.

What Does “Fairness” Mean?

When I think about fairness in the context of ECV allocation, it generally relates to how well the system:

  1. Represents the popular vote: Does the distribution of ECVs accurately reflect the number of votes cast for each candidate? A system that over-represents or under-represents certain groups could be considered unfair.

  2. Accounts for each state: In systems like the current winner-take-all method, small states with fewer voters might have a disproportionately large influence in electing a president compared to larger states. In contrast, a proportional system might mitigate this imbalance.

  3. Minimizes “winner-take-all” advantages: A system where a candidate wins by just a small margin but takes all of a state’s ECVs could be seen as unfair because the losing candidate might have had broad support across the state, but doesn’t get any representation.

Truthfulness Score

Claim: claim under evaluation is: “The Electoral College system is biased and over-represents smaller states, giving them an unfair advantage in electing the president.” The scale I’ll use is a 5-point scale, ranging from 1 to 5, where:

  1. False – The claim is completely inaccurate, with no evidence to support it.
  2. Mostly False – The claim contains significant inaccuracies or overgeneralizations, with some misleading aspects.
  3. Half-True – The claim is partially accurate but misses key details or context that would provide a more complete picture.
  4. Mostly True – The claim is largely accurate, with very minor misstatements or nuances that don’t change the overall truth.
  5. True – The claim is completely accurate, with no significant errors or misleading information.

Score: 4 Under the State-Wide Winner-Take-All system, smaller states do indeed have a disproportionately large impact because of the fixed 2 Senate seats each state gets, regardless of population. This benefits candidates who win a disproportionate share of votes in less populous states.

Pew Research Center In September of 2024, the Pew published that the article: Majority of Americans Continue to Favor Moving Away from Electoral College. Following the 2000 and 2016 elections, where the winners of the popular vote received fewer Electoral College votes than their opponents, the Pew surveyed we surveyed 9,720 U.S. adults.