Do Proportional Electoral College Allocations Yield a More Representative Presidency?
Author
Cindy Li
Introduction
The U.S. Electoral College (EC) system, by design, has a significant impact on presidential elections, often making the distribution of votes much more complex than a simple nationwide popular vote. The system has been debated, especially when the results diverge from the popular vote. This analysis’ primary goal is to assess how the allocation schemes impact the election outcomes and whether any bias exists, especially in favor of one political party.
Data Ingesting
Data I: ELection Data
Data Source: MIT Election Data Science Lab datasets From the MIT Election Data Science Lab, we are retrieving two data sets. First, are votes from all biennial congressional races in all 50 states from 1976 to 2020. Second, are statewide presidential vote cotes. This requires a download from the link
HOUSE <-read_csv("1976-2022-house.csv")PRESIDENT <-read_csv("1976-2020-president.csv")sample_n(HOUSE, 1000) |> DT::datatable()
1976-2020 President Data
View Code
sample_n(HOUSE, 1000) |> DT::datatable()
Data II: Congressional Boundary Files 1976 to 2012
Data Source: Jeffrey B. Lewis, Brandon DeVine, and Lincoln Pritcher with Kenneth C. Martis This source give us the shapefiles for all US congressional districts from 1789 to 2012.
View Code
get_file <-function(fname){ BASE_URL <-"https://cdmaps.polisci.ucla.edu/shp/" fname_ext <-paste0(fname, ".zip") fname_ext1 <-paste0(fname, ".shp") fname_extunzip <-gsub(".zip$", "", fname_ext) subfolder <-"districtshapes"# Subfolder where the shapefile is locatedif(!file.exists(fname_ext)){ FILE_URL <-paste0(BASE_URL, fname_ext)download.file(FILE_URL, destfile = fname_ext) }# Unzip the contents and save unzipped contentunzip(zipfile = fname_ext, exdir = fname_extunzip)# Define File Path shapefile_path <-file.path(fname_extunzip, subfolder, fname_ext1)# Read the shapefileread_sf(shapefile_path)}# Download files by iterating throughstart_congress =95end_congress =114for (i in start_congress:end_congress) { district_name <-sprintf("districts%03d", i) # Formats as district001, district002, etc. district_data <-get_file(district_name) # Download and read the shapefileassign(district_name, district_data, envir = .GlobalEnv) # Assign the data frame to a variable in the global environment}
Data III: Congressional Boundary Files 2014 to Present
Data Source: US Census Bureau This data source provides district boundaries for more recent congressional elections.
View Code
get_congress_file <-function(fname, year){ BASE_URL <-sprintf("https://www2.census.gov/geo/tiger/TIGER%d/CD/", year) #replace %d with year fname_ext <-paste0(fname, ".zip") fname_ext1 <-paste0(fname, ".shp") fname_extunzip <-gsub(".zip$", "", fname_ext)# Download Fileif(!file.exists(fname_ext)){ FILE_URL <-paste0(BASE_URL, fname_ext)download.file(FILE_URL, destfile = fname_ext) }# Unzip the contents and save unzipped contentunzip(zipfile = fname_ext, exdir = fname_extunzip)# Define File Path shapefile_path <-file.path(fname_extunzip, fname_ext1)# Read the shapefileread_sf(shapefile_path)}# Download file for each district by iterating through each yearbase_year =2022base_congress =116# Congress number for 2012for (i in0:10) { # i will range from 0 (2022) to 10 (2012) year <- base_year - iif (year >=2018) {congress <-116} elseif (year >=2016) {congress <-115} elseif (year >=2014) {congress <-114} elseif (year ==2013) {congress <-113} elseif (year ==2012) {congress <-112} district_name <-sprintf("tl_%d_us_cd%d", year, congress) district_data <-get_congress_file(district_name, year) # Download and read the shapefileassign(district_name, district_data, envir = .GlobalEnv) # Assign the data frame to a variable in the global environment }
Exploration
1. Which states have gained and lost the most seats in the US House of Representatives between 1976 and 2022?
View Code
# Count the number of districts (aka seats) per state for each yeargains_losses <- HOUSE |>group_by(state, year) |>summarise(num_districts =n_distinct(district)) |>arrange(state, year) |># Calculate seat changes for each stategroup_by(state) |>summarise(first_year_seats =first(num_districts),last_year_seats =last(num_districts),seat_change = last_year_seats - first_year_seats) |>filter(seat_change !=0) |>arrange(desc(seat_change))# Plot the seat changesggplot(gains_losses, aes(x =reorder(state, seat_change), y = seat_change, fill = seat_change >0)) +geom_bar(stat ="identity") +coord_flip() +scale_fill_manual(values =c("red", "blue"), labels =c("Loss", "Gain")) +labs(title ="Gains and Losses of House Seats (1976 to 2022)",x ="State",y ="Change in Seats",fill ="Change" ) +theme_minimal()
2. New York State has a unique “fusion” voting system where one candidate can appear on multiple “lines” on the ballot and their vote counts are totaled. Are there any elections in our data where the election would have had a different outcome if the “fusion” system was not used and candidates only received the votes their received from their “major party line” (Democrat or Republican) and not their total number of votes across all lines?
View Code
# Summarize the votes for each candidate, total votes vs major party votesfusion_summary <- HOUSE |>filter(!is.na(candidate)) |>group_by(year, state, state_po, district, candidate, fusion_ticket) |>summarise(total_candidate_votes =sum(candidatevotes, na.rm =TRUE), # Total votes across all party linesmajor_party_votes =sum( # Major party votes (only votes from Democrat and Republican)if_else(party %in%c("DEMOCRAT", "REPUBLICAN"), candidatevotes, 0), na.rm =TRUE), .groups ="drop") |>select(year, state, state_po, district, candidate, total_candidate_votes, major_party_votes, fusion_ticket) |>arrange(state, district, candidate)# Check if there would have been a different outcome without fusion votingfusion_outcome_changes <- fusion_summary |>filter(fusion_ticket ==TRUE) |># out of the times when fusion voting was usedgroup_by(year, state, state_po, district) |>summarise(# Find the winner based on total votes winner_with_fusion = candidate[which.max(total_candidate_votes)],winner_without_fusion = candidate[which.max(major_party_votes)],total_votes_winner =max(total_candidate_votes),major_party_votes_winner =max(major_party_votes),.groups ="drop" ) |># Ensure that major party votes winner is not zero and handle if no major party candidate ranmutate( major_party_votes_winner =ifelse(major_party_votes_winner ==0, NA, major_party_votes_winner), # Check if the winners are the same or different based on fusion votingoutcome_change =ifelse(winner_with_fusion != winner_without_fusion, "Yes", "No") ) |>arrange(year, state, district)# Plot directly from fusion_outcome_changes without creating a summaryggplot(fusion_outcome_changes, aes(x = outcome_change, fill = outcome_change)) +geom_bar(show.legend =FALSE) +# Create the bar plotlabs(title ="Impact of Fusion Voting on Election Outcomes",x ="Outcome Change",y ="Number of Elections" ) +theme_minimal() +theme(axis.text.x =element_text(angle =45, hjust =1),panel.grid.major =element_blank(),panel.grid.minor =element_blank() )
3. Do presidential candidates tend to run ahead of or run behind congressional candidates in the same state? That is, does a Democratic candidate for president tend to get more votes in a given state than all Democratic congressional candidates in the same state? Are any presidents particularly more or less popular than their co-partisans?
Let’s take a glimpse at just the year 2020.
View Code
# Summarize presidential votes for Democrat and Republican candidatespresidential_votes <- PRESIDENT |>filter(party_simplified %in%c("DEMOCRAT", "REPUBLICAN")) |>group_by(year, state, party_simplified) |>summarise(presidential_total_votes =sum(candidatevotes, na.rm =TRUE),.groups ="drop" )# Summarize congressional votes for Democrat and Republican candidatescongressional_votes <- HOUSE |>filter(party %in%c("DEMOCRAT", "REPUBLICAN")) |>group_by(year, state, party) |>summarise(congressional_total_votes =sum(candidatevotes, na.rm =TRUE),.groups ="drop" )# Join the two datasets to compare presidential and congressional votesvote_comparison <-left_join(presidential_votes, congressional_votes, by =c("year", "state", "party_simplified"="party")) |>mutate(vote_difference = presidential_total_votes - congressional_total_votes,run_ahead =if_else(vote_difference >0, "Presidential Ahead", "Presidential Behind")) |>arrange(run_ahead, vote_difference)
Does this trend differ over time? Does it differ across states or across parties?
View Code
# Plot the vote difference between presidential and congressional candidates frequencies by yearggplot(vote_comparison, aes(x = vote_difference, fill = run_ahead)) +geom_histogram(binwidth =100000, position ="identity", alpha =0.7) +facet_wrap(~ year, ncol=3) +labs(title ="Presidential vs Congressional Vote Difference",x ="Vote Difference (Presidential - Congressional)",y ="Frequency",fill ="Vote Comparison") +theme_minimal() +theme(axis.text.x =element_text(size =8, angle =45, hjust =1)) # Adjust x-axis tick labels
View Code
# Calculate the average vote difference for each president (across all states and years)presidential_comparison <- vote_comparison |>group_by(year, state, party_simplified) |>summarise(average_vote_difference =mean(vote_difference, na.rm =TRUE),.groups ="drop" )# group by party_simplified and year to get the overall average for each party-yearpresident_ranking <- presidential_comparison |>group_by(party_simplified, year) |>summarise(average_vote_difference =mean(average_vote_difference, na.rm =TRUE),.groups ="drop" ) |>arrange(desc(average_vote_difference))# Create a plot to visualize presidential popularity vs. co-partisan congressional candidatesggplot(president_ranking, aes(x =reorder(party_simplified, average_vote_difference), y = average_vote_difference, fill = party_simplified)) +geom_bar(stat ="identity", show.legend =FALSE) +coord_flip() +scale_fill_manual(values =c("DEMOCRAT"="blue", "REPUBLICAN"="red")) +labs( title ="Presidential Popularity vs. Congressional Co-partisans",x ="President",y ="Average Vote Difference (Presidential - Congressional)",subtitle ="Higher values indicate greater presidential popularity relative to congressional candidates") +theme_minimal()# Line plot with trend lines for each partyggplot(president_ranking, aes(x = year, y = average_vote_difference, color = party_simplified, group = party_simplified)) +geom_line(size =1) +geom_point(size =3) +# Adds points at each yearscale_color_manual(values =c("DEMOCRAT"="blue", "REPUBLICAN"="red")) +labs(title ="Presidential Popularity vs. Congressional Co-partisans Over Time",x ="Year",y ="Average Vote Difference (Presidential - Congressional)",subtitle ="Line plot showing trends for each party" ) +theme_minimal()
Maps & Shapefiles
Chloropleth Visualization of the 2000 Presidential Election Electoral College Results
Filter Election Data for 2000: To create a map of the results broken down by states, we will need to find the election results of each state. The first step involves filtering the election dataset PRESIDENT to get the results for the year 2000. We specifically focus on the U.S. Presidential election and filter for the two main candidates, George W. Bush and Al Gore. We then calculate the winner for each state based on who received the most votes and assign the appropriate party.
View Code
election_2000 <- PRESIDENT |>filter(year ==2000, office =="US PRESIDENT") |># filter for 2000 and president officefilter(candidate %in%c("BUSH, GEORGE W.", "GORE, AL")) |># filter for Bush and Goregroup_by(state) |>summarise( # Winner based on the candidate with the most voteswinner =if_else(sum(candidatevotes[candidate =="BUSH, GEORGE W."]) >sum(candidatevotes[candidate =="GORE, AL"]),"Bush", "Gore"),winner_party =case_when(# Party based on the candidate winner =="Bush"~"Republican", winner =="Gore"~"Democrat" )) |>ungroup()
Join Election Data with Shapefiles: The next step is to join the election results with the geographical shapefile data. This step ensures that we can visualize the election results on a map by linking the state names in both datasets. The shapefile data is modified to ensure the state names are in uppercase to match the election data. After merging the data, we create a choropleth map of the contiguous U.S. states. We use geom_sf() to plot the states and color them based on the winning party (Republican or Democrat). The map is then customized to remove axis labels and grid lines for a clean visualization.
View Code
# join with shapefiledistricts106$STATENAME <-toupper(districts106$STATENAME) # uppercase state name to matchdis_election_2000 <-left_join(districts106, election_2000, by =c("STATENAME"="state"), relationship ="many-to-many")main_us <- dis_election_2000 |>filter(!STATENAME %in%c("ALASKA", "HAWAII"))ggplot(main_us, aes(geometry = geometry, fill = winner_party)) +geom_sf() +scale_fill_manual(values =c("Republican"="red", "Democrat"="blue")) +theme_minimal() +labs(title ="U.S. Presidential Election Results by State in 2000",fill ="Winning Party") +theme(axis.text =element_blank(),axis.ticks =element_blank(),panel.grid =element_blank() )
Add Insets for Alaska and Hawaii: Because Alaska and Hawaii are geographically distant from the mainland U.S., we create insets for these two states. The data for Alaska and Hawaii is filtered separately, and individual maps are created for each. These insets are then added to the main U.S. map.
Chloropleth Visualization of Electoral College Results Over Time
Data Preparation First, we need to clean the data to ensure they join properly. First, we convert to the same CRS. Then, I am adding a STATENAME column based on the STATEFP as well as changing STATENAME values to uppercase to match.
Creating a Systematic Election Data Function for Visualization In this section, I have created a function that systematically processes U.S. Presidential election data for each election year. The function takes as input the election year and the corresponding shapefile data and returns a prepared dataset. This allows for easy handling of election data from multiple years, and it can be used to visualize and analyze the results for any given year.
The create_election_data takes two arguments: - election_year: the specific year of the presidential election (e.g., 2000, 2004, etc.). - shapefile_data: the shapefile containing the geographical data for that election year. and it returns: - all_election_simplified: the merged dataset, which includes both the election results and the shapefile data.
View Code
# Function to create election datacreate_election_data <-function(election_year, shapefile_data) {# Step 1: Filter for the specific year and the simplified party election_data <- PRESIDENT |>filter(year == election_year, office =="US PRESIDENT") |># Filter for the specific year and presidential electionfilter(party_simplified %in%c("DEMOCRAT", "REPUBLICAN")) |>group_by(state, state_fips, year) |># Group by state and partysummarise(winner_party =if_else(sum(candidatevotes[party_simplified =="DEMOCRAT"]) >sum(candidatevotes[party_simplified =="REPUBLICAN"]),"DEMOCRAT", "REPUBLICAN")) |>ungroup() |>filter(!is.na(winner_party))# Step 2: Join with the shapefile data dis_election <-left_join(shapefile_data, election_data, by =c("STATENAME"="state"), relationship ="many-to-many")#dis_election$year <- year # add year columnreturn(dis_election)}# bind election data for each year into one fileall_election_data <-bind_rows( election_data_2020 <-create_election_data(2020, tl_2020_us_cd116), election_data_2016 <-create_election_data(2016, tl_2016_us_cd115), election_data_2012 <-create_election_data(2012, districts112), election_data_2008 <-create_election_data(2008, districts111), election_data_2004 <-create_election_data(2004, districts108), election_data_2000 <-create_election_data(2000, districts106), election_data_1996 <-create_election_data(1996, districts103), election_data_1992 <-create_election_data(1992, districts102), election_data_1988 <-create_election_data(1988, districts101), election_data_1984 <-create_election_data(1984, districts098), election_data_1980 <-create_election_data(1980, districts097), election_data_1976 <-create_election_data(1976, districts095))# simplify map datasf::sf_use_s2(FALSE)all_election_simplified <-st_simplify(all_election_data, dTolerance =0.01)
Creating the Election Results Map With the combined and simplified election data, we can now create a series of maps to visualize the election results for each year. The code below creates a map of the contiguous U.S. (excluding Alaska and Hawaii).
View Code
all_alaska <- all_election_simplified |>filter(STATENAME =="ALASKA")all_hawaii <- all_election_simplified |>filter(STATENAME =="HAWAII") all_main_us <- all_election_simplified |>filter(!STATENAME %in%c("ALASKA", "HAWAII"), !is.na(winner_party))# Step 3: Main map for the contiguous U.S.all_map_us <-ggplot(all_main_us, aes(geometry = geometry, fill = winner_party)) +geom_sf() +scale_fill_manual(values =c("REPUBLICAN"="red", "DEMOCRAT"="blue")) +theme_minimal() +labs(title ="U.S. Presidential Election Results by State and Year",fill ="Winning Party") +theme_void() +facet_wrap(~ year, ncol=3) print(all_map_us)
Comparing the Effects of ECV Allocation Rules
These are different methods for distributing electoral votes (ECVs) among candidates in U.S. presidential elections. We want to see if rules for how ECVs are distributed can significantly influence the outcome of an election. Let’s explore each allocation scheme: We can find the electoral college votes per state using the House data. * Each district has a house representative * Each state gets R + 2
View Code
# count number of House Representatives using count of unique districts grouped by year and stateECV <- HOUSE |>group_by(state, year) |># Group by state and yearsummarise(house_reps =n_distinct(district), # Count unique districts (House representatives)ecv = house_reps +2, .groups ="drop") # get ECV by adding 2
State-Wide Winner-Take-All
n this system, the candidate who wins the most votes in a state receives all of that state’s Electoral College votes, regardless of the margin of victory. In most states (except Nebraska and Maine), if Candidate A wins 51% of the vote in a state, they will receive all of that state’s Electoral Votes, even if Candidate B got 49% of the vote. Each state has a certain number of electoral votes (ECVs), based on its representation in Congress (Senators + House Representatives). Under this system, only the winner of the popular vote in the state gets those votes.
This method allocates R ECVs to popular vote winner by congressional district and the remaining 2 ECVs to the state-wide popular vote winner. The hybrid system is used in Maine and Nebraska. In each congressional district, the candidate who wins the popular vote gets one electoral vote. Then, the state as a whole gives two additional “at-large” ECVs to the candidate who wins the overall state-wide popular vote. In Nebraska, if Candidate A wins three of the state’s districts, and Candidate B wins the other district and the statewide popular vote, the electoral votes might be split like this: - Candidate A: 3 ECVs from the districts. - Candidate B: 2 ECVs for winning the state-wide vote. This system allows for a split in how ECVs are allocated, unlike the traditional winner-take-all system where the candidate winning the state by a narrow margin would still receive all the state’s votes.
View Code
# look at statewide winner - assign 2 ecvstate_wide_winner <- PRESIDENT |>group_by(state, year) |>mutate(statewide_winner =if_else(candidatevotes ==max(candidatevotes), "Yes", "No")) |># Mark statewide winnerungroup() |># Assign ECV based on who won the statemutate(ECV =if_else(statewide_winner =="Yes", 2, 0)) |># assign the 2 ECV if statewide winner, else 0 ECVselect(state, year, candidate, candidatevotes, ECV) |>filter(!is.na(candidate))# look at winner of district - assign 1 ecv per district# Assume that the presidential candidate of the same party as the congressional representative wins that election.# Find the winner of each district in the HOUSE datasetdistrict_winners <- HOUSE |>filter(year %in%c("1976", "1980", "1984", "1988", "1992", "1996", "2000", "2004", "2008", "2012", "2016", "2020")) |>group_by(state, year, district) |>filter(candidatevotes ==max(candidatevotes)) |>ungroup() |>mutate(ecv =1) # Assign 1 ECV for each winning district# Join the district winners with the PRESIDENT dataset to match the partyecv_assignment <- district_winners |>left_join(PRESIDENT, by =c("state", "year", "party"="party_simplified"), relationship ="many-to-many") |>mutate(ecv_presidential =1) |>select(state, year, district, candidate.y, party, ecv_presidential)# Find total ecv from districtsdistrict_ecv_summary <- ecv_assignment |>group_by(state, year, candidate.y, party) |>summarise(district_total_ecv =sum(ecv_presidential), .groups ="drop")# Join the district-level ECV summary with the statewide ECVsecv_combined <- state_wide_winner |>left_join(district_ecv_summary, by =c("state", "year", "candidate"="candidate.y")) |># Add the statewide ECV to the district-level ECVsmutate(total_ecv = district_total_ecv + ECV) |>filter(!is.na(total_ecv)) ecv_combined <- ecv_combined |>group_by(year, candidate, party) |>summarise(total_ecv =sum(total_ecv), .groups ="drop") |># total ecvarrange(year, total_ecv) |>group_by(year) |>mutate(winner =if_else(total_ecv ==max(total_ecv), "Yes", "No")) |># Mark winnerungroup() ecv_combined |>gt() |>tab_header(title ="District-Wide Winner-Take-All + State-Wide At Large Votes" ) |>cols_label( # display column namesyear ="Year",candidate ="Candidate",party ="Party",total_ecv ="Electoral Votes",winner ="Winning Candidate" )
District-Wide Winner-Take-All + State-Wide At Large Votes
Year
Candidate
Party
Electoral Votes
Winning Candidate
1976
FORD, GERALD
REPUBLICAN
204
No
1976
CARTER, JIMMY
DEMOCRAT
362
Yes
1980
CARTER, JIMMY
DEMOCRAT
258
No
1980
REAGAN, RONALD
REPUBLICAN
287
Yes
1984
MONDALE, WALTER
DEMOCRAT
276
No
1984
REAGAN, RONALD
REPUBLICAN
283
Yes
1988
BUSH, GEORGE H.W.
REPUBLICAN
262
No
1988
DUKAKIS, MICHAEL
DEMOCRAT
292
Yes
1992
BUSH, GEORGE H.W.
REPUBLICAN
228
No
1992
CLINTON, BILL
DEMOCRAT
329
Yes
1996
CLINTON, BILL
DEMOCRAT
282
No
1996
DOLE, ROBERT
REPUBLICAN
283
Yes
2000
GORE, AL
DEMOCRAT
280
No
2000
BUSH, GEORGE W.
REPUBLICAN
290
Yes
2004
OTHER
DEMOCRAT
12
No
2004
KERRY, JOHN
DEMOCRAT
248
No
2004
BUSH, GEORGE W.
REPUBLICAN
299
Yes
2008
MCCAIN, JOHN
REPUBLICAN
224
No
2008
OBAMA, BARACK H.
DEMOCRAT
330
Yes
2012
OBAMA, BARACK H.
DEMOCRAT
275
No
2012
ROMNEY, MITT
REPUBLICAN
286
Yes
2016
CLINTON, HILLARY
DEMOCRAT
291
No
2016
TRUMP, DONALD J.
REPUBLICAN
313
Yes
2020
TRUMP, DONALD J.
REPUBLICAN
263
No
2020
BIDEN, JOSEPH R. JR
DEMOCRAT
275
Yes
View Code
ggplot(ecv_combined, aes(x =factor(year), y = total_ecv, fill = party)) +geom_bar(stat ="identity", position ="dodge") +# keeps bars side-by-sidescale_fill_manual(values =c("DEMOCRAT"="blue", "REPUBLICAN"="red")) +theme_minimal() +labs(title ="Total ECV Votes for Each Candidate in U.S. Presidential Elections",x ="Year",y ="Total ECV",fill ="Party" )
State-Wide Proportional
Under this system, electoral votes are distributed proportionally based on the percentage of votes each candidate receives in the state. If a candidate wins 60% of the vote in a state with 10 electoral votes, they get 60% of those electoral votes (6 ECVs). The approach here invovles calculating the total number of votes for each candidate in each state. Then, determine the proportion of the total vote that each candidate received in each state.
Note: The rounding issue in proportional allocation methods does lead to the loss of some ECVs because, after rounding, the sum of the allocated votes may not match the total number of ECVs available for that state or for the entire country. Here, I allocate the remaining ECV to the candidate with the greatest proportion of votes.
View Code
# Allocate ECVs based on that proportion with rounding state_proportional_votes <- PRESIDENT |>group_by(state, year) |>mutate(vote_share = candidatevotes /sum(candidatevotes)) |># Proportion of votesungroup() |>left_join(ECV, by =c("state", "year")) |>mutate(proportional_ecv =round(vote_share * ecv)) # Round to allocate ECVs# Summarize the total ECVs for each candidate by state and yearstate_proportional_summary <- state_proportional_votes |>group_by(state, year, candidate, party_simplified) |>summarise(total_proportional_ecv =sum(proportional_ecv), .groups ="drop") |>arrange(state, year, total_proportional_ecv) |>group_by(state, year) |># Mark the winner with the most ECVs in each state and yearmutate(winner =if_else(total_proportional_ecv ==max(total_proportional_ecv), "Yes", "No")) |>ungroup() # When we use proportions and round, some ECV go unallocated# Allocate ECVs proportionally and round downstate_wide_prop <- PRESIDENT |>group_by(state, year) |>mutate(vote_share = candidatevotes /sum(candidatevotes)) |># Proportion of votesungroup() |>left_join(ECV, by =c("state", "year")) |>mutate(prop_ecv = vote_share * ecv, round_prop_ecv =round(vote_share * ecv)) |># Round ECVsgroup_by(state, year) |>mutate(remaining_ecvs = ecv -sum(round_prop_ecv)) |># Calculate how many ECVs are left to allocateungroup() |># assign remainder to the max unrounded proportiongroup_by(state, year) |>mutate(final_ecv =ifelse(vote_share ==max(vote_share), round_prop_ecv + remaining_ecvs, round_prop_ecv)) |># Allocate remaining ECVs to the candidate with max vote shareungroup() |>select(year, state, candidate, party_simplified, ecv, prop_ecv, round_prop_ecv, remaining_ecvs, final_ecv)# Summarize the total allocated ECVs for each candidatestate_wide_prop_summary <- state_wide_prop |>group_by(state, year, candidate, party_simplified) |>summarise(total_prop_ecv =sum(final_ecv), .groups ="drop") |>group_by(year, state) |>mutate(winner =if_else(total_prop_ecv ==max(total_prop_ecv), "Yes", "No")) |>ungroup() |>filter(total_prop_ecv >0) |>select(year, candidate, party_simplified, total_prop_ecv, winner)# across states for the yearstate_wide_totals <- state_wide_prop_summary |>group_by(year, candidate, party_simplified) |>summarise(total_ecv =sum(total_prop_ecv), .groups ="drop") |>group_by(year) |>mutate(winner =if_else(total_ecv ==max(total_ecv), "Yes", "No")) |>ungroup() |>filter(total_ecv >0) |>select(year, candidate, party_simplified, total_ecv, winner) |>arrange(year, desc(total_ecv))
View Code
ggplot(state_wide_totals, aes(x =factor(year), y = total_ecv, fill = party_simplified)) +geom_bar(stat ="identity", position ="dodge") +# keeps bars side-by-sidescale_fill_manual(values =c("DEMOCRAT"="blue", "REPUBLICAN"="red", "LIBERTARIAN"="beige", "OTHER"="gray")) +theme_minimal() +labs(title ="Total ECV Votes for Each Candidate in U.S. Presidential Elections",x ="Year",y ="Total ECV",fill ="Party" )
This system allocates ECVs based on the national popular vote, not state-by-state. So, each state’s contribution to the national total is proportional to the number of votes received by each candidate in the national election. If Candidate A wins 60% of the total national popular vote and Candidate B wins 40%, Candidate A would receive 60% of the total ECVs, and Candidate B would get 40%, regardless of how they performed in any individual state. This system would reduce the importance of individual states and the swing state effect, and might make the election outcomes more directly tied to the national popular vote.
View Code
# Find total ECV for each year electoral_votes_available <- ECV |>group_by(year) |>summarize(total_ecv =sum(ecv)) # sum ecvnation_wide_prop <- PRESIDENT |>select(year, state, candidate, candidatevotes, party_simplified) |>group_by(year, candidate, party_simplified) |>summarize(candidate_total =sum(candidatevotes)) |># total votes nationwide per candidate per yeargroup_by(year) |>mutate(nation_total =sum(candidate_total)) |># total votes nationwide per yearungroup() |>mutate(prop_vote = (candidate_total / nation_total)) |># proportion of candidate votes to nationwide votesselect(-candidate_total, -nation_total) |>left_join(electoral_votes_available, join_by(year == year)) |># join with ECVmutate(prop_ecv =round(prop_vote * total_ecv, digits =0)) |># multiply proportion to total ecv that yearselect(-prop_vote, -total_ecv) |>group_by(year)# Summarize the total allocated ECVs for each candidatenation_wide_summary <- nation_wide_prop |>group_by(year) |>mutate(winner =if_else(prop_ecv ==max(prop_ecv), "Yes", "No")) |>ungroup() |>filter(prop_ecv >0, !is.na(candidate)) |>select(year, candidate, prop_ecv, winner, party_simplified) |>arrange(year, desc(prop_ecv))
View Code
ggplot(nation_wide_summary, aes(x =factor(year), y = prop_ecv, fill = party_simplified)) +geom_bar(stat ="identity", position ="dodge") +# keeps bars side-by-sidescale_fill_manual(values =c("DEMOCRAT"="blue", "REPUBLICAN"="red", "LIBERTARIAN"="beige", "OTHER"="gray")) +theme_minimal() +labs(title ="Total ECV Votes for Each Candidate in U.S. Presidential Elections",x ="Year",y ="Total ECV",fill ="Party" )
Fact Check Example: The 2000 U.S. Presidential Election
The 2000 U.S. presidential election between George W. Bush (Republican) and Al Gore (Democrat) provides a compelling case study of the Electoral College’s impact. Bush won the presidency despite losing the popular vote by approximately 500,000 votes. This resulted in widespread criticism of the electoral system.
State-Wide Winner-Take-All: Bush: 271 ECVs (Winner) Gore: 266 ECVs (Loser) In this system, Bush wins, as he narrowly wins key battleground states, including Florida, despite Gore’s national popular vote lead.
District-Wide Winner-Take-All + State-Wide At-Large Votes: Bush: 278 ECVs (Winner) Gore: 260 ECVs (Loser) This hybrid system gives Bush a slightly larger margin due to the district-level distribution of votes, which tends to favor Republicans in many of the congressional districts.
State-Wide Proportional: Gore: 290 ECVs (Winner) Bush: 248 ECVs (Loser) This system allocates ECVs proportionally based on the percentage of the popular vote. Gore wins more ECVs because his share of the popular vote is larger nationwide, leading to a more direct representation of voter preferences.
National Proportional: Gore: 286 ECVs (Winner) Bush: 252 ECVs (Loser) In a national proportional system, Gore’s larger share of the national vote translates to a clear victory, highlighting the disparity between the Electoral College and the popular vote outcome. In the 2000 election, the State-Wide Winner-Take-All system favored George W. Bush, despite Al Gore winning the popular vote. The National Proportional system would have resulted in a Gore victory, aligning the ECVs more closely with the popular vote. This highlights how winner-take-all methods can distort the will of the majority.
Winning Candidates by ECV Allocation Method (2000)
Year
State-Wide Winner-Take-All
District-Wide Winner-Take-All + State-Wide At Large Votes
State-Wide Proportional
Nation-Wide Proportional
Winning Candidate
Winner Party
Total ECV
Winning Candidate
Winner Party
Total ECV
Winning Candidate
Winner Party
Total ECV
Winning Candidate
Winner Party
Total ECV
2000
BUSH, GEORGE W.
REPUBLICAN
271
BUSH, GEORGE W.
REPUBLICAN
290
GORE, AL
DEMOCRAT
263
GORE, AL
DEMOCRAT
258
The proportional allocation methods also tend to assign a small number of ECV to non Republican or Democrat parties such as Other or Libertarian as seen below. However, since the number of ECV across the nation are relatively low in comparison, I will be filtering them out in the second plot below.
View Code
winners_2000 <-bind_rows( # bind different allocation methods together to plot in a facet state_wide_winner_take_all |>filter(year ==2000) |>mutate(allocation_method ="State-Wide Winner-Take-All"), ecv_combined |>filter(year ==2000) |>mutate(allocation_method ="District-Wide Winner-Take-All + State-Wide At Large Votes") |>rename(party_simplified = party), state_wide_totals |>filter(year ==2000) |>mutate(allocation_method ="State-Wide Proportional"), nation_wide_summary |>filter(year ==2000) |>mutate(allocation_method ="Nation-Wide Proportional") |>rename(total_ecv = prop_ecv))ggplot(winners_2000, aes(x = party_simplified, y = total_ecv, fill = party_simplified)) +geom_bar(stat ="identity", position ="stack") +geom_text(aes(label = total_ecv), # label with ECVposition =position_stack(vjust =0.5), color ="black", size =3# )+facet_wrap(~allocation_method, scales ="free_y") +# Facet by allocation methodlabs(title ="Total ECV Votes by Allocation Method (2000)",x ="Party",y ="Total ECV Votes",fill ="Party" ) +theme_minimal() +scale_fill_manual(values =c("REPUBLICAN"="red", "DEMOCRAT"="blue", "LIBERTARIAN"="beige", "OTHER"="gray") ) +theme(axis.text.x =element_text(angle =45, hjust =1), # Rotate party names for better visibilitylegend.position ="none"# Hide legend since it's already represented by colors )
View Code
winners_2000_na <- winners_2000 |>filter(party_simplified %in%c("DEMOCRAT", "REPUBLICAN"))ggplot(winners_2000_na, aes(x = party_simplified, y = total_ecv, fill = party_simplified)) +geom_bar(stat ="identity", position ="stack") +geom_text(aes(label = total_ecv), # label with ECVposition =position_stack(vjust =0.5), color ="black", size =3# ) +facet_wrap(~allocation_method, scales ="free_y") +# Facet by allocation methodlabs(title ="Total ECV Votes by Allocation Method (2000)",x ="Party",y ="Total ECV Votes",fill ="Party" ) +theme_minimal() +scale_fill_manual(values =c("REPUBLICAN"="red", "DEMOCRAT"="blue") ) +theme(axis.text.x =element_text(angle =45, hjust =1), # Rotate party names for better visibilitylegend.position ="none"# Hide legend since it's already represented by colors )
Winner-Take-All systems: In this system, narrow victories in key swing states have an outsized impact, making it possible for a candidate to lose the popular vote but win the Electoral College.
Proportional systems: These systems provide a more accurate reflection of the national popular vote and reduce the disproportionate weight given to small states or swing states.
Comparison of Winning Candidates and Parties by ECV Allocation System
Year
State-Wide Winner-Take-All
District-Wide Winner-Take-All + State-Wide At Large Votes
State-Wide Proportional
Nation-Wide Proportional
Winning Candidate
Winner Party
Total ECV
Winning Candidate
Winner Party
Total ECV
Winning Candidate
Winner Party
Total ECV
Winning Candidate
Winner Party
Total ECV
1976
CARTER, JIMMY
DEMOCRAT
294
CARTER, JIMMY
DEMOCRAT
362
CARTER, JIMMY
DEMOCRAT
270
CARTER, JIMMY
DEMOCRAT
267
1980
REAGAN, RONALD
REPUBLICAN
448
REAGAN, RONALD
REPUBLICAN
287
REAGAN, RONALD
REPUBLICAN
281
REAGAN, RONALD
REPUBLICAN
270
1984
REAGAN, RONALD
REPUBLICAN
525
REAGAN, RONALD
REPUBLICAN
283
REAGAN, RONALD
REPUBLICAN
321
REAGAN, RONALD
REPUBLICAN
313
1988
BUSH, GEORGE H.W.
REPUBLICAN
426
DUKAKIS, MICHAEL
DEMOCRAT
292
BUSH, GEORGE H.W.
REPUBLICAN
291
BUSH, GEORGE H.W.
REPUBLICAN
284
1992
CLINTON, BILL
DEMOCRAT
367
CLINTON, BILL
DEMOCRAT
329
CLINTON, BILL
DEMOCRAT
226
CLINTON, BILL
DEMOCRAT
229
1996
CLINTON, BILL
DEMOCRAT
376
DOLE, ROBERT
REPUBLICAN
283
CLINTON, BILL
DEMOCRAT
262
CLINTON, BILL
DEMOCRAT
263
2000
BUSH, GEORGE W.
REPUBLICAN
271
BUSH, GEORGE W.
REPUBLICAN
290
GORE, AL
DEMOCRAT
263
GORE, AL
DEMOCRAT
258
2004
BUSH, GEORGE W.
REPUBLICAN
286
BUSH, GEORGE W.
REPUBLICAN
299
BUSH, GEORGE W.
REPUBLICAN
278
BUSH, GEORGE W.
REPUBLICAN
271
2008
OBAMA, BARACK H.
DEMOCRAT
361
OBAMA, BARACK H.
DEMOCRAT
330
OBAMA, BARACK H.
DEMOCRAT
285
OBAMA, BARACK H.
DEMOCRAT
282
2012
OBAMA, BARACK H.
DEMOCRAT
329
ROMNEY, MITT
REPUBLICAN
286
OBAMA, BARACK H.
DEMOCRAT
271
OBAMA, BARACK H.
DEMOCRAT
272
2016
TRUMP, DONALD J.
REPUBLICAN
305
TRUMP, DONALD J.
REPUBLICAN
313
CLINTON, HILLARY
DEMOCRAT
265
CLINTON, HILLARY
DEMOCRAT
257
2020
BIDEN, JOSEPH R. JR
DEMOCRAT
306
BIDEN, JOSEPH R. JR
DEMOCRAT
275
BIDEN, JOSEPH R. JR
DEMOCRAT
273
BIDEN, JOSEPH R. JR
DEMOCRAT
276
View Code
winners_all <-bind_rows( # bind different allocation methods together to plot in a facet state_wide_winner_only |>mutate(allocation_method ="State-Wide Winner-Take-All"), district_wide_winner_only |>mutate(allocation_method ="District-Wide Winner-Take-All + State-Wide At Large Votes"), state_prop_winner_only |>mutate(allocation_method ="State-Wide Proportional"), nation_prop_winner_only |>mutate(allocation_method ="Nation-Wide Proportional") |>rename(total_ecv = prop_ecv))ggplot(winners_all, aes(x = year, y = total_ecv, fill = winner_party)) +geom_bar(stat ="identity", position ="dodge") +geom_text(aes(label = total_ecv, y = total_ecv +20), # label with ECV and offset labels above barsposition =position_dodge(width =0.8), color ="black", size =3# ) +facet_wrap(~allocation_method, scales ="free_y") +# Facet by method, each year will have its own plotscale_fill_manual(values =c("REPUBLICAN"="red", "DEMOCRAT"="blue") )+labs(title ="Electoral College Votes by Method and Year",x ="Year",y ="Total ECV",fill ="Party" ) +theme_minimal() +theme(axis.text.x =element_text(angle =90, hjust =1),legend.position ="bottom" )
View Code
library(gganimate)ggplot(winners_all, aes(x = year, y = total_ecv, fill = winner_party)) +geom_bar(stat ="identity", position ="dodge") +# Create bar plot with dodged bars for each partygeom_text(aes(label = total_ecv, y = total_ecv +20), # Label ECV values with an offset to appear above barsposition =position_dodge(width =0.8), color ="black", size =3# Text properties ) +scale_fill_manual(values =c("REPUBLICAN"="red", "DEMOCRAT"="blue") # Party colors ) +labs(title ="Electoral College Votes by Method and Year: {current_frame}",x ="Year",y ="Total ECV",fill ="Party" ) +theme_minimal() +theme(axis.text.x =element_text(angle =90, hjust =1), # Rotate x-axis labels for better readabilitylegend.position ="bottom") +# Position the legend at the bottom transition_manual(allocation_method) +# Animate by allocation_methodenter_fade() +# Fade in new bars for the new allocation methodexit_fade()
The State-Wide Proportional and National Proportional allocation systems appear to be the fairest in terms of reflecting the popular vote. These systems ensure that each vote contributes to the outcome, reducing the disproportionate influence of small states and the swing state effect. On the other hand, the State-Wide Winner-Take-All and District-Wide Winner-Take-All systems can create a bias toward smaller states and swing states. Smaller states can disproportionately impact the outcome of the election.
What Does “Fairness” Mean?
When I think about fairness in the context of ECV allocation, it generally relates to how well the system:
Represents the popular vote: Does the distribution of ECVs accurately reflect the number of votes cast for each candidate? A system that over-represents or under-represents certain groups could be considered unfair.
Accounts for each state: In systems like the current winner-take-all method, small states with fewer voters might have a disproportionately large influence in electing a president compared to larger states. In contrast, a proportional system might mitigate this imbalance.
Minimizes “winner-take-all” advantages: A system where a candidate wins by just a small margin but takes all of a state’s ECVs could be seen as unfair because the losing candidate might have had broad support across the state, but doesn’t get any representation.
Truthfulness Score
Claim: claim under evaluation is: “The Electoral College system is biased and over-represents smaller states, giving them an unfair advantage in electing the president.” The scale I’ll use is a 5-point scale, ranging from 1 to 5, where:
False – The claim is completely inaccurate, with no evidence to support it.
Mostly False – The claim contains significant inaccuracies or overgeneralizations, with some misleading aspects.
Half-True – The claim is partially accurate but misses key details or context that would provide a more complete picture.
Mostly True – The claim is largely accurate, with very minor misstatements or nuances that don’t change the overall truth.
True – The claim is completely accurate, with no significant errors or misleading information.
Score: 4 Under the State-Wide Winner-Take-All system, smaller states do indeed have a disproportionately large impact because of the fixed 2 Senate seats each state gets, regardless of population. This benefits candidates who win a disproportionate share of votes in less populous states.
Pew Research Center In September of 2024, the Pew published that the article: Majority of Americans Continue to Favor Moving Away from Electoral College. Following the 2000 and 2016 elections, where the winners of the popular vote received fewer Electoral College votes than their opponents, the Pew surveyed we surveyed 9,720 U.S. adults.