Day 3: Binary Diagnostic

Click for Problem Statement

Back to 2021


library(tidyverse)
library(here)
path_data <- here("2021/inputs/03-input.txt")
input <- tibble(x = read_lines(path_data))

This one was all about binary numbers. In general, I’m not trying to go fast, more trying to learn something. This time, went in confident and Part 1 went pretty easily. Part 2 kicked my butt because of wanting to make a dplyr function to solve it. I find creating dplyr functions super confusing and this one checked all the boxes. No…I didn’t get frustrated, that would be silly. Why would THAT happen. I’m just trying to a pass a simple VARIABLE NAME! I’M NOT YELLING, YOU’RE YELLING! (inhale) It’s fine. :)

Let’s take a look…

Part 1

This was pretty straight forward:

  • separate the strings into columns, one for each place
  • count up the ones and zeroes in each place column
  • filter for the highest value for each place

We end up with a dataframe of places and zeroes or ones

# split strings into columns, convert to numbers
# (type conversion was trouble on day 2, sort of happened again here)
data <- input %>% 
  separate(x, into = as.character(seq(1,12)), 
           sep = seq(1,12)) %>% 
  mutate(across(.cols = everything(), ~as.numeric(.)))

# pivot, count columns, 
# filter most common value in each place
most_common <- data %>% 
  pivot_longer(cols = everything(), names_to = "place", values_to = "num") %>% 
  mutate(place = as.numeric(place)) %>% 
  group_by(place, num) %>% 
  summarise(total = n()) %>% 
  group_by(place) %>% 
  filter(total == max(total)) %>% 
  arrange(place)

most_common
## # A tibble: 12 × 3
## # Groups:   place [12]
##    place   num total
##    <dbl> <dbl> <int>
##  1     1     0   508
##  2     2     1   507
##  3     3     1   501
##  4     4     0   506
##  5     5     1   508
##  6     6     1   508
##  7     7     1   527
##  8     8     1   506
##  9     9     0   505
## 10    10     0   503
## 11    11     0   514
## 12    12     0   514

We can now pull out the “number” as a vector. Quite proud of myself for flipping it round to get the least common version: a little modulo math gets us there. Finally, we can make a vector of the powers of two and matrix multiply to get the relevant rate values.

# extract the binary number from df
binary_most_common <- most_common %>% 
  pull(num)

# I felt smart doing this :)
binary_least_common <- (binary_most_common + 1) %% 2

# Paul Rubin had a super helpful blog post on matrix multiplication
# made things very easy here
powers_of_two <- rev(2 ^ (0:11))

gamma <- binary_most_common %*% powers_of_two
epsilon <- binary_least_common %*% powers_of_two

gamma * epsilon
##         [,1]
## [1,] 4118544

Part 2

This one destroyed me for a bit because I had trouble making a dplyr function that would work. I’ve talked with other people with similar troubles, passing variable names to be used in a function. I find this all really dumb and unintuitive: enquo, quoting, masking, using variable names, etc. It’s kind of a mess. Feels like it should be simpler to make your own dplyr functions for chaining things grammatically.

I reuse the code from part one to find the most common values. The challenge was the column names from separating the string were numbers, basically the worst case when trying to use them in a function. In this case, I change the column names to be strings, then can filter on them programmatically before resetting the names back to normal.

Turned out !!sym(col) worked, casting the string as a symbol to use it as a filter column name. However, I can’t turn numbers into symbols, so I had to convert the place number to a temp string.

# find the counts for a given type
# either oxygen or co2 (most or least common)
find_common <- function(df, type) {
  df %>% 
    pivot_longer(cols = everything(), names_to = "place", values_to = "num") %>% 
    mutate(place = as.numeric(place)) %>% 
    group_by(place, num) %>% 
    summarise(total = n()) %>% 
    group_by(place) %>% 
    filter(total == ifelse(type == 0, min(total), max(total))) %>% 
    arrange(place)
} 

# filter for a given place, using 
# tie_value to distinguish ox or co2
filter_place <- function(df, place, tie_value) {

  # get current stats for filtered values
  check_df <- find_common(df, type = tie_value)
  
  # rename the columns from numbers to strings
  # otherwise passing values led to trouble
  col = paste0("X", place)
  raw <- df %>%
    set_names(paste0('X', names(.)))

  # check if the place has a tie,
  # the group sizes of 1s and 0s will be
  # identical (equal one)
  group_sizes <- raw %>% 
    count(!!sym(col)) %>% 
    pull(n) %>% 
    unique() %>% 
    length()
  
  # either follow tie rules
  # or filter based on common stats
  if (group_sizes == 1) {
    raw %>%
      filter(!!sym(col) == tie_value) %>%
      set_names(seq(1,length(names(.))))
  } else {
    raw %>%
      filter(!!sym(col) == check_df[[place, "num"]]) %>%
      set_names(seq(1,length(names(.))))
  }
  
}

Now we can filter a bunch. There are certainly better ways to do this (like what? let me know on Twitter), but this copy-paste worked. Then can unnest the row and do the binary conversion like before.

# filter till we get to one row for each one

binary_ox_gen <- data %>% 
  filter_place(place = 1, tie_value = 1) %>% 
  filter_place(place = 2, tie_value = 1) %>% 
  filter_place(place = 3, tie_value = 1) %>% 
  filter_place(place = 4, tie_value = 1) %>% 
  filter_place(place = 5, tie_value = 1) %>% 
  filter_place(place = 6, tie_value = 1) %>% 
  filter_place(place = 7, tie_value = 1) %>% 
  filter_place(place = 8, tie_value = 1) %>% 
  filter_place(place = 9, tie_value = 1) %>% 
  filter_place(place = 10, tie_value = 1) %>% 
  filter_place(place = 11, tie_value = 1) %>% 
  filter_place(place = 12, tie_value = 1) %>% 
  unlist(use.names = FALSE)

binary_co2_scrub <- data %>% 
  filter_place(place = 1, tie_value = 0) %>% 
  filter_place(place = 2, tie_value = 0) %>% 
  filter_place(place = 3, tie_value = 0) %>% 
  filter_place(place = 4, tie_value = 0) %>% 
  filter_place(place = 5, tie_value = 0) %>% 
  filter_place(place = 6, tie_value = 0) %>% 
  filter_place(place = 7, tie_value = 0) %>% 
  filter_place(place = 8, tie_value = 0) %>% 
  unlist(use.names = FALSE)
# get the numbers and multiply
ox_gen_rating <- binary_ox_gen %*% powers_of_two
co2_scrub_rating <- binary_co2_scrub %*% powers_of_two

ox_gen_rating * co2_scrub_rating
##         [,1]
## [1,] 3832770

Phew! That was a lot. Hope you learned something!

How would you do it? What’s your shortcut? Please share!

Till next time!