The task asks us to find the first position where a stream of data contained 4 consecutive unique strings. How to best handle this efficiently?
The best way to to do this is to only hold 4 characters at a time, and a deque
makes it very efficient to add characters at the end and automatically drop the no-longer-needed 5th character from the other side.
How to track that they are unique then? While we could pass the deque
to a set()
each step and check if the length is still 4, you can better use a Counter()
as that can be used to keep track of all the characters currently being tracked and only have to update the counts for the new character added and the old character dropped. If there are then 4 counts in the Counter, we know they must all be unique.
from collections import Counter, deque
from itertools import islice
from typing import Iterable
def find_n_unique(stream: Iterable[str], n: int = 4) -> int:
values = iter(stream)
buffer = deque(islice(values, n), maxlen=n)
counts = Counter(buffer)
if all(v == 1 for v in counts.values()):
return n
for i, value in enumerate(values, n + 1):
counts -= Counter(buffer[0])
counts += Counter(value)
if len(counts) == n:
return i
buffer.append(value)
part1_tests = {
"bvwbjplbgvbhsrlpgdmjqwftvncz": 5,
"nppdvjthqldpwncqszvftbrmjlhg": 6,
"nznrnfrfntjfmvfwmzdfjlvtqnbhcprsg": 10,
"zcfzfwzzqfrljwzlrfnpqdbhtmscgvjw": 11,
}
for test, expected in part1_tests.items():
result = find_n_unique(test)
assert result == expected, f"{test}: {result} != {expected}"
import aocd
datastream = aocd.get_data(day=6, year=2022)
print("Part 1:", find_n_unique(datastream))
Part 1: 1034
Part two was very easy, as I already had made the number of characters a variable. Plus, the approach was already sound and efficient, so all I had to do was pass in an extra argument to solve part 2.
part2_tests = {
"mjqjpqmgbljsphdztnvjfqwrcgsmlb": 19,
"bvwbjplbgvbhsrlpgdmjqwftvncz": 23,
"nppdvjthqldpwncqszvftbrmjlhg": 23,
"nznrnfrfntjfmvfwmzdfjlvtqnbhcprsg": 29,
"zcfzfwzzqfrljwzlrfnpqdbhtmscgvjw": 26,
}
for test, expected in part2_tests.items():
result = find_n_unique(test, 14)
assert result == expected, f"{test}: {result} != {expected}"
print("Part 2:", find_n_unique(datastream, 14))
Part 2: 2472