Skip to main content

Dictionaries and sets

So far you have stored things in a list — a numbered row of items where you find something by its position (the first one, the third one, and so on). But very often you do not care about position at all. You want to find something by its name: "What is Sam's age?" "What is the price of milk?" "Which definition belongs to this word?" For that, programmers reach for a different container called a dictionary. This lesson teaches the two containers built for finding-by-name and for tracking what's-unique: the dictionary and the set. They are, together, the most heavily used tools in all of programming, so we will go slowly and define every single word.

What a dictionary is

A dictionary (Python shortens this to dict) is a collection that maps keys to values. Those two words are the whole idea, so let us pin them down:

  • A key is the thing you look something up by — the label, the name you already know.
  • A value is the thing you get back — the information stored under that key.
  • To map a key to a value just means to connect them: "this key points to this value."

:::note The name is the analogy The name "dictionary" is a perfect analogy. In a real paper dictionary, you look up a word (the key) and you get back its definition (the value). You never flip to "page 412, the third entry" — you flip straight to the word. A contacts app works the same way: you look up a name (the key) and get back a phone number (the value). A dict is exactly this, in code. :::

Creating one and looking things up

You write a dictionary with curly braces {}, and inside you list each pair as key: value, separated by commas. The colon is what ties a key to its value:

ages = {"Sam": 30, "Lee": 25}
# keys are the names "Sam" and "Lee"
# values are the numbers 30 and 25

To get a value out, you write the dictionary's name followed by the key in square brackets — the same brackets you used for list positions, except now you hand it a key instead of a number:

print(ages["Sam"]) # 30 (look up the value stored under the key "Sam")
print(ages["Lee"]) # 25

Read ages["Sam"] out loud as "ages, at the key Sam." It is not asking for position 0 or position 1 — there are no positions here at all. It hands back whatever value was filed under that exact key. An empty dictionary, with nothing in it yet, is just {}; you fill it up as you go.

When the key isn't there

What if you ask for a key the dictionary has never heard of? Python does not quietly return a blank — it stops the program with an error called a KeyError (literally, "I don't have that key"):

print(ages["Mo"]) # KeyError: 'Mo' (there is no key "Mo" — the program crashes)

Because a crash is rarely what you want, there are two safe ways to check first. The first is the word in, which gives back True or False depending on whether the key exists. (Note: in checks the keys, never the values.)

print("Sam" in ages) # True ("Sam" is a key)
print("Mo" in ages) # False ("Mo" is not a key)

The second safe way is the .get method. (A method is a built-in action you run on a value by writing a dot and a name after it.) .get(key) returns the value if the key exists, and otherwise returns None — a special "nothing here" value — instead of crashing. You can also give it a default: a fallback value to hand back when the key is missing.

print(ages.get("Sam")) # 30 (key exists, so you get its value)
print(ages.get("Mo")) # None (key missing — no crash, just None)
print(ages.get("Mo", 0)) # 0 (key missing, so use the default 0)

Adding, updating, and deleting

Putting something into a dictionary uses the same square-bracket syntax, now on the left of an =. Here is the one rule that surprises beginners: the same line both adds and overwrites. If the key is new, the pair gets added; if the key already exists, its old value is replaced. A dictionary holds each key only once.

ages["Mo"] = 40 # "Mo" is new -> ADDS it: {"Sam":30, "Lee":25, "Mo":40}
ages["Sam"] = 31 # "Sam" already exists -> OVERWRITES 30 with 31

To remove a pair, use del (short for "delete"). After this line, the key and its value are gone entirely:

del ages["Lee"] # removes the "Lee" pair -> {"Sam":31, "Mo":40}

Looping over a dictionary

A loop repeats an action for every item in a collection. With a dictionary there are three things you might want to walk through, and each has its own way:

Looping over the dict directly gives you the keys, one at a time:

for name in ages:
print(name)
# Sam
# Mo

Using .items() hands you the key and its value together, so you can read both in one go (the comma in for name, age unpacks the pair into two variables):

for name, age in ages.items():
print(name, age)
# Sam 31
# Mo 40

And .values() walks the values only, when you don't care about the keys:

for age in ages.values():
print(age)
# 31
# 40

Why dictionaries matter so much

Here is the property that makes the dictionary the workhorse of programming. Looking something up by key is effectively instant — and, crucially, it stays instant no matter how big the dictionary gets. Finding ages["Sam"] in a dictionary of three names takes the same tiny amount of time as finding it in a dictionary of a million names.

:::note Constant time, in plain words Programmers write this "constant time" speed as O(1). In plain words, O(1) means the time does not grow with the size of the data — one entry or a billion entries, the lookup costs the same. Contrast this with a list: to find a name in a list, you may have to scan from the front, checking each item until you hit it. With a million items that could be a million checks. The dictionary skips all that scanning and jumps straight to the answer. That single difference is why the dict is the most-used data structure in real software and coding interviews alike. The clever machinery that pulls off this trick is called a hash map, and you'll meet it once you move on to data-structure patterns. :::

Two small rules come out of how that machinery works. First, keys must be unique — the same key cannot appear twice (that is why re-assigning a key overwrites instead of duplicating). Second, keys must be "hashable," which is a fancy way of saying they must be values that cannot change after they're made — like strings, numbers, or tuples. Things that can change, like lists, are not allowed as keys. Values, on the other hand, can be anything at all.

Sets: a collection of unique things

A set is a close cousin of the dictionary. It is an unordered collection of unique values. "Unordered" means there are no positions — you cannot ask for "the first one." "Unique" means it refuses to hold duplicates: put the same value in twice and the set keeps just one copy. You write a set with curly braces and plain values inside, no colons (the colons are what made it a dict):

seen = {1, 2, 3}
print(2 in seen) # True (is 2 in the set? yes)
print(9 in seen) # False (9 is not in there)

A set answers one question superbly: "have I seen this before?" Just like dictionary lookup, checking x in some_set is O(1) — instant, no matter how large the set. You add new items with .add:

seen.add(4) # now {1, 2, 3, 4}
seen.add(2) # 2 is already in there -> no change, still unique

The other thing sets do beautifully is remove duplicates. Wrap any list in set(...) and every repeat is automatically thrown away — the one-line trick for "give me just the distinct values":

print(set([1, 1, 2])) # {1, 2} (the duplicate 1 is dropped)

:::note Set math Sets also do simple math on groups. Union (a | b) gives every value in either set; intersection (a & b) gives only the values in both. For example {1, 2} | {2, 3} is {1, 2, 3}, and {1, 2} & {2, 3} is {2}. :::

Worked example 1: counting words with a dict

Here is the pattern interviewers ask for constantly: counting. We want to know how many times each word appears in a list. The idea: keep a dictionary where each key is a word and its value is that word's running count. For each word, if we have seen it before we add 1; if not, we start it at 1.

def word_counts(words):
counts = {} # start with an empty dict
for w in words: # walk every word in the list
if w in counts: # have we counted this word before?
counts[w] = counts[w] + 1 # yes -> bump its count by one
else:
counts[w] = 1 # no -> first sighting, start at 1
return counts

print(word_counts(["a", "b", "a"])) # {"a": 2, "b": 1}

Let's trace it step by step (a trace means walking the code by hand to see what changes):

  • Start: counts is {}.
  • See "a" — not in counts, so go to else: counts["a"] = 1. Now {"a": 1}.
  • See "b" — not in counts, so: counts["b"] = 1. Now {"a": 1, "b": 1}.
  • See "a" again — this time it is in counts, so bump it: counts["a"] = 1 + 1 = 2. Now {"a": 2, "b": 1}.
  • List is finished, so return the answer: {"a": 2, "b": 1}.

The reason this is fast even on a huge list of words is the O(1) lookup: each w in counts check and each counts[w] update is instant, so the whole thing just walks the list once.

Worked example 2: catching a repeat with a set

Now a set in action. We want to scan a list and report whether any value shows up twice. We keep a set of values we have already passed; before accepting each new value, we ask the set "have I seen you before?"

def has_repeat(items):
seen = set() # empty set of things we've passed
for x in items:
if x in seen: # O(1) check: met this value already?
return True # yes -> we found a repeat, stop early
seen.add(x) # no -> remember it and keep going
return False # walked the whole list, no repeats

print(has_repeat([3, 1, 4, 1])) # True (the 1 repeats)
print(has_repeat([3, 1, 4])) # False (all distinct)

Trace the first call: seen starts empty. See 3 — not seen, add it: {3}. See 1 — not seen, add it: {3, 1}. See 4 — not seen, add it: {3, 1, 4}. See 1 — it is in seen, so return True right away. This "remember-with-a-set" idea is everywhere: deduping, finding the first repeat, checking membership against a known group.

:::tip Three containers, three jobs A list is for an ordered sequence you reach by position. A dict is for "look up a value by its key," instantly. A set is for "is this unique / have I seen it." A surprising number of programs — and interview answers — are just the right one of these three plus a single loop. :::

Why it matters

You have now met the four everyday containers: numbers and booleans, lists, dicts, and sets. A dict maps keys to values for instant, O(1) lookup by name; a set holds only unique values and answers "have I seen this?" just as fast. Both grow out of the same hash-map machinery, which is exactly why they show up in nearly every real program and coding interview. The next building block is the one you'll touch in nearly every program: text. The next lesson, working with text: strings, shows how to look inside and reshape strings — and you'll see dicts and sets show up again to count letters and find unique characters.

Where this leads: dicts and sets are the gateway to the hash-map patterns that dominate data-structure work and coding interviews. See Where this leads next.

Checkpoint

Required checkpoint

Dictionaries and sets

Pass to unlock the Next button below

Next: Strings →