Collector from Collections#

The Python collections library is a powerful module that extends the built-in collection types such as lists, dictionaries, and tuples. In this tutorial, we will explore some important classes in the collections module, like Counter, deque, namedtuple, OrderedDict, and defaultdict.

In this tutorial, we’ll explore the Counter() class from the collections module in Python. This is a beginner-friendly tutorial, so don’t worry if you’re new to Python programming. We’ll guide you through the basics step by step.

What is a Counter?#

A Counter is a Python class that allows you to count the occurrences of elements in a collection, such as a list, tuple, or string. It is a subclass of the built-in dict class, so you can use it just like a regular dictionary.

Why use a Counter?#

Counter can be incredibly useful when you need to count elements in a collection and store the results in a dictionary-like format. It simplifies the process by handling all the counting logic for you.

Importing Counter#

Before we can use the Counter() class, we need to import it from the collections module. Here’s how to do that:

from collections import Counter

Creating a Counter#

To create a new Counter object, simply pass a collection (like a list, tuple, or string) as an argument to the Counter() function. Let’s see an example:

words = ['to', 'be', 'or', 'not', 'to', 'be', 'that', 'is', 'the', 'question']
word_counter = Counter(words)

print(word_counter)
Counter({'to': 2, 'be': 2, 'or': 1, 'not': 1, 'that': 1, 'is': 1, 'the': 1, 'question': 1})

As you can see, the Counter object counts the occurrences of each word in the words list and stores the result as a dictionary.

Accessing Counts#

You can access the count of an element using the square bracket notation, just like you would do with a regular dictionary:

to_count = word_counter['to']
print(to_count)
2

In some instances, we may write code where we try count words that do not appear in the Counter. Rather than throwing an error, Counter will simply return a 0.

print(word_counter["and"])
0

Updating Counts#

To update the counts in a Counter, you can use the update() method. This method takes an iterable as an argument and updates the counts accordingly:

more_words = ['to', 'be', 'or', 'not', 'to', 'be']
word_counter.update(more_words)

print(word_counter)
Counter({'to': 4, 'be': 4, 'or': 2, 'not': 2, 'that': 1, 'is': 1, 'the': 1, 'question': 1})

Finding Most Common Elements#

Use the most_common() method to get a list of the n most common elements and their counts:

top_three_words = word_counter.most_common(3)
print(top_three_words) 
[('to', 4), ('be', 4), ('or', 2)]
word_counter.clear()
print(word_counter)

Real Example for Finding Frequency in text.#

Now that you have a basic understanding of how to use the Counter() class, let’s see a practical example in the context of humanities research. In this example, we will count the frequencies of words in a given text. There are better ways to split a text (such as via spaCy), but for this example, the simple split() function will suffice.

text = """To be, or not to be, that is the question:
Whether 'tis nobler in the mind to suffer
The slings and arrows of outrageous fortune,
Or to take arms against a sea of troubles
And by opposing end them."""

# Convert the text to lowercase and split it into a list of words
words = text.lower().split()

# Create a Counter object to count the word occurrences
word_counter = Counter(words)

# Print the word frequencies
print(word_counter)
Counter({'to': 4, 'the': 3, 'be,': 2, 'or': 2, 'and': 2, 'of': 2, 'not': 1, 'that': 1, 'is': 1, 'question:': 1, 'whether': 1, "'tis": 1, 'nobler': 1, 'in': 1, 'mind': 1, 'suffer': 1, 'slings': 1, 'arrows': 1, 'outrageous': 1, 'fortune,': 1, 'take': 1, 'arms': 1, 'against': 1, 'a': 1, 'sea': 1, 'troubles': 1, 'by': 1, 'opposing': 1, 'end': 1, 'them.': 1})

Wth the word frequencies calculated, you can perform various analyses, such as finding the most common words, identifying themes, or comparing the usage of words across different texts. The Counter class is a powerful tool for humanities research, offering an easy and efficient way to count and analyze elements in a collection.