Skip to content Skip to sidebar Skip to footer

Count Unique Values Per Unique Keys In Python Dictionary

I have dictionary like this: yahoo.com|98.136.48.100 yahoo.com|98.136.48.105 yahoo.com|98.136.48.110 yahoo.com|98.136.48.114 yahoo.com|98.136.48.66 yahoo.com|98.136.48.71 yaho

Solution 1:

Use a defaultdict:

from collections import defaultdict

d = defaultdict(set)

with open('somefile.txt') as thefile:
   for line in the_file:
      if line.strip():
          value, key = line.split('|')
          d[key].add(value)

for k,v in d.iteritems():  # use d.items() in Python3
    print('{} - {}'.format(k, len(v)))

Solution 2:

you can use zip function to separate the ips and domains in tow list , then use set to get the unique entries !

>>>f=open('words.txt','r').readlines()
>>> zip(*[i.split('|') for i in f])
[('yahoo.com', 'yahoo.com', 'yahoo.com', 'yahoo.com', 'yahoo.com', 'yahoo.com', 'yahoo.com', 'yahoo.com', 'yahoo.net', 'g03.msg.vcs0'), ('98.136.48.100\n', '98.136.48.105\n', '98.136.48.110\n', '98.136.48.114\n', '98.136.48.66\n', '98.136.48.71\n', '98.136.48.73\n', '98.136.48.75\n', '98.136.48.100\n', '98.136.48.105')]
>>> [set(dom) for dom in zip(*[i.split('|') for i in f])]
[set(['yahoo.com', 'g03.msg.vcs0', 'yahoo.net']), set(['98.136.48.71\n', '98.136.48.105\n', '98.136.48.100\n', '98.136.48.105', '98.136.48.114\n', '98.136.48.110\n', '98.136.48.73\n', '98.136.48.66\n', '98.136.48.75\n'])]

and then with len you can find the number of unique objects ! all in one line with list comprehension :

>>> [len(i) for i in [set(dom) for dom in zip(*[i.split('|') for i in f])]]
[3, 9]

Post a Comment for "Count Unique Values Per Unique Keys In Python Dictionary"