Coder Perfect

Obtain the distinction between two lists

Problem

In Python, I have two lists that look like this:

temp1 = ['One', 'Two', 'Three', 'Four']
temp2 = ['One', 'Two']

I need to make a third list that includes items from the first list that aren’t on the second. I’ll use the example as an example.

temp3 = ['Three', 'Four']

Are there any shortcuts that don’t involve cycles and checking?

Asked by Max Frai

Solution #1

To get elements which are in temp1 but not in temp2 :

In [5]: list(set(temp1) - set(temp2))
Out[5]: ['Four', 'Three']

Keep in mind that it is asymmetric:

In [5]: set([1, 2]) - set([2, 3])
Out[5]: set([1]) 

when you might expect/wish for it to be equal ([1, 3]). Set([1, 3]) is an option if you wish to utilize it as your answer ([1, 2]). set([2, 3]) symmetric difference

Answered by ars

Solution #2

The existing solutions all offer either one or the other of:

However, no solution has yet to combine the two. Try this if you want both:

s = set(temp2)
temp3 = [x for x in temp1 if x not in s]

Performance test

import timeit
init = 'temp1 = list(range(100)); temp2 = [i * 2 for i in range(50)]'
print timeit.timeit('list(set(temp1) - set(temp2))', init, number = 100000)
print timeit.timeit('s = set(temp2);[x for x in temp1 if x not in s]', init, number = 100000)
print timeit.timeit('[item for item in temp1 if item not in temp2]', init, number = 100000)

Results:

4.34620224079 # ars' answer
4.2770634955  # This answer
30.7715615392 # matt b's answer

Because it does not require the formation of an unnecessary set, the solution I gave, as well as preserving order, is (slightly) faster than set subtraction. If the first list is significantly longer than the second and hashing is expensive, the performance difference will be more noticeable. This is demonstrated in a second test:

init = '''
temp1 = [str(i) for i in range(100000)]
temp2 = [str(i * 2) for i in range(50)]
'''

Results:

11.3836875916 # ars' answer
3.63890368748 # this answer (3 times faster!)
37.7445402279 # matt b's answer

Answered by Mark Byers

Solution #3

The python XOR operator can be used to accomplish this.

set(temp1) ^ set(temp2)

Answered by SuperNova

Solution #4

temp3 = [item for item in temp1 if item not in temp2]

Answered by matt b

Solution #5

The following basic method can be used to find the difference between two lists (say, list1 and list2).

def diff(list1, list2):
    c = set(list1).union(set(list2))  # or c = set(list1) | set(list2)
    d = set(list1).intersection(set(list2))  # or d = set(list1) & set(list2)
    return list(c - d)

or

def diff(list1, list2):
    return list(set(list1).symmetric_difference(set(list2)))  # or return list(set(list1) ^ set(list2))

By Using the above function, the difference can be found using diff(temp2, temp1) or diff(temp1, temp2). Both will give the result [‘Four’, ‘Three’]. You don’t have to worry about the order of the list or which list is to be given first.

Python doc reference

Answered by arulmr

Post is based on https://stackoverflow.com/questions/3462143/get-difference-between-two-lists