# The quickest way to see if a value in a list exists

## Problem

What is the quickest way to determine whether a value exists in a list (a list with millions of elements) and its index?

As in this example, I know that all of the values in the list are unique.

The first method I try is (in my real code, it takes 3.8 seconds):

``````a = [4,2,3,1,5,6]

if a.count(7) == 1:
b=a.index(7)
"Do something with variable b"
``````

The second way I try is (2x faster than my real code: 1.9 seconds):

``````a = [4,2,3,1,5,6]

try:
b=a.index(7)
except ValueError:
"Do nothing"
else:
"Do something with variable b"
``````

Methods suggested by a Stack Overflow member (2.74 seconds for my actual code):

``````a = [4,2,3,1,5,6]
if 7 in a:
a.index(7)
``````

The first method takes 3.81 seconds and the second method takes 1.88 seconds in my real code. It’s a step forward, but:

I’m new to Python and programming; is there a faster way to accomplish the same tasks and save time in the process?

A more detailed explanation of my application is as follows:

I can get a list of particles using the Blender API:

``````particles = [1, 2, 3, 4, etc.]
``````

I can get the position of a particle from there:

``````particles[x].location = [x,y,z]
``````

And for each particle, I check if it has a neighbor by scanning each particle’s location as follows:

``````if [x+1,y,z] in particles.location
"Find the identity of this neighbour particle in x:the particle's index
in the array"
particles.index([x+1,y,z])
``````

## Solution #1

``````7 in a
``````

Things’s the clearest and quickest method to do it.

You may alternatively use a set, but creating one from your list may take longer than the time saved by speedier membership testing. The only way to be positive is to conduct thorough benchmarking. (This, of course, is dependent on the operations you require.)

## Solution #2

For huge lists, as others have said, it can be very slow. Here are some performance comparisons for in, set, and bisect. It’s worth noting that the time (in seconds) is in log scale.

Code for testing:

``````import random
import bisect
import matplotlib.pyplot as plt
import math
import time

def method_in(a, b, c):
start_time = time.time()
for i, x in enumerate(a):
if x in b:
c[i] = 1
return time.time() - start_time

def method_set_in(a, b, c):
start_time = time.time()
s = set(b)
for i, x in enumerate(a):
if x in s:
c[i] = 1
return time.time() - start_time

def method_bisect(a, b, c):
start_time = time.time()
b.sort()
for i, x in enumerate(a):
index = bisect.bisect_left(b, x)
if index < len(a):
if x == b[index]:
c[i] = 1
return time.time() - start_time

def profile():
time_method_in = []
time_method_set_in = []
time_method_bisect = []

# adjust range down if runtime is too long or up if there are too many zero entries in any of the time_method lists
Nls = [x for x in range(10000, 30000, 1000)]
for N in Nls:
a = [x for x in range(0, N)]
random.shuffle(a)
b = [x for x in range(0, N)]
random.shuffle(b)
c = [0 for x in range(0, N)]

time_method_in.append(method_in(a, b, c))
time_method_set_in.append(method_set_in(a, b, c))
time_method_bisect.append(method_bisect(a, b, c))

plt.plot(Nls, time_method_in, marker='o', color='r', linestyle='-', label='in')
plt.plot(Nls, time_method_set_in, marker='o', color='b', linestyle='-', label='set')
plt.plot(Nls, time_method_bisect, marker='o', color='g', linestyle='-', label='bisect')
plt.xlabel('list size', fontsize=18)
plt.ylabel('log(time)', fontsize=18)
plt.legend(loc='upper left')
plt.yscale('log')
plt.show()

profile()
``````

## Solution #3

You may make a set out of your things. Set lookups are extremely time-saving.

Try:

``````s = set(a)
if 7 in s:
# do stuff
``````

edit You mention in a comment that you’d like to know the element’s index. Sets, on the other hand, have no concept of element position. Pre-sorting your list and then using binary search every time you need to find an entry is another option.

## Solution #4

``````def check_availability(element, collection: iter):
return element in collection
``````

Usage

``````check_availability('a', [1,2,3,4,'a','b','c'])
``````

This, I believe, is the quickest way to see if a given value is in an array.

## Solution #5

The question that started it all was:

As a result, there are two things to look for:

To do this, I changed the @xslittlegrass code to compute indexes in all circumstances and introduced a new method.

Results

Methods are:

Method 5 is the fastest, according to the results.

Intriguingly, the try and set approaches take the same amount of time.

Test Code

``````import random
import bisect
import matplotlib.pyplot as plt
import math
import timeit
import itertools

def wrapper(func, *args, **kwargs):
" Use to produced 0 argument function for call it"
# Reference https://www.pythoncentral.io/time-a-python-function/
def wrapped():
return func(*args, **kwargs)
return wrapped

def method_in(a,b,c):
for i,x in enumerate(a):
if x in b:
c[i] = b.index(x)
else:
c[i] = -1
return c

def method_try(a,b,c):
for i, x in enumerate(a):
try:
c[i] = b.index(x)
except ValueError:
c[i] = -1

def method_set_in(a,b,c):
s = set(b)
for i,x in enumerate(a):
if x in s:
c[i] = b.index(x)
else:
c[i] = -1
return c

def method_bisect(a,b,c):
" Finds indexes using bisection "

# Create a sorted b with its index
bsorted = sorted([(x, i) for i, x in enumerate(b)], key = lambda t: t[0])

for i,x in enumerate(a):
index = bisect.bisect_left(bsorted,(x, ))
c[i] = -1
if index < len(a):
if x == bsorted[index][0]:
c[i] = bsorted[index][1]  # index in the b array

return c

def method_reverse_lookup(a, b, c):
reverse_lookup = {x:i for i, x in enumerate(b)}
for i, x in enumerate(a):
c[i] = reverse_lookup.get(x, -1)
return c

def profile():
Nls = [x for x in range(1000,20000,1000)]
number_iterations = 10
methods = [method_in, method_try, method_set_in, method_bisect, method_reverse_lookup]
time_methods = [[] for _ in range(len(methods))]

for N in Nls:
a = [x for x in range(0,N)]
random.shuffle(a)
b = [x for x in range(0,N)]
random.shuffle(b)
c = [0 for x in range(0,N)]

for i, func in enumerate(methods):
wrapped = wrapper(func, a, b, c)
time_methods[i].append(math.log(timeit.timeit(wrapped, number=number_iterations)))

markers = itertools.cycle(('o', '+', '.', '>', '2'))
colors = itertools.cycle(('r', 'b', 'g', 'y', 'c'))
labels = itertools.cycle(('in', 'try', 'set', 'bisect', 'reverse'))

for i in range(len(time_methods)):
plt.plot(Nls,time_methods[i],marker = next(markers),color=next(colors),linestyle='-',label=next(labels))

plt.xlabel('list size', fontsize=18)
plt.ylabel('log(time)', fontsize=18)
plt.legend(loc = 'upper left')
plt.show()

profile()
``````