Coder Perfect

How to iterate through two lists at the same time

Problem

In Python, I have two iterables that I want to go over in pairs:

foo = (1, 2, 3)
bar = (4, 5, 6)

for (f, b) in some_iterator(foo, bar):
    print("f: ", f, "; b: ", b)

It should lead to:

f: 1; b: 4
f: 2; b: 5
f: 3; b: 6

Iterating over the indices is one method:

for i in range(len(foo)):
    print("f: ", foo[i], "; b: ", bar[i])

But that strikes me as a little unpythonic. Is there a more efficient method to accomplish it?

Asked by Nathan Fellman

Solution #1

for f, b in zip(foo, bar):
    print(f, b)

When the shorter of foo or bar comes to an end, zip comes to an end.

Zip, like itertools.izip in Python 2, returns an iterator of tuples in Python 3. Use list(zip(foo, bar)) to get a list of tuples. Itertools.zip longest is used to zip until both iterators are exhausted.

Zip returns a list of tuples in Python 2. When the foo and bar aren’t too big, this is fine. If they are both massive then forming zip(foo,bar) is an unnecessarily massive temporary variable, and should be replaced by itertools.izip or itertools.izip_longest, which returns an iterator instead of a list.

import itertools
for f,b in itertools.izip(foo,bar):
    print(f,b)
for f,b in itertools.izip_longest(foo,bar):
    print(f,b)

When both foo and bar are exhausted, the _longest ends. When the shorter iterator(s) are exhausted, izip longest returns a tuple with None in the iterator’s location. If you choose, you can use a different fillvalue than None. The complete story can be found here.

It’s also worth noting that zip and its zip-like brethren can take any number of iterables as arguments. As an example,

for num, cheese, color in zip([1,2,3], ['manchego', 'stilton', 'brie'], 
                              ['red', 'blue', 'green']):
    print('{} {} {}'.format(num, color, cheese))

prints

1 red manchego
2 blue stilton
3 green brie

Answered by unutbu

Solution #2

You’re looking for the zip function.

for (f,b) in zip(foo, bar):
    print "f: ", f ,"; b: ", b

Answered by Karl Guertin

Solution #3

The ‘zip’ function should be used. This is an example of how your own zip function may seem.

def custom_zip(seq1, seq2):
    it1 = iter(seq1)
    it2 = iter(seq2)
    while True:
        yield next(it1), next(it2)

Answered by Vlad Bezden

Solution #4

Using Python 3.6’s zip() functions, Python’s enumerate() function, a manual counter (see count() function), an index-list, and a special scenario where the elements of one of the two lists (either foo or bar) may be used to index the other list, I’ve compared the iteration performance of two identical lists. The timeit() method was used to assess their performance for printing and constructing a new list, respectively. The number of repeats used was 1000. The following is one of the Python scripts I wrote to do these investigations. The foo and bar lists ranged in size from 10 to 1,000,000 entries.

The amount of compute time per operation that is meaningful or significant must be determined by the programmer.

If this time criterion is 1 second, i.e. 10**0 sec, then looking at the y-axis of the graph on the left at 1 sec and projecting it horizontally until it reaches the monomials curves, we can see that lists with more than 144 elements will incur significant compute cost and significance to the programmer. That is, for smaller list sizes, any performance gained by the methodologies discussed in this study will be inconsequential to the programmer. The programmer will conclude that the zip() function performs similarly to the other approaches when it comes to iterating print statements.

Using the zip() function to iterate through two lists in parallel during list generation improves performance significantly. When iterating through two lists in parallel to print out their elements, the zip() function performs similarly to the enumerate() function, as well as when using a manual counter variable, an index-list, and in the special case where the elements of one of the two lists (either foo or bar) are used to index the other list.

import timeit
import matplotlib.pyplot as plt
import numpy as np


def test_zip( foo, bar ):
    store = []
    for f, b in zip(foo, bar):
        #print(f, b)
        store.append( (f, b) ) 

def test_enumerate( foo, bar ):
    store = []
    for n, f in enumerate( foo ):
        #print(f, bar[n])
        store.append( (f, bar[n]) ) 

def test_count( foo, bar ):
    store = []
    count = 0
    for f in foo:
        #print(f, bar[count])
        store.append( (f, bar[count]) )
        count += 1

def test_indices( foo, bar, indices ):
    store = []
    for i in indices:
        #print(foo[i], bar[i])
        store.append( (foo[i], bar[i]) )

def test_existing_list_indices( foo, bar ):
    store = []
    for f in foo:
        #print(f, bar[f])
        store.append( (f, bar[f]) )


list_sizes = [ 10, 100, 1000, 10000, 100000, 1000000 ]
tz = []
te = []
tc = []
ti = []
tii= []

tcz = []
tce = []
tci = []
tcii= []

for a in list_sizes:
    foo = [ i for i in range(a) ]
    bar = [ i for i in range(a) ]
    indices = [ i for i in range(a) ]
    reps = 1000

    tz.append( timeit.timeit( 'test_zip( foo, bar )',
                              'from __main__ import test_zip, foo, bar',
                              number=reps
                              )
               )
    te.append( timeit.timeit( 'test_enumerate( foo, bar )',
                              'from __main__ import test_enumerate, foo, bar',
                              number=reps
                              )
               )
    tc.append( timeit.timeit( 'test_count( foo, bar )',
                              'from __main__ import test_count, foo, bar',
                              number=reps
                              )
               )
    ti.append( timeit.timeit( 'test_indices( foo, bar, indices )',
                              'from __main__ import test_indices, foo, bar, indices',
                              number=reps
                              )
               )
    tii.append( timeit.timeit( 'test_existing_list_indices( foo, bar )',
                               'from __main__ import test_existing_list_indices, foo, bar',
                               number=reps
                               )
                )

    tcz.append( timeit.timeit( '[(f, b) for f, b in zip(foo, bar)]',
                               'from __main__ import foo, bar',
                               number=reps
                               )
                )
    tce.append( timeit.timeit( '[(f, bar[n]) for n, f in enumerate( foo )]',
                               'from __main__ import foo, bar',
                               number=reps
                               )
                )
    tci.append( timeit.timeit( '[(foo[i], bar[i]) for i in indices ]',
                               'from __main__ import foo, bar, indices',
                               number=reps
                               )
                )
    tcii.append( timeit.timeit( '[(f, bar[f]) for f in foo ]',
                                'from __main__ import foo, bar',
                                number=reps
                                )
                 )

print( f'te  = {te}' )
print( f'ti  = {ti}' )
print( f'tii = {tii}' )
print( f'tc  = {tc}' )
print( f'tz  = {tz}' )

print( f'tce  = {te}' )
print( f'tci  = {ti}' )
print( f'tcii = {tii}' )
print( f'tcz  = {tz}' )

fig, ax = plt.subplots( 2, 2 )
ax[0,0].plot( list_sizes, te, label='enumerate()', marker='.' )
ax[0,0].plot( list_sizes, ti, label='index-list', marker='.' )
ax[0,0].plot( list_sizes, tii, label='element of foo', marker='.' )
ax[0,0].plot( list_sizes, tc, label='count()', marker='.' )
ax[0,0].plot( list_sizes, tz, label='zip()', marker='.')
ax[0,0].set_xscale('log')
ax[0,0].set_yscale('log')
ax[0,0].set_xlabel('List Size')
ax[0,0].set_ylabel('Time (s)')
ax[0,0].legend()
ax[0,0].grid( b=True, which='major', axis='both')
ax[0,0].grid( b=True, which='minor', axis='both')

ax[0,1].plot( list_sizes, np.array(te)/np.array(tz), label='enumerate()', marker='.' )
ax[0,1].plot( list_sizes, np.array(ti)/np.array(tz), label='index-list', marker='.' )
ax[0,1].plot( list_sizes, np.array(tii)/np.array(tz), label='element of foo', marker='.' )
ax[0,1].plot( list_sizes, np.array(tc)/np.array(tz), label='count()', marker='.' )
ax[0,1].set_xscale('log')
ax[0,1].set_xlabel('List Size')
ax[0,1].set_ylabel('Performances ( vs zip() function )')
ax[0,1].legend()
ax[0,1].grid( b=True, which='major', axis='both')
ax[0,1].grid( b=True, which='minor', axis='both')

ax[1,0].plot( list_sizes, tce, label='list comprehension using enumerate()',  marker='.')
ax[1,0].plot( list_sizes, tci, label='list comprehension using index-list()',  marker='.')
ax[1,0].plot( list_sizes, tcii, label='list comprehension using element of foo',  marker='.')
ax[1,0].plot( list_sizes, tcz, label='list comprehension using zip()',  marker='.')
ax[1,0].set_xscale('log')
ax[1,0].set_yscale('log')
ax[1,0].set_xlabel('List Size')
ax[1,0].set_ylabel('Time (s)')
ax[1,0].legend()
ax[1,0].grid( b=True, which='major', axis='both')
ax[1,0].grid( b=True, which='minor', axis='both')

ax[1,1].plot( list_sizes, np.array(tce)/np.array(tcz), label='enumerate()', marker='.' )
ax[1,1].plot( list_sizes, np.array(tci)/np.array(tcz), label='index-list', marker='.' )
ax[1,1].plot( list_sizes, np.array(tcii)/np.array(tcz), label='element of foo', marker='.' )
ax[1,1].set_xscale('log')
ax[1,1].set_xlabel('List Size')
ax[1,1].set_ylabel('Performances ( vs zip() function )')
ax[1,1].legend()
ax[1,1].grid( b=True, which='major', axis='both')
ax[1,1].grid( b=True, which='minor', axis='both')

plt.show()

Answered by Sun Bear

Solution #5

You can bundle the nth elements into a tuple or list using comprehension, then pass them out with a generator function.

def iterate_multi(*lists):
    for i in range(min(map(len,lists))):
        yield tuple(l[i] for l in lists)

for l1, l2, l3 in iterate_multi([1,2,3],[4,5,6],[7,8,9]):
    print(str(l1)+","+str(l2)+","+str(l3))

Answered by Don F

Post is based on https://stackoverflow.com/questions/1663807/how-to-iterate-through-two-lists-in-parallel