Coder Perfect

Initialization of a NumPy array (fill with identical values)

Problem

I need to make an n-element NumPy array with each element being v.

Is there anything more satisfying than:

a = empty(n)
for i in range(n):
    a[i] = v

For v = 0, 1, I’m sure zeros and ones would suffice. I could use v * ones(n), but it won’t work if v is None, and it’ll take a long time.

Asked by max

Solution #1

NumPy 1.8 introduced np.full(), which is a more direct method than empty() followed by fill() for creating an array filled with a certain value:

>>> np.full((3, 5), 7)
array([[ 7.,  7.,  7.,  7.,  7.],
       [ 7.,  7.,  7.,  7.,  7.],
       [ 7.,  7.,  7.,  7.,  7.]])

>>> np.full((3, 5), 7, dtype=int)
array([[7, 7, 7, 7, 7],
       [7, 7, 7, 7, 7],
       [7, 7, 7, 7, 7]])

This is perhaps the best technique to create an array with precise values because it states explicitly what is being accomplished (and it can in principle be very efficient since it performs a very specific task).

Answered by Eric O Lebigot

Solution #2

Numpy 1.7.0 has been updated: (Thanks to @Rolf Bartstra for the tip.)

a=np.empty(n); a.fill(5) is the quickest method.

In order of decreasing speed:

%timeit a=np.empty(10000); a.fill(5)
100000 loops, best of 3: 5.85 us per loop

%timeit a=np.empty(10000); a[:]=5 
100000 loops, best of 3: 7.15 us per loop

%timeit a=np.ones(10000)*5
10000 loops, best of 3: 22.9 us per loop

%timeit a=np.repeat(5,(10000))
10000 loops, best of 3: 81.7 us per loop

%timeit a=np.tile(5,[10000])
10000 loops, best of 3: 82.9 us per loop

Answered by Yariv

Solution #3

Filling is, in my opinion, the quickest method to d

a = np.empty(10)
a.fill(7)

It’s also a good idea to avoid iterating like you did in your example. Numpy broadcasting will do what your iteration accomplishes with a simple a[:] = v.

Answered by Paul

Solution #4

I was planning on using np.array(n * [value]), however it appears that it is slower than all other options for large enough n. In terms of readability and quickness, the best is

np.full(n, 3.14)

The full comparison with perfplot can be seen here (a pet project of mine).

The two empty options are still the quickest (with NumPy 1.12.1). For big arrays, complete catches up.

The plot was created using the following code:

import numpy as np
import perfplot


def empty_fill(n):
    a = np.empty(n)
    a.fill(3.14)
    return a


def empty_colon(n):
    a = np.empty(n)
    a[:] = 3.14
    return a


def ones_times(n):
    return 3.14 * np.ones(n)


def repeat(n):
    return np.repeat(3.14, (n))


def tile(n):
    return np.repeat(3.14, [n])


def full(n):
    return np.full((n), 3.14)


def list_to_array(n):
    return np.array(n * [3.14])


perfplot.show(
    setup=lambda n: n,
    kernels=[empty_fill, empty_colon, ones_times, repeat, tile, full, list_to_array],
    n_range=[2 ** k for k in range(27)],
    xlabel="len(a)",
    logx=True,
    logy=True,
)

Answered by Nico Schlömer

Solution #5

Not only absolute speeds, but also speed order (as observed by user1579844) appear to be machine-dependent; here’s what I discovered:

a=np.empty(1e4) is the quickest; a.fill(5) is the quickest.

In descending speed order:

timeit a=np.empty(1e4); a.fill(5) 
# 100000 loops, best of 3: 10.2 us per loop
timeit a=np.empty(1e4); a[:]=5
# 100000 loops, best of 3: 16.9 us per loop
timeit a=np.ones(1e4)*5
# 100000 loops, best of 3: 32.2 us per loop
timeit a=np.tile(5,[1e4])
# 10000 loops, best of 3: 90.9 us per loop
timeit a=np.repeat(5,(1e4))
# 10000 loops, best of 3: 98.3 us per loop
timeit a=np.array([5]*int(1e4))
# 1000 loops, best of 3: 1.69 ms per loop (slowest BY FAR!)

So, try to figure out what’s the fastest on your platform and use that.

Answered by Rolf Bartstra

Post is based on https://stackoverflow.com/questions/5891410/numpy-array-initialization-fill-with-identical-values