Skip to content Skip to sidebar Skip to footer

Populating A Numpy Ndarray Of An Arbitrary Shape (preferably Without Using For Loops)

Consider a NumPy ndarray A of floats, with a dimension of n and an arbitrary shape of D=[d1,...,dn] (dis are nonnegative integers). How can I populate A to have for example: A[j1,.

Solution 1:

One important fact to realize is that you can use broadcasting to solve this problem efficiently. So for the 2D case you could do

d1, d2 = (3, 4)
A = numpy.sqrt(numpy.arange(d1)[:, None] * numpy.arange(d2)[None, :])
# array([[0.        , 0.        , 0.        , 0.        ],
#        [0.        , 1.        , 1.41421356, 1.73205081],
#        [0.        , 1.41421356, 2.        , 2.44948974]])

Once you feel comfortable with using broadcasting to do these outer products (or sums, or comparisons etc.) we can try to solve this for the nD case.

Looking at the input arrays of the above code we realize they have shapes

(d1,  1)
( 1, d2)

So to do this in nD we need to find a method that takes linear index arrays and automatically creates arrays of shapes

(d1,  1,  1, ...)
( 1, d2,  1, ...)
( 1,  1, d3, ...)

Numpy offers such a function: numpy.meshgrid(..., sparse=True)

numpy.meshgrid(numpy.arange(3), numpy.arange(4), sparse=True)

Knowing this we can put it all together in one line:

D = (3, 4, 5)
numpy.sqrt(numpy.prod(numpy.meshgrid(*[numpy.arange(d) for d in D], sparse=True, indexing='ij')))
# array([[[0.        , 0.        , 0.        , 0.        , 0.        ],
#         [0.        , 0.        , 0.        , 0.        , 0.        ],
#         [0.        , 0.        , 0.        , 0.        , 0.        ]],
# 
#        [[0.        , 0.        , 0.        , 0.        , 0.        ],
#         [0.        , 1.        , 1.41421356, 1.73205081, 2.        ],
#         [0.        , 1.41421356, 2.        , 2.44948974, 2.82842712]],
# 
#        [[0.        , 0.        , 0.        , 0.        , 0.        ],
#         [0.        , 1.41421356, 2.        , 2.44948974, 2.82842712],
#         [0.        , 2.        , 2.82842712, 3.46410162, 4.        ]],
# 
#        [[0.        , 0.        , 0.        , 0.        , 0.        ],
#         [0.        , 1.73205081, 2.44948974, 3.        , 3.46410162],
#         [0.        , 2.44948974, 3.46410162, 4.24264069, 4.89897949]]])

Performance evaluation

To evaluate the performance of all three solutions, let's time their speed for several different tensor sizes:

D=(2,3,4,5)

%timeit np.fromfunction(function=myfunc2, shape=D)# 501 µs ± 9.34 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)%timeit np.fromfunction(function=creation_function, shape=D)# 24.2 µs ± 455 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)%timeit numpy.sqrt(numpy.prod(numpy.meshgrid(*[numpy.arange(d) for d in D], sparse=True, indexing='ij')))# 30.9 µs ± 1.02 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

D=(20,30,40,50)

%timeit np.fromfunction(function=myfunc2, shape=D)# 4.64 s ± 36.7 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)%timeit np.fromfunction(function=creation_function, shape=D)# 36.7 ms ± 1.17 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)%timeit numpy.sqrt(numpy.prod(numpy.meshgrid(*[numpy.arange(d) for d in D], sparse=True, indexing='ij')))# 9 ms ± 237 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

D=(200,30,40,50)

%timeit np.fromfunction(function=myfunc2, shape=D)# never completed%timeit np.fromfunction(function=creation_function, shape=D)# 508 ms ± 7.41 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)%timeit numpy.sqrt(numpy.prod(numpy.meshgrid(*[numpy.arange(d) for d in D], sparse=True, indexing='ij')))# 88.1 ms ± 1.63 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

D=(200,300,40,50)

%timeit np.fromfunction(function=myfunc2, shape=D)# never completed%timeit np.fromfunction(function=creation_function, shape=D)# 5.8 s ± 565 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)%timeit numpy.sqrt(numpy.prod(numpy.meshgrid(*[numpy.arange(d) for d in D], sparse=True, indexing='ij')))# 1.29 s ± 15.3 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

Solution 2:

A modified version of ibarronds solution without using labmda and works with higher dimensions above 3:

import numpy as np
def myfunc(*J):
    return np.sqrt(np.prod(np.array(J)))
myfunc2=np.vectorize(myfunc)

D=(2,3,4,5)

np.fromfunction(function=myfunc2 , shape=D)

P.S. Unfortunately he has removed his answer so I copy it here for reference:

creation_function = lambda *args: np.sqrt(np.prod(np.array([*args]), axis=0))
np.fromfunction(function=creation_function, shape=D)

Post a Comment for "Populating A Numpy Ndarray Of An Arbitrary Shape (preferably Without Using For Loops)"