Multidimensional algorithms and iteration in Julia

shoyer · on Feb 2, 2016

This is a fantastic step forward for Julia, and array-oriented computing in general.

Routines that smoothly generalize to arrays with an arbitrary number of dimensions is also a strength of NumPy -- it's the core idea behind NumPy's "universal functions" [1] like sin or exp.

Writing custom ufuncs in C is not easy, but Numba makes it straightforward to write fast ufuncs in Python [2]. For many of the use cases described here, it's a pretty solid alternative. In general, I think Numba's ufuncs are under-appreciated -- they make it easy to write fast code that works on any number of array dimensions, and even automatically respect metadata when called on array-like objects from third party libraries like pandas.

Unfortunately, we don't have abstractions quite as powerful as eachindex and CartesianRange in NumPy/Numba. This makes for some awkward work-arounds and optimization constraints (e.g., see my project Numbagg [3], and discussions on the Numba issue tracker [4]). That said, I don't think there's any reason why these abstractions couldn't also be implemented in Numba, and I would love to see that happen.

[1] http://docs.scipy.org/doc/numpy/reference/ufuncs.html [2] http://numba.pydata.org/numba-doc/0.23.1/user/vectorize.html [3] https://github.com/shoyer/numbagg [4] e.g., https://github.com/numba/numba/issues/841

tadlan · on Feb 2, 2016

What about dynd? Iirc doesn't use iteration st all.

jordigh · on Feb 2, 2016

    In many languages, writing a general (N-dimensional)
    implementation of this conceptually-simple algorithm is somewhat
    painful, but in Julia it’s a piece of cake:

Indeed, for the Octave implementation of bwlabeln[1] I had to do something like that for considering the n-dimensional neighbours of each voxel in C++. I ended up padding the Nd-array with zeros. What do people normally do in languages that are not Julia?

--

http://sourceforge.net/p/octave/image/ci/default/tree/src/bw...

jpfr · on Feb 1, 2016

Python has itertools.product to iterate over the cartesian product of two lists. Some of the most elegant code I ever wrote is due to this abstraction.

https://docs.python.org/2/library/itertools.html#itertools.p...

elcritch · on Feb 2, 2016

`itertools` comes in real handy in some cases. I just wish it didn't require a special import. Have you used the itertools.partition much? It's pretty handy as well.

The fun part with Julia is the ability to use these style of for-loops to generate code to run arbitrary N-dimensional computations based on the specific input array sizes. Not sure exactly what they're doing [here](https://medium.com/@acidflask/smoothing-data-with-julia-s-ge...) but it looks cool!

clickok · on Feb 2, 2016

It's always nice to get a chance to deploy some functional techniques.

You can get a Cartesian range (also called a von Neumann neighborhood) of size "n" and dimension "d" via:

  def cartesian_range(n, d):
	return product(*[range(-n, n+1) for _ in range(d)])

This returns a generator that yields all tuples of integers where each integer has absolute value less than or equal to "n". If you want the Moore neighborhood (tuples whose absolute sum is less than or equal to "n"), you can add a filtering step:

  def moore_neighborhood(n, d):
	seq = product(*[range(-n, n+1) for _ in range(d)])
	return filter(lambda x: sum(map(abs, x)) <= n, seq)

There's probably a more efficient way of doing this for Moore neighborhoods, but I didn't come across it when I was looking into such things. The problem lies in the fact that as "d" grows large, you're throwing away more and more of "seq", so while it looks elegant, you might need something a little less terse in practice.