I cringe when I see code like this:
if len(vals) in range(1, 5):
It seems like a harmless enough thing to do but that code is effectively creating a new array with values [1, 2, 3, 4, 5] and testing if the result of len() is in that array by iterating over it. This check is happening multiple times for each draw call every frame.
Sadly I see this kind of stuff in Python all the time and it just adds weight to the argument that Python is not a performant language. 1 <= len(vals) <= 5 would have been more pythonic and certainly more efficient (and obvious) but I have to wonder if under the hood it's just doing the same inefficient operation.
That doesn't tell you about the speed of the code though.
$> python -m timeit "1 <= 4 <= 5"
10000000 loops, best of 3: 0.0877 usec per loop
$> python -m timeit "4 in range(1,5)"
1000000 loops, best of 3: 0.413 usec per loop
$> python -m timeit "4 in (1,2,3,4,5)"
10000000 loops, best of 3: 0.0776 usec per loop
Tested in python 3.5.0, Even with the improvements to the range function, the comparison is almost 5x faster.
Eh? That dis output doesn't really tell us which is more efficient. To know that, you'd need to know how `COMPARE_OP` works for each of it's arguments, which is what @Negative1 believes will be not-as-efficient (along with range creation).
This output doesn't really provide any clarification.
It may be a shorter number of bytecodes, but that does not mean it has better performance.
You have two function calls in the range version and only one in the non-range version. You are just hiding some of that code in a dynamic function call. Also, your compare_op consists of two <= in the non-range version and an "in" compare_op for the range version (which probably hides the element by element comparison or hash/bloom if it is O(1).
Fewer lines of code does not mean more efficient. Regardless of whether it is source code or bytecode.
The bytecode count is, while significant in slower CPython interpreters, not directly an indicator of performance. `LOAD_GLOBAL` and `CALL_FUNCTION` are comparatively expensive opcodes to run.
In python 3 range() does not build a list but produce a range object. Testing x in range(a,b) will actually call the __contains__ method of the object which is O(1).
The only overhead here may be the object creation.
See this http://stackoverflow.com/q/30081275
To be pedantic (which I think is warranted here), range does not return a generator, it returns a sequence called a range object. This object can be indexed, sliced, and (relevant to this discussion) supports the 'in' operator.
"x in range(10)" will operate in constant time and memory in Python 3. Whether it is actually more efficient than "0 <= x < 10" is another matter, of course. I would expect the call to range() to dominate here, and indeed, it is much slower in this microbenchmark:
$ python3 -m timeit -s 'x = 8' 'x in range(10)'
1000000 loops, best of 3: 0.351 usec per loop
$ python3 -m timeit -s 'x = 8' '0 <= x < 10'
10000000 loops, best of 3: 0.0732 usec per loop
Even aside from this, I find the "0 <= x < 10" syntax to be clearer.
I'm certain that the call to range is significant there, but if you think of how this actually plays out, it may be doing a naive list compare. if a compare operation is your limiting operation, then the former has to do nine operations (Is x == 0? Is x == 1? ... Is x == 8?) vs precisely two in the latter (is x >= 0? Is x < 10?). The math on that is pretty close - 2/9ths of 0.351 is 0.078, quite close to your result.
From a formal CS perspective, that's why this is wrong; It's because 'in' is an o(n) operation, while the two comparisons are constant time. It may be abstracted away by the range object overloading, but that's a fairly narrow optimization it's able to pull off.
On modern CPUS, branching operations like compare usually are your problem; An unconditional function call is easy to optimize, but a branch in a looping construct is guaranteed to lead to one or more mispredictions. Trying to minimize the surface area of comparisons is an important part of performant code.
To question your assessment I tested it on my own machine:
python3 -m timeit -s 'x = 8' 'x in range(1000)'
1000000 loops, best of 3: 0.405 usec per loop
python3 -m timeit -s 'x = 8' '0 <= x < 1000'
10000000 loops, best of 3: 0.0742 usec per loop
python3 -m timeit -s 'x = 8' 'x in range(100000)'
1000000 loops, best of 3: 0.405 usec per loop
python3 -m timeit -s 'x = 1123' 'x in range(100000)'
1000000 loops, best of 3: 0.415 usec per loop
python3 -m timeit -s 'x = 1123' '0 <= x < 100000'
10000000 loops, best of 3: 0.0745 usec per loop
While 0<=x<10 seems to be 5 times faster, both seem to be independent of the actual chosen values.
Mainly, the call to range() has to construct something that it returns, that means allocating memory, which is probably going to dwarf any other code involved in this. (In case of range(1,5) I tend to assume that constructing [1,2,3,4] is probably marginally faster than constructing rangeobject)
Also, as noted in other comments, range object has special cased O(1) implementation of in for integers (range_contains_long() in Objects/rangeobject.c)
CPython bytecode interpreter tries to minimize amount of branching it causes by using some non-obvious tricks, but still by definition it is going to cause at least one essentially unpredictable branch per interpreted instruction, so optimizing for number of branches in user python code is mostly pointless endeavor.
Even without going into performance concerns, it's just harder to understand than the <= form.
Example in this thread: half of the commenters consider that it means 1 <= len(vals) <= 5 and the other half consider that it means 1 <= len(vals) < 5...
What abstraction? Comparisons are ubiquitous, easy to read, and a one-liner as well. If the range were to change it could affect performance. I've never seen anything written like the original loop, it's not like it's a Python idiom or anything. I'd say it's the wrong way to do things.
The abstraction is in removing the cognitive overhead needed to understand this. The original programmer can now allocate the time saved to higher level concerns. That is what C did for assembly and what Python is doing for C. Is that not the theme of abstraction?
In this case I actually think the cognitive overhead is larger for the in range(5) call. You have to know how that will behave, is it 1 <= x < 5 or 1 <= x <= 5? The comparisons are more explicit and they are quite easy to read in this case.
I've been teaching a class (http://apprenticeworks.us) based on the same original repo it seems like you found. I made a lot of the same changes as you as well. I ended up abstracting a lot of the "optimization" type code (the sectors for example) and changed the way the state of the player is kept to support multiple players in the future.
Not sure if you'd be interested, but I have some content I'd be happy to add for background type information (i.e. degrees of freedom, calculating movement, frames of reference, etc). Might make a good starting point for contributors but not sure if that's the direction you see your documentation taking.
Just found this on hackernews now. Pull requests are very welcome. I don't personally have a lot of time to actually code stuff, as I'm busy with contracts, but if you post on our irc someone might be interested in digging into your code base and pulling out the useful bits.
I've always wanted to have an MC clone in Python with GUI event hooks to run scripts. Like pull a lever and an email is sent sort of thing. Will this fork allow for something like that? I'm not against adding it myself if allowed, btw.
It's build on top of pyglet which is built around event-based programming, so it's already got the ability to do low-level callbacks from key/mouse events built in [1]. GUI widgets are a separate matter though and would probably require a separate library to do OpenGL GUI elements on top of this.
It will not. We're still very early, and definitely not feature complete with minecraft. We're still working on some very fundamental things.
Do do that, you first have to write code for pulling levers, which will first require some digging around with how pycraft handles textures to make it more generic.
That point was when Minecraft was worthless. It's currently somewhere around the $2.5b mark in value. But that's just splitting hairs, it won't likely ever be open-sourced after MS acquired it.
That said, they also promised a modding API too...
What he actually said was "Once sales start dying and a minimum time has passed, I will release the game source code as some kind of open source." [0]. Sales haven't died down and he doesn't own it anymore anyway.
It seems like a harmless enough thing to do but that code is effectively creating a new array with values [1, 2, 3, 4, 5] and testing if the result of len() is in that array by iterating over it. This check is happening multiple times for each draw call every frame.
Sadly I see this kind of stuff in Python all the time and it just adds weight to the argument that Python is not a performant language. 1 <= len(vals) <= 5 would have been more pythonic and certainly more efficient (and obvious) but I have to wonder if under the hood it's just doing the same inefficient operation.