Hacker News new | past | comments | ask | show | jobs | submit login
Pycraft: Minecraft engine in Python (github.com/traverseda)
147 points by danso on April 19, 2016 | hide | past | favorite | 47 comments



I cringe when I see code like this: if len(vals) in range(1, 5):

It seems like a harmless enough thing to do but that code is effectively creating a new array with values [1, 2, 3, 4, 5] and testing if the result of len() is in that array by iterating over it. This check is happening multiple times for each draw call every frame.

Sadly I see this kind of stuff in Python all the time and it just adds weight to the argument that Python is not a performant language. 1 <= len(vals) <= 5 would have been more pythonic and certainly more efficient (and obvious) but I have to wonder if under the hood it's just doing the same inefficient operation.


Why do you suspect 1 <= len(vals) <= 5 would do something inefficient under the hood?

You can actually get a very good idea for what Python is doing under the hood using the dis module:

    dis.dis(lambda: 1 <= len(vals) <= 5)
      1           0 LOAD_CONST               1 (1)
                  3 LOAD_GLOBAL              0 (len)
                  6 LOAD_GLOBAL              1 (vals)
                  9 CALL_FUNCTION            1
                 12 DUP_TOP             
                 13 ROT_THREE           
                 14 COMPARE_OP               1 (<=)
                 17 JUMP_IF_FALSE_OR_POP    27
                 20 LOAD_CONST               2 (5)
                 23 COMPARE_OP               1 (<=)
                 26 RETURN_VALUE        
            >>   27 ROT_TWO             
                 28 POP_TOP             
                 29 RETURN_VALUE        
Compare to:

    dis.dis(lambda: len(vals) in range(1, 5))
      1           0 LOAD_GLOBAL              0 (len)
                  3 LOAD_GLOBAL              1 (vals)
                  6 CALL_FUNCTION            1
                  9 LOAD_GLOBAL              2 (range)
                 12 LOAD_CONST               1 (1)
                 15 LOAD_CONST               2 (5)
                 18 CALL_FUNCTION            2
                 21 COMPARE_OP               6 (in)
                 24 RETURN_VALUE


That doesn't tell you about the speed of the code though.

    $> python -m timeit "1 <= 4 <= 5"
    10000000 loops, best of 3: 0.0877 usec per loop
    
    $> python -m timeit "4 in range(1,5)"
    1000000 loops, best of 3: 0.413 usec per loop

    $> python -m timeit "4 in (1,2,3,4,5)"
    10000000 loops, best of 3: 0.0776 usec per loop
Tested in python 3.5.0, Even with the improvements to the range function, the comparison is almost 5x faster.


Eh? That dis output doesn't really tell us which is more efficient. To know that, you'd need to know how `COMPARE_OP` works for each of it's arguments, which is what @Negative1 believes will be not-as-efficient (along with range creation).

This output doesn't really provide any clarification.


It may be a shorter number of bytecodes, but that does not mean it has better performance.

You have two function calls in the range version and only one in the non-range version. You are just hiding some of that code in a dynamic function call. Also, your compare_op consists of two <= in the non-range version and an "in" compare_op for the range version (which probably hides the element by element comparison or hash/bloom if it is O(1).

Fewer lines of code does not mean more efficient. Regardless of whether it is source code or bytecode.


There's no argument made in the comment you replied to. Certainly not a strident one in favor of range.

It appears to me they went ahead and posted the bytecode for both because they had spun up an interpreter to get the bytecode for one.


The bytecode count is, while significant in slower CPython interpreters, not directly an indicator of performance. `LOAD_GLOBAL` and `CALL_FUNCTION` are comparatively expensive opcodes to run.


In python 3 range() does not build a list but produce a range object. Testing x in range(a,b) will actually call the __contains__ method of the object which is O(1). The only overhead here may be the object creation. See this http://stackoverflow.com/q/30081275


It may be O(1), but the constant of a __contains__ method is almost certainly much higher than the constant of an if 1 <= len(vals) <= 5: statement.


This is in Python 3. xrange is now range, so only a generator is constructed.

Agreed that 1 <= len(vals) < 5 would be more Pythonic.


To be pedantic (which I think is warranted here), range does not return a generator, it returns a sequence called a range object. This object can be indexed, sliced, and (relevant to this discussion) supports the 'in' operator.

"x in range(10)" will operate in constant time and memory in Python 3. Whether it is actually more efficient than "0 <= x < 10" is another matter, of course. I would expect the call to range() to dominate here, and indeed, it is much slower in this microbenchmark:

  $ python3 -m timeit -s 'x = 8' 'x in range(10)'
  1000000 loops, best of 3: 0.351 usec per loop
  $ python3 -m timeit -s 'x = 8' '0 <= x < 10'
  10000000 loops, best of 3: 0.0732 usec per loop
Even aside from this, I find the "0 <= x < 10" syntax to be clearer.


I'm certain that the call to range is significant there, but if you think of how this actually plays out, it may be doing a naive list compare. if a compare operation is your limiting operation, then the former has to do nine operations (Is x == 0? Is x == 1? ... Is x == 8?) vs precisely two in the latter (is x >= 0? Is x < 10?). The math on that is pretty close - 2/9ths of 0.351 is 0.078, quite close to your result.

From a formal CS perspective, that's why this is wrong; It's because 'in' is an o(n) operation, while the two comparisons are constant time. It may be abstracted away by the range object overloading, but that's a fairly narrow optimization it's able to pull off.

On modern CPUS, branching operations like compare usually are your problem; An unconditional function call is easy to optimize, but a branch in a looping construct is guaranteed to lead to one or more mispredictions. Trying to minimize the surface area of comparisons is an important part of performant code.


To question your assessment I tested it on my own machine:

  python3 -m timeit -s 'x = 8' 'x in range(1000)'
  1000000 loops, best of 3: 0.405 usec per loop
  python3 -m timeit -s 'x = 8' '0 <= x < 1000'
  10000000 loops, best of 3: 0.0742 usec per loop
  python3 -m timeit -s 'x = 8' 'x in range(100000)'
  1000000 loops, best of 3: 0.405 usec per loop
  python3 -m timeit -s 'x = 1123' 'x in range(100000)'
  1000000 loops, best of 3: 0.415 usec per loop
  python3 -m timeit -s 'x = 1123' '0 <= x < 100000'
  10000000 loops, best of 3: 0.0745 usec per loop
While 0<=x<10 seems to be 5 times faster, both seem to be independent of the actual chosen values.


Mainly, the call to range() has to construct something that it returns, that means allocating memory, which is probably going to dwarf any other code involved in this. (In case of range(1,5) I tend to assume that constructing [1,2,3,4] is probably marginally faster than constructing rangeobject)

Also, as noted in other comments, range object has special cased O(1) implementation of in for integers (range_contains_long() in Objects/rangeobject.c)

CPython bytecode interpreter tries to minimize amount of branching it causes by using some non-obvious tricks, but still by definition it is going to cause at least one essentially unpredictable branch per interpreted instruction, so optimizing for number of branches in user python code is mostly pointless endeavor.


Even without going into performance concerns, it's just harder to understand than the <= form.

Example in this thread: half of the commenters consider that it means 1 <= len(vals) <= 5 and the other half consider that it means 1 <= len(vals) < 5...


The cost of abstraction.

If it doesn't affect performance then who's thinking is "wrong" here?


What abstraction? Comparisons are ubiquitous, easy to read, and a one-liner as well. If the range were to change it could affect performance. I've never seen anything written like the original loop, it's not like it's a Python idiom or anything. I'd say it's the wrong way to do things.


The abstraction is in removing the cognitive overhead needed to understand this. The original programmer can now allocate the time saved to higher level concerns. That is what C did for assembly and what Python is doing for C. Is that not the theme of abstraction?


In this case I actually think the cognitive overhead is larger for the in range(5) call. You have to know how that will behave, is it 1 <= x < 5 or 1 <= x <= 5? The comparisons are more explicit and they are quite easy to read in this case.


We're still pretty close to foglemans "minecraft in 500 lines of python" which was optimized for minimal lines.

There are already some people talking about Cython for some of the fundamental graphics stuff.

Basically, this is just a base so that people can see what they're working towards.


I've been teaching a class (http://apprenticeworks.us) based on the same original repo it seems like you found. I made a lot of the same changes as you as well. I ended up abstracting a lot of the "optimization" type code (the sectors for example) and changed the way the state of the player is kept to support multiple players in the future.

Not sure if you'd be interested, but I have some content I'd be happy to add for background type information (i.e. degrees of freedom, calculating movement, frames of reference, etc). Might make a good starting point for contributors but not sure if that's the direction you see your documentation taking.


Just found this on hackernews now. Pull requests are very welcome. I don't personally have a lot of time to actually code stuff, as I'm busy with contracts, but if you post on our irc someone might be interested in digging into your code base and pulling out the useful bits.


It's not my project but in sure the repo maintainers would be interested!


I've always wanted to have an MC clone in Python with GUI event hooks to run scripts. Like pull a lever and an email is sent sort of thing. Will this fork allow for something like that? I'm not against adding it myself if allowed, btw.


There is also ScriptCraft - https://github.com/walterhiggins/ScriptCraft Here you can hook JavaScript code into Minecraft events.


It's build on top of pyglet which is built around event-based programming, so it's already got the ability to do low-level callbacks from key/mouse events built in [1]. GUI widgets are a separate matter though and would probably require a separate library to do OpenGL GUI elements on top of this.

[1] https://pyglet.readthedocs.org/en/pyglet-1.2-maintenance/pro...


Reading the docs, thanks!


You could always give me a hand to get https://github.com/voltagex/minecraft-rebridge up and running again, and then use HTTP from Python to talk to Minecraft-proper ;)


Seems interesting! Would this run on a raspberry pi? I'd love to email to see if we can collaborate in some way. My address is in my profile. :)


If you can run a Forge-enabled standard Minecraft server on the Pi, then yes. Development on the Pi would be painful, though.


Oh, the Rpi is the target platform, not the dev environment. I will look into the Forge-enabled MC server, thanks!


It will not. We're still very early, and definitely not feature complete with minecraft. We're still working on some very fundamental things.

Do do that, you first have to write code for pulling levers, which will first require some digging around with how pycraft handles textures to make it more generic.

Check back in a few months.


I will definitely do so! Thank you. :) By the way, accepting contributors? Maybe I can help out in some way...


We are very much accepting contributions.


Isn't this already possible with normal Minecraft? I mean obviously you have to write your mod with Java, but still.


Yes, you are correct. But normal MC is not open source. :)


Even though Notch was supposed to make it open source at some point. Another broken promise.


That point was when Minecraft was worthless. It's currently somewhere around the $2.5b mark in value. But that's just splitting hairs, it won't likely ever be open-sourced after MS acquired it.

That said, they also promised a modding API too...


I dunno... Microsoft has been open sourcing some pretty awesome stuff lately.


What he actually said was "Once sales start dying and a minimum time has passed, I will release the game source code as some kind of open source." [0]. Sales haven't died down and he doesn't own it anymore anyway.

[0] http://web.archive.org/web/20100301103851/http://www.minecra...


Seems like selling it is the broken promise. If sales die down, and a minimum time passes, he still can't OS it because he sold it.

Still, as long as the idea's out there... I heard the codebase isn't actually that great...


Couldn't something like that be done in Minetest?


I like that it uses shaders and modern OpenGL and not fixed functions. It looks like very good clean OpenGL!


"Permissive policy on pull requests" makes me fearful for the long-term maintainability.


Compared to the original. We'll revisit that when we have something that offers any genuine advantages.

A lot of people have been doing pull requests for unit-tests, documentation, and continuous integration. So I'm not too worried.

We're setting up a policy for requests, but that policy is living in code.


Now to write a Python interpreter in Minecraft for the infinite translation loop...


I love python, but I wish this was a web/webgl app. Is there no way to run this in the browser? :(




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: