Not just TF: pytorch, keras,mxnet,.. Like I said: We won't be the #1 framework d...

sandGorgon · on Oct 5, 2017

i think this aspect of dl4j gets lost in the overall message. For me, this is much more powerful - "use dl4j if you want to have a seamless experience running tensorflow/keras/mxnet on spark".

Because right now, it looks like it is tensorflow vs dl4j.

agibsonccc · on Oct 5, 2017

Honestly hard for me to care..They can both compete as well as integrate.No offense here but model import existing will become a common thing as things like ONNX become standard.

We play up what's unique about Dl4j anyways. It's right in the name. We're square focused on the JVM.

We also hedge our bets against spark. Most of our logic doesn't even run on spark.

Spark is just a facilitator..it's not where anything that matters runs. We could just as easily use flink or apex here too. Those are also JVM based streaming engines.

We can't overplay spark because most of our deployments won't even be with spark. We don't even use spark in our own production inference tools. We just deploy as a microservice.

sandGorgon · on Oct 5, 2017

Well interesting parallel you give here !

I have personally been campaigning for ONNX to merge/leverage Apache Arrow (https://news.ycombinator.com/item?id=15195658). It probably makes sense for the efforts to build on top of each other.

YMMV ;)

agibsonccc · on Oct 5, 2017

Integrating with arrow is on our bucket list as well. We plan on integrating with their tensor data type and format.

I feel like you're trying to cross 2 worlds that shouldn't be for arrow/ONNX though. You're trying to map a neural net description language to a columnar format..that doesn't make any sense to me. In our case, our runtime will understand both but for different reasons though.

We will import the format for our neural nets and integrate with arrow for our ETL -> tensor pipelines.