Not just TF: pytorch, keras,mxnet,..
Like I said: We won't be the #1 framework data scientists use.
There's a ton of churn in that space yet.
Granted, a ton of it is TF. Google is doing an amazing job now.
We're trying to be a moderate middle ground.
TF and pytorch and anything in python tends to be an interface.
Core logic still operates in C/C++.
That's great when you want speed, but you end up missing the benefits of the JVM (the tooling is great for monitoring and things folks need at scale).
So we do everything via JNI where we push down the math one block with minimal overhead if any.
That and if you want to write production code, you don't need to push logic down to c. You can do something JVM based instead, which gives you kotlin,scala,clojure,..
i think this aspect of dl4j gets lost in the overall message. For me, this is much more powerful - "use dl4j if you want to have a seamless experience running tensorflow/keras/mxnet on spark".
Because right now, it looks like it is tensorflow vs dl4j.
Honestly hard for me to care..They can both compete as well as integrate.No offense here but model import existing will become a common thing as things like ONNX become standard.
We play up what's unique about Dl4j anyways. It's right in the name. We're square focused on the JVM.
We also hedge our bets against spark. Most of our logic doesn't even run on spark.
Spark is just a facilitator..it's not where anything that matters runs. We could just as easily use flink or apex here too. Those are also JVM based streaming engines.
We can't overplay spark because most of our deployments won't even be with spark. We don't even use spark in our own production inference tools. We just deploy as a microservice.
I have personally been campaigning for ONNX to merge/leverage Apache Arrow (https://news.ycombinator.com/item?id=15195658). It probably makes sense for the efforts to build on top of each other.
Integrating with arrow is on our bucket list as well.
We plan on integrating with their tensor data type and format.
I feel like you're trying to cross 2 worlds that shouldn't be for arrow/ONNX though. You're trying to map a neural net description language to a columnar format..that doesn't make any sense to me. In our case, our runtime will understand both but for different reasons though.
We will import the format for our neural nets and integrate with arrow for our ETL -> tensor pipelines.
Granted, a ton of it is TF. Google is doing an amazing job now.
We're trying to be a moderate middle ground.
TF and pytorch and anything in python tends to be an interface.
Core logic still operates in C/C++.
That's great when you want speed, but you end up missing the benefits of the JVM (the tooling is great for monitoring and things folks need at scale).
So we do everything via JNI where we push down the math one block with minimal overhead if any.
That and if you want to write production code, you don't need to push logic down to c. You can do something JVM based instead, which gives you kotlin,scala,clojure,..