As a long time Kettle user (probably close to 10 years) I must warn potential users that the learning curve is steep and that (as any large body of code) it contains code that sometimes can run unpredictably. I got good at diagnosing user induced bugs in PDI transformation via reading the stacktrace but it is not to everyone liking.
To me, a very strong regression is the "new" UI which switched from meaningful icons to a blue & white scheme that makes reading/discovering new transformation a real pain: all is a blur of blue without the past color cues that you learned in the past ("ok, this is the icon for a merge from a source file & a database sent to an ES cluster" became "some stuff is read from blue sources and sent to some blue output")
I recently learned about the capability to run transformation into a spark cluster that replace the original engine by a new spark implementation, bring obvious compute optimization for large enough dataset but I don't have enough experience with it to speak of it positively or negatively.
@karmbahh - good to know. I've used Kettle for about 9 months in production and so far its been pretty solid - but we are not going that far off the beaten path for most things. Its a big app, but at least there is documentation and some great users who have been very helpful and the codebase is by and large very logically laid out.
We do use the Adaptive Execution Layer - but so far not with spark (we use it with our own processing engine) - its working well for us and its great we can switch engines as needed.
re: UI. I like a lot of new look and feel but I can see how it did lose some visual semantics and i can imagine any long term user would find the changes frustrating. I guess with coming to the tool much later, this has been less of an issue for me and we teak the presentation for our own workflows and plugins anyhow.
For us Kettle/Pentaho PDI is a great open source project but it will definitely be interesting to see how things evolve now Hitachi has acquired Pentaho.
To me, a very strong regression is the "new" UI which switched from meaningful icons to a blue & white scheme that makes reading/discovering new transformation a real pain: all is a blur of blue without the past color cues that you learned in the past ("ok, this is the icon for a merge from a source file & a database sent to an ES cluster" became "some stuff is read from blue sources and sent to some blue output")
I recently learned about the capability to run transformation into a spark cluster that replace the original engine by a new spark implementation, bring obvious compute optimization for large enough dataset but I don't have enough experience with it to speak of it positively or negatively.