Hacker News new | past | comments | ask | show | jobs | submit login

I have a chain of applications. A postgres spits out data. A java application makes a CSV from it. A apache+php application takes that csv, removes the first column (we don't want to publish the id), then sends it to the destination. Both postgres and java do significant transformations, but the bottleneck for that chain is the PHP code. I already optimized it, and it is now literally 3 lines of code: read line, remove everything before first ',' , write line. This speeds up things enormously from the previous php doing fgetcsv/remove column/fputcsv, but still removing the PHP (keeping only apache ) from the chain doubles the speed of the csv download.





why does the java application/postgres output the id instead of omitting it in the first place?

Adding PHP to your stack in order to drop a column from a result is a WILD decision.

I don't want to be a "back seat driver", but it seems strange to me as well.

It could be that the original files are used by other processes, and that they for some reason don't want to create two separate files. Maybe an issue with office politics (works on a different team), or an attempt at saving disk space


The java thing is an off the shelf application, there is no way to turn off the generated id. So someone slapped some php glue code on it, php being the majority of the code. It is far from optimal, but good enough now to end up low on the priority list.

As it runs on windows, I might move it to linux and add sed or something in the chain, losing the php completely. But for now, php it is.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: