I had a problem where there was some fixed width data in a sql database - basically someone put mainframe data in a database.
There were 3 fields that had multiple entries in them, fixed width delimited. I had to split the fields by width and re-combine them, then also recombine them with another set of data with weird delimiters and rules.
It took a day and half (not full time) to figure it out in python. I can't even imagine tackling the problem in a non-repl language.
Turns out a good database is really good at data munging. A solution for postgresql might look something like this:
CREATE VIEW my_table_improved AS
SELECT SUBSTRING(the_column, 0, 8) as col1,
SUBSTRING(the_column, 8, 8) as col2,
SUBSTRING(the_column, 16, 8) as col3
FROM my_table
After which you can query my_table_improved as a normal table. col1, col2, and col3 contain the data split out from the_column. In practice you'd probably also want to do some type conversion, e.g. if col1 is supposed to be an integer you can simply update the view to select `CAST(SUBSTRING(the_column, 0, 8) as INTEGER) as col1` instead. In production use you might find this to be slow at which point you will want to create indexes for your new columns. Something like this (adjusted to which queries you're running of course) should work:
CREATE INDEX ON my_table ((SUBSTRING(the_column, 0, 8)));
Of course this is a lot of work if you're unfamiliar with SQL, and above examples aren't quite complete yet for your use case, but it should get you an idea of how SQL is the exact right tool for the job here. which is somewhat the point of many commenters here: get yourself familiar with SQL and save yourself a metric tonne of work in the future.
An alternate approach might be to write a query that migrates the fixed-width format to a format where each entry is in their own column. The ease of this mostly depends on if applications depend on that column being in that format.
side note: above sql code is untested but should be roughly correct.
If it was just one column, that would be easy enough. Or if one object was on one row, it would be easy enough. In my case, up to 6 rows could be required to represent one object, and I had to slice 3 columns with an arbitrary number of slices.
Oh, and there were two types of sub-record per object, and they had to be processed in database order.
Well, as long as you can build a query to get the data in the right format (which you almost inevitably can) you can make a view out of it. But honestly the true solution here would be to migrate away from such a brain damaged format.
There were 3 fields that had multiple entries in them, fixed width delimited. I had to split the fields by width and re-combine them, then also recombine them with another set of data with weird delimiters and rules.
It took a day and half (not full time) to figure it out in python. I can't even imagine tackling the problem in a non-repl language.