Automated Refactoring of a U.S. Department of Defense Mainframe to AWS

kyberias · on May 6, 2019

I wonder what 1 million lines of COBOL translated to Java looks like.

Let us stop and give some caring thoughts to the people who will maintain that code base.

tanilama · on May 6, 2019

> Rather than simply transliterating source COBOL code to target Java code, the tool executes a mature automated conversion and refactoring process by first constructing a comprehensive Intermediate Object Model of the legacy system in an intermediate translation language.

> Once modeled within the tool engine, SMEs employ an iterative process of applying rules and tuning to output the transformed code into the target Java language.

Seems like they manually construct an IR of the source system, then translate the IR to Java Code.

Automatically transpiling COBOL code to Java will lose all possibility of maintenance moving forward. So it is not considered in the first place, nor it is what they need. They need a semi-auto REWRITE of their legacy system, not just target it to Java.

LrnByTeach · on May 6, 2019

what would have been the total cost of this whole conversion effort COBOL to JAVA system?

even if it $100 millions, it seems a successful project , given DoD mentioned potential savings of $25 millions/year

>Customer Benefits

> For the DoD, the component has been transformed from an expensive mainframe COBOL legacy system to an affordable, modern maintainable Java-based system. All valuable existing business rules have been preserved in the modernized system, while development, test, and production environments were migrated to AWS providing flexibility at reduced cost.

>The DoD’s projected cost savings is $25 million per year, and they are now able to use popular Java programmers to maintain and enhance the critical component.

Teckla · on May 6, 2019

There are commercial COBOL compilers available that compile to Java bytecode.

le-mark · on May 6, 2019

It's an odd phenomenon I've seen before. For some reason, execs think they need Java. I had one guy say that customers didn't want to hear cobol; so for him there was a marketing component. Otherwise, I think a lot of execs buy into the "cobol develoepers are rare and expensive" myth also put forth by this article. It's false; after 20 years of offshoring, there are 10,000s of unemployed cobol developers in the US. The shortage is cobol domain experts who wrote the original app, and who's jobs were offshored. That's the shortage.

Certainly there are cobol to jvm bytecode options and I'm sure they see some use. One other option I haven't seen yet is cobol to an intermediate DSL in Java that legacy developers could use and code with little trouble, Jobol if you will. This doesn't exist that I know of.

dwild · on May 6, 2019

> 10,000s of unemployed cobol developers in the US.

You need to find that Cobol developper, that can work at your office, that will still works for at least the next 10 years, that won't ask a ton of cash because he expect to keep a similar salary as he did before, that is actually competent, etc...

In the other hands, you got 10 000 unemployed java developer that get out of school in less than 500 km around your office, every year, that will do the job nicely for a pretty cheap salary because it's their first job.

PorterDuff · on May 6, 2019

Aha. That's why they keep inventing new languages.

givehimagun · on May 6, 2019

A fun fact - Java is over 23 years old. Don't know if anyone would call Java "new" anymore.

JoeSmithson · on May 6, 2019

It's actually almost the same amount of time C-to-Java as Java-to-now.

js8 · on May 7, 2019

And same amount of time Cobol-to-C?

rtkwe · on May 7, 2019

C started in 1972 so no. Cobol is close but was only 13 years between it's appearance and C.

salty_biscuits · on May 6, 2019

Like just like stegosaurus and t-rex

lalos · on May 7, 2019

Just like a fancy new javascript framework that get propped up by big companies. Can't have experienced developers if the framework is brand new!

granshaw · on May 7, 2019

In all seriousness, I think there’s a ton of waste involving languages - each language introduces its own package manager, doesn’t interop with other languages meaning you have to rewrite python’s requests library in #newlanguage2019 etc

ChuckMcM · on May 6, 2019

I see this happening more and more. And for me, it was my guess at why IBM bought RedHat, since the government was likely moving off an IBM solution this allows IBM to still "own" the platform (RHEL + IBM Java) and I presume at some point they will try to own the cloud as well.

It was interesting while I was there helping to transition the Blekko stuff to IBM I brought up how this sort of modernization of their cloud offering was in their strategic interest. IBM is so big, and so complex internally, it was kind of like shouting at a crowd moving through Grand Central station in New York. People can hear you shouting and slowly start to comprehend what you're trying to get them to see. Change is hard in an organization of that size.

nolok · on May 6, 2019

> I presume at some point they will try to own the cloud as well.

They already did that, they bought SoftLayer, but it's not working very well since instead of going for "a cloud offering" they are doing a "same kind of ibm offering as usual, but in the cloud !", there is a reason no one is talking about using them on sites like HN their entire target market seems to be company moving from big "enterprise" hosting to big cloud "enterprise" hosting.

And they're not buying AWS any time soon.

ChuckMcM · on May 6, 2019

I don't disagree with your assessment but if you are implying they aren't doing anything else I would disagree with that.

IBM has been around for over a hundred years. That is important because understanding how they have survived can give you an idea of the way they think about things. At several layers they tend to think in very long time frames.

I got to meet a number of the Softlayer managers when our startup was being integrated (Softlayer had people going through the same program) and I got to meet more of them when we did a project to host the Blekko crawler in a Softlayer data center on a number of machines there.

That was a good example for them of a project where the "Colocation/Hosting" model really fell down on the floor. A crawler needs (for example) a bunch of machines that have low latency access to each other as well as the Internet in order to support their storage and indexing models. AWS has 'service layer' clusters where you can subscribe to an "instance" of an elastic search cluster, which then allows you to talk to the service layer while the data center can optimize the communications layer. This is "Web 2.0" architecture which Google and Amazon both developed as part of their data center scale computers effort. The "policy" or "application" layer can be hosted on any machine because it needs only to dispatch requests of the underlying services, whether it be a storage service, an indexing service, or a network crawling service.

Back in 2014 when we were acquired this was still 'new stuff' to Softlayer and they were getting hammered both by our integration needs and the nascent Watson developer cloud for similar reasons. That was combined with billing and finance challenges (IBM is driven by the Finance group :-)) in order to understand costs, expenses, and ROI.

From what I could see from where I was sitting, it was at least a 5 to 10 year effort for them to get to where AWS was a decade ago in their thinking. But advancement from that point on could be more rapid by taking advantage of what had been tried and failed before. So many pieces of the company needed alignment, sales, finance, engineering, management, and operations. To be honest, if I was going to spend the end of my career somewhere it would have been a good project to take on.

So when I read the comment "it's not working very well" my response is "yet" and I agree that that "yet" can be a fairly arduous process within a company the size of IBM.

reallydude · on May 6, 2019

> it was at least a 5 to 10 year effort for them to get to where AWS was a decade ago in their thinking.

> how they have survived can give you an idea of the way they think about things

I don't care. The speed of innovation will outpace IBM forever, at this point. I'm not going to leverage them in my lifetime, nor wold anyone else who is competent. There are lots of bottom feeders and IBMs spectacular age is legacy entrenchment, not a magic sauce.

dev_dull · on May 7, 2019

> There are lots of bottom feeders and IBMs spectacular age is legacy entrenchment, not a magic sauce.

This can actually be said of almost every “modern” technology now and hot startup now.

user5994461 · on May 6, 2019

I write about them from time to time.

Here's one article from last year: AWS is now cheaper than SoftLayer https://thehftguy.com/2018/11/13/2018-cloud-pricing-aws-now-...

tyingq · on May 6, 2019

They did get OpenShift with the RH purchase. So they could put customers on that, potentially on AWS, with some sort of "hybrid / agnostic" pitch. Not that I think it's a terrific idea, but it's sticky.

Ixiaus · on May 6, 2019

Blekko is a name I haven't heard in a long time...

ThinkBeat · on May 6, 2019

It will be interesting to see how this works out, but I doubt we will hear much publically about it again.

I have been involved in a somwhat similar project where we had a large system written in an obsolete language. It was HUGE. And it ran fine. With millions of transaction. But it required expensive hardware to run it.

The decision was made to convert it to Java. The effort took years, and the rend result was not impressive I thought.

In a lot of ways Java was a step down in abstraction from the more DSL original langauge. Implementing new rules took more time, and running it took a lot more orchtecstration.

I am not convinced that it would have been better to hire and train a few devs in the old system and let it keep doing its thing for a few decades more.

iSnow · on May 6, 2019

The article implies it wasn't a very nice or well-structured codebase, so I guess they didn't make it much worse.

And they ripped out the COBOL Data management system and replaced it with a RDBMS. This alone probably is worth it since you can hook it up to business intelligence tools now to create reports and dashboards.

lallysingh · on May 6, 2019

The old language was domain specific, of course it was a better set of abstractions for what you care about.

But maintaining and extending the thing was only going to get more expensive. And once you've paid the price to modernize, you've bought yourselves a lot of savings for 1-2 decades, maybe more.

new4thaccount · on May 6, 2019

Did you? That system was already working and it sounds like it just needed a couple of decent Devs. I'm sure the hardware is not cheap, but surely it's cheaper than a full Java rewrite with new hardware?

lallysingh · on May 6, 2019

Mainframe hardware is crazy expensive. There was an old saying from an ex-IBM'er that each one came with a free engineer. Because that's how much they cost.

jaden · on May 7, 2019

A former employer had mainframes that weren't permitted to run at full capacity without paying even more. They were so entrenched they paid IBM thousands of dollars whenever they exceeded their allotment.

cr0sh · on May 7, 2019

At one time IBM (and probably other manufacturers) installed RAM and hard drive capacity, but didn't enable it. If you wanted more capacity, they sent out an engineer to basically "flip a switch" (more or less). You always had that ability, but not until they enabled it.

We kinda see something similar today with Tesla - and we'll probably see more of it in the future.

user5994461 · on May 6, 2019

Was it COBOL?

COBOL is a very niche language that is very well specialized to banking. I don't envision applications rewritten in something else easily for any significant benefit.

bashinator · on May 7, 2019

Could have been MUMPS

cr0sh · on May 6, 2019

I am impressed with this solution, insofar as was described in the press release. But I also have concerns.

My concerns mainly lie with choosing such a proprietary solution over one using more open-source standards. Both in the language selected, along with the backend (Oracle and AWS) being used.

Will there or could there be problems in the future, should the need arise to migrate off one or more of those proprietary platforms?

What will or could happen if there is a security issue that affects or targets AWS, Java, or this system in particular - are we "stuck" again with a potentially "broken" system that can't be easily moved because of vendor lock-in of sorts?

Should Oracle make further moves in the direction of "closing" Java - will programmers continue to learn the language and support it, or will they move to other, more open, solutions?

What happens if or when AWS or Amazon ceases to exist?

I just wonder if we haven't traded a largely unwieldy but mostly known issue of a mainframe and COBOL, for a seemingly known (but questionable) issue of AWS, Java, and Oracle.

Will we just revisit this whole problem again in 50 years?

If so, maybe then we'll make a better decision to use more open and auditable solutions (but I'm not going to hold my breath to that)...

ryanmarsh · on May 6, 2019

should the need arise to migrate off one or more of those proprietary platforms

Fair point given how things usually go in the private sector. In the public sector, especially DoD, the rules of the game are changed.

What happens if or when AWS or Amazon ceases to exist?

Well guess what, they can't. It's now a national security issue so the necessary services will continue to be provided by a KTLO crew or DoD would facilitate a transaction where another company could absorb the needed technology and be contracted to support it. DoD is the biggest company on earth. FY 2019 budget is $686B.

DoD has a pattern of keeping mission critical end-of-life systems running and supported.

I'm currently advising the US Navy on moving some COBOL systems to Java. Those systems have been lovingly supported by lifelong Navy civilian employees for 40-50 years. Those people are now in retirement age and so they're "replatforming" to Java.

There really is no corollary in the private sector. The default for government systems has been the people who built it stick around and maintain it until they retire. For DoD the biggest risk isn't old technologies and changes in the private sector, it's brain drain when people retire.

You might ask yourself, what 20-something wants to work for the US Navy, as a civilian, and work on a Java implementation (ported from COBOL) for the rest of their career? Well, I'd like to introduce you to them. These guys could work anywhere but they like the Navy. They like that their work matters, it's important, and it's stable. The pay is below market but it isn't terrible. Keep in mind DoD isn't all in DC, many offices are in inexpensive places to live.

Frankly I'd love to work on systems with the meaningful impact that theirs have (non-combat in this case). Read about 18F and what it was like for the people who joined and how they stuck with it anyway.

tty2300 · on May 6, 2019

Most 20-something developers are likely of the opinion that that work is the opposite of useful. Its actively harmful

abakker · on May 6, 2019

Nothing is really future proof, though. If it makes if 5 years before any major changes, then it will be good. I don't disagree that eventually this could lead to the same kind of vendor lock-in, but, it is also performant, has support, and will work tomorrow with off the shelf parts. of course you could go with OSS, and while that might be a good technical solution, it doesn't necessarily make it a good business decision. AWS is available now, and its performance characteristics are well understood.

noobermin · on May 7, 2019

There's more here too. Amazon isn't "American" even though it's CEO is American and has many American employees. Multinational corporations owe no loyalty to anyone other than their shareholders, I mean, Amazon the corporation paid nothing in federal taxes last year. What happens when their bottom line contradicts keeping this country safe?

This underlines how so much of the work and contracts at the DoD don't really revolve around defending the US than being another way to hand out tax payer dollars to already well off corporations.

scarface74 · on May 7, 2019

As if government officials do what’s best for the country and not what will allow them to get campaign contributions or make friends companies so they can leave the government and go into the private sector.

DeepYogurt · on May 6, 2019

Considering amazon did all the I feel like the AWS target is reasonable. I suspect that the next time this sort of works needs to be done it will be easier than this round was.

dangerboysteve · on May 6, 2019

Was this based on the work one IRS engineer did?

https://federalnewsnetwork.com/tom-temin-commentary/2018/01/...

edit: Added link to patent application.

http://www.freepatentsonline.com/y2018/0253287.html

asdfman123 · on May 6, 2019

> Wang was working under streamlined critical pay authority the agency has had since its landmark 1998 restructuring. It gave the IRS 40 slots under which it could pay temporary, full-time employees higher than GS rates. Former Commissioner John Koskinen pointed out Congress did not re-up this authority in 2013...

> [Wang] says he applied to become a GS-15 or Senior Executive Service member so he could see through the assembler-to-Java project. But his approval didn’t come through until a week before his employment authority expired.

They lost a guy responsible for software that could save taxpayers how much -- tens of millions of dollars? More? All because they couldn't pay a guy a GS-15 salary, which translates to $100k-140k.

Man, people are really dumb about paying programmers well. They'd rather see a project go off the rails than pay someone (still slightly below) market rates.

Also:

> In many ways, assembler is still excellent for this application. Milholland said of the code, “The assembler is well written. It’s incredibly efficient and effective.” But a shrinking number of people understand it. And it’s not optimized for the online, transaction mode to which the IRS needs to keep moving. Java, relatively inefficient as it may be, is the current standard and has legions of people who know it.

That's like arguing mixing concrete by hand is "more efficient" because doing so allows you to be more frugal with concrete.

rakoo · on May 6, 2019

> That's like arguing mixing concrete by hand is "more efficient" because doing so allows you to be more frugal with concrete.

No, that's more like saying mixing concert by hand starts to become attractive because there's only one guy on Earth who knows how to operate a cement mixer. The alternative is crap, but it's easy to find people to do it.

asdfman123 · on May 8, 2019

Honest question: do you really think we'd be better off using assembly for run-of-the-mill business logic applications, if there was a workforce that was well trained on it?

rakoo · on May 9, 2019

Assembly in itself is probably too low-level and doesn't have enough abstractions to be able to develop and maintain a typical business application, but the language itself isn't really the reason. What made companies switch to the cool kids that were Java, Python and Javascript is that the time it takes to learn these languages enough to be productive as a grunt developer is low. It's not so much about the workforce already being well trained or not (there definitely are experts in COBOL and other ancient languages), it's that getting new people up to speed is easy. They will mostly spend their time on the actual business logic instead of stumbling on syntax errors.

This is the reason that makes me believe Go is going to be the new Java: a language with good enough performance, lots of safeguards preventing sloppiness to break, and a very very flat learning curve.

karambahh · on May 6, 2019

I've worked on a similar project, we tested two different approaches:

Recompiling the COBOL codebase on Linux and accessing and loading it as shared objects (via JNI for JVM usage)

Running Hercules ( https://en.wikipedia.org/wiki/Hercules_%28emulator%29 ), an IBM mainframe emulator

The first one has lots of merits, as it allows for a progressive replacement of COBOL functionalities

The second one immediately lowers the TCO for obvious licensing reasons but does not plan for the future, particularly as COBOL developers retire and thus the resources are scarce.

(Anecdotally the customer chose both: recompilation & progressive replacement by Java code for core assets and emulation for "dead", non core, assets dev & staging environments)

le-mark · on May 6, 2019

The second one immediately lowers the TCO for obvious licensing reasons

How true is that though? They'd still have software licensing, they still need an OS to run on Hercules after all.

fork1 · on May 7, 2019

Not just the OS, legacy applications tend to rely on DB2, CICS, IMS, and what not. All of those have expensive licenses and cannot readily run on emulation. Switching to Hercules does not solve the issue, unless your cobol apps are barebone batches.

karambahh · on May 6, 2019

It's been a long while.... At the time, RedHat was still "free".

Linux sysadmins were cheaper than their mainframe colleagues.

Hw was orders of magnitude cheaper.

Floor occupancy was lower.

We did not got as far as watts consumed per functional task performed but I doubt it would have tilted the balance?

karambahh · on May 6, 2019

Sorry I misread your comment!

You're obviously right for licensing inside the emulator! However, the costs are(were?) in hardware & floor occupancy (as in "decomission a whole room"-cheaper)

mperham · on May 6, 2019

No expensive IBM hardware support/services needed.

tyingq · on May 6, 2019

IBM won't let you run OS/390, z/OS, etc, in Hercules for free. There's an ancient version of MVS you may be able to run with Hercules, but I'd expect a lawsuit anyway.

ryanmarsh · on May 6, 2019

Here's the main problem with this: Where are the requirements?

The COBOL doesn't have automated tests. When you port the COBOL to Java (whether automated or manually) you aren't asking "where are the original requirements?". Contrast this with a "re-write". When you do a re-write you have to go back to the customer and whatever documentation you can find and figure out and re-document what the system should do. This can be far removed from what the system did in its original design docs, and can differ still from what people believe.

This step of codifying tacit knowledge of the behaviour is important in the evolution of systems.

dragonwriter · on May 6, 2019

> The COBOL doesn't have automated tests. When you port the COBOL to Java (whether automated or manually) you aren't asking "where are the original requirements?".

You should be asking where is the current documentation of the system specs, and in some cases you might even get something reasonably current and complete when you ask that, depending on how to use the change management process has been.

> When you do a re-write you have to go back to the customer and whatever original documentation you have lying around and you have to figure out and document what the system does.

Both what it does and (and this is where the customer is involved) what it is currently supposed to do.

binarymax · on May 6, 2019

The system is the specification. Working software that is 65 years old is an organic and evolved entity that can't be manually ported or updated.

Another possible target for this is the IPPS/OPPS Medicaid/Medicare code calculator system - which was written in COBOL for IBM 370 by mostly a single developer. Sadly, the developer passed away about 10 years ago, and the system has yet to be updated. The code was open sourced in a bid to gain assistance for updating from the community. Automation such as this would help significantly.

tinix · on May 7, 2019

Got a link to that code? Or more info...?

binarymax · on May 7, 2019

The info I have on the unfortunate death of the developer was word-of-mouth from people close to the industry when I was working on a project in the domain.

The code and data files are all here, enjoy :)

https://www.cms.gov/Medicare/Medicare-Fee-for-Service-Paymen...

jefft255 · on May 6, 2019

I'm surprised that automatically translated Java code would be easier to maintain than human-written COBOL. Doesn't it become difficult to understand? I suppose 50 years old COBOL is basically as cryptic as auto-generated Java...

bpicolo · on May 6, 2019

> Phase 2 (12 months): Code advanced refactoring to remove COBOL design overtones

They spent 12 months making it less-COBOLy / autogenerated feeling

le-mark · on May 6, 2019

This is the part that piqued my interest. I've seen cobol to java translated source, it looks like java used as an assembler for cobol. Certainly there's a spectrum of cobol-y-ness. One would be hard pressed to go from cobol to even a rudimentary class/domain business/presentation object oriented architecture, for example. Cobol control flow and data representations are simply to different. I'd like to see what they ended up with.

There are many many problems with these types of projects. One thing that often happens is leaving the new application in the hands of the legacy team to manage. They often lack even rudimentary skills in managing java runtimes. I saw one team struggle with performance issues that were easily solvable with a connection pool configuration change.

Maybe the worst case with these types of systems is enhancements going forward. You can guarantee there will be no more money spent on modernization, and any changes will be made to the Frankenstein java/cobol source. Which will continue to accrue technical debt and an even greater rate.

The main project I saw that did this was eventually abandoned, and luckily the cobol application was still running.

stcredzero · on May 6, 2019

Certainly there's a spectrum of cobol-y-ness. One would be hard pressed to go from cobol to even a rudimentary class/domain business/presentation object oriented architecture, for example. Cobol control flow and data representations are simply to different.

It's still a win if they can go from procedural COBOL to procedural idiomatic Java.

I saw one team struggle with performance issues that were easily solvable with a connection pool configuration change.

Seen it happen before. Someone will step in and charge those hourly consulting fees.

You can guarantee there will be no more money spent on modernization, and any changes will be made to the Frankenstein java/cobol source. Which will continue to accrue technical debt and an even greater rate.

Not if the project includes a phase where they start modifying things to increase the degree of idiomatic Java code.

le-mark · on May 6, 2019

It's still a win if they can go from procedural COBOL to procedural idiomatic Java.

Goto's and "fall through PERFORMS" make idiomatic java impossible.

bpicolo · on May 7, 2019

Seems like gotos would be fairly easy to clear out in the automated refactor. Control flow feels like one of the more trivial parts of automated refactoring, because that's straightforward AST modifications?

stcredzero · on May 7, 2019

Control flow feels like one of the more trivial parts of automated refactoring, because that's straightforward AST modifications?

No. The AST part is wrong. Compilers do decompose code into basic blocks. Even the most spaghettified code decomposes into basic blocks. Any language with tail call optimization can handle the basic blocks rewritten as functions.

That won't get you to "clean" however.

stcredzero · on May 6, 2019

Even that involves basic blocks.

itronitron · on May 6, 2019

Possibly only easier to maintain as Java due to developer tools being more recently developed and actively used.

jrowley · on May 6, 2019

Very interesting write up, especially for what one might expect to be straight up promotional piece. It's nice to hear that amazon can guarantee the necessary uptime to run a system of such importance.

It'd be interesting to compare this approach to modernization approaches in the government. For example the veterans administration is moving away from their ancient but popular and decently well regarded electronic health record system Vista (https://en.wikipedia.org/wiki/VistA) to cerner.

https://www.hitechanswers.net/va-establishes-office-of-ehr-m...

jakebol · on May 6, 2019

My Father works as an engineer for the VA, the hidden costs of this transition is the need to rewire almost the entire VA hospital system to comply with cerner networking requirements at the cost of $100's of millions per hospital before the software can begin to be deployed.

The other hidden cost is that these ancient software systems are extremely efficient in terms of cpu / memory utilization compared to modern equivalents so the hardware requirements go up considerably which impacts power / energy efficiency and direct deployment costs for replacement systems. Not saying modernization is not a win in the long term but there are direct and indirect costs with these big transitions.

edoo · on May 6, 2019

I wouldn't be surprised if Amazon is a national security component now and that may be why they don't pay taxes are get have deliveries subsidized etc. Can the government let them fail if they are critical infrastructure?

tgraham · on May 6, 2019

Dumb question - how automated is an automated solution that 'refactors COBOL to java' in 18 months? Impressed with the solution / outcome.

darkr · on May 6, 2019

At a wild guess, I’d imagine getting something working probably took about 18 hours, with the rest of the time spent writing tests and defining behaviour of the legacy system to prove that the port actually worked

fork1 · on May 7, 2019

We migrated the core banking from a small private bank, and that was 12 million COBOL lines. It actually seems like a "small" system. We used the same approach (java translation). Running the cobol is actually the easy part. The hard part is when you start throwing in managing files the mainframe way, the database (needs to quack like DB2), the transaction monitor (CICS), presentation layer (3270 like), scheduler, etc. And there is plenty of other technologies usually deployed in such shops (IMS, MQ, etc). And you need to support it all...

wycy · on May 7, 2019

How many lines was the migrated version?

maccam94 · on May 7, 2019

"Refactor Redundant Code to remove code redundancies" in the Phase 2 diagram was pretty great. I hope that was intentional.

jxramos · on May 6, 2019

> At the end of Phase 1, the Java program is “dirty” with remaining COBOL coding practices. At the end of Phase 2, the Java program is “clean” without COBOL remnants.

Great use to match up with the dirty Java logo. I’m sure they had a lot of fun designing that, and this dirty/clean terminology will probably stick in other source to source transformations.

stcredzero · on May 6, 2019

I've done something like this using syntax-directed translation. One can proceed by changing the transformation granularity. First, you have naive direct transliterations plus scaffolding libraries which result in non-idiomatic target language code. (In the article's case, this would be non-idiomatic machine produced Java) It runs and passes some of the tests.

Next step is to fix the bugs, and get all the tests to run.

The last step, to get to "clean" involves going through the code base, looking for project-specific idioms, then hand-tuning the syntax matching engine with rules that output idiomatic target language code. This might also involve refactorings to make the scaffolding libraries more idiomatic as well.

Since one is coding on the transformation engine, instead of directly on the application itself, production changes and maintenance can proceed, and no code freeze is needed.

craftyguy · on May 7, 2019

Alternate title: How the DOD was locked into depending on single vendor

(this rarely works out well for taxpayers in the long term..)

tantalor · on May 6, 2019

What is "architectural future state requirement"? Why didn't the emulation idea work? Isn't that the industry standard for moving old code to new hardware?

blincoln · on May 6, 2019

I have limited experience with mainframe emulation, but from what I've seen, trying to do it in a live business environment is challenging at best because all of that old mainframe software was written with a lot of assumptions that don't apply to non-mainframe environments. Things like blazing-fast I/O, virtually guaranteed uptime of the mainframe, etc.

As one example, when I was a systems engineer, I got a call from a team that was trying to run a business application from z/OS in emulation on an x86 server. It was running incredibly slowly, even on a fairly beefy server. I looked in its data directory, and there were hundreds of thousands of individual files in a single directory. Not a tree. One directory. That might work fine on a mainframe, but just doing a dir or ls (depending on the host OS) took something like ten minutes.

Similarly, I remember some folks being dumbfounded at the idea that their batch processes needed error-handling when they were doing things that involved talking over the network, because when everything was running within that one giant IBM system, failures were pretty rare.

Those both sound like they should be easy to tack on when running in an emulator, but they're just the tip of the iceberg. When you have code that's 30-50 years old, written for a vastly different platform, there are going to be a lot of gotchas like that.

posixplz · on May 6, 2019

The authors emphasized in multiple places that moving off COBOL was a first-order project requirement.

I did find it interesting that the original system, and delivered overhaul, both ran Oracle databases. Overall transactional volume is low (500k/day). Perhaps the DoD has significant investments in stored procedures? Or perhaps Aurora is not ready for prime time in AWS Gov Cloud? (It’s no secret that AWS really struggles with keeping feature parity between Gov and commercial.)

count · on May 6, 2019

The DoD has a significant investment in Oracle by name. In over a decade working around their systems, I've yet to encounter anything that couldn't be served by a mid-sized PG instance instead. The VAST majority of systems don't even register on 'performance' thresholds - they serve 5-50k users, almost entirely internal, etc. The 'bigness' of the program is its importance, rather than it's actual technical scale.

tantalor · on May 6, 2019

But there seems to be no business purpose justifying that requirement. They spent 30 months writing and refining a COBOL-to-Java transpiler instead.

Maybe the business purpose was to waste time and bill clients?

mooreds · on May 6, 2019

Which requirement? Moving off COBOL?

They mentioned numerous times that COBOL work was harder to staff and maintain.

Seems like a good reason to move to me.

taftster · on May 6, 2019

I think that was somewhat implied in the article:

> "Difficulty finding qualified, affordable COBOL programmers."

Continuing to emulate the COBOL environment doesn't quite solve the problem that finding COBOL developers is becoming harder and harder.

_qwfv · on May 6, 2019

Amazon, and AWS in particular, sure seems all in on building technology for the military and police.

Anyone else experience some cognitive dissonance in "the world's most customer centric company" also someday powering the facial recognition systems used by customs, border patrol, police, and military systems?

It seems ripped straight of a cyberpunk novel to me.

crankylinuxuser · on May 6, 2019

I, as a single person working on a maker conference thing, did a facial recognition system that relied on no GPU or internet anything. I did this in 2015.

That's just a lone hacker doing shit outside of work... And my solution can go through directories of images, realtime video stream, videos, and ipcams.

So yeah the cat is out of the bag. The bag was shredded, and the cat has had 3 litters of kittens.

Any video system could be retrofitted as a facial recognition system... Or even process videos after the fact.

galkk · on May 6, 2019

Personally, I don't see any problems with providing services (including facial recognition) to government and it's structures.

disc: I'm Amazon employee, not working in any of the facial recognition. I'm also an immigrant, ~5 years in USA

_qwfv · on May 6, 2019

Sure. I think that's a valid position. It's just not what comes to mind first when I think about Amazon as a retail company. (I worked for AWS for quite a while as well.)

jcwayne · on May 6, 2019

The vast majority of the general public has no idea that Amazon is a logistics company (I'm including AWS in that definition) that also happens to do retail stuff. It's similar to how GM was for a long time a finance company that also happened to make cars.

Spooky23 · on May 6, 2019

Police are great ways to get into government business as the paramilitary structure allows decisions to be made.

tracker1 · on May 7, 2019

The software is going to be written anyway, I'd rather they rely on existing services and infrastructure to reduce taxpayer costs.

oldjokes · on May 6, 2019

Right now it's not super scary, but in 10 or 20 years it's conceivable that Amazon becomes effectively a state-run entity like the telecoms are now, and then you could start imagining all sorts of awful cyberpunk storylines.

mtnGoat · on May 6, 2019

Its not that far fetched of an idea IMO, considering there is talk in the government about braking AMZN up, buddying up to the GOV might be a good defense against that.

nartz · on May 6, 2019

My main question is why this can't be done incrementally? The system can almost always be broken down into functional areas? Does Agile here just not apply?

rb808 · on May 6, 2019

From a national security point of view it seems a few well placed bombs in about 10 data centers will take out most of the US computing power.

I'm a bit surprised DoD encourages the consolidation, though maybe once the software is cloud enabled it will be more portable to small sites in the future.

dragonwriter · on May 6, 2019

> From a national security point of view it seems a few well placed bombs in about 10 data centers will take out most of the US computing power.

AWS has 13 US-based availability zones, each of which has, as I understand, multiple DCs.

GCP has a similar architecture with 16 US-based zones and 3 more standing up imminently.

Azure has a similar architecture, but separate AZs within regions are newer for Azure, and I can't find detailed information as readily, but they look to have 8 US regions,at least four of which are multi-AZ, so at least 12 zones, each with one or more DCs.

And that's 41 zones, many of which have multiple DCs, with just three big public cloud providers; I don't think 10 DCs gets you most of the computing power.

ElijahLynn · on May 6, 2019

This project resides in GovCloud, which is just `us-gov-west-1` right now, I believe.

ElijahLynn · on May 6, 2019

Looks like I am incorrect, us-gov-east exists as well. My project only has access to west right now. https://aws.amazon.com/blogs/aws/aws-govcloud-us-east-now-op...

dragonwriter · on May 6, 2019

> This project resides in GovCloud, which is just `us-gov-west-1` right now, I believe.

GovCloud East is also up, but the GovCloud West region's 3 AZs plus GovCloud East aren't all of US (or even “US government on public cloud”) compute (AWS also has Secret Region, Azure has two government non-secret regions and two secret regions; GCP doesn't have a government region but federal government agencies use regular commercial GCP regions—that’s true of AWS and Azure, too, the government offerings are approved for high-impact use whereas the general offerings have a lower-impact approval; Google is seeking to also get the High impact approval for their commercial offering, rather than having a distinct region IIRC.)

ma2rten · on May 6, 2019

I don't know how you define datacenter. An AZ could be a different building on the same campus. These are Google's main locations. As you can see there are only 9:

https://www.google.com/about/datacenters/inside/locations/in...

wolf550e · on May 7, 2019

AWS AZs are really separate from each other (separate power and network)

Quoting from the docs [1]:

Each AZ can be multiple data centers (typically 3), and at full scale can be hundreds of thousands of servers. They are fully isolated partitions of the AWS Global Infrastructure. With their own power infrastructure, the AZs are physically separated by a meaningful distance, many kilometers, from any other AZ, although all are within 100 km (60 miles of each other).

All AZs are interconnected with high-bandwidth, low-latency networking, over fully redundant, dedicated metro fiber providing high-throughput, low-latency networking between AZs. The network performance is sufficient to accomplish synchronous replication between AZs.

1 - https://aws.amazon.com/about-aws/global-infrastructure/regio...