Some consumers will want the latest and greatest content. To please everyone (other than the owner) you'd need to look at the content across time, versions, alternate world views,... Thus "deep".
My central use case is that I might 'scrape' content from sources such as
and have the process be "repeatable" in the sense that:
1. The system archives the original inputs and the process to create refined data outputs
2. If the inputs change the system should normally be able to download updated versions of the inputs, apply the process and produce good outputs
3. If something goes wrong there are sufficient diagnostics and tests that would show invariants are broken, or that the system can't tell how many fingers you are holding up
4. and in that case you can revert to "known good" inputs
I am thinking of data products here, but even if the 'product' is a paper, presentation, or report that involves human judgements there should be a structured process to propagate changes.
My central use case is that I might 'scrape' content from sources such as
https://en.wikipedia.org/wiki/List_of_U.S._states_and_territ...
and have the process be "repeatable" in the sense that:
1. The system archives the original inputs and the process to create refined data outputs
2. If the inputs change the system should normally be able to download updated versions of the inputs, apply the process and produce good outputs
3. If something goes wrong there are sufficient diagnostics and tests that would show invariants are broken, or that the system can't tell how many fingers you are holding up
4. and in that case you can revert to "known good" inputs
I am thinking of data products here, but even if the 'product' is a paper, presentation, or report that involves human judgements there should be a structured process to propagate changes.