Note -- these scripts disallow commercial use in their default license, and this page is separate from the widely-used imagemagick library, which has a different, more permissive (though also sui generis) license
Interesting. I've never thought of pre-built CLI commands as something that would be licensable. I'm not sure why; it's not particularly different than an NPM module or something, I suppose. I can't put my finger on it, but it feels super weird to me.
Because with a programming language, you can learn how something works and implement it yourself without copying the implementation. Your implementation will have a similar underlying algorithm but enough differences that it wouldn't be a copy. Algorithm (that is probably not copyrightable) and implementation (that is) are separate. You can use the implementation or you can learn the algorithm and implement it yourself.
With CLI tools, however, there is only one way of doing most things. There is no OOP vs functional, no loop vs comprehension, no choice of data structure, nothing. So commands being copyrightable implies no one can do certain actions with the CLI without a license. Because the only way to do those actions are with the exact command that was copyrighted.
So that means once you do something in a CLI tool, and copyright it, it's game over for everyone else using said tool? I understand what you're saying but I'm not sure that tracks, technically you could argue that the tool's license also include every combination of parameters used since the authors implemented them
But if a command language is turing complete, or some subset of it, then I guess there is more than one way of doing something and the copyright is a fair game, you can always reimplement it
But couldn't using the script to learning the algorithm and write a commercial version also be a type of commercial use? I think we can never be too sure of anything when it comes to this lawyer stuff.
Patently wrong. Have you ever heard of the GNU operating system that avoided copywrite with a clean room rewrite of old unix tools? There's also the busybox coreutils on GPLv2 instead of v3.
911 requests, 14 MB, 43 seconds download time (HTML ready in 2s though). It's at the bottom of the front page but the server is either already struggling or not doesn't have much upload capacity to begin with.
I have a workflow that that depends on complex ImageMagick commands that can take several minutes to run. GraphicsMagick is faster, but ImageMagick produces much higher quality results for my use case, so I stuck with it. Since they forked 20 years ago, neither is a drop-in replacement for the other and each has its own strengths.
I’ve been able to make larger image montages in less time with gm compared to im with my MacBook, so I switched. I was surprised by this and could be doing something wrong, but others around me had the same experience for my situation.
I still use GraphicsMagick. In the past it was a ton faster, apparently ImageMagick changed how they do things and you can’t do a direct comparison any more.
Neither seem able to decompose an input image into smaller "tile" images.
I have some very large TIFFs that I cannot work with (I can't even open them in some cases!) and I would like to run them through a mill that can "paginate" the larger image into a number of smaller ones.
Take a look at rasterio in Python (best installed using Conda because you need a few tricky dependencies like GDAL). It's designed for geospatial data that can be too large for RAM. You usually open data windows that are smaller views into the image. Ideally you'd have a geotiff but I don't see why it couldn't open a normal one.
I've used a variant of this code to tile large aerial surveys for processing. The snippet needs a bit of cleaning but it works out of the box:
Not sure this is of help for enormous files but slapped this ffmpeg script together with a friend a while ago for a similar need to split images for printing. Might point you in a direction at least. Excuse the virtue signaling vibe on the repo. It was a while ago.
«If the x and y offsets are omitted, a set of tiles of the specified geometry, covering the entire input image, is generated. The rightmost tiles and the bottom tiles are smaller if the specified geometry extends beyond the dimensions of the input image.»
Is there a script example of de-dupe of a thousand photos, including ones with trimming, and re-sizing?
I believe ImageMagick includes some of the hash based methods which often do a DCT reduction and then do 2D analysis of similarity, emitting a hash code which you can do a hamming distance thing on. But thats a way off practical use to actuall find, stack, rename and identify the one golden one to keep in all the dupes.
Deduplication is a hard problem: what constitutes a "dupe" is somewhat arguable.
My first approach just used metadata, but that doesn't aggregate the files stripped of their original metadata, like what you get from a Google Takeout, so I had to add image hashing as well. I actually generate three mean hashes in L*a*b color space for PhotoStructure (many hashes ignore color). I've also found that metadata needs to be normalized, including captured-at time, and even exposure metadata. It's a lot of whack-a-mole, especially add new cameras and image formats are released every year.
I described more about what I've written for PhotoStructure (which does deduplication for both videos and images) here: https://photostructure.com/faq/what-do-you-mean-by-deduplica... -- it might help you avoid some of the pitfalls I've had to overcome.
Thank you. It looks interesting. I was heading to much the same place of the order of precedence for matches, silently wondering if there was a class of bad edit which made the post-modified file bigger not smaller. Seems unlikely but not impossible.
A lot of my dupes are google dupes but across about 4 cameras with a mixture of original/compressed size.
A lot of my local copies had jhead run on them to "fix" time. So have modified EXIF
A small number have me playing with ITPC to try and auto-name things for tag matching.
Your program looks to be the one which understands the corner cases.
If you are open to alternatives, there are tools like (macOS) https://macpaw.com/gemini which scans for duplicates and can find similar pictures. I would assume cropped photos are found unless they are VERY cropped.
Fred's ImageMagick Scripts - https://news.ycombinator.com/item?id=16668254 - March 2018 (100 comments)
Fred's ImageMagick Scripts - https://news.ycombinator.com/item?id=8912591 - Jan 2015 (13 comments)