Hacker Newsnew | past | comments | ask | show | jobs | submit | BrannonKing's commentslogin

Note: please don't turn your screenshots and digital art into JPG. JPG uses compression based on natural lighting. It works well for photos, but it's the wrong solution where run-length encoding will do much better (e.g. in screenshots). Black text (or cartoon art) on white backround always looks lousy when converted to JPG.


What should we use instead?


For final export? AVIF, JpegXL, maybe even WebP (lossless mode).

PNG kinda sucks for high resolution stuff because decoding is extremely slow. The way PNG does lossless compression also only really works with flat graphic design, anything with gradients or texture blows up the file size.


>> What should we use instead?

> AVIF, JpegXL, maybe even WebP (lossless mode).

Definitely not lossless AVIF. It is less efficient than lossless WebP. WebP is supported everywhere, but is not much more efficient than optimized lossless PNG. Lossless JPEG XL has the best lossless compression but can't be used for web without fallbacks.

So, for offline archival: JPEG XL.

For web use without fallbacks: lossless WebP.

For web use with lossless WebP fallback: JPEG XL.


>> What should we use instead?

> AVIF, JpegXL, maybe even WebP (lossless mode).

Definitely not lossless AVIF. It is less efficient than lossless WebP. WebP is supported everywhere, but is not much more efficient than optimized lossless PNG. Lossless JpegXL has the best lossless compression but can't be used for web without fallbacks.

So, for offline archival: JpegXL.

For web use without fallbacks: lossless WebP.

For web use with lossless WebP fallback: JpegXL.


Generation Loss: JPEG, WebP, JPEG XL, AVIF [1]

Generation loss: FLIF vs WebP vs BPG vs JPEG vs MozJPEG [2]

[1] - https://www.youtube.com/watch?v=FtSWpw7zNkI

[2] - https://www.youtube.com/watch?v=YKmhZJ8H1Fc


PNG


This helps me: don't do social media or any news or any thumb-memory-based anything in the morning. Don't grab your phone first thing. Just let your mind slowly warm up to work -- the legit work that you need to do. When your brain is pulled in many directions it feels overwhelmed, and starting the day that way compounds it.


What I want: take scan/photo of a document (including a full book), pass it to the language model, and then get out a Latex document that matches the original document exactly (minus the copier/camera glitches and angles). I feel like some kind of reinforcement learning model would be possible for this. It should be able to learn to generate Latex that reproduces the exact image, pixel for pixel (learning which pixels are just noise).


A big difficulty there is typeface detection, some of these were never digital fonts. But, even if it could detect them, you likely don't have those fonts on your computer to be able to put it back together as a digital typesetting for any but the most trivial fonts.


The tool could include all known open source fonts, and for the rest, maybe could have a model recreate missing fonts for non-patented fonts, as while font files (.ttf, .otf, .woff, etc.) are copyrighted, styles usually do not have design patents, so tracing and re-creating them is usually not an issue as far as I'm aware (not a lawyer.) [1]

Though if it accidentally "traces" one of the few exceptions, then you've potentially committed a crime, and the big difficulty in typeface detection you mention increases those odds. That said, there are so few exceptions that even if the model couldn't properly identify a font, it might be able to identify whether a font is likely to have a design patent.

I do think getting an AI to create a high quality vector font from a potentially low-res raster graphic is going to be quite challenging though. Raster to vector tools I've tried in the past left a bit to be desired.

1. https://www.copyright.gov/comp3/chap900/ch900-visual-art.pdf

> As a general rule, typeface, typefont, lettering, calligraphy, and typographic ornamentation are not registrable. 37 C.F.R. § 202.1(a), (e). These elements are mere variations of uncopyrightable letters or words, which in turn are the building blocks of expression. See id. The Office typically refuses claims based on individual alphabetic or numbering characters, sets or fonts of related characters, fanciful lettering and calligraphy, or other forms of typeface. This is true regardless of how novel and creative the shape and form of the typeface characters may be.

> There are some very limited cases where the Office may register some types of typeface, typefont, lettering, or calligraphy, such as the following:

> • Pictorial or graphic elements that are incorporated into uncopyrightable characters or used to represent an entire letter or number may be registrable. Examples include original pictorial art that forms the entire body or shape of the typeface characters, such as a representation of an oak tree, a rose, or a giraffe that is depicted in the shape of a particular letter.

> • Typeface ornamentation that is separable from the typeface characters is almost always an add-on to the beginning and/or ending of the characters. To the extent that such flourishes, swirls, vector ornaments, scrollwork, borders and frames, wreaths, and the like represent works of pictorial or graphic authorship in either their individual designs or patterned repetitions, they may be protected by copyright. However, the mere use of text effects (including chalk, popup papercraft, neon, beer glass, spooky-fog, and weathered-and-worn), while potentially separable, is de minimis and not sufficient to support a registration.

> The Office may register a computer program that creates or uses certain typeface or typefont designs, but the registration covers only the source code that generates these designs, not the typeface, typefont, lettering, or calligraphy itself. For a general discussion of computer programs that generate typeface designs, see Chapter 700, Section 723.


Did you try mathpix? Not sure about full pages, but it is pretty good at eqn


Try fasting. Go 24 hours without food and little (water) or no drink. Do this once a month when you don't have cancer, and maybe more when you do. The theory behind it: Some cancers such as Chordoma are reliant on mTOR, which fasting inhibits or modulates. (This is also why Rapamycin is being researched for cancer treatment, though its mTOR effect is mild.) Theory part 2: tumors tend to have a lot of glycogen, which is unavailable when fasting. The body still needs it so it will pull some from tumors if necessary.


I went back to school for a PhD at the age of 41. I was burned out on programming, I was lacking in theory, and I was ready to see the other side of the country for a few years. I'm now in my fifth year of that. It might take me another year for the PhD or it might not happen, but it has been a great adventure either way. It has opened up some other opportunities for me and helped me to recognize the areas of programming that I truly enjoy.


Did you need to build a large nest egg first?


I sold a house when I moved and used that equity money to make the house payment at my new location while in school. If you don't have a lot of kids and/or your spouse brings in just a little money, you can make it okay on the graduate assistantship alone. My assistantships have been paying $2500/mo. The problem right now is that housing in the USA is very expensive almost everywhere, so you'll need a smart plan to deal with that.


I've pondered on this question some. Is there such a thing as an objective talent rating? For example, if you take a group like Dream Theater, and look at the skill of each member of that group vs the skill of each member in the typical heavy-metal band, there is a distinct and obvious difference in skill level. Whether or not you like their style or their music's message, Dream Theater's skill is astounding. And you don't have to like all of their songs to appreciate some of them. But the difference might not be noticeable to someone who hasn't attempted to master a musical instrument. Such a person might just go with the flavor of the day; they won't be able to incorporate the amount of work that went into that flavor into their appreciation of it. (Dream Theater's singer has lost his vocal prowess over the years, which is sad but not unusual.)

In a recent concert (at Carnegie) Lang Lang brought out a melody in Chopin's first Scherzo that was very innovative though subtle. If you have any appreciation at all for Chopin, you would want to hear that rendering. It made a show piece into something both beautiful and astounding. But can a machine be made to tell the difference? The song is so fast and notes so blurred that it's almost noise from a reverse-engineering/transcribing standpoint.

Similarly, Sarah Brightman (in her early years) and Loreena McKennitt have far superior vocal skills to the typical modern pop singer. But if you're never exposed to real vocal skills, or if you've never attempted to sing an pure open vowel on pitch and hold it there for a sustained amount of time, why would you care? You wouldn't even know what to appreciate. So is it the algorithm's job to expose people to artists of higher skill and talent? I would like it to. That's absolutely a feature I want. I want to be exposed to those people. I'd like it to go even beyond that. I want to be exposed to talented composers regardless of performer.


37 seems like a high number; turn some of them into apartments instead. Put shops on the bottom floor like they do in Europe.


So who manages Pypi? This document seemed vague on that. Maybe that's the problem with Pypi's progress in life.

Most packages on Pypi are complete crap. It's also heavily burdened with domain-specific applications and one-off student projects. They have no standards for what makes a useful package, and no ranking system aside from the number-of-downloads. I think package maintainers should be required to push an update every other year or have their package get dropped. I think frameworks should be separate from applications. I think packages without a lot of downloads should utilize endorsements and code-cleanliness metrics.


PyPI’s policies are here: https://policies.python.org/pypi.org/Acceptable-Use-Policy/

Outside of abuse, PyPI does not impose editorial standards on packages. That would take an incredible amount of additional work, and it’s not clear to me that it would be “better”. How much does it really matter if there’s a university student project on there with virtually no downloads?

“I think package maintainers should be required to push an update every other year or have their package get dropped.”

Sometimes libraries really are “finished” - if you go through your dependency stack you may find a surprising number of packages with no new releases in the past 12 months, because they didn’t need a release.

I tried that myself just now, here are some of the packages I found that haven't had a release in a few years:

    decorator               2022-01-07
    rfc3986                 2022-01-10
    aiosignal               2022-11-08
    colorama                2022-10-25
    h11                     2022-09-25
    jmespath                2022-06-17
    mdurl                   2022-08-14
    rsa                     2022-07-20
    mergedeep               2021-02-05
    dictdiffer              2021-07-22
    janus                   2021-12-17
    conda-content-trust     2021-05-12
    six                     2021-05-05
    uritemplate             2021-10-13
    pytest-clarity          2021-06-11
    ptyprocess              2020-12-28
    backcall                2020-06-09
    text-unidecode          2019-08-30
    PySocks                 2019-09-20
    sphinxcontrib-jsmath    2019-01-21
    pprintpp                2018-07-01
    homebrew-pypi-poet      2018-02-23
    pickleshare             2018-09-25
    webencodings            2017-04-05
Script here: https://gist.github.com/simonw/6165948ce595d74c767ce2bce8465...


Should there be an expectation of a package being particularly useful to be in a package repository?

You see the same in other places like npm or docker repositories and it is not a problem.

Manually checking things is very much out of scope for a service for open source like this. Limiting it by arbitrary metrics like code cleanliness would also just give a false sense of quality. One thing that'd make sense to me would just be asking for confirmation that the upload is not more suited to test pypi instead of the main one. Not sure whether the tools aren't already doing that or not.

The major problem that's being somewhat worked on now is typo squatting, names taken up by old packages, and other security considerations around pypi. Random packages being useless (or malware) doesn't fall under that in my mind as you just won't or shouldn't be downloading completely random things.

Admittedly there isn't as much man power dedicated to it as I think there should be, more so after I saw how much admin there is in PSF with the recent coc debacle.


I think you’re confusing two things: PyPI has maintainers end administrators, but that doesn’t mean that it’s a curated index. Like RubyGems, NPM, Cargo, etc., PyPI explicitly does not present a curated view of the packaging ecosystem. Doing so would require orders of magnitude more staffing than the index already has.

Python as a community prefers standards over implementations, which is why you could easily stand up your own curated alternative to PyPI if you wanted to. But think you’ll discover that the overwhelming majority of users don’t want their resolutions breaking just because a particular package hasn’t needed an update in the last 6 months.


GLPK or GLPK_MI? GLPK is probably the worst tool you could pick. I tried it before on some of my problems, and it would not ever finish. Use OR-Tools, if it will work for your problem, or maybe one of the other free solvers with CVXpy will help you (https://www.cvxpy.org/tutorial/solvers/index.html), if you really need a free tool.

GLPK should not be used as a guide for the general field. The commercial solvers will do infinitely better than GLPK.


This book has software engineering concepts that I never did see or hear anywhere else: "Expert C# Business Objects". Its concepts are language agnostic, even though it has C# examples. You might try some videos by the author of that book too.

Related to this book is what we call "vertical feature slicing". (I'd be curious to know what other books cover this topic.) There are some Youtube videos on the topic. There was a great 2-hour video from 20 years ago that influenced me. Unfortunately, I don't remember the title and cannot find it today.

The video "Simple Made Easy" had a profound influence on my programming: https://www.infoq.com/presentations/Simple-Made-Easy/


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: