Hacker Newsnew | past | comments | ask | show | jobs | submit | hiichbindermax's favoriteslogin

We use GNU Unifont in Solvespace for the text window/property browser. It's built right into the executable. This turned out to be amazingly useful. Some people have CJK stuff in their designs and it "just works" on all platforms. I was also looking into hole annotations in CAD and was pleased to see the symbols for counter-bore and counter-sink are both already there in unifont.

You can see unifont in the experimental web version here: https://cad.apps.dgramop.xyz/



I just requested access to the database @freediver so hopefully it should be integrated into https://hcker.news soon.

I appreciate Kagi's community-driven approach. The open Small Web list[0] is invaluable. Applying a smallweb filter[1] on HN brings a breath of fresh air to the frontpage.

0: https://github.com/kagisearch/smallweb

1: https://hcker.news/?smallweb=true


This was recently shared on HN: https://visualrambling.space/dithering-part-1/

For anyone interested in seeing how dithering can be pushed to the limits, play 'Return of the Obra Dinn'. Dithering will always remind you of this game after that.

- https://visualrambling.space/dithering-part-1

- https://store.steampowered.com/app/653530/Return_of_the_Obra...


Chartjunk. It took near-zero effort to find better ones all around this interesting and heavily researched topic, and their real papers too, per standard HN preference.

- https://www.zianet.com/wrucker/the%20energetic%20cost%20of%2...

- https://www.sciencedirect.com/science/article/pii/S258884042...

- https://www.nature.com/articles/ncomms1350


In a similar vein, this is one of the most interesting things I’ve come across on HN over the years:

https://www.linusakesson.net/programming/pipelogic/index.php

Past HN post: https://news.ycombinator.com/item?id=15363029


I encourage everyone with even a slight interest in the subject to download a random sample of Common Crawl (the chunks are ~100MB) and see for yourself what is being used for training data.

https://data.commoncrawl.org/crawl-data/CC-MAIN-2025-38/segm...

I spotted here a large number of things that it would be unwise to repeat here. But I assume the data cleaning process removes such content before pretraining? ;)

Although I have to wonder. I played with some of the base/text Llama models, and got very disturbing output from them. So there's not that much cleaning going on.


> The standard library provides the LazyLoader class to solve some of these inefficiency problems. It permits imports at the module level to work mostly like inline imports do.

The use of these sorts of Python import internals is highly non-obvious. The Stack Overflow Q&A I found about it (https://stackoverflow.com/questions/42703908/) doesn't result in an especially nice-looking UX.

So here's a proof of concept in existing Python for getting all imports to be lazy automatically, with no special syntax for the caller:

  import sys
  import threading # needed for python 3.13, at least at the REPL, because reasons
  from importlib.util import LazyLoader # this has to be eagerly imported!
  class LazyPathFinder(sys.meta_path[-1]): # <class '_frozen_importlib_external.PathFinder'>
      @classmethod
      def find_spec(cls, fullname, path=None, target=None):
          base = super().find_spec(fullname, path, target)
          base.loader = LazyLoader(base.loader)
          return base
  sys.meta_path[-1] = LazyPathFinder
We've replaced the "meta path finder" (which implements the logic "when the module isn't in sys.modules, look on sys.path for source code and/or bytecode, including bytecode in __pycache__ subfolders, and create a 'spec' for it") with our own wrapper. The "loader" attached to the resulting spec is replaced with an importlib.util.LazyLoader instance, which wraps the base PathFinder's provided loader. When an import statement actually imports the module, the name will actually get bound to a <class 'importlib.util._LazyModule'> instance, rather than an ordinary module. Attempting to access any attribute of this instance will trigger the normal module loading procedure — which even replaces the global name.

Now we can do:

  import this # nothing shows up
  print(type(this)) # <class 'importlib.util._LazyModule'>
  rot13 = this.s # the module is loaded, printing the Zen
  print(type(this)) # <class 'module'>
That said, I don't know what the PEP means by "mostly" here.

When I first got with my wife I seemed a bit crazier than I am because I am a media hoarder for 30+ years. I don't have any VHS, DVDs, etc. laying around because I only keep digital copies, but I have pretty decent archives. Nothing important really, just normal stuff and some rare or obscure stuff that disappears over time.

My wife was interested in the idea that I was running "Netfix from home" and enjoyed the lack of ads or BS when we watched any content. I never really thought I would be an "example" or anything like that - I fully expected everyone else to embrace streaming for the rest of time because I didn't think those companies would make so many mistakes. I've been telling people for the last decade "That's awesome I watch using my own thing, what shows are your favorites I want to make sure I have them"

In the last 2 years more family members and friends have requested access to my Jellyfin and asked me to setup a similar setup with less storage underneath their TV in the living room or in a closet.

Recently-ish we have expanded our Jellyfin to have some YouTube content on it. Each channel just gets a directory and gets this command ran:

    yt-dlp "$CHANNEL_URL" \
      --download-archive "downloaded.txt" \
      --playlist-end 10 \
      --match-filters "live_status = 'not_live' & webpage_url!*='/shorts/' & original_url!*='/shorts/'" \
      -f "bv*[height<=720]+ba/b[height<=720]" \
      --merge-output-format mp4 \
      -o "%(upload_date>%Y-%m-%d)s - %(title)s.%(ext)s"
It actually fails to do what I want here and download h264 content so I have it re-encoded since I keep my media library in h264 until the majority of my devices support h265, etc. None of that really matters because these YouTube videos come in AV1 and none of my smart TVs support that yet AFAIK.

https://pypi.org/p/torchruntime might help here, it's designed precisely for this purpose.

`pip install torchruntime`

`torchruntime install torch`

It figures out the correct torch to install on the user's PC, factoring in the OS (Win, Linux, Mac), the GPU vendor (NVIDIA, AMD, Intel) and the GPU model (especially for ROCm, whose configuration varies per generation and ROCm version).

And it tries to support quite a number of older GPUs as well, which are pinned to older versions of torch.

It's used by a few cross-platform torch-based consumer apps, running on quite a number of consumer installations.


This is epic: :)

From : https://github.com/ioccc-src/winner/blob/master/2024/kurdyuk...

This code draws the current moon phase to the console. So if you’re a lycanthrope, you can monitor the phase of the moon.

#include <time.h> #include <stdio.h>

        a,b=44,x,
     y,z;main()  {!a
   ?a=2551443,x=    -b
  ,y=2-b,z=((time     (
 0)-592531)%a<<9)/     a
 :putchar(++x>=a?x     =
 -b,y+=4,10:x<0?x=     x
 *x+y*y<b*b?a=1-x,     -
  1:x+1,32:"#."[(     x
   <a*(~z&255)>>    8)
     ^z>>8]),y>  b?0
        :main();}

Funnily enough, I did a Sudoku one too (albeit with Poetry) a few years ago: https://github.com/mildbyte/poetry-sudoku-solver

Alternatively, if you don't want to run the whole Electron app, the money is this line:

  sudo.exec("/System/Library/PrivateFrameworks/Apple80211.framework/Versions/Current/Resources/airport en0 -z && ifconfig en0 ether `openssl rand -hex 6 | sed 's/\(..\)/\1:/g; s/.$//'`",

CardStock[0] isn’t mentioned in this article, but seems broadly similar in goals and approach to Scrappy. Unlike Scrappy (so far as I can tell) CardStock is open-source and can be run locally.[1]

Decker[2] (which is also open-source) has answers to several of the things outlined on Scrappy’s roadmap, including facilities for representing and manipulating tabular data with its query language and grid widgets and the ability for users to abstract collections of parts into reusable "Contraptions".

[0] https://cardstock.run

[1] https://github.com/benjie-git/CardStock

[2] http://beyondloom.com/decker/index.html


Here's a system prompt I tend to use

    ## Instructions
    * Be concise
    * Use simple sentences. But feel free to use technical jargon.
    * Do NOT overexplain basic concepts. Assume the user is technically proficient.
    * AVOID flattering, corporate-ish or marketing language. Maintain a neutral viewpoint.
    * AVOID vague and / or generic claims which may seem correct but are not substantiated by the the context.
Cannot completely avoid hallucinations and it's good to avoid AI for text that's used for human-to-human communication. But this makes AI answers to coding and technical questions easier to read.

As another example: I run https://shithub.us with shell scripts, serving a terabyte or so of data monthly (mostly due to AI crawlers that I can't be arsed to block).

I'm launching between 15 and 3000 processes per request. While Plan 9 is about 10x faster at spawning processes than Linux, it's telling that 3000 C processes launching in a shell is about as fast as one python interpreter.


I’d like to share some little demos here.

Bitwise XOR modulo T: https://susam.net/fxyt.html#XYxTN1srN255pTN1sqD

Bitwise AND modulo T: https://susam.net/fxyt.html#XYaTN1srN255pTN1sqN0

Bitwise OR modulo T: https://susam.net/fxyt.html#XYoTN1srN255pTN1sqDN0S

Where T is the time coordinate. Origin for X, Y coordinates is at the bottom left corner of the canvas.

You can pause the animation anytime by clicking the ‘■’ button and then step through the T coordinate using the ‘«’ and ‘»’ buttons.


One problem I haven't found a mechanical solution for yet is how one could (simply) implement a state transition table - for a specific example say you have 9 states each mapping to one of 9 other states, and many-to one mappings are possible:

  1 -> 2
  2 -> 9
  3 -> 1
  4 -> 6
  5 -> 2
  6 -> 6
  7 -> 1
  8 -> 8
  9 -> 9
(This is the 3rd of 4 transition tables for an 8-state, 4-symbol Universal Turing Machine. These transitions apply if the 3rd symbol is read from tape at the current head position - with all 4 transition tables implemented you could select between them depending on the read symbol. 9 is the halt state.)

The mechanism should remain in one state and then go to the next as indicated by the table, repeatedly. How would you mechanically implement this? A face cam with many grooves perhaps, starting and ending at different angles? https://i.imgur.com/aNPBcdh.png - while always moving a follower from the center of the wheel through the groove to its edge, with something like a https://en.wikipedia.org/wiki/Chebyshev_lambda_linkage, so the wheel stops at the next angle representing the current state?

The fact that there does not seem to exist a simple answer for even this seems to partially explain why mechanical computers were quickly given up on.


https://www.learn-c.org/

If you have lots of time: https://hal.inria.fr/hal-02383654

If you can't be bothered reading a whole book: https://matt.sh/howto-c

Exercises: https://www.codestepbystep.com/problem/list/c and https://exercism.org/tracks/c

Once you have syntax and basic algorithms down well, watch this, the only 2 hour YouTube video I'll ever recommend: https://m.youtube.com/watch?v=443UNeGrFoM

Both r/cprogramming and r/C_programming are active, also full of lazy students trying to get people to do their homework. If you come by, describe your problem well with code. Say you're learning for yourself, not for school.

Together C & C++ is a good Discord if you prefer live chat: https://discord.gg/tccpp


I had actually done a writeup on it, and thought I had lost it. I found it, dated 2/15/2002:

---

Consider that any D app is completely specified by a list of .module files and the tools necessary to compile them. Assign a unique GUID to each unique .module file. Then, an app is specified by a list of .module GUIDs. Each app is also assigned a GUID.

On the client's machine is stored a pool of already downloaded .module files. When a new app is downloaded, what is actually downloaded is just a GUID. The client sees if that GUID is an already built app in the pool, then he's done. If not, the client requests the manifest for the GUID, a manifest being a list of .module GUIDs. Each GUID in the manifest is checked against the client pool, any that are not found are downloaded and added to the pool.

Once the client has all the .module files for the GUIDs that make up an app, they can all be compiled, linked, and the result cached in the pool.

Thus, if an app is updated, only the changed .module files ever need to get downloaded. This can be taken a step further and a changed .module file can be represented as a diff from a previous .module.

Since .module files are tokenized source, two source files that differ only in comments and whitespace will have identical .module files.

There will be a master pool of .module files on WT's server. When an app is ready to release, it is "checked in" to the master pool by assigning GUIDs to its .module files. This master pool is what is consulted by the client when requesting .module files by GUID.

The D "VM" compiler, linker, engine, etc., can also be identified by GUIDs. This way, if an app is developed with a particular combination of tools, it can specify the GUIDs for them in the manifest. Hence the client will automatically download "VM" updates to get the exact tools needed to duplicate the app exactly.


Yep — some components currently rely on external APIs (e.g. OpenAI, MathPix), primarily for stability and ease of deployment during early release. But I’m planning to support fully local inference in the future to eliminate API key dependency.

The local pipeline would include:

• Tesseract or TrOCR for general OCR

• Pix2Struct, Donut, or DocTR for document structure understanding

• OpenAI CLIP for image-text semantic alignment

• Gemma / Phi / LLaMA / Mistral for downstream reasoning tasks

Goal is to make the system fully self-hostable for offline and private use.


Very neat! Reminds me of Tom Yeh's "AI By Hand" exercises [0].

[0] https://www.byhand.ai/


Note that the 'Jump Flood Algorithm' is O(N log N) where N is the number of pixels. There is a better O(N) algorithm which can be parallelized over the number of rows/columns of an image:

https://news.ycombinator.com/item?id=36809404

Unfortunately, it requires random access writes (compute shaders) if you want to run it on the GPU. But if CPU is fine, here are a few implementations:

JavaScript: https://parmanoir.com/distance/

C: https://github.com/983/df

C++: https://github.com/opencv/opencv/blob/4.x/modules/imgproc/sr...

Python: https://github.com/pymatting/pymatting/blob/afd2dec073cb08b8...


Copy your session token into .token then:

curl "https://adventofcode.com/2024/day/$DAY/input" --header "Cookie: $(cat .token)" > input.txt


Ah that's great, because that's a category of apps that's not doing great in terms of privacy: https://foundation.mozilla.org/en/privacynotincluded/categor...

(And unfortunately this is also fairly sensitive data in some regions...)


holy cow. This is one of the best writeups[1] of "the mess" I've ever seen.

Thanks for sharing.

1 - https://theyrule.net/so_what


Some exciting projects from the last months:

- 3d scene reconstruction from a few images: https://dust3r.europe.naverlabs.com/

- gaussian avatars: https://shenhanqian.github.io/gaussian-avatars

- relightable gaussian codec: https://shunsukesaito.github.io/rgca/

- track anything: https://co-tracker.github.io/ https://omnimotion.github.io/

- segment anything: https://github.com/facebookresearch/segment-anything

- good human pose estimate models: (Yolov8, Google's mediapipe models)

- realistic TTS: https://huggingface.co/coqui/XTTS-v2, bark TTS (hit or miss)

- open great STT (mostly whisper based)

- machine translation (ex: seamlessm4t from meta)

It's crazy to see how much is coming out of Meta's R&D alone.


This is an excellent tool to realize how an LLM actually works from the ground up!

For those reading it and going through each step, if by chance you get stuck on why 48 elements are in the first array, please refer to the model.py on minGPT [1]

It's an architectural decision that it will be great to mention in the article since people without too much context might lose it

[1] https://github.com/karpathy/minGPT/blob/master/mingpt/model....


Yeah, like the other commenter said, everything is in this file here:

https://github.com/Dicklesworthstone/fast_vector_similarity/...

If you also make your project using Rust and Maturin, you can literally just copy and paste that into your project because it's totally generic, and if the repo is public, GitHub will just run it all for you for free.

The only thing is you need to create an account on PyPi (pip) and add 2-Factor Auth so you can generate an API key. Then you go into the repo settings and go to secrets, and create a Github Actions secret with the name PYPI_API_TOKEN and make the value your PyPi token. That's it! It will not only compile all the wheels for you but even upload the project to PyPi for you using the settings found in your pyproject.toml file, like this:

https://github.com/Dicklesworthstone/fast_vector_similarity/...



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: