Hacker News new | past | comments | ask | show | jobs | submit login
Toolong: Terminal application to view, tail, merge, and search log files (github.com/textualize)
275 points by ingve 9 months ago | hide | past | favorite | 54 comments



The code base seems like a good reference as a small Python project.

My fav option in this class of apps: https://lnav.org/ It lets you use journalctl with pipes as requested here: https://github.com/Textualize/toolong/issues/4


Plus it's possible to download lnav as a statically linked binary, which is very nice. (Would have been even better if it was in the official repos.) I'm not interested in installing things using yet another package manager, like pip or the like.


Mind elaborating why this would be a good reference? Not saying it isn't, just want to understand why you think so


Hi there! I'm mostly AFK for a couple days so replies might be delayed, but some notes:

- System tool niche that I find interesting so it's nice to see a fairly complete and yet relatively small project

- Textual seems to be their largest dependency, but other than that the project seems self-contained and relies on the standard library

- With some exceptions, the majority of methods are short and easy to parse

- Typing support :)


Makes total sense, I think I was a bit thrown of after the first glance because there are so many classes and files [0] and it reads a bit like Java code.

But after a second glance it looks very well written compared to many other python projects, which sometimes read like a 5000 line bash script.

And I can't argue against your arguments, especially using "minimal" dependencies and using typing.

Typing often helps for autocompletion and understanding what a variable/function "means", which makes it [1] easier to start hacking on it.

[0] not necessarily bad, just wasn't what I would expect to be a small reference project

[1] not always, sometimes types can be too verbose and start messing with your brain ;)


I donated to lnav, it's just soo good!


lnav is great


Congrats on the launch! I'm an author of https://logdy.dev seems like we had a similar problem with logs and decided to solve it but in a slightly different way. Logdy works with pipes very well, I'm wrapping up another version to be released soon.


I maintain a workflow manager, and had both textualize and logdy on my list of projects to try soon. Planning to add a TUI written in textualize, and was thinking in something like logdy (or use it directly). Not sure which way to go now, will play with both now and see which version users like best. Thanks for logdy, and thanks to the creators of toolong too!


Thanks for making lodgy, it looks properly awesome! I definitely plan on using it as much as I can. I love the jump out of the terminal and into the browser, it makes it that much friendlier to me. Best of luck to the project!


Hello. Author of Toolong here. Happy to answer any questions!


I don't really see myself using this app but I loved textual and even used it for a PoC a while back.

I appreciate the work you are doing by dogfooding textual and how consistent you (or your team, I didn't know there was more than one person behind it) have been.

I just wanted to say thanks so yeah, Thanks!


Cheers. We’re a small team. 4 developers in total. I’ll pass on your thanks!


Just wanted to say thanks for creating this. Wonderful tool!


De Nada


What's the relevance of the kookaburra?


We have a tradition of using a bird logo for our projects.


What I often do when I analyze logs is removing timestamps and changing unique identifiers to something more predictable and diff them to see when things started diverting from the norm, because it is often earlier than the usual error/crash. Is there anything that does something like it?



https://github.com/harjoc/LogDiff - unmaintained, parses only ProcMon, but does the steps you describe, and uses Kdiff3 for diffs. With threaded apps it can be hard to identify automatically which thread is which.


Nice! I've found this kind of tools really useful and love the merge functionality. I've skimmed the README but maybe I've missed the info: does toolong support multiline logs like stacktraces? Or is it possible to customize the recognized formats?


It should work fine with multi-line logs.

Its relatively easy to add new formats in code, but there isn't yet a way of configuring that. In the future I might add a config file to make that easy.


Thank you!


How to remember that I installed this in 2 weeks when I need it?


A physical post-it note on your monitor that you don't take off until you've used it three times.

Or some other similar reminder -- perhaps just a list of things you're going to install when you need them, so you don't install them until you have a use.

(I have this problem too)


When I have a similar conundrum I typically add it to my calendar at a random time as a helpful popup/reminder.

Usually I don’t even need the reminder, just writing it down is reminder enough.


Make this alias for whatever log viewer you use now.

  alias tail='echo "use toolong instead"
Now using tail to view logs will print that reminder instead.


This looks great. I spend a good amount of time each week grepping through Kubernetes log files. Looking forward to trying this out next week. I particularly like the pretty-print and merge options.


A handy utility I’ve written a few times is a tool that can quickly extract a range of logs from a timestamped, ordered file without reading every byte. This would be a good feature to add to this.


This looks cool. If I have a log file with a format that is like `[timestamp] SEVERITY { json content }`, can I use the feature that pretty prints the JSON part, or does the whole line need to be valid JSON? If not could I somehow write a plugin that would allow me to parse the lines in order to accomplish this? That would be really useful.


Not currently, but it would be relatively easy to add.

You might want to open an issue on the repo.


lnav can do this may be worth a look


I'd love to see a tool that lets you modify large files efficiently.

I had to replace line 4 of a 200 GB SQL dump, it took a substantial amount of compute time to perform a find / replace with sed and it also required having over double the disk space since sed creates a temp file before it writes out the new file.

Using a hex editor could have worked but it seemed too risky because data integrity was really important.


There is this thing called dd.

Hold it like so:

  seek=$bytes-offset-until-line-4 conv=notrunc,sync
Enjoy responsibly :)


Wouldn't something like this, work for your specific use case:

  head -n4 sqldump > sqldump2
Then change the line 4 in sqldump2 to whatever you wanted to, and after that:

  tail -n+5 sqldump >> sqldump2

And now sqldump2 contains sqldump but with line 4 edited.

This still requires double the disk space, but at least shouldn't take 30 minutes.


That would work but what would tail do different than sed here? Both would still need to read and write the full file (minus 4 lines for tail) which without testing indicates the time would be the same.


I thought that sed may have a "high" overhead, and that's why it took so long.

YMMV, but I just reproduced this on my machine, and the tail command took 3 minutes. Maybe you are limited by disk IO?

What was the sed command you used?

EDIT: sed was even a bit faster with:

  sed -i '4s/^.*$/your line four/' sqldump
Took 20s less than my tail solution, and does not occupy duplicated disk space.


I deleted the line with `sed -i "4d" sqldump`. The duplicated disk space comes from sed writing a temporary file out to the current directory, it does this in chunks as the command runs.

It makes sense for sed to write a temp file even with `-i` because what happens if your power goes out mid-way through the command without the temp file? You'd have data loss. To combat that sed will write a temp file out and when the operation is completed the last step is to move the temp file to the original file to overwrite it. You might not notice this unless you monitor the directory for changes while sed runs. You can spam `ls -la` as sed runs on a decently sized file to see the temp file.


What’s the challenge in here? Loading the text efficiently in RAM?


It's mainly to save time.

I don't remember exactly how long it took but I remember it being something like 30 minutes of purely waiting for the find / replace to finish on a 4 CPU core / 8 GB of memory machine. Memory wasn't an issue fortunately.


Why would there be a market for something like this?


Once in a while things come up where you need to make a surgical edit in a really large file and it could be in a scenario where time matters.

For example if you're doing a SQL dump -> import, technically you could have downtime during this process to eliminate any chance of data loss. Having to wait ~30min for a command to finish is painful.

I'm not saying to go off and make this product to sell but if such a tool existed and you positioned it at $39 or whatever, if it could eliminate half an hour of downtime for a business it pays for itself. Especially if the alternative is to muck around with hex editing the file.

If 100,000 people have this problem and 3,000 of them would pay for it that's $117,000. If you spent 6 months making a super polished tool that made it easy to know exactly what's being edit (or deleting lines, etc.) that's pretty appealing. Even if the sales were half that's still solid but it could also be 5x too, who knows. Maybe you can finish such an app in 3 months instead of 6, etc.. It's also an app that mostly feels like it could reach a "done" state with little maintenance since you're editing text files which is a well known topic.


I mean, if you knew you only wanted to find/replace on line 4, why not simply stop the search after the first match?

The syntax escapes me at the moment, but I'm quite sure I've used sed (or maybe awk?) in the past to do exactly that


As far as I know sed will still read and write the full file, even when you do a modification on a specific line. I've target deleted 1 line with sed and it took quite some time too.


There's a big difference between "quite some time" and 30 minutes, though. Of course sed needs to read the file and write it back to disk, but that's capped by I/O speed which is very high on modern drives - we're talking seconds for a file in the tens of gigabytes on the fastest SSDs, a very far cry from half an entire hour.


This was run on an AWS gp3 SSD EBS volume with 3k IOPS. It was a CPU optimized c6i.xlarge machine. The file in question was 200 GB.

I used "quite some time" because this happened months ago and I don't recall the exact time on the delete. I don't think deleting a line was any faster than doing a find / replace on 1 line. If sed is reading and writing the file in both cases I'd expect both to have the same performance.


Sed takes ages to remove a single line from huge files too.


This would have been great when I used to work in embedded development and had to grep log files to root cause bugs. I had thought about writing something like this but was always too busy. The biggest feature is combining log files and organizing by timestamp.. well done


In Indonesian “tolong” means help :)


Similar with Tagalog -- "tulong".


Is there a way to search/highlight multiple tokens at once?

Didn’t find this in the help/GitHub readme…


Nice example of a README done well. What, why, screenshots, how.


Looks very interesting and should solve a pretty common use case for me - am often trying to debug some issue over many log files. I will for sure test this next week.


for a gui based tool, I find klogg really good.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: