Hacker Newsnew | past | comments | ask | show | jobs | submit | ktsaou's commentslogin

systemd-journal has amazing logs centralization capabilities. Combined with Netdata you can get a powerful logs management solution.


Journald messages have almost infinite cardinality at their labels and all of them are indexed, even if every single message has unique fields and values.

When you send journald logs to Loki, even if you use the `relabel_rules` argument to `loki.source.journal` with a JSON format, you need to specify which of the fields from journald you want inherited by Loki. This means you loose all the flexibility journald provides: indexing on all fields and all their values.

Loki generally assumes that all logs are like a table. All entries in a stream share the same fields. But journald does exactly the opposite. Each log entry is unique and may have its own unique fields.

Of course someone could use filters and relabel rules to create multiple streams of logs out of a single journald stream (never tried it), but it sounds a lot of work and again assumes you know all the possible fields beforehand.

So, Loki and systemd-journal are good for different use cases. The good thing with systemd-journal, is that you already have it and use it. It is there inside your systems.


There are ways to put syslog messages into the journals. This plugin works with whatever logs are in the journals.


Do not compare Netdata with other monitoring solutions that centralize everything to one place, or with single installation applications.

Netdata is a distributed application, and it is installed all over the place. So we needed to find a way to provide SSO.

There are a few alternatives:

1. PAM (then LDAP or a DB), but this would significantly increase the attack surface of your Network, making Netdata an ideal component to test your security. We didn't want this.

2. LDAP, similar to the above and increased complexity. Probably too complex for the average user out there, and it would over-complicate things when you need to run Netdata in private and public clouds concurrently.

We chose to provide a free service to everyone using Netdata, where we manage all this complexity and simplify the process.

Netdata Cloud uses Google SSO, Github SSO, and email verification to authenticate users. It does not store user passwords. Combined with the claiming process of the Netdata Agents:

a) it ensures you are the admin of each server you want to manage b) it verifies your identify c) it provides centralized control on who of the authenticated users has access to your servers.

What happens when you use Netdata Cloud to access a Netdata agent, is that your web browser asks from Netdata Cloud to access this Netdata agent, Netdata Cloud verifies you and if this succeeds and you have trusted the agent before, it asks the agent (via their link) to generate a unique token for you, which is sent back to your browser and is then used as an authorization bearer to access the agent directly. So, your data do not flow through Netdata Cloud. You only get a token from the agent, via Netdata Cloud.


Please consider an offline approach for those willing to take it on.

I can see the benefit of what you've outlined. I really want good representation of journals.

Yet, I'm likely to not start using netdata, because it seems to be 'always online' / dependent on something external. If things are bad enough where I'm looking at logs... maybe I don't have Internet access.

While we retain the data, we don't retain full control over the access to it. This is a pro for some, con for others.

Sure, in this situation, we can go look at the data directly, but that kind of nullifies the point of collection/presentation...

I dare say that most who have needs complicated enough to warrant log collection have auth infrastructure, tooling, or knowledge to manage


ok, can we discuss how you see this working for you? How do you believe you can provide SSO to all your Netdata agents?

Please open a discussion here: https://github.com/netdata/netdata/discussions

Even if this may be a niche need, I am open to create such a feature for those that need it, at a small price. But we need some specs.


I'll mull it over and follow up once I have something of more substance, thank you!


I started the discussion, and offered a solution too:

https://github.com/netdata/netdata/discussions/16136


Thanks!

How do you use it? How big infra you monitor with it?


- Integrations Marketplace - monitoring Systemd Journal logs - Easier claiming of agents - Quickly spot anomalies across the dashboard


Netdata v1.39 just released, introducing a major change in monitoring. Now Netdata not only trains multiple ML models for each and every metric collected, but also Netdata Charts use an ML-first approach, revealing all the power of ML instantly!


Hi. I am the founder of Netdata.

We complement the Netdata agent with Netdata.Cloud, a free-forever SaaS offering that maintains all the principles of the Netdata agent, while providing infrastructure level monitoring and several additional convenience features.

In Netdata.Cloud, infrastructure is organized in war-rooms. On each war-room you will find the "Overview" page, that provides a fully automated dashboard, very similar to the one provided by the agent, in which every chart presented aggregates data from all servers in the war-room! Magic! Zero configuration! Fully automated!

Keep in mind that Netdata.Cloud is a thin convenience layer on top of the Netdata agent. We don't aggregate your data. Your data stay inside your servers. We only collect and store a few metadata (how many netdata agents you have, which metrics they collect, what alarms have been configured, when they triggered - but not the actual metric and log data of your systems).

Try it! You will be surprised!


>I am the founder of Netdata.

Awesome! I see the free tier is indeed looking generous. Just hooked up a node and looks good - I like the calculate correlations on alerts thing in particular.

>Keep in mind that Netdata.Cloud is a thin convenience layer on top of the Netdata agent.

I see. Didn't know/understand that.

On the claim node page - could you perhaps add the kickstart bash script code too? I find myself needing them one after the other yet they're on different pages


Good to hear metrics correlation might be useful to you, just as background you can get more info here: https://www.netdata.cloud/blog/netdata-cloud-metric-correlat...

At the moment it's based on a short window of data so the focus is more for short term changes around an area of interest you have already found.

Longer term it would be cool to be able to use an anomaly score on the metrics themselves (or if a lot of alarms happen to be going off) to automatically find such regions for you so its more like surfacing insights to you as opposed to you having to already know a window if time you are interested in.


>Keep in mind that Netdata.Cloud is a thin convenience layer on top of the Netdata agent. We don't aggregate your data.

I didn't get that from the website until just now. I was looking and looking for how much it would cost to subscribe for our 150 dev/stg/prod VMs -- usually that's the killer.


Thank you for this feedback. I am the founder of Netdata.

Netdata is about making our lives easier. If you need to tweak Netdata, please open a github issue to let us know. It is a bug. Netdata should provide the best possible dashboards and alerts out of the box. If it does not for you, we missed something and we need your help to fix it, so please open a github issue to let us know of your use case. We want Netdata to be installed and effectively used with zero configuration, even mid-crisis, so although tweaking is possible and we support plenty of it, it should not be required.

An "incident" is a way to organize people, an issue management tool for monitoring, a collaboration feature. Netdata's primary goal however, is about exploring and understanding our infrastructure. We are trying to be amazingly effective in this by providing unlimited high resolution metrics, real-time dashboards and battle tested alarms. In our roadmap we have many features that we believe will change the way we understand monitoring. We are changing even the most fundamental features of a chart.

Of course at the same time we are trying to improve collaboration. This is why Netdata.Cloud, our free-forever SaaS offering that complements the open-source agent to provide out of the box infrastructure-level monitoring along side several convenience features, organizes our infra in war-rooms. In these war-rooms we have added metrics correlation tools that can help us find the most relevant metrics for something that got our attention, an alarm, a spike or a dive on a chart.

For Netdata, the high level incident panel you are looking for, will be based on a mix of charts and alarms. And we hope it is going to be also fully automated, autodetected and provided with zero configuration and tweaking. Stay tuned. We are baking it...


> our free-forever SaaS offering that complements the open-source agent

How do you make or plan to make money?


The same way GitHub, Slack or Cloudflare provide massively free-forever SaaS offerings while making money.

We believe that the world will greatly benefit by a monitoring solution that is massively battle tested, highly opinionated, incorporating all the knowledge and experience of the community for monitoring infrastructure, systems and applications. A solution that is installed in seconds, even mid-crisis and is immediately effective in identifying performance and stability issues.

The tricky part is to find a way to support this and sustain it indefinitely. We believe we nailed it!

So, we plan to avoid selling monitoring features. Our free offering will never have a limit on the number of nodes monitored, the number users using it, the number of metrics collected, analyzed and presented, no limit on the granularity of data, the number of war-rooms, of dashboards, the number of alarms configured, the notifications sent, etc. All these will always be free.

And no, we are not collecting any data for ML or any other purpose. The opposite actually: we plan to release ML at the edge, so that each server will learn its own behavior.

We plan to eventually sell increased convenience features, enforcement of compliance to business policies and enterprise specific integrations, all of them on top of the free offering.


I was analyzing the activity in the netdata project and what I found interesting was this project is less active than I would have thought. See the following for insights into the project:

https://public-001.gitsense.com/insights/github/repos?q=wind...

In the last 30 days, there were 2 frequent and 3 occasional contributors. I honestly thought frequent contributors would have been much higher, which leads me to believe the project is quite mature and they don't need a lot of people to work on netdata.

Based on Crunchbase, they've raised about 33 million so far and if the number of people required to maintain netbase is low (relatively speaking that is), I can see them not really needing to worry about making money and I'm guessing they are finding value in gathering data for ML.


> they've raised about 33 million

yes, this is right

> if the number of people required to maintain netbase is low (relatively speaking that is)

The Netdata agent is a robust and mature product. We maintain it and we constantly improve it, but:

- most of our efforts go to Netdata.Cloud

- most of the action in the agent is in internal forks we have. For example, we are currently testing ML at the edge. This will eventually go into the agent, but is not there yet. Same with EBPF. We do a lot of work to streamline the process of providing the best EBPF experience out there.

> I can see them not really needing to worry about making money

We are going to make money on top of the free tier of Netdata.Cloud. We are currently building the free tier. In about a year from now we will start introducing new paid features to Netdata.Cloud. Whatever we will have released by then, will always be free.

> I'm guessing they are finding value in gathering data for ML

No, we are not gathering any data for any purpose. Our database is distributed. Your data are your data. We don't need them.


Hey thanks for the insights. I figured effort was being spent elsewhere and/or was not visible in the public repo.


oh cool that's a nice tool.

p.s. i am the only person working on ML at Netdata and i can confirm we don't gather any data for ML purposes, which is actually my biggest challenge right now :) - convincing people the ML can be useful without having lots of nice labeled data from real netdata users to be able to quantify that with typical metrics like accuracy etc. I'm hoping to introduce mainly unsupervised ML features into the product that don't rely on lots of labeled data and have thumbs up/down type feedback and we can then use that to figure out if new ML based features are working or being useful for users. So any models that would be trained would be trained on the host and live on the host as opposed to in Netdata Cloud somewhere.


> i am the only person working on ML at Netdata and i can confirm we don't gather any data for ML purposes, which is actually my biggest challenge right now :)

Yeah I would have to imagine that it would be an issue. This is just my personal opinion, but I think there should be a way to provide anonymized data for building models for anomaly detection. Maybe an opt-in feature, as it would benefit everybody using netdata, but this is just my own personal opinion.


This is a good question, their website doesn't seem to have any "Pricing" information anywhere and everything is "get now" and "sign up for free"...


bash <(curl -Ss https://my-netdata.io/kickstart-static64.sh)

For any 64bit Linux system. Check the wiki if bash is not available on the target system.

Wiki: https://github.com/firehol/netdata/wiki/Installation Home: https://my-netdata.io/


The link doesn't really explain what Netdata is or why I might want to use it.


Agreed. The front-page gives more data:

https://github.com/firehol/netdata

It looks like a thing that stores and visualises "metrics".

But it looks like it runs only upon a single host, which makes it less interesting than collectd/prometheus/etc, even if the visuals look a little nicer than grafana, etc.

I certainly wouldn't have the patience to open ten pages if I wanted to see the stats on ten hosts. I want all my metrics in one place - so I can graph "top five busiest hosts", etc.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: