Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

My number one requirement for a tool like this is that the JSON content never leaves the machine it's on.

I can only imagine the kind of personal information or proprietary internal data that has been unwittingly transmitted due to tools like this.

If my objective was to gain the secrets of various worldwide entities, one of the first things I would do is set up seemingly innocent Pastebins, JSON checkers, online file format convertors and permanently retain all submitted data.



Personal requirements aside (I have the same requirements); just using this would constitute misconduct at the very least at my place of work.

Yes it's a cool looking tool, but there are certslain requirements that ignorance doesn't exempt us from.

My pet gripe is all of the seemingly local (open source) tools that phone home with opt-out metrics, not mentioned in the "getting started" and take some obscure flag to disable and it's just that little bit more complex to do when running the defacto (containerised) build.


> My pet gripe is all of the seemingly local (open source) tools that phone home with opt-out metrics, not mentioned in the "getting started" and take some obscure flag to disable and it's just that little bit more complex to do when running the defacto (containerised) build.

Exhibit A: DotNet! https://learn.microsoft.com/en-us/dotnet/core/tools/telemetr...


Ouch, this is particularly egregious:

"...To opt out, set the DOTNET_CLI_TELEMETRY_OPTOUT environment variable before you install the .NET SDK"


Just to be clear: that is to opt out of the single telemetry message sent by the installer itself on successful install, not to opt out of .NET telemetry in general. You can do that at any time post install by setting the env var, no need to remove and reinstall the entire SDK just to turn off telemetry.


I still consider this fairly egregious in that if you've already installed it (but not used it) before finding this out; a bunch of details has already been reported about your environment. Does opting it before install also keep you opted out of all telemetry or does that have to be done separately?

If a new starter did this on a company machine, boom, misconduct on their first day; though I hope the network/operations team have already put a block in place for that.

That is an extreme example, but it's kind of annoying you have to look this up, and make sure you didn't typo the flag for every piece of software you use/test.


That flag opts out of all telemetry.


It not being opt-in is the problem


If it's not opt-in, the software's spyware, pure and simple, and ought to be lumped in with other malware that should be rejected, shamed, and marginalized until/unless that behavior changes.

I'm sticking to our much-better norms for this shit from c. 2000, damnit! It really is crazy how fast and completely that changed.


No idea why you're being downvoted. The only difference between telemetry and spyware is there's a "legitimate company" with "legitimate interests" behind it.


Well, it's not as simple. Telemetry can be something as benign as sending error report when app crashed (which generally is useful if it doesn't leak other data and leads to better app), or as intrusive as tracking every click.

The context is also important, beta test of a game's entire point is to get that data to improve the product.


That's only for the telemetry that happens during the install process (if I've read the link correctly). Seems quite reasonable as long as we accept them sending telemetry during install. ("A single telemetry entry is also sent by the .NET SDK installer when a successful installation happens")

For telemetry during actual use, you can set that flag any time, and a message is shown on first use to inform you about it.

So seems relatively reasonable to me.


> That's only for the telemetry that happens during the install process (if I've read the link correctly).

Wrong. That's for the dotnet cli tool to phone home each and every time you run a command.

https://learn.microsoft.com/en-us/dotnet/core/tools/telemetr...

Microsoft even provides a page which showcases summaries of some of the metrics they collect from you if you don't disable this feature. These metrics even include MAC addresses.

https://dotnet.microsoft.com/en-us/platform/telemetry

> Seems quite reasonable as long as we accept them sending telemetry during install.

There is nothing reasonable about this. You should not be required to have tribal knowledge on how to use arcane tricks prior to running an application just to avoid being spied upon. It's a dark pattern, and one that conveys a motivation to spy upon unsuspecting users whether they approve it or not.


But they do mention that you can disable it at anytime, only that for the telemetry that is sent with the installer you have to set the flag beforehand (obviously):

> The .NET SDK telemetry feature is enabled by default. To opt out of the telemetry feature, set the DOTNET_CLI_TELEMETRY_OPTOUT environment variable to 1 or true.

> A single telemetry entry is also sent by the .NET SDK installer when a successful installation happens. To opt out, set the DOTNET_CLI_TELEMETRY_OPTOUT environment variable before you install the .NET SDK.


> These metrics even include MAC addresses.

MAC address SHA256 hashes, to be precise.


MAC addresses are only 48-bit and sparsely allocated (i.e. the first half indentifies the vendor). I wouldn't be surprised if the hashes for all normal hardware (i.e. with known vendors) can be easily brute forced.


Well, dotnet DOES mention it in getting started (first run):

   Telemetry
   ---------
   The .NET tools collect usage data in order to help us improve your experience. The data is collected by Microsoft and shared with the community. You can opt-out of telemetry by setting the DOTNET_CLI_TELEMETRY_OPTOUT environment variable to '1' or 'true' using your favorite shell.

  Read more about .NET CLI Tools telemetry: https://aka.ms/dotnet-cli-telemetry
https://learn.microsoft.com/en-us/dotnet/core/tools/telemetr...

HN crowd associate telemetry with privacy or cancer and the T word gives shrugs... but it is not always the case.

> Protecting your privacy is important to us. If you suspect the telemetry is collecting sensitive data or the data is being insecurely or inappropriately handled, file an issue in the dotnet/sdk repository or send an email to dotnet@microsoft.com for investigation.


It sends telemetry as part of the installation, too. It is implementing a deliberately dark pattern as well. It should be an option presented right at the installer screen, _and_ should be default to opt-out.


>_and_ should be default to opt-out.

Here I don't agree. It should be certainly be visible, so you can make an informed decision whether or not to use it. But it's a sad fact of human nature how little we are willing to contribute back even when people give us something for free, even if just a click away (not to mention paying a small amount, filling a bug report...)


The "HN crowd" is right. We don't want these corporations exfiltrating any information about us. We don't really care what it is or what they're going to use it for. We want them to have exactly zero bits of information about us. Their attempts to collect data without our consent demonstrates a complete lack of respect for us and our wishes.


I worked at a $massive_tech_company_with_extreme_secrecy and using these tools was expressly forbidden because of the risk. Maybe one exists, but I would gladly pay $20 for a Mac app that could do all of this locally: like a Markdown Pro type app but for JSON formatting and validation. I want to simply open the app, paste in some json and have it format it to my requirements (spaces/tabs/pretty/etc.)


I vaguely remember some tool that was just a collection of random tools like that running as local server (including a bunch of crypto primitives) but I can't remember the name now...


You’re probably thinking of the tool made by GCHQ, CyberChef

https://gchq.github.io/CyberChef/


Great tool that i'd recommend any software person, techy, reverse engineer etc should self-host.

I host in in my kube cluster with all outbound connections blocked, just to be safe.


Yup, thanks!


Completely agree. I could actually get a lot of use out of a tool like this, but the fact that even the VSCode extension sends the JSON to their servers and opens it at a publicly accessible URL makes this a no-go for me. I wouldn't recommend anyone use this for any remotely sensitive data.


the extension apparently can be configured to use a locally running instance of the server. But yes, by default it uses the remote version, and thus you post publicly the json, which may or may not be ideal depending on what you're doing.


The fact that it needs a server at all seems unnecessary. It's all written in JavaScript, and isn't doing anything that couldn't be done in a browser, I see no reason why this can't be an entirely client-side application.


Processing multi-GB files in the browser is... fun. Doing that kind of thing on a server is easier.

*I'm not justifying doing it on the server, especially for an application like this where yes: it can be done in the client.* But I do sympathize because I know from experience why it's easier to do it server-side, without any conspiracies.

I wrote Papa Parse[0] about 10 years ago, and back then at least, it was extremely difficult to stream large files in an efficient, reliable way. Web Workers make things slightly better, but there's so many issues with large-scale local compute in a browser tab.

A few examples:

- https://stackoverflow.com/questions/24708649/why-does-web-wo... (the answer actually came from Google+ which is still linked to, but no longer available; fortunately I summarized it in my post)

- https://stackoverflow.com/questions/27081858/how-can-i-make-...

You get deep enough into the weeds and eventually you realize you can make it work cross-browser if you know which browser you're using (YES, User-Agent does matter for things like this) and call you crazy for trying to find out:

- https://stackoverflow.com/questions/27084036/how-can-i-relia...

Despite all this, I *100%* agree and local-only processing is also a hard-rule for me as well. (That's why JSON-to-Go[1] does it all client-side. `go fmt` event compiles to WASM and runs in the browser!)

[0]: https://www.papaparse.com/

[1]: https://mholt.github.io/json-to-go/


> Processing multi-GB files in the browser is... fun. Doing that kind of thing on a server is easier.

This sounds like a strawman. Not everyone wrangles multi-GB files, let alone JSON documents. Those who do are already readily aware of the implications. I mean,some popular text editors even struggle with multi-GB of plain text files.

You don't need a server to handle JSON. There is no excuse.


"You don't need a server to handle JSON. There is no excuse."

No technical excuse, but lots of buisness reasons I guess.


Ok, I think I get what you're saying - this is a VS Code extension, but because VS Code is an Electron application, it's still "running in a browser"?


"the extension apparently can be configured to use a locally running instance of the server" - well that sounds needlessly complicated, I mean, the code could be implemented directly in the extension (I know, that's probably easier than it sounds if you are trying to maintain both the extension and the online version with the same code base).

"you post publicly the json, which may or may not be ideal depending on what you're doing" - that's never ideal, it's just a smaller problem (if the JSON is publicly available anyway) or a much bigger problem (if it's sensitive personal data).


> or a much bigger problem (if it's sensitive personal data).

Personal data is a red herring. It's not the only thing that matters. For starters, using this at work with anything not explicitly public is likely a violation of your contract. In some contexts, it may even be gross misconduct or illegal and potentially exposing your employer to large fines.

And, in general, I'd say a tool like this that comes without explicit, bold warning that it's shipping data off your machine, is just being rude.


> Personal data is a red herring. It's not the only thing that matters. For starters, using this at work with anything not explicitly public is likely a violation of your contract. (...)

"Personal data" means the reddest of data. If a system collects and tracks personal information then it will be expected to collect highly sensitive information that is not personal. It makes absolutely no sense at all to try to downplay security problems by coming up with excuses such as "oh it's only leaking personal data".


I mean it the other way: I see the problems routinely downplayed with excuses like "it's not collecting personal data".

See e.g. this, elsewhere in this thread: https://news.ycombinator.com/item?id=33784919. What does the linked Microsoft page say? Quoting:

> The telemetry feature doesn't collect personal data, such as usernames or email addresses. It doesn't scan your code and doesn't extract project-level data, such as name, repository, or author. The data is sent securely to Microsoft servers using Azure Monitor technology, held under restricted access, and published under strict security controls from secure Azure Storage systems.

I.e. "we're not collecting personal data, so you have nothing to worry about". Plus the classic "the data is sent securely to our servers", as if that was supposed to be reassuring. It's one of the most common types of distraction I see: focusing on how the data in-flight won't leak to third parties, and ignoring the fact that it's the first party that shouldn't be getting this data in the first place.


> I mean it the other way: I see the problems routinely downplayed with excuses like "it's not collecting personal data".

You claimed that personal data was a red herring. It is not. Shipping personal data is the worst possible scenario. It's unthinkable to try to make the case that a data leak is not serious because it's just personal data.


> You claimed that personal data was a red herring. It is not. Shipping personal data is the worst possible scenario.

Which is exactly what makes it the red herring. Shipping personal data is one of the worst possible scenarios (I'd argue that, in corporate context, shipping data that's subject to export controls is worse, as it could easily get you fired, the company fined, and potentially land someone in jail) - which makes it a perfect distraction from all the other data that's being exfiltrated. "We're not collecting personal data" is the equivalent of putting a "doesn't contain asbestos" label on food packaging.


Either you do not know what's the meaning of "red herring" or you're failing to understand the problem. Personal data is the reddest of data, even and specially in a corporate context.

You can also have more data that is red, but if your infosec policies fail to prevent or stop personal information being sent, which is the lowest of low-hanging fruits to spot, then you will assuredly be leaking more red data that is harder to spot.

It makes no sense to try to downplay the problem if leaking personal data. It's the most serious offense in any context, not only for the data but specially for what it says about the security policies in place.


> Either you do not know what's the meaning of "red herring" or you're failing to understand the problem.

Merriam-Webster: "red herring [noun] (...) 2. [from the practice of drawing a red herring across a trail to confuse hunting dogs] : something that distracts attention from the real issue"

English Wikipedia: "A red herring is something that misleads or distracts from a relevant or important question. It may be either a logical fallacy or a literary device that leads readers or audiences toward a false conclusion. A red herring may be used intentionally, as in mystery fiction or as part of rhetorical strategies (e.g., in politics), or may be used in argumentation inadvertently."

This is exactly the meaning I'm using, so I think I know it just fine. To reiterate once again: leaking personal data isn't the only way telemetry can be problematic - it's not even the major issue in practice, thanks to associated risk of fines and bad PR (GDPR was quite helpful here). Saying that your telemetry is fine because it's not collecting personal data is just a way to distract the reader. It's the equivalent of advertising your heavily processed food product as safe "because it doesn't contain asbestos".


I agree, mostly. But since when isn’t it obvious that posting data with a browser will send that data somewhere? And the users here are (from what I can tell) developers.

I think this is a cool tool for public data and obviously I can’t paste private data sets on any public website, ever.


It's not obvious ever since some of those tools started to blur the line; there are plenty of such little utilities that do everything client-side, or at least claim so. I don't use them with anything but public data, as it takes one mistake or one silent update for the data to get shipped off my machine, but there's a whole generation of devs now who were growing up with webapps and online-first software, so I can easily see some developers making this mistake.

Plus, they offer a VS Code extension. It's not so obvious that it's just the same public website underneath.

Additionally, developers who understand those concerns kind of expect that other developers also understand them, and thus would not create an on-line tool like this in the first place.


> I agree, mostly. But since when isn’t it obvious that posting data with a browser will send that data somewhere?

I'd be surprised if anyone at all, specially developers, expects a browser to do anything other than transfer data to/from the internet.


Eric here (one of the creators of JSON Hero) and this is a really good point. We built JSON Hero earlier this year and partly wanted to use it to try out Cloudflare Workers and Remix, hence the decision to store in KV and use that kind of architecture. We're keen to update JSON Hero with better local-only support for this reason, and to make it easier to self-host or run locally.


If the vscode extension did it all locally, I'd 100% install in an instant!


To add to this, I'd probably pay for this too, if it wasn't too expensive.


There are instructions in the readme to 'run locally' - are you saying that even that version (running on localhost:8787) is sending something back to y'all, either from the client in the browser or sending something back via the locally-running server?

I was totally about to clone this repo and run it locally so I can play with some internal json.


If the vscode extension did it all locally, I'd 100% install in an instant! DITTO


Try WASM.


This reminds me of an "Online HTML Minifier" website that analyzed the text and included affiliate links for random words within the text.

And they operated for years, when someone noticed links on their own website, they haven't added themselves and tried to figure out, how it happened, because nobody else had access to the website.

(Will update with a link, if I find it.)


I agree.

My tool flatterer: https://lite.flatterer.dev/ converts deeply nested JSON to csv/xlsx, is done in web assembly in the browser.

It hard to prove that it is not sending data to a server, so it can be trusted. I know people could check dev tools but that is error prone and some users may not be able to do it.

I wish there was an easy way to prove this to users as it would make online tools like this much more attractive.


I think there is an easy way to prove this to users. Make your thing be a single page self contained html file which they save into the hard disk. Then they can trust the restricted permissions with which chrome runs such local files.

If you have a tech savvy audience they can also view your thing in an iframe with only sandbox="allow-scripts" to prove that it's not making network requests.

I wrote an html/js log viewer with those security models https://GitHub.com/ljw1004/seaoflogs - it handles up to 10kline log files decently, all locally.


Would be nice to have the option to switch tabs into offline mode, just like we can mute them.


You can do that with Chrome dev tools: Network -> No throttling -> Offline

Don't know how reliable this is though or whether a web developer could work around this.


It could be storing data and send it later when you visit the same origin from a different tab.

Unless you were also using incognito or throwaway tab containers to discard stored data.


Good point, thanks.


Turn off wifi? Unplug the ethernet cable? Try it from my garden shed where there never seems to be connectivity no matter what I try.


No Dave, you can't upload this export-controlled document to this web tool. I don't care how convenient it is.


There's an issue on the github requesting a local version

https://github.com/apihero-run/jsonhero-web/issues/134


Just set up an online HL7 or better yet CCDA parser and let the PHI roll in.


Even more, it has to work completely offline! And if it makes ANY network calls, it is a huge red flag for some!


100% literally came here to make sure someone said this.


yep if I were a Bad Guy and had nation state resources I'd be salivating over trying to get "in" at JetBrains, GitHub and the like


All those free online .PSD utilities make my spidey-sense tingle.


Better yet, build an operating system and link it to the cloud.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: