Paragon submits 27k-line NTFS driver to Linux kernel

piscisaureus · on Aug 23, 2020

I understand that reviewing 27k LoC is daunting and probably not very fun.

But unlike most patches that draw a similar response it's not a narrowly useful patch that mostly serves the submitter. Proper NTFS support benefits a large proportion of Linux users (the jab from the article that there are more advanced file systems out there seems out of place; there are no signs that windows is about to switch its default FS to something else).

Additionally this code has been used in production for years now (e.g. my 2015 router runs the closed source version of this driver in order to support NTFS formatted external drives) so most likely a lot of quality issues have already been found and addressed.

So I feel it's a bit unreasonable to respond with so much negativity to this contribution.

AnotherGoodName · on Aug 23, 2020

It's also no big deal from either side. Paragon sent in the patch and it's appreciated. There's a few problems to get this in. Reviewers noted the issues and what would need to be done to get this through. The process to get this in is happening.

Split your diff! and Fix your makefile! have to be one of the most benign and common pieces of diff feedback i've seen. I feel that you could make a media story about any submission to the Linux kernel based on there being comments in the review process.

piscisaureus · on Aug 23, 2020

Admittedly I didn't actually read the mailing list discussion. It's entirely possible that The Register made up a big drama where there was none.

_ugfj · on Aug 23, 2020

I read the discussion. There's no drama at all. Paragon did an unreviewable code dump with intent to maintain and they are warmly welcome in general. David laid out the path to review and probable acceptance https://lore.kernel.org/linux-fsdevel/20200815190642.GZ2026@... If there was any fuss it's because of the unreviewable nature of the patch but especially by kernel standards the discussion was cordial. In fact, aside from Nikolay's outburst by any standard it was a cordial discussion. (Someone should've gently told him this is no way to welcome newcomers especially newcomers carrying such a gift.)

Others noted it needs to pass the existing test suite and that it is close.

dataflow · on Aug 23, 2020

Not sure if this counts as drama per Linux kernel mailing list standards: https://lore.kernel.org/linux-fsdevel/2911ac5cd20b46e397be50...

> So how exactly do you expect someone to review this monstrosity ?

tripletao · on Aug 23, 2020

It seems from the link that the kernel developers would rather have one patch per new file plus a patch that does the integration, instead of one big patch with everything. That's a bit unconventional, perhaps because they tend to use git alone instead of higher-level software that would help them break down a big single commit; but whatever they're doing clearly works for them.

The entire dispute seems to be that minor question of style, nothing substantive. I don't think anyone's specially unhappy on either side. The controversy seems manufactured, perhaps by a reporter who noticed the gruff language but lacked the technical knowledge to understand what's actually going on.

Most people developing free software (probably including both the submitters and recipients of this patch) could make a lot more money elsewhere, but have chosen to instead to do work with considerable public benefit. That's thankless enough already without some reporter inventing drama for clicks.

u801e · on Aug 23, 2020

> perhaps because they tend to use git alone instead of higher-level software that would help them break down a big single commit

git is capable of breaking down a large diff into manageable pieces (e.g., limiting a diff to a single file), but reviewing code in a mailing list means replying to the message that contains a patch and replying inline to certain parts to comment on it.

As for higher level software that could break down a large commit, what specifically do you have in mind? I can't think of any feature that other review tools like Git??b, gerrit, reviewboard, phabricator, etc. that would make something like this easy to review.

tripletao · on Aug 23, 2020

I meant like GitHub and competitors, which let you attach comments to specific lines and files and such, and perhaps follow references into the full code faster than you could flipping between your mail client and your editor (and save you the effort of applying the patch to a local tree for that review). Since the kernel developers prefer to discuss on a plain mailing list and not use such tools, it makes sense that they prefer smaller chunks.

27 kLOC will be a big project to review no matter what, but I'd probably rather take them in a single commit--the files presumably depend on each other, and there's probably no order in which the files could be reviewed in isolation without reference to files not yet reviewed. (Obviously we try for hierarchical structure that would make that possible, but not usually with perfect success.)

That's a matter of personal preference, though, and people who want a project to merge their contributions should adhere to the maintainer's preferences. In any case, it seems Paragon intends to do exactly that. I doubt Paragon expected their reward for their contribution would be an article read by thousands of people that called it "half-baked" over this minor point, and I can't imagine such publicity encourages others to make similar contributions in future.

u801e · on Aug 24, 2020

> I meant like GitHub and competitors, which let you attach comments to specific lines and files and such

Github does allow you to filter the diff down to the commit or jump to a particular file within the diff. Commenting on a line in the diff isn't really any different than positioning one's comment inline below the relevant line(s) of code in an email reply. I know that in Github, it's also possible to comment directly on a commit (though those comments are not displayed with any context in the general PR view), unlike an email reply to a particular patch series.

Depending on one's email client, it's certainly possible to search for things like /^diff/ or /^@@/ to jump from file to file or hunk to hunk within the compose window.

> perhaps follow references into the full code faster than you could flipping between your mail client and your editor (and save you the effort of applying the patch to a local tree for that review).

For some, the email client doubles as an editor (i.e., gnus). And, at least in my experience, it's far faster to navigate code in an editor compared to the web interface that Git??b provides.

> 27 kLOC will be a big project to review no matter what, but I'd probably rather take them in a single commit--the files presumably depend on each other

While that's true, the dependency can be preserved when merging the branch of the series of commits in the mainline repository. Plus, many may find it easier to review declarations, definitions, and calls in that order.

chrisseaton · on Aug 24, 2020

> Most people developing free software (probably including both the submitters and recipients of this patch) could make a lot more money elsewhere

I think most free software developers are normal corporate employees. I work on tons of free software as my job, like most of my peers, but that’s normal in the industry. I don’t consider myself a free software developer.

tripletao · on Aug 24, 2020

Fair--depending what "most" is weighted by, I may have overstated, and a Google employee who happens to get assigned to work on Chrome stuff is certainly making no personal sacrifice.

I meant independent volunteers or people working for free-software-focused companies (which I believe usually offer well below FAANG-level compensation, especially at the high end, though still enough to live quite well). Excluding hardware vendors porting Linux to their own products, I believe the core kernel developers tend to fall in to that last category. I have no specific knowledge of their individual compensation, but the technical leads responsible for closed-source projects of similar scope make incredible amounts of money.

isatty · on Aug 23, 2020

It’s a legitimate concern - I would not assume malice. How exactly would someone review a 25k loc .patch file?

michaelt · on Aug 23, 2020

It would be tough, no doubt.

But it's not like splitting the feature into 100 patches of 250 lines each would make it any quicker to review. Or merging code that was known not to work, as it was only a fraction of what was needed for the functionality.

wtallis · on Aug 23, 2020

> But it's not like splitting the feature into 100 patches of 250 lines each would make it any quicker to review. Or merging code that was known not to work, as it was only a fraction of what was needed for the functionality.

That would also be rejected, because the kernel maintainers aren't idiots and their standards aren't the stupid arbitrary rules you construe them to be. They generally want big changes to be broken up into logical, sensible chunks that each leave the tree in a usable state, so that git-bisect still works.

skissane · on Aug 23, 2020

How do people merge big new filesystems in practice though? Especially one with years of pre-existing out-of-tree development?

I guess one could start by merging a skeleton of the filesystem which supports mount/unmount but then returns an IO error on every operation? And then a patch to add directory traversal (you can view the files but not their contents), and then a patch to add file reading, and then a patch to add file writing, and then a patch to add mkdir/rmdir, and then a patch to add rename/delete of regular files.

Breaking down an existing filesystem into a sequence of patches like that, no doubt it is doable, but it is going to be a lot of work.

wtallis · on Aug 24, 2020

My guess is that given the history of this filesystem implementation, most of the review effort will be focused on the interface between this FS and the rest of the kernel. It's typical for all the changes touching communal files or introducing generic helper functions or data structures to be broken out into separate commits. If any of those helpers are a reinvention of stuff that's already in the kernel, there will need to be a justification for why NTFS needs its own special versions. It's not typical for a large patch series adding genuinely new stuff to be broken up into absurdly tiny commits. For the stuff that's truly internal to the filesystem implementation, it looks like one patch per file will be an acceptable granularity.

dataflow · on Aug 23, 2020

I didn't mean to imply otherwise. Drama is often not malice and rather due to legitimate concerns on one or both sides.

rat9988 · on Aug 23, 2020

I assume malice because of the tone. The concern is legitimate, the tone is offputting.

phs318u · on Aug 23, 2020

Welcome to Linux kernel development. By lkml history, this tone is very mild. There are many examples of far worse commentary and personal attacks on devs. I’m not justifying this by the way. Linus can be a very smart jerk, and as a leader (THE leader) he sets the tone for what’s acceptable in the community.

https://www.zdnet.com/article/linux-developer-who-took-on-li...

echlebek · on Aug 23, 2020

There wasn't much drama in the mailing list discussion as I read it. Mostly comments asking "can you make this easier for us to review?".

geofft · on Aug 23, 2020

There were two specific concerns in the initial review that I think were reasonable:

1) The Linux kernel already has an in-kernel read-only NTFS driver. What should be done about it? (There are a number of reasonable options here, including just getting rid of it and replacing it with Paragon's, but that requires at least some buy-in from the maintainers of the existing driver.)

2) The patch didn't actually build, which was a one-line Makefile fix, but raised some concern about how it wax tested/how the patch was generated.

cactus2093 · on Aug 23, 2020

It's strange to me that there's no file system with basic features like journaling and support for files larger than a couple of GB, that is supported across all major desktop OSes (MacOS, Windows, Linux, and FreeBSD). All platforms support the same i/o standards like USB or DisplayPort, why did filesystems never make the cut to become a cross-system standard?

Imagine if you could have a backup drive (with reasonable modern data protections) that you could just plug into different systems and save all your files to. Isn't it odd that such a simple thing isn't possible? I guess network attached storage has gotten pretty accessible at this point so there's no need for it?

jcrawfordor · on Aug 23, 2020

I think the basic problem is that FAT is generally "good enough," and in the increasingly common case where it isn't exFAT is close to universal and addresses the only problem that consumers frequently run into (file size limit).

While FAT/exFAT leave the possibility of a variety of different types of filesystem inconsistency, these seem to be fairly rare in actual practice, probably in good part due to Windows disabling writecaching on devices it thinks are portable. This kind of special handling of external devices is sort of distasteful, and leads to some real downsides on Windows (e.g. LDM handling USB devices weird), but using a newer file system doesn't really eliminate that problem - NTFS and Ext* external devices require special handling on mounting to avoid the problems that come from file permissions traveling from machine to machine, for example.

wheybags · on Aug 23, 2020

> in good part due to Windows disabling writecaching on devices it thinks are portable. This kind of special handling of external devices is sort of distasteful

What's distasteful is that on my linux machine with a lot of ram, when I copy a multiple gigabyte file to a USB key it "completes" the copy almost immediately, when actually all it has done is copy the file to a ram buffer. Then when I try to disconnect the drive, it will hang for ages while it actually finishes the write. IMO windows does it better here (although I never realised what exactly they did, nice to know.

mpsq · on Aug 23, 2020

You should try to use the "sync" command, this is exactly what it solves.

wheybags · on Aug 23, 2020

I'm talking about dragging files with the GUI file manager, I shouldn't have to use any commands :(.

What I end up doing is using a "watch" on some command I can't remember that shows overall dirty pages.

nitrogen · on Aug 24, 2020

GUI file copy tools should be using O_DIRECT, or periodically calling f/sync(). An argument could also be made that the kernel write cache should have a size limit so that one-off write latency is masked, but very slow bulk I/O is not masked.

wtallis · on Aug 24, 2020

O_DIRECT seems like overkill, and the lack of write buffering could be a real detriment in some circumstances. Syncing at the end of each operation (from the user's perspective) should be the best mix of throughput and safety, but it makes it hard to do an accurate progress bar. Before the whole batch operation is finished, it may be useful to periodically use madvise or posix_fadvise to encourage the OS to flush the right data from the page cache—but I don't know if Linux really makes good use of those hints at the moment.

On really new kernels, it might work well to use io_uring to issue linked chains of read -> write -> fdatasync operations for everything the user wants to copy, and base the GUI's progress bar on the completion of those linked IO units. That will probably ensure the kernel has enough work enqueued to issue optimally large and aligned IOs to the underlying devices. (Also, any file management GUI really needs to be doing async IO to begin with, or at least on a separate thread. So adopting io_uring shouldn't be as big an issue as it would be for many other kinds of applications.)

im3w1l · on Aug 24, 2020

If you syncfs every reasonable-unit-of-time, you can get a progress bar.

wtallis · on Aug 24, 2020

Not always. If you're reading from a SSD and writing to a slow USB 2.0 flash drive, you could end up enqueuing in one second a volume of writes that will take the USB drive tens of seconds to sync(), leading to a very unresponsive progress bar. You almost have to do a TCP-like ramp up of block sizes until you discover where the bottleneck is.

flukus · on Aug 24, 2020

Which distro/desktop? My standard ubuntu 18.04 with gnome and mounted through the file manager doesn't do this and copying to a slow USB drive is as glacial as it should be, but copying between internal drives is instant and hidden.

wheybags · on Aug 24, 2020

Default gnome on Ubuntu 20.04. How much free ram do you normally have? If you don't have enough to buffer the whole operation, then it's not a problem.

flukus · on Aug 25, 2020

Now that I think about it, this might actually explain some bugs I've seen when copying multiple files. Copying one file seems to work but then copying a second the progress sits at 0%, it's probably waiting for the first transfer to sync.

marcodiego · on Aug 23, 2020

sync is your friend.

Wowfunhappy · on Aug 23, 2020

I don’t know if it’s just because I’m largely on Mac and Apple’s drivers suck, but I have run into lots of data corruption issues with exFAT on large drives (1TB+). More than enough for me to stop using it.

(Now I use NTFS and Tuxera’s commercial Mac driver, because I don’t know how else to have a cross-platform filesystem without a stupidly-low file-size limit.)

0xcde4c3db · on Aug 23, 2020

It's pretty widely accepted that exFAT should be avoided on Nintendo Switch because of data corruption issues. So I don't think it's just Apple.

kasabali · on Aug 24, 2020

> because I don’t know how else to have a cross-platform filesystem without a stupidly-low file-size limit

UDF is natively supported on all major OS.

I haven't used it in macOS but on paper macOS seems to be having even better support for it than Linux so you can give it a shot.

Lastly, you can use https://github.com/JElchison/format-udf for creating most compatible filesystem across different devices.

Wowfunhappy · on Aug 24, 2020

Oh, the optical disk filesystem! That would be a useful hack. I won’t have problems using it on a standard hdd or ssd?

kasabali · on Aug 24, 2020

It isn't a 1st class fs in any OS so it lacks some polish tooling-wise etc., but it should work fine for basic file transfer jobs.

I'm using it on an external HDD for copying/watching video files between Linux and Windows boxes and haven't had any problems yet.

On the other hand 7-8 years ago I tried to use an UDF partition for sharing a common Thunderbird profile between Windows and Linux and had done strange errors on Windows side after a while. I didn't dig further so it may have been a non-udf os or tooling issue, or it may have been solved in the meantime.

mschuster91 · on Aug 23, 2020

> why did filesystems never make the cut to become a cross-system standard?

Many reasons, the most important being patents and insistence by vendors to keep stuff proprietary (exfat, ntfs, HFS, ZFS).

Also, there are fundamental differences on the OS level:

- how users are handled: Unices generally use numerical UID/GID while Windows uses GUIDs

- Windows has/had a 255-char maximum length of the total path of a file

- Unices have the octal mode set for permissions, old Windows had for bits (write protected, archived, system, hidden) and that's it

- Windows, Linux and OS X have fundamentally different capabilities for ACLs - a cross platform filesystem needs to handle all of them and that in a sane way

- don't get me started on the mess of advanced fs features (sparse files, transparent compression, immutability of files, transparent encryption, checksums, snapshots, journals)

- Exotic stuff like OS X and NTFS additional metadata streams that doesn't even have representation ín Linux or most/all BSDs

And finally, embedded devices and bootloaders. Give me a can of beer and a couple weeks and I'll probably be able to hack together a FAT implementation from scratch. Everything else? No f...in' way. Stuff like journals is too advanced to handle in small embedded devices. The list goes on and on.

TheOtherHobbes · on Aug 24, 2020

Filesystems have never been standardised. In mainframe/mini days manufacturers supplied a selection of OS options for the same hardware, and there was no expectation that the various filesystems would be compatible between different OSs.

Which is why we have abstraction layers like Samba (etc) on top of networked drives. They're descendants of vintage cross-OS utilities like PIP which provide a minimal interface that supports folder trees and basic file operations.

But a lot of OS-specific options remain OS-specific, and there's literally no way to design a globally compatible file system that implements them all.

This isn't to say a common standard is impossible, but defining its features would be a huge battle. And including next-gen options - from more effective security and permissions, to content database support, to some form of literally global file ID system, to smart versioning - would be even more of a challenge.

wesnerm2 · on Aug 23, 2020

> Windows has/had a 255-char maximum length of the total path of a file

The actual path of file in Windows can be practically unlimited, but either requires using special network notation that can exceed 256 characters or relative addresses. Recent versions of Windows includes a setting that removes the limitation in the APIs because of development issues like node nested packages.

JdeBP · on Aug 24, 2020

Windows uses SIDs, not GUIDs.

mschuster91 · on Aug 24, 2020

No idea who downvoted you, thanks for the correction!

0xcde4c3db · on Aug 23, 2020

UDF looks a lot like that file system on paper, but my understanding is that it practically falls apart because there are too many bugs and too much variation in which features are really supported among the various implementations. Everyone agrees on the DVD-ROM subset, but beyond that it seems like a crapshoot.

opan · on Aug 24, 2020

ISO 13346[0] is supported everywhere and can handle files larger than 4GB. It's used on DVDs, but it can also be used on a flash drive[1].

[0] https://en.wikipedia.org/wiki/Universal_Disk_Format [1] https://github.com/JElchison/format-udf

qalmakka · on Aug 24, 2020

Using a non-FAT filesystem for portable storage opens up lots of issues regarding broken permissions. Try to create a file on an ext4-formatted USB flash drive using your current user; rest assured that, unless your remembered to set its permissions to 777, on another computer you'll have to chmod it because it still belongs to the possibly dangling creator's UID and GIDs. If you don't have root access, and unless coincidentally your user has the same UID as your other machine, you're screwed and you have to go back grumpingly to a system where you have administrative privileges.

Same thing applies to NTFS, but I've seen that more often than not Windows creates files on removable drives with extremely open permissions, and in my experience NTFS-3G just straight ignores NTFS ACLs on the drives it mounts, so more often than not it JustWorks™ in common use cases.

I think a journaled extFAT-like filesystem would be perfect for this task, but given how hard it was for exFAT to even start to displace FAT32, even if it actually existed I wouldn't expect it to succeed any time soon.

jchook · on Aug 24, 2020

Almost as if the corporations have a conflict of interest with their consumers.

franga2000 · on Aug 23, 2020

Not standardising means being able to distinguish yourself easier, it's just that for anything that integrates with third-party stuff, standardisation is way cheaper. Apple doesn't make monitors or flash drives, so making their own specs for those wouldn't be beneficial to them and would increase prices of compatible products. But a filesystem is something they do make and being able to do whatever they want with it is quite beneficial, with no downsides as there are no third parties (that they care about) integrating with it that would have to put in extra work to support it (besides their direct competitors, which is a nice bonus).

quarantine · on Aug 23, 2020

There's a ZFS driver for Linux, macOS, and Windows.

wtallis · on Aug 23, 2020

In all three cases you list, it's a third-party module providing support, rather than it being a standard feature of the OS that can be expected to be generally available on more than 1% of the install base.

hoistbypetard · on Aug 23, 2020

ZFS development moves so fast that it is common for my (FreeBSD-based) FreeNAS box to warn me when I upgrade my OS that certain actions will make it incompatible with the prior version of FreeNAS.

That is fine and appropriate for a drive that will be connected to the system for the foreseeable future.

That kind of compatibility concern makes me squeamish about using ZFS for a drive that I want to share between different systems. If it's easy to make it incompatible between two releases for the same system, that smells like a waiting nightmare trying to keep it compatible between Linux, FreeBSD and Windows.

DenseComet · on Aug 23, 2020

Yeah it should stabilize in the next couple of months with the release of OpenZFS 2.0 as that release is supposed to signify the unification of ZFS on Linux and FreeBSD. ZFS on FreeBSD is being rebased onto ZFS on Linux. Theres also been some talk on adding MacOS zfs support OpenZFS but thats still up in the air.

mycall · on Aug 23, 2020

This is great news

codetrotter · on Aug 24, 2020

Agreed! I’ve been running FreeBSD on various computers for very close to a decade now, and still run it on my mail server, but one problem that I faced a couple of years ago when I sold my old laptop, which I was running FreeBSD on, was that my other computer at home at the time was running Linux but I had an external HDD that I’d been using with the laptop and which I was using GELI encryption on.

Since I didn’t have money for any more hard drives at the time, I couldn’t transfer the data to anything else. So then when I wanted to access that data I’d do so via a FreeBSD VM running in VirtualBox. The performance was... not great.

I took the data that I needed the most, and for the rest of the data I let it sit at rest.

This week I wanted to use the drive again, and in the end because I was doing general cleanup, I decided to install FreeBSD on my desktop temporarily.

I actually love FreeBSD but the reason that I prefer to have my desktop running Linux is in big part because I want software on the computer to be able to take advantage of CUDA with the GTX 1060 6GB graphics card that I have in it, and unfortunately only the Linux driver by Nvidia has CUDA, the FreeBSD driver by Nvidia does not.

I was actually looking at installing VMWare vSphere on the computer instead, so that I could easily jump between running Linux and running FreeBSD with what I understand will probably be good performance compared to VirtualBox at least. But the NIC in my machine is not supported and vSphere would not install. I found some old drivers, messed around with VMWare tooling which required PowerShell, and which turned out not to work with the open source version of PowerShell on any other operating system than Windows. So then I downloaded a VM image of Win 10 from Microsoft [0], and used that to try and make a vSphere installer with drivers for my NIC. No luck at first attempt unfortunately. A decade ago I probably would have kept trying to make that work, but at this point in my life I said ok fine fuck it. I ordered an Intel I350 NIC online second-hand for about $40 shipping included, and the guy I bought it from sent it the next day. It is expected to arrive tomorrow. Meanwhile, I installed FreeBSD on the desktop. When the NIC arrives I will do some benchmarking of vSphere to decide whether to use vSphere on the desktop or to stick to either FreeBSD for a while on that machine or to put it back to just Linux again.

Anyways, that’s a whole lot more about my life and the stuff that I spend my spare time on than anyone would probably care to know :p but the point that I was getting to is that, with OpenZFS 2.0 I will be able to use ZFS native encryption instead of GELI and I will be able to read and write to said HDD from both FreeBSD and Linux.

I still need to scrape together money for another drive first before I can switch from GELI + ZFS to ZFS with native encryption though XD

Oh, and one more thing, with the external drive I was having a lot of instability with the USB 3.0 connection on FreeBSD, leading to a bit of pain with transferring data because the drive would disconnect now and then and I’d have to start over. But yesterday I decided to shuck the drive – that is, to remove the enclosure and to connect the drive with SATA like you would any other regular internal drive. It worked out excellently, the WD Essentials enclosure was easier to pry open than I had feared, and a video on YouTube showed me how to do it [1]. As prying tools I used a couple of plastic rulers. As a bonus, it also looks like I/O performance is better with the direct SATA connection than what I was getting with the USB 3.0 connection.

Speaking of that, some people have reported finding that the drives in their WD Essentials external drives were WD Red HDDs. I didn’t have the same luck with mine; mine was WD Blue. But idk if WD Red is even common with the capacity that mine has anyways. Mine is “only” 5TB and I think the people that have been talking about finding WD Red drives in theirs has bought 8TB models often. Idk. The main thing for me anyways is just to have my data and someplace to store it ^^

[0]: https://developer.microsoft.com/en-us/microsoft-edge/tools/v...

[1]: https://youtu.be/QApvLyorr3g

Fnoord · on Aug 23, 2020

There's a Btrfs driver for Windows [1]

[1] https://github.com/maharmstone/btrfs

qalmakka · on Aug 24, 2020

While it works fine, it tends to BSOD more often than I'd like...

Wowfunhappy · on Aug 23, 2020

The ZFS driver is still early in development and quite unstable! I’ve used it in read-only mode just so I could have some access to my ZFS pool while booted into Windows, and although it kind of worked in that use case it was would still do weird things like randomly refuse to open certain files.

nitrogen · on Aug 24, 2020

One thing that occasionally causes data interoperability problems for me is forgetting that Windows can't have colons in filenames. Not really sure what a filesystem driver could do about that.

sedatk · on Aug 23, 2020

exFAT is such a file system. It doesn't have journaling but why do you need journaling on backup media anyway?

toyg · on Aug 23, 2020

Because shit can happen even when you're backing up...?

sedatk · on Aug 23, 2020

what kind of shit can happen without you knowing when backing up?

toyg · on Aug 23, 2020

Power loss. I mean, backing up is the literal definition of a nightly job...

sedatk · on Aug 24, 2020

So, what does a journaling FS helps in a power loss? You just restart backup anyway. What’s the concern there?

toyg · on Aug 24, 2020

The health of the backup device. Restarting a backup is one thing, having to rebuild a disk is another.

sedatk · on Aug 24, 2020

You don't need to rebuild a disk because some clusters were orphaned. It doesn't mean disk was corrupt, it just means there is some allocated space that isn't used. It's trivial to fix too. There is no danger there. Actually, what journaling does is to rollback that allocated space. Doing that manually doesn't make it less safe. For backup scenarios exFAT is perfectly feasible.

toyg · on Aug 25, 2020

What is feasible and what is desirable are two different things.

With journalling, I don't have to know or care about any of this: I restart and chances are it's all back to normal. This is desirable.

You can store your backups on stone tablets, with a machine that carves rock to write 1s and 0s and a conveyor belt that feeds new tablets. That is perfectly feasible. It is also not desirable.

slim · on Aug 24, 2020

You restart the backup process, maybe ?

toyg · on Aug 24, 2020

Not if your target FS is corrupt.

rini17 · on Aug 23, 2020

Yes there are still dreams about bringing your home directory on a stick anywhere with you and just plugging it into nearest public toilet when you need it.

But it hasn't and won't work for obvious reasons.

qalmakka · on Aug 24, 2020

Ironically, a fully encrypted Linux installation with Btrfs or ZFS on a removable drive is quite easy to make and it works really well. I've one I made on an old SSD I had lying around and a 2.5" USB-C caddy and it's wonderful, you can have your work environment everywhere you want and even plug it into a running machine and boot it up using Hyper-V or KVM.

sam42 · on Aug 23, 2020

In the meantime v2 of the patch has already been submitted, addressing some of the points mentioned in the article: https://lore.kernel.org/linux-fsdevel/904d985365a34f0787a451...

nippoo · on Aug 23, 2020

This is such an odd article. Perhaps Paragon isn't being entirely altruistic with this move but TR are being quite scathing of someone's work submitting a kernel driver with no direct financial reward - and no real praise for, hopefully, fixing one of the biggest out-of-the-box gripes that Windows / Linux desktop dual-boot environments have. It's no small wonder people often complain about the hardships of writing and maintaining open-source software!

shakna · on Aug 23, 2020

I don't get where TheRegister is getting the drama. The thread doesn't seem that scathing to me. [0]

It doesn't build, but the person who pointed it out also supplied a diff to make it happen.

It also fails a few tests, but Paragon are more than happy to see if they can make it a bit more compliant.

UBSan finds a few potential bugs, but again, Paragon are more than happy to fix the problems.

There's some style guide suggestions, which Paragon seem to immediately take on board:

> The patch will be splitted in v2 file-wise. Wasn't clear initially which way will be more convenient to review.

[0] https://lore.kernel.org/lkml/2911ac5cd20b46e397be506268718d7...

Delk · on Aug 23, 2020

Yeah, The Register can often be witty and irreverent, and I wouldn't complain about that, but that kind of wit only works if there's an actual point behind it. Jumping on the apparently popular bandwagon of squeezing out "drama" from the Linux kernel mailing list isn't adding anything of value.

It also seems a bit ignorant to call the submission "half-baked" just because it needed to be worked on and because someone pointed that out (also) in a slightly irreverent way.

All of those points you bring up about the feedback they got seem like just business as usual for a larger merge of code to a carefully developed FOSS project.

cortesoft · on Aug 23, 2020

Yeah, this seems like one of those 'dramas' that incites bystanders more than the actual participants.

jcrawfordor · on Aug 23, 2020

I think the issue here is the context... Paragon has had a rocky relationship with the Linux community in the past, for example their whiny reaction to exFAT being proposed for mainline a few months ago, so I think in the eyes of The Register there is a certain amount of inherent mistrust in Paragon proposing their NTFS driver for the kernel---shortly after going on the offensive against another formerly-commercial FS going mainline.

That said, I think the tension is more imagined than real, as the LKML doesn't really seem to have responded any more negatively than they do to most other patches.

example: https://arstechnica.com/information-technology/2020/03/the-e...

Fnoord · on Aug 23, 2020

There's no big issue, and there hasn't been one for many years. Ext2/3/4fs read/write support has existed for a long time for Windows, and FUSE/Dokan has a working NTFS driver with r/w support as well (also for a long time). It just doesn't work out of the box (though it does on some distributions).

Its gonna take a while till this driver is mainline in Linux kernel, and till that Linux kernel is included in distributions (especially LTS).

I haven't read The Register article, but in the past it has come to my attention they dramatize their articles, and I don't want to read such media.

aspaceman · on Aug 23, 2020

This is why companies don’t open source their code though. Do it and everyone looks at it and says “oh I think you’re stupid”

matthewmacleod · on Aug 23, 2020

But that didn’t happen here.

DHowett · on Aug 23, 2020

LKML post: https://lore.kernel.org/lkml/2911ac5cd20b46e397be506268718d7...

Earlier discussion: https://news.ycombinator.com/item?id=24170001

cozzyd · on Aug 23, 2020

Thanks, maybe it was the expectations set by the tone of The Register article, but the mailing list discussion seemed mostly very reasonable...

mauvehaus · on Aug 23, 2020

Possibly a dumb question, but what are the plusses/minuses of something like this as compared to NTFS-3G?

https://en.wikipedia.org/wiki/NTFS-3G

formerly_proven · on Aug 23, 2020

Mostly better performance / less CPU usage. NTFS-3G is typically CPU-limited even for sequential access, which isn't exactly a great spot for a FS driver to be in.

solarkraft · on Aug 23, 2020

NTFS-3G is a userspace (FUSE) driver, which usually have worse performance than kernel drivers.

dTal · on Aug 23, 2020

You'll be able to install Linux on NTFS root now!

dataflow · on Aug 23, 2020

I wish Linux on NTFS could share metadata with WSL1. That way you could boot it off the same WSL1 files you could access in Windows.

nix23 · on Aug 23, 2020

Nice idea, and with Wine we could have all the beautiful Windows AV-Software :-)

mariuolo · on Aug 23, 2020

I don't think it's completely fair.

Don't look a gift horse in the mouth, especially when you need one.

znpy · on Aug 23, 2020

On one hand yes, on the other hand if you want your patch to be included it's your job to make it reviewable and to make it pass all the various checks.

After all, we all lived long enough without their software, we can live without it a bit longer.

Yes it's a gift, but it's also code that has to be maintained and updated along with the kernel, so it's not really 100% a gift. If you then consider that they might keep on selling a proprietary version of the code (which - don't get me wrong - is 100% legit and fair) they might also get basically free labour: they could rebase onto the latest public gpl version, they might get notes of various issues and bugs...

Quite literally, it's free labour.

hinkley · on Aug 23, 2020

“Free as in puppy.”

qchris · on Aug 23, 2020

It turns out someone (could be you?) wrote a Medium article about open-source software using the "free as in puppy" metaphor, although I don't think that they really used in the same way you are here with regards to a corporation using the open-source community to functionally receive free labor. I'm definitely going to add it to my mental list...

[1] https://medium.com/swlh/free-as-in-puppy-5b7eb1bf3908

hinkley · on Aug 23, 2020

Wasn’t me. Gifts bearing obligations come in a lot of shapes and sizes, I just always found the puppy metaphor very compelling and so I like to use it.

(As a mental exercise sometime, go to a pet store and figure out how much a “$12 hamster” costs once you get everything you need to set up and maintain a habitat.)

DuskStar · on Aug 23, 2020

This is a wonderful description of "free" to go along with "free as in beer" and "free as in speech", thanks!

znpy · on Aug 23, 2020

It's a beautiful metaphor.

Puppies can bring a lot of joy, but they certainly bring obligations.

Really spot on.

Nexxxeh · on Aug 23, 2020

I've not heard this one before. Like a White Elephant but useful.

https://en.m.wikipedia.org/wiki/White_elephant

Jowsey · on Aug 24, 2020

This is such a perfect addition to free as in beer/speech. Really quite surprised I haven't seen this more often, going to be using it haha

cratermoon · on Aug 23, 2020

> f you want your patch to be included it's your job to make it reviewable

I have a mid-junior level co-worker who submits PRs that are excessively large. As best I've been able to determine, he's not especially good at managing the dependencies in his code, and he doesn't want to submit a broken PR, so his default is to wait until he gets everything written instead of breaking it down into smaller pieces.

Pick-A-Hill2019 · on Aug 23, 2020

I think their (the kernel maintainers’) position is that a single patch of 27000 lines of diffs is a bit of a nightmare to do code review on. I’m not sure if you took a look at the patch file (available at https://dl.paragon-software.com/ntfs3/ntfs3.patch ).

I think their point is ‘man, how are we going to divide this up amongst the maintainers? Who gets to check which function or call?’

Paragons response (more or less will fix in v2) https://lore.kernel.org/linux-fsdevel/a8fa5b2b31b349f2858306...

tyingq · on Aug 23, 2020

Using language like "monstrosity" though, doesn't help.

boomboomsubban · on Aug 23, 2020

The word has several meanings, and I took it to mean "frighteningly large" rather than any of the more negative ones. Perhaps not the best choice, but not overtly rude.

hodgesrm · on Aug 23, 2020

"Monstrosity" seems like a fair word, and to the point as well. The follow-up messages showed good will on the part of the kernel maintainers.

znpy · on Aug 23, 2020

I guess that's not you that have to do the review.

tyingq · on Aug 23, 2020

Heh. I've certainly been handed stuff I didn't like. It's always been easier to work through if I avoided anything that came off as emotional or insulting.

sorryd4d · on Aug 23, 2020

Sorry, dad.

We’ll all try really hard to ignore our individual biology to satisfy the most sensitive sensibilities.

Cause that expectation is not manipulative at all.

dang · on Aug 23, 2020

Could you please stop posting flamebait and unsubstantive comments to HN? Also, could you please stop creating accounts for every few comments you post? We ban accounts that do that. This is in the site guidelines: https://news.ycombinator.com/newsguidelines.html.

You needn't use your real name, of course, but for HN to be a community, users need some identity for other users to relate to. Otherwise we may as well have no usernames and no community, and that would be a different kind of forum. https://hn.algolia.com/?sort=byDate&dateRange=all&type=comme...

dboreham · on Aug 23, 2020

If it's working code (assume it is) then requiring it to be turd-polished for the purposes of code review is absurd. Whoever is asking for that is either disingenuous or has no real experience as a software developer.

rhn_mk1 · on Aug 23, 2020

Taking up patches to code is also taking up vulnerabilities and bugs they carry. Even assuming good intentions, a gift of code is a gift with a burden attached, and should pass deep scrutiny before getting accepted.

Sad as it may sound, free code doesn't come for free.

mason55 · on Aug 23, 2020

This is like giving someone a litter of puppies as a gift and then walking away.

Hooray, free puppies. Now you just have to care for them for the next 10 years.

setr · on Aug 23, 2020

Paragon stated their intent to maintain and support the code in their initial email, and added themselves to the maintainer files.

There's no drama happening here. Paragon guys are trying to give it "properly", and linux guys want it "properly", and the only thing happening is defining "proper" in this context

starfleet_bop · on Aug 23, 2020

Intent, promises and reality aren’t always aligned. Especially when given a slap dash of tens of thousands of lines of code that didn’t meet kernel contribution guidelines.

setr · on Aug 24, 2020

Thats true... but nothing has happened yet. No one has yet failed to hold up their end. All parties seem to want this to work -- it certainly isn't a dead drop.

mixmastamyk · on Aug 23, 2020

Someone else mentioned their commitment.

Additionally, the driver is mature, as is the FS.

eirini1 · on Aug 23, 2020

how is that a good comparison, when you're given puppies you are being made responsible for living creatures, If you're given a bunch of code you can literally just drop it.

krisoft · on Aug 23, 2020

> "Don't look a gift horse in the mouth"

Interesting you choose that metaphor. I can think of a particular gift horse where you would have seen Greek soldiers if you have looked into its mouth. :)

guenthert · on Aug 23, 2020

Working on someone else's code is no fun. I do it, when I get paid properly, but can't say I enjoy it. So I understand that not everyone is enthusiastic about that gift.

And who needs it? There are alternatives to read or write the occasional file (e.g. if you happen to have forgotten the Admin's password) of a NT box, but I can't say that I missed the ability to create new files in a NTFS. And why now? Surely those who actually needed that functionality, needed it years ago and meanwhile found some other solution. Perhaps those who actually need it still volunteer to ready that driver for inclusion into the kernel or sponsor someone who can?

formerly_proven · on Aug 23, 2020

NTFS-3g can create files and directories etc. on NTFS (albeit it doesn't do journaling, that's why it requires you to have a clean journal). Having a proper kernel driver is mostly about performance, not features. The current built-in kernel driver for NTFS is completely read-only though.

wtallis · on Aug 24, 2020

> The current built-in kernel driver for NTFS is completely read-only though.

Not quite:

> CONFIG_NTFS_RW:

> This enables the partial, but safe, write support in the NTFS driver.

> The only supported operation is overwriting existing files, without changing the file length. No file or directory creation, deletion or renaming is possible. Note only non-resident files can be written to you may find that some very small files (<500 bytes or so) cannot be written to.

toyg · on Aug 23, 2020

> especially when you need one.

It's very debatable that Linux "needs" more support for a proprietary 25-year-old filesystem that is, in many ways, obsolete.

userbinator · on Aug 24, 2020

I wonder if the reason for the huge line count is overly verbose code, or if it's just the inherent complexity of NTFS. For contrast, I wrote a FAT32 filesystem driver (read/write) for an embedded system a long time ago, and it was less than 1K lines --- of Asm.

The_Colonel · on Aug 24, 2020

NTFS is way more complicated (and feature rich) filesystem than (rather barebones) FAT32.

prirun · on Aug 23, 2020

It's just a single data point, but I tried Paragon's ext4 driver (the paid version) for Mac many years ago. It seemed to work, but when the drive was connected back to Linux, there were all kinds of fsck errors. Immediately deleted it.

apfsx · on Aug 23, 2020

This has also been my exact experience. I used Paragon's NTFS driver (paid) for Mac for my external SSD. After using, when plugging into Windows it would always find "errors" and recommend to Scan and Fix them.

Wowfunhappy · on Aug 23, 2020

Fwiw, I had the same problem with Paragon but have had a lot more success with Tuxera’s driver, if you’re still looking for a solution.

bArray · on Aug 24, 2020

27k lines isn't that crazy. I've merged larger patches, it just takes a while. This is a very negative view of an open source contribution.

wtallis · on Aug 24, 2020

> I've merged larger patches,

In what kind of context? The Linux kernel?

Filesystem code is pretty tricky to begin with, and prone to very subtle bugs with very not-subtle consequences. And this isn't greenfield development of a new filesystem, but an implementation that needs to remain highly compatible with Microsoft's version. This FS driver has to be maintained to track changes to two operating systems. So this 27kloc can reasonably be expected to encompass a lot more complexity than your average 27kloc, and it requires a lot more review effort than something like 27kloc of GPU driver register definitions.

bArray · on Aug 24, 2020

> In what kind of context? The Linux kernel?

Not the Linux kernel, but a large embedded system.

> Filesystem code is pretty tricky to begin with, and prone

> to very subtle bugs with very not-subtle consequences.

This code has been running in the wild for quite a while now, it has had a trial by fire. And there's no way around testing, subtle ext4 bugs still crop up despite the maturity of the filesystem.

> And this isn't greenfield development of a new filesystem,

> but an implementation that needs to remain highly

> compatible with Microsoft's version.

No amount of code review will stop Microsoft from adapting their version. Also, I doubt Microsoft themselves will change too much about the filesystem given the compatibility they themselves have to maintain with cold storage NTFS drives.

> This FS driver has to be maintained to track changes to

> two operating systems.

You make it sound as if Microsoft have a hand in any of this. Also, have you seen the state of the current NTFS driver? It's a bit flakey (no disrespect to the maintainers).

wtallis · on Aug 24, 2020

> No amount of code review will stop Microsoft from adapting their version.

Way to miss the point. Code review for the kernel isn't just about verifying that the code currently works. It's also about making sure the code is maintainable. Microsoft is relevant here because their actions will increase the maintenance burden of any Linux NTFS driver. Kernel developers rightly need to be concerned about how difficult it will be to extend the NTFS driver to handle new NTFS features that Microsoft introduces.

bArray · on Aug 24, 2020

> Way to miss the point.

The point wasn't so clear, but I see what you're saying now. Maintainability is normal code review though.

> It's also about making sure the code is maintainable. [..]

> Kernel developers rightly need to be concerned about how

> difficult it will be to extend the NTFS driver to handle new

> NTFS features that Microsoft introduces.

Maintainability is one thing, extensibility is another. Preparing your code to implement some changes completely outside of your control seems like a waste of time and something that might bite you later on.

beervirus · on Aug 23, 2020

27k is 27 kB.

This is 27 kLOC.

Tade0 · on Aug 23, 2020

I solemnly swear never to complain again about the size of PRs given to me for review.

It's going to take months to throughly assess the quality of this.

Havoc · on Aug 23, 2020

Ouch. That looks really useful, but at the same time possibly their first kernel submission perhaps?

Hope I’m wrong but I think they’ll be fighting initial bad impression for a while despite good intention

adamretter · on Aug 23, 2020

I hope the it's better than their ext2 drivers for macOS. I have had nothing but problems and data loss and their support is the worst.

jug · on Aug 23, 2020

Same on Windows regarding data loss. Terrible and I’ll never use their ext2 drivers again.

aaron695 · on Aug 23, 2020

> "It looks as though that with NTFS being surpassed by other more advanced file-systems,

Is this remotly true?

Will my grandmother not be using NTFS in 10 years?

wtallis · on Aug 24, 2020

Microsoft has been trying to come up with a replacement for NTFS for a very long time. They've had mixed success with trying to extend NTFS with more advanced features like the now-deprecated TxF. There's little doubt that even its creators see NTFS as something of a dead-end. Whether your grandmother is still using it in 10 years depends primarily on whether Microsoft can get its act together to pick and ship a replacement.

aaron695 · on Aug 24, 2020

> Whether your grandmother is still using it in 10 years depends primarily on whether Microsoft can get its act together to pick and ship a replacement.

So yes she will I guess.

Then it's vital Linux gets NTFS working well.

My partner is not going to use Linux things if every time they try and transfer the 8k 3D holographic photos of our CRISPR'ed dog learning to spell to my grandmother it doesn't work on her Holovision.

True story, last month I just lost about 1 in 10 of my media files on my Linux Share to NTFS issues. So now I run the box on Windows.

wtallis · on Aug 24, 2020

> True story, last month I just lost about 1 in 10 of my media files on my Linux Share to NTFS issues. So now I run the box on Windows.

Were you seriously running a Linux-based NAS with NTFS as the underlying filesystem, or did you mean something very different? I can't imagine why anyone would ever think their choice of disk filesystem on a server—hiding behind a network filesystem—should be influenced by what disk filesystems are supported by client devices.

aaron695 · on Aug 24, 2020

Given my original question about my grandmother has been downvoted it says it all about the judginess of Linux users ;)

My setup was -

Windows and Ubuntu dual boot gaming PC with a NTFS 4G external hard disk, Windows network share. Default boot was to Ubuntu.

Torrents running off wireless laptop to the network share.

Amazon Fire stick with Kodi wireless running to network share.

r41nbowdash · on Aug 24, 2020

200loc/h, it's 4-5 weeks of work for a single person

PhoenixRobo · on Aug 23, 2020

Does it need to be included in the Linux Kernel? I understand that NTFS is 27 years old and mostly to support Microsoft file systems. Can't the driver be an external download?

EE84M3i · on Aug 23, 2020

In-tree drivers are the standard for Linux kernel modules.

londons_explore · on Aug 23, 2020

As long as it builds as a module, uses only the standard filesystem apis and doesn't have code changes outside that, code review is far less important IMO...

lumost · on Aug 23, 2020

The linux kernel is successful partly because by and large what's in the kernel works and doesn't generally break on new revisions.

Accepting a new driver which may in fact not work, breaks in ambiguous ways, interacts with other components poorly, or otherwise generates headaches for the kernel could be more trouble than it's worth. Ultimately this driver will eventually need to be modified by others, and ensuing it doesn't get a reputation as a nightmare right out of the gate is also worthwhile.

huhtenberg · on Aug 23, 2020

At the risk of stating the obvious, this is an exceptionally short-sighted approach. Committing poorly understood code is not that different from using binary blobs.

starfleet_bop · on Aug 23, 2020

So you’d willingly allow tens of thousands of lines of code which you don’t understand or really know what it does into a kernel which is used by literally billions of devices?

cratermoon · on Aug 23, 2020

> uses only the standard filesystem apis and doesn't have code changes outside that

That's literally why having to review 27K lines of code is a pain. You're not seriously going to trust, sight unseen, that there's nothing in the code that slips in a backdoor or catastrophic data-loss bug, are you?

segmondy · on Aug 23, 2020

I wish you weren't getting downvoted, but what you said is absolutely correct. If it's a module and separate from core, who cares. If it's modifying core kernel or mixed up with kernel space code, then nope.

wtallis · on Aug 24, 2020

Linux isn't a microkernel; a filesystem accepted into the kernel source tree will run as kernel space code whether or not it's compiled as a loadable module. So all 27k lines need to be audited for security purposes and to ensure they're interfacing with the rest of the kernel only in the approved ways, because there aren't a lot of technological barriers to the filesystem misbehaving.

But more important than that is the maintenance burden. NTFS will be around for a long time, but it's also a moving target because Microsoft hasn't replaced it yet. Kernel developers have to keep in mind how this code will look in a few decades, after all the original developers are retired. If it's written in a very different style from other Linux filesystems, will there be anyone left who both knows enough about the workings of the Linux IO stack 20 years hence, and understands Paragon's code conventions?