Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Paragon submits 27k-line NTFS driver to Linux kernel (theregister.com)
203 points by kiyanwang on Aug 23, 2020 | hide | past | favorite | 149 comments


I understand that reviewing 27k LoC is daunting and probably not very fun.

But unlike most patches that draw a similar response it's not a narrowly useful patch that mostly serves the submitter. Proper NTFS support benefits a large proportion of Linux users (the jab from the article that there are more advanced file systems out there seems out of place; there are no signs that windows is about to switch its default FS to something else).

Additionally this code has been used in production for years now (e.g. my 2015 router runs the closed source version of this driver in order to support NTFS formatted external drives) so most likely a lot of quality issues have already been found and addressed.

So I feel it's a bit unreasonable to respond with so much negativity to this contribution.


It's also no big deal from either side. Paragon sent in the patch and it's appreciated. There's a few problems to get this in. Reviewers noted the issues and what would need to be done to get this through. The process to get this in is happening.

Split your diff! and Fix your makefile! have to be one of the most benign and common pieces of diff feedback i've seen. I feel that you could make a media story about any submission to the Linux kernel based on there being comments in the review process.


Admittedly I didn't actually read the mailing list discussion. It's entirely possible that The Register made up a big drama where there was none.


I read the discussion. There's no drama at all. Paragon did an unreviewable code dump with intent to maintain and they are warmly welcome in general. David laid out the path to review and probable acceptance https://lore.kernel.org/linux-fsdevel/20200815190642.GZ2026@... If there was any fuss it's because of the unreviewable nature of the patch but especially by kernel standards the discussion was cordial. In fact, aside from Nikolay's outburst by any standard it was a cordial discussion. (Someone should've gently told him this is no way to welcome newcomers especially newcomers carrying such a gift.)

Others noted it needs to pass the existing test suite and that it is close.


Not sure if this counts as drama per Linux kernel mailing list standards: https://lore.kernel.org/linux-fsdevel/2911ac5cd20b46e397be50...

> So how exactly do you expect someone to review this monstrosity ?


It seems from the link that the kernel developers would rather have one patch per new file plus a patch that does the integration, instead of one big patch with everything. That's a bit unconventional, perhaps because they tend to use git alone instead of higher-level software that would help them break down a big single commit; but whatever they're doing clearly works for them.

The entire dispute seems to be that minor question of style, nothing substantive. I don't think anyone's specially unhappy on either side. The controversy seems manufactured, perhaps by a reporter who noticed the gruff language but lacked the technical knowledge to understand what's actually going on.

Most people developing free software (probably including both the submitters and recipients of this patch) could make a lot more money elsewhere, but have chosen to instead to do work with considerable public benefit. That's thankless enough already without some reporter inventing drama for clicks.


> perhaps because they tend to use git alone instead of higher-level software that would help them break down a big single commit

git is capable of breaking down a large diff into manageable pieces (e.g., limiting a diff to a single file), but reviewing code in a mailing list means replying to the message that contains a patch and replying inline to certain parts to comment on it.

As for higher level software that could break down a large commit, what specifically do you have in mind? I can't think of any feature that other review tools like Git??b, gerrit, reviewboard, phabricator, etc. that would make something like this easy to review.


I meant like GitHub and competitors, which let you attach comments to specific lines and files and such, and perhaps follow references into the full code faster than you could flipping between your mail client and your editor (and save you the effort of applying the patch to a local tree for that review). Since the kernel developers prefer to discuss on a plain mailing list and not use such tools, it makes sense that they prefer smaller chunks.

27 kLOC will be a big project to review no matter what, but I'd probably rather take them in a single commit--the files presumably depend on each other, and there's probably no order in which the files could be reviewed in isolation without reference to files not yet reviewed. (Obviously we try for hierarchical structure that would make that possible, but not usually with perfect success.)

That's a matter of personal preference, though, and people who want a project to merge their contributions should adhere to the maintainer's preferences. In any case, it seems Paragon intends to do exactly that. I doubt Paragon expected their reward for their contribution would be an article read by thousands of people that called it "half-baked" over this minor point, and I can't imagine such publicity encourages others to make similar contributions in future.


> I meant like GitHub and competitors, which let you attach comments to specific lines and files and such

Github does allow you to filter the diff down to the commit or jump to a particular file within the diff. Commenting on a line in the diff isn't really any different than positioning one's comment inline below the relevant line(s) of code in an email reply. I know that in Github, it's also possible to comment directly on a commit (though those comments are not displayed with any context in the general PR view), unlike an email reply to a particular patch series.

Depending on one's email client, it's certainly possible to search for things like /^diff/ or /^@@/ to jump from file to file or hunk to hunk within the compose window.

> perhaps follow references into the full code faster than you could flipping between your mail client and your editor (and save you the effort of applying the patch to a local tree for that review).

For some, the email client doubles as an editor (i.e., gnus). And, at least in my experience, it's far faster to navigate code in an editor compared to the web interface that Git??b provides.

> 27 kLOC will be a big project to review no matter what, but I'd probably rather take them in a single commit--the files presumably depend on each other

While that's true, the dependency can be preserved when merging the branch of the series of commits in the mainline repository. Plus, many may find it easier to review declarations, definitions, and calls in that order.


> Most people developing free software (probably including both the submitters and recipients of this patch) could make a lot more money elsewhere

I think most free software developers are normal corporate employees. I work on tons of free software as my job, like most of my peers, but that’s normal in the industry. I don’t consider myself a free software developer.


Fair--depending what "most" is weighted by, I may have overstated, and a Google employee who happens to get assigned to work on Chrome stuff is certainly making no personal sacrifice.

I meant independent volunteers or people working for free-software-focused companies (which I believe usually offer well below FAANG-level compensation, especially at the high end, though still enough to live quite well). Excluding hardware vendors porting Linux to their own products, I believe the core kernel developers tend to fall in to that last category. I have no specific knowledge of their individual compensation, but the technical leads responsible for closed-source projects of similar scope make incredible amounts of money.


It’s a legitimate concern - I would not assume malice. How exactly would someone review a 25k loc .patch file?


It would be tough, no doubt.

But it's not like splitting the feature into 100 patches of 250 lines each would make it any quicker to review. Or merging code that was known not to work, as it was only a fraction of what was needed for the functionality.


> But it's not like splitting the feature into 100 patches of 250 lines each would make it any quicker to review. Or merging code that was known not to work, as it was only a fraction of what was needed for the functionality.

That would also be rejected, because the kernel maintainers aren't idiots and their standards aren't the stupid arbitrary rules you construe them to be. They generally want big changes to be broken up into logical, sensible chunks that each leave the tree in a usable state, so that git-bisect still works.


How do people merge big new filesystems in practice though? Especially one with years of pre-existing out-of-tree development?

I guess one could start by merging a skeleton of the filesystem which supports mount/unmount but then returns an IO error on every operation? And then a patch to add directory traversal (you can view the files but not their contents), and then a patch to add file reading, and then a patch to add file writing, and then a patch to add mkdir/rmdir, and then a patch to add rename/delete of regular files.

Breaking down an existing filesystem into a sequence of patches like that, no doubt it is doable, but it is going to be a lot of work.


My guess is that given the history of this filesystem implementation, most of the review effort will be focused on the interface between this FS and the rest of the kernel. It's typical for all the changes touching communal files or introducing generic helper functions or data structures to be broken out into separate commits. If any of those helpers are a reinvention of stuff that's already in the kernel, there will need to be a justification for why NTFS needs its own special versions. It's not typical for a large patch series adding genuinely new stuff to be broken up into absurdly tiny commits. For the stuff that's truly internal to the filesystem implementation, it looks like one patch per file will be an acceptable granularity.


I didn't mean to imply otherwise. Drama is often not malice and rather due to legitimate concerns on one or both sides.


I assume malice because of the tone. The concern is legitimate, the tone is offputting.


Welcome to Linux kernel development. By lkml history, this tone is very mild. There are many examples of far worse commentary and personal attacks on devs. I’m not justifying this by the way. Linus can be a very smart jerk, and as a leader (THE leader) he sets the tone for what’s acceptable in the community.

https://www.zdnet.com/article/linux-developer-who-took-on-li...


There wasn't much drama in the mailing list discussion as I read it. Mostly comments asking "can you make this easier for us to review?".


There were two specific concerns in the initial review that I think were reasonable:

1) The Linux kernel already has an in-kernel read-only NTFS driver. What should be done about it? (There are a number of reasonable options here, including just getting rid of it and replacing it with Paragon's, but that requires at least some buy-in from the maintainers of the existing driver.)

2) The patch didn't actually build, which was a one-line Makefile fix, but raised some concern about how it wax tested/how the patch was generated.


It's strange to me that there's no file system with basic features like journaling and support for files larger than a couple of GB, that is supported across all major desktop OSes (MacOS, Windows, Linux, and FreeBSD). All platforms support the same i/o standards like USB or DisplayPort, why did filesystems never make the cut to become a cross-system standard?

Imagine if you could have a backup drive (with reasonable modern data protections) that you could just plug into different systems and save all your files to. Isn't it odd that such a simple thing isn't possible? I guess network attached storage has gotten pretty accessible at this point so there's no need for it?


I think the basic problem is that FAT is generally "good enough," and in the increasingly common case where it isn't exFAT is close to universal and addresses the only problem that consumers frequently run into (file size limit).

While FAT/exFAT leave the possibility of a variety of different types of filesystem inconsistency, these seem to be fairly rare in actual practice, probably in good part due to Windows disabling writecaching on devices it thinks are portable. This kind of special handling of external devices is sort of distasteful, and leads to some real downsides on Windows (e.g. LDM handling USB devices weird), but using a newer file system doesn't really eliminate that problem - NTFS and Ext* external devices require special handling on mounting to avoid the problems that come from file permissions traveling from machine to machine, for example.


> in good part due to Windows disabling writecaching on devices it thinks are portable. This kind of special handling of external devices is sort of distasteful

What's distasteful is that on my linux machine with a lot of ram, when I copy a multiple gigabyte file to a USB key it "completes" the copy almost immediately, when actually all it has done is copy the file to a ram buffer. Then when I try to disconnect the drive, it will hang for ages while it actually finishes the write. IMO windows does it better here (although I never realised what exactly they did, nice to know.


You should try to use the "sync" command, this is exactly what it solves.


I'm talking about dragging files with the GUI file manager, I shouldn't have to use any commands :(.

What I end up doing is using a "watch" on some command I can't remember that shows overall dirty pages.


GUI file copy tools should be using O_DIRECT, or periodically calling f/sync(). An argument could also be made that the kernel write cache should have a size limit so that one-off write latency is masked, but very slow bulk I/O is not masked.


O_DIRECT seems like overkill, and the lack of write buffering could be a real detriment in some circumstances. Syncing at the end of each operation (from the user's perspective) should be the best mix of throughput and safety, but it makes it hard to do an accurate progress bar. Before the whole batch operation is finished, it may be useful to periodically use madvise or posix_fadvise to encourage the OS to flush the right data from the page cache—but I don't know if Linux really makes good use of those hints at the moment.

On really new kernels, it might work well to use io_uring to issue linked chains of read -> write -> fdatasync operations for everything the user wants to copy, and base the GUI's progress bar on the completion of those linked IO units. That will probably ensure the kernel has enough work enqueued to issue optimally large and aligned IOs to the underlying devices. (Also, any file management GUI really needs to be doing async IO to begin with, or at least on a separate thread. So adopting io_uring shouldn't be as big an issue as it would be for many other kinds of applications.)


If you syncfs every reasonable-unit-of-time, you can get a progress bar.


Not always. If you're reading from a SSD and writing to a slow USB 2.0 flash drive, you could end up enqueuing in one second a volume of writes that will take the USB drive tens of seconds to sync(), leading to a very unresponsive progress bar. You almost have to do a TCP-like ramp up of block sizes until you discover where the bottleneck is.


Which distro/desktop? My standard ubuntu 18.04 with gnome and mounted through the file manager doesn't do this and copying to a slow USB drive is as glacial as it should be, but copying between internal drives is instant and hidden.


Default gnome on Ubuntu 20.04. How much free ram do you normally have? If you don't have enough to buffer the whole operation, then it's not a problem.


Now that I think about it, this might actually explain some bugs I've seen when copying multiple files. Copying one file seems to work but then copying a second the progress sits at 0%, it's probably waiting for the first transfer to sync.


sync is your friend.


I don’t know if it’s just because I’m largely on Mac and Apple’s drivers suck, but I have run into lots of data corruption issues with exFAT on large drives (1TB+). More than enough for me to stop using it.

(Now I use NTFS and Tuxera’s commercial Mac driver, because I don’t know how else to have a cross-platform filesystem without a stupidly-low file-size limit.)


It's pretty widely accepted that exFAT should be avoided on Nintendo Switch because of data corruption issues. So I don't think it's just Apple.


> because I don’t know how else to have a cross-platform filesystem without a stupidly-low file-size limit

UDF is natively supported on all major OS.

I haven't used it in macOS but on paper macOS seems to be having even better support for it than Linux so you can give it a shot.

Lastly, you can use https://github.com/JElchison/format-udf for creating most compatible filesystem across different devices.


Oh, the optical disk filesystem! That would be a useful hack. I won’t have problems using it on a standard hdd or ssd?


It isn't a 1st class fs in any OS so it lacks some polish tooling-wise etc., but it should work fine for basic file transfer jobs.

I'm using it on an external HDD for copying/watching video files between Linux and Windows boxes and haven't had any problems yet.

On the other hand 7-8 years ago I tried to use an UDF partition for sharing a common Thunderbird profile between Windows and Linux and had done strange errors on Windows side after a while. I didn't dig further so it may have been a non-udf os or tooling issue, or it may have been solved in the meantime.


> why did filesystems never make the cut to become a cross-system standard?

Many reasons, the most important being patents and insistence by vendors to keep stuff proprietary (exfat, ntfs, HFS, ZFS).

Also, there are fundamental differences on the OS level:

- how users are handled: Unices generally use numerical UID/GID while Windows uses GUIDs

- Windows has/had a 255-char maximum length of the total path of a file

- Unices have the octal mode set for permissions, old Windows had for bits (write protected, archived, system, hidden) and that's it

- Windows, Linux and OS X have fundamentally different capabilities for ACLs - a cross platform filesystem needs to handle all of them and that in a sane way

- don't get me started on the mess of advanced fs features (sparse files, transparent compression, immutability of files, transparent encryption, checksums, snapshots, journals)

- Exotic stuff like OS X and NTFS additional metadata streams that doesn't even have representation ín Linux or most/all BSDs

And finally, embedded devices and bootloaders. Give me a can of beer and a couple weeks and I'll probably be able to hack together a FAT implementation from scratch. Everything else? No f...in' way. Stuff like journals is too advanced to handle in small embedded devices. The list goes on and on.


Filesystems have never been standardised. In mainframe/mini days manufacturers supplied a selection of OS options for the same hardware, and there was no expectation that the various filesystems would be compatible between different OSs.

Which is why we have abstraction layers like Samba (etc) on top of networked drives. They're descendants of vintage cross-OS utilities like PIP which provide a minimal interface that supports folder trees and basic file operations.

But a lot of OS-specific options remain OS-specific, and there's literally no way to design a globally compatible file system that implements them all.

This isn't to say a common standard is impossible, but defining its features would be a huge battle. And including next-gen options - from more effective security and permissions, to content database support, to some form of literally global file ID system, to smart versioning - would be even more of a challenge.


> Windows has/had a 255-char maximum length of the total path of a file

The actual path of file in Windows can be practically unlimited, but either requires using special network notation that can exceed 256 characters or relative addresses. Recent versions of Windows includes a setting that removes the limitation in the APIs because of development issues like node nested packages.


Windows uses SIDs, not GUIDs.


No idea who downvoted you, thanks for the correction!


UDF looks a lot like that file system on paper, but my understanding is that it practically falls apart because there are too many bugs and too much variation in which features are really supported among the various implementations. Everyone agrees on the DVD-ROM subset, but beyond that it seems like a crapshoot.


ISO 13346[0] is supported everywhere and can handle files larger than 4GB. It's used on DVDs, but it can also be used on a flash drive[1].

[0] https://en.wikipedia.org/wiki/Universal_Disk_Format [1] https://github.com/JElchison/format-udf


Using a non-FAT filesystem for portable storage opens up lots of issues regarding broken permissions. Try to create a file on an ext4-formatted USB flash drive using your current user; rest assured that, unless your remembered to set its permissions to 777, on another computer you'll have to chmod it because it still belongs to the possibly dangling creator's UID and GIDs. If you don't have root access, and unless coincidentally your user has the same UID as your other machine, you're screwed and you have to go back grumpingly to a system where you have administrative privileges.

Same thing applies to NTFS, but I've seen that more often than not Windows creates files on removable drives with extremely open permissions, and in my experience NTFS-3G just straight ignores NTFS ACLs on the drives it mounts, so more often than not it JustWorks™ in common use cases.

I think a journaled extFAT-like filesystem would be perfect for this task, but given how hard it was for exFAT to even start to displace FAT32, even if it actually existed I wouldn't expect it to succeed any time soon.


Almost as if the corporations have a conflict of interest with their consumers.


Not standardising means being able to distinguish yourself easier, it's just that for anything that integrates with third-party stuff, standardisation is way cheaper. Apple doesn't make monitors or flash drives, so making their own specs for those wouldn't be beneficial to them and would increase prices of compatible products. But a filesystem is something they do make and being able to do whatever they want with it is quite beneficial, with no downsides as there are no third parties (that they care about) integrating with it that would have to put in extra work to support it (besides their direct competitors, which is a nice bonus).


There's a ZFS driver for Linux, macOS, and Windows.


In all three cases you list, it's a third-party module providing support, rather than it being a standard feature of the OS that can be expected to be generally available on more than 1% of the install base.


ZFS development moves so fast that it is common for my (FreeBSD-based) FreeNAS box to warn me when I upgrade my OS that certain actions will make it incompatible with the prior version of FreeNAS.

That is fine and appropriate for a drive that will be connected to the system for the foreseeable future.

That kind of compatibility concern makes me squeamish about using ZFS for a drive that I want to share between different systems. If it's easy to make it incompatible between two releases for the same system, that smells like a waiting nightmare trying to keep it compatible between Linux, FreeBSD and Windows.


Yeah it should stabilize in the next couple of months with the release of OpenZFS 2.0 as that release is supposed to signify the unification of ZFS on Linux and FreeBSD. ZFS on FreeBSD is being rebased onto ZFS on Linux. Theres also been some talk on adding MacOS zfs support OpenZFS but thats still up in the air.


This is great news


Agreed! I’ve been running FreeBSD on various computers for very close to a decade now, and still run it on my mail server, but one problem that I faced a couple of years ago when I sold my old laptop, which I was running FreeBSD on, was that my other computer at home at the time was running Linux but I had an external HDD that I’d been using with the laptop and which I was using GELI encryption on.

Since I didn’t have money for any more hard drives at the time, I couldn’t transfer the data to anything else. So then when I wanted to access that data I’d do so via a FreeBSD VM running in VirtualBox. The performance was... not great.

I took the data that I needed the most, and for the rest of the data I let it sit at rest.

This week I wanted to use the drive again, and in the end because I was doing general cleanup, I decided to install FreeBSD on my desktop temporarily.

I actually love FreeBSD but the reason that I prefer to have my desktop running Linux is in big part because I want software on the computer to be able to take advantage of CUDA with the GTX 1060 6GB graphics card that I have in it, and unfortunately only the Linux driver by Nvidia has CUDA, the FreeBSD driver by Nvidia does not.

I was actually looking at installing VMWare vSphere on the computer instead, so that I could easily jump between running Linux and running FreeBSD with what I understand will probably be good performance compared to VirtualBox at least. But the NIC in my machine is not supported and vSphere would not install. I found some old drivers, messed around with VMWare tooling which required PowerShell, and which turned out not to work with the open source version of PowerShell on any other operating system than Windows. So then I downloaded a VM image of Win 10 from Microsoft [0], and used that to try and make a vSphere installer with drivers for my NIC. No luck at first attempt unfortunately. A decade ago I probably would have kept trying to make that work, but at this point in my life I said ok fine fuck it. I ordered an Intel I350 NIC online second-hand for about $40 shipping included, and the guy I bought it from sent it the next day. It is expected to arrive tomorrow. Meanwhile, I installed FreeBSD on the desktop. When the NIC arrives I will do some benchmarking of vSphere to decide whether to use vSphere on the desktop or to stick to either FreeBSD for a while on that machine or to put it back to just Linux again.

Anyways, that’s a whole lot more about my life and the stuff that I spend my spare time on than anyone would probably care to know :p but the point that I was getting to is that, with OpenZFS 2.0 I will be able to use ZFS native encryption instead of GELI and I will be able to read and write to said HDD from both FreeBSD and Linux.

I still need to scrape together money for another drive first before I can switch from GELI + ZFS to ZFS with native encryption though XD

Oh, and one more thing, with the external drive I was having a lot of instability with the USB 3.0 connection on FreeBSD, leading to a bit of pain with transferring data because the drive would disconnect now and then and I’d have to start over. But yesterday I decided to shuck the drive – that is, to remove the enclosure and to connect the drive with SATA like you would any other regular internal drive. It worked out excellently, the WD Essentials enclosure was easier to pry open than I had feared, and a video on YouTube showed me how to do it [1]. As prying tools I used a couple of plastic rulers. As a bonus, it also looks like I/O performance is better with the direct SATA connection than what I was getting with the USB 3.0 connection.

Speaking of that, some people have reported finding that the drives in their WD Essentials external drives were WD Red HDDs. I didn’t have the same luck with mine; mine was WD Blue. But idk if WD Red is even common with the capacity that mine has anyways. Mine is “only” 5TB and I think the people that have been talking about finding WD Red drives in theirs has bought 8TB models often. Idk. The main thing for me anyways is just to have my data and someplace to store it ^^

[0]: https://developer.microsoft.com/en-us/microsoft-edge/tools/v...

[1]: https://youtu.be/QApvLyorr3g


There's a Btrfs driver for Windows [1]

[1] https://github.com/maharmstone/btrfs


While it works fine, it tends to BSOD more often than I'd like...


The ZFS driver is still early in development and quite unstable! I’ve used it in read-only mode just so I could have some access to my ZFS pool while booted into Windows, and although it kind of worked in that use case it was would still do weird things like randomly refuse to open certain files.


One thing that occasionally causes data interoperability problems for me is forgetting that Windows can't have colons in filenames. Not really sure what a filesystem driver could do about that.


exFAT is such a file system. It doesn't have journaling but why do you need journaling on backup media anyway?


Because shit can happen even when you're backing up...?


what kind of shit can happen without you knowing when backing up?


Power loss. I mean, backing up is the literal definition of a nightly job...


So, what does a journaling FS helps in a power loss? You just restart backup anyway. What’s the concern there?


The health of the backup device. Restarting a backup is one thing, having to rebuild a disk is another.


You don't need to rebuild a disk because some clusters were orphaned. It doesn't mean disk was corrupt, it just means there is some allocated space that isn't used. It's trivial to fix too. There is no danger there. Actually, what journaling does is to rollback that allocated space. Doing that manually doesn't make it less safe. For backup scenarios exFAT is perfectly feasible.


What is feasible and what is desirable are two different things.

With journalling, I don't have to know or care about any of this: I restart and chances are it's all back to normal. This is desirable.

You can store your backups on stone tablets, with a machine that carves rock to write 1s and 0s and a conveyor belt that feeds new tablets. That is perfectly feasible. It is also not desirable.


You restart the backup process, maybe ?


Not if your target FS is corrupt.


Yes there are still dreams about bringing your home directory on a stick anywhere with you and just plugging it into nearest public toilet when you need it.

But it hasn't and won't work for obvious reasons.


Ironically, a fully encrypted Linux installation with Btrfs or ZFS on a removable drive is quite easy to make and it works really well. I've one I made on an old SSD I had lying around and a 2.5" USB-C caddy and it's wonderful, you can have your work environment everywhere you want and even plug it into a running machine and boot it up using Hyper-V or KVM.


In the meantime v2 of the patch has already been submitted, addressing some of the points mentioned in the article: https://lore.kernel.org/linux-fsdevel/904d985365a34f0787a451...


This is such an odd article. Perhaps Paragon isn't being entirely altruistic with this move but TR are being quite scathing of someone's work submitting a kernel driver with no direct financial reward - and no real praise for, hopefully, fixing one of the biggest out-of-the-box gripes that Windows / Linux desktop dual-boot environments have. It's no small wonder people often complain about the hardships of writing and maintaining open-source software!


I don't get where TheRegister is getting the drama. The thread doesn't seem that scathing to me. [0]

It doesn't build, but the person who pointed it out also supplied a diff to make it happen.

It also fails a few tests, but Paragon are more than happy to see if they can make it a bit more compliant.

UBSan finds a few potential bugs, but again, Paragon are more than happy to fix the problems.

There's some style guide suggestions, which Paragon seem to immediately take on board:

> The patch will be splitted in v2 file-wise. Wasn't clear initially which way will be more convenient to review.

[0] https://lore.kernel.org/lkml/2911ac5cd20b46e397be506268718d7...


Yeah, The Register can often be witty and irreverent, and I wouldn't complain about that, but that kind of wit only works if there's an actual point behind it. Jumping on the apparently popular bandwagon of squeezing out "drama" from the Linux kernel mailing list isn't adding anything of value.

It also seems a bit ignorant to call the submission "half-baked" just because it needed to be worked on and because someone pointed that out (also) in a slightly irreverent way.

All of those points you bring up about the feedback they got seem like just business as usual for a larger merge of code to a carefully developed FOSS project.


Yeah, this seems like one of those 'dramas' that incites bystanders more than the actual participants.


I think the issue here is the context... Paragon has had a rocky relationship with the Linux community in the past, for example their whiny reaction to exFAT being proposed for mainline a few months ago, so I think in the eyes of The Register there is a certain amount of inherent mistrust in Paragon proposing their NTFS driver for the kernel---shortly after going on the offensive against another formerly-commercial FS going mainline.

That said, I think the tension is more imagined than real, as the LKML doesn't really seem to have responded any more negatively than they do to most other patches.

example: https://arstechnica.com/information-technology/2020/03/the-e...


There's no big issue, and there hasn't been one for many years. Ext2/3/4fs read/write support has existed for a long time for Windows, and FUSE/Dokan has a working NTFS driver with r/w support as well (also for a long time). It just doesn't work out of the box (though it does on some distributions).

Its gonna take a while till this driver is mainline in Linux kernel, and till that Linux kernel is included in distributions (especially LTS).

I haven't read The Register article, but in the past it has come to my attention they dramatize their articles, and I don't want to read such media.


This is why companies don’t open source their code though. Do it and everyone looks at it and says “oh I think you’re stupid”


But that didn’t happen here.



Thanks, maybe it was the expectations set by the tone of The Register article, but the mailing list discussion seemed mostly very reasonable...


Possibly a dumb question, but what are the plusses/minuses of something like this as compared to NTFS-3G?

https://en.wikipedia.org/wiki/NTFS-3G


Mostly better performance / less CPU usage. NTFS-3G is typically CPU-limited even for sequential access, which isn't exactly a great spot for a FS driver to be in.


NTFS-3G is a userspace (FUSE) driver, which usually have worse performance than kernel drivers.


You'll be able to install Linux on NTFS root now!


I wish Linux on NTFS could share metadata with WSL1. That way you could boot it off the same WSL1 files you could access in Windows.


Nice idea, and with Wine we could have all the beautiful Windows AV-Software :-)


I don't think it's completely fair.

Don't look a gift horse in the mouth, especially when you need one.


On one hand yes, on the other hand if you want your patch to be included it's your job to make it reviewable and to make it pass all the various checks.

After all, we all lived long enough without their software, we can live without it a bit longer.

Yes it's a gift, but it's also code that has to be maintained and updated along with the kernel, so it's not really 100% a gift. If you then consider that they might keep on selling a proprietary version of the code (which - don't get me wrong - is 100% legit and fair) they might also get basically free labour: they could rebase onto the latest public gpl version, they might get notes of various issues and bugs...

Quite literally, it's free labour.


“Free as in puppy.”


It turns out someone (could be you?) wrote a Medium article about open-source software using the "free as in puppy" metaphor, although I don't think that they really used in the same way you are here with regards to a corporation using the open-source community to functionally receive free labor. I'm definitely going to add it to my mental list...

[1] https://medium.com/swlh/free-as-in-puppy-5b7eb1bf3908


Wasn’t me. Gifts bearing obligations come in a lot of shapes and sizes, I just always found the puppy metaphor very compelling and so I like to use it.

(As a mental exercise sometime, go to a pet store and figure out how much a “$12 hamster” costs once you get everything you need to set up and maintain a habitat.)


This is a wonderful description of "free" to go along with "free as in beer" and "free as in speech", thanks!


It's a beautiful metaphor.

Puppies can bring a lot of joy, but they certainly bring obligations.

Really spot on.


I've not heard this one before. Like a White Elephant but useful.

https://en.m.wikipedia.org/wiki/White_elephant


This is such a perfect addition to free as in beer/speech. Really quite surprised I haven't seen this more often, going to be using it haha


> f you want your patch to be included it's your job to make it reviewable

I have a mid-junior level co-worker who submits PRs that are excessively large. As best I've been able to determine, he's not especially good at managing the dependencies in his code, and he doesn't want to submit a broken PR, so his default is to wait until he gets everything written instead of breaking it down into smaller pieces.


I think their (the kernel maintainers’) position is that a single patch of 27000 lines of diffs is a bit of a nightmare to do code review on. I’m not sure if you took a look at the patch file (available at https://dl.paragon-software.com/ntfs3/ntfs3.patch ).

I think their point is ‘man, how are we going to divide this up amongst the maintainers? Who gets to check which function or call?’

Paragons response (more or less will fix in v2) https://lore.kernel.org/linux-fsdevel/a8fa5b2b31b349f2858306...


Using language like "monstrosity" though, doesn't help.


The word has several meanings, and I took it to mean "frighteningly large" rather than any of the more negative ones. Perhaps not the best choice, but not overtly rude.


"Monstrosity" seems like a fair word, and to the point as well. The follow-up messages showed good will on the part of the kernel maintainers.


I guess that's not you that have to do the review.


Heh. I've certainly been handed stuff I didn't like. It's always been easier to work through if I avoided anything that came off as emotional or insulting.


Sorry, dad.

We’ll all try really hard to ignore our individual biology to satisfy the most sensitive sensibilities.

Cause that expectation is not manipulative at all.


Could you please stop posting flamebait and unsubstantive comments to HN? Also, could you please stop creating accounts for every few comments you post? We ban accounts that do that. This is in the site guidelines: https://news.ycombinator.com/newsguidelines.html.

You needn't use your real name, of course, but for HN to be a community, users need some identity for other users to relate to. Otherwise we may as well have no usernames and no community, and that would be a different kind of forum. https://hn.algolia.com/?sort=byDate&dateRange=all&type=comme...


If it's working code (assume it is) then requiring it to be turd-polished for the purposes of code review is absurd. Whoever is asking for that is either disingenuous or has no real experience as a software developer.


Taking up patches to code is also taking up vulnerabilities and bugs they carry. Even assuming good intentions, a gift of code is a gift with a burden attached, and should pass deep scrutiny before getting accepted.

Sad as it may sound, free code doesn't come for free.


This is like giving someone a litter of puppies as a gift and then walking away.

Hooray, free puppies. Now you just have to care for them for the next 10 years.


Paragon stated their intent to maintain and support the code in their initial email, and added themselves to the maintainer files.

There's no drama happening here. Paragon guys are trying to give it "properly", and linux guys want it "properly", and the only thing happening is defining "proper" in this context


Intent, promises and reality aren’t always aligned. Especially when given a slap dash of tens of thousands of lines of code that didn’t meet kernel contribution guidelines.


Thats true... but nothing has happened yet. No one has yet failed to hold up their end. All parties seem to want this to work -- it certainly isn't a dead drop.


Someone else mentioned their commitment.

Additionally, the driver is mature, as is the FS.


how is that a good comparison, when you're given puppies you are being made responsible for living creatures, If you're given a bunch of code you can literally just drop it.


> "Don't look a gift horse in the mouth"

Interesting you choose that metaphor. I can think of a particular gift horse where you would have seen Greek soldiers if you have looked into its mouth. :)


Working on someone else's code is no fun. I do it, when I get paid properly, but can't say I enjoy it. So I understand that not everyone is enthusiastic about that gift.

And who needs it? There are alternatives to read or write the occasional file (e.g. if you happen to have forgotten the Admin's password) of a NT box, but I can't say that I missed the ability to create new files in a NTFS. And why now? Surely those who actually needed that functionality, needed it years ago and meanwhile found some other solution. Perhaps those who actually need it still volunteer to ready that driver for inclusion into the kernel or sponsor someone who can?


NTFS-3g can create files and directories etc. on NTFS (albeit it doesn't do journaling, that's why it requires you to have a clean journal). Having a proper kernel driver is mostly about performance, not features. The current built-in kernel driver for NTFS is completely read-only though.


> The current built-in kernel driver for NTFS is completely read-only though.

Not quite:

> CONFIG_NTFS_RW:

> This enables the partial, but safe, write support in the NTFS driver.

> The only supported operation is overwriting existing files, without changing the file length. No file or directory creation, deletion or renaming is possible. Note only non-resident files can be written to you may find that some very small files (<500 bytes or so) cannot be written to.


> especially when you need one.

It's very debatable that Linux "needs" more support for a proprietary 25-year-old filesystem that is, in many ways, obsolete.


I wonder if the reason for the huge line count is overly verbose code, or if it's just the inherent complexity of NTFS. For contrast, I wrote a FAT32 filesystem driver (read/write) for an embedded system a long time ago, and it was less than 1K lines --- of Asm.


NTFS is way more complicated (and feature rich) filesystem than (rather barebones) FAT32.


It's just a single data point, but I tried Paragon's ext4 driver (the paid version) for Mac many years ago. It seemed to work, but when the drive was connected back to Linux, there were all kinds of fsck errors. Immediately deleted it.


This has also been my exact experience. I used Paragon's NTFS driver (paid) for Mac for my external SSD. After using, when plugging into Windows it would always find "errors" and recommend to Scan and Fix them.


Fwiw, I had the same problem with Paragon but have had a lot more success with Tuxera’s driver, if you’re still looking for a solution.


27k lines isn't that crazy. I've merged larger patches, it just takes a while. This is a very negative view of an open source contribution.


> I've merged larger patches,

In what kind of context? The Linux kernel?

Filesystem code is pretty tricky to begin with, and prone to very subtle bugs with very not-subtle consequences. And this isn't greenfield development of a new filesystem, but an implementation that needs to remain highly compatible with Microsoft's version. This FS driver has to be maintained to track changes to two operating systems. So this 27kloc can reasonably be expected to encompass a lot more complexity than your average 27kloc, and it requires a lot more review effort than something like 27kloc of GPU driver register definitions.


> In what kind of context? The Linux kernel?

Not the Linux kernel, but a large embedded system.

> Filesystem code is pretty tricky to begin with, and prone

> to very subtle bugs with very not-subtle consequences.

This code has been running in the wild for quite a while now, it has had a trial by fire. And there's no way around testing, subtle ext4 bugs still crop up despite the maturity of the filesystem.

> And this isn't greenfield development of a new filesystem,

> but an implementation that needs to remain highly

> compatible with Microsoft's version.

No amount of code review will stop Microsoft from adapting their version. Also, I doubt Microsoft themselves will change too much about the filesystem given the compatibility they themselves have to maintain with cold storage NTFS drives.

> This FS driver has to be maintained to track changes to

> two operating systems.

You make it sound as if Microsoft have a hand in any of this. Also, have you seen the state of the current NTFS driver? It's a bit flakey (no disrespect to the maintainers).


> No amount of code review will stop Microsoft from adapting their version.

Way to miss the point. Code review for the kernel isn't just about verifying that the code currently works. It's also about making sure the code is maintainable. Microsoft is relevant here because their actions will increase the maintenance burden of any Linux NTFS driver. Kernel developers rightly need to be concerned about how difficult it will be to extend the NTFS driver to handle new NTFS features that Microsoft introduces.


> Way to miss the point.

The point wasn't so clear, but I see what you're saying now. Maintainability is normal code review though.

> It's also about making sure the code is maintainable. [..]

> Kernel developers rightly need to be concerned about how

> difficult it will be to extend the NTFS driver to handle new

> NTFS features that Microsoft introduces.

Maintainability is one thing, extensibility is another. Preparing your code to implement some changes completely outside of your control seems like a waste of time and something that might bite you later on.


27k is 27 kB.

This is 27 kLOC.


I solemnly swear never to complain again about the size of PRs given to me for review.

It's going to take months to throughly assess the quality of this.


Ouch. That looks really useful, but at the same time possibly their first kernel submission perhaps?

Hope I’m wrong but I think they’ll be fighting initial bad impression for a while despite good intention


I hope the it's better than their ext2 drivers for macOS. I have had nothing but problems and data loss and their support is the worst.


Same on Windows regarding data loss. Terrible and I’ll never use their ext2 drivers again.


> "It looks as though that with NTFS being surpassed by other more advanced file-systems,

Is this remotly true?

Will my grandmother not be using NTFS in 10 years?


Microsoft has been trying to come up with a replacement for NTFS for a very long time. They've had mixed success with trying to extend NTFS with more advanced features like the now-deprecated TxF. There's little doubt that even its creators see NTFS as something of a dead-end. Whether your grandmother is still using it in 10 years depends primarily on whether Microsoft can get its act together to pick and ship a replacement.


> Whether your grandmother is still using it in 10 years depends primarily on whether Microsoft can get its act together to pick and ship a replacement.

So yes she will I guess.

Then it's vital Linux gets NTFS working well.

My partner is not going to use Linux things if every time they try and transfer the 8k 3D holographic photos of our CRISPR'ed dog learning to spell to my grandmother it doesn't work on her Holovision.

True story, last month I just lost about 1 in 10 of my media files on my Linux Share to NTFS issues. So now I run the box on Windows.


> True story, last month I just lost about 1 in 10 of my media files on my Linux Share to NTFS issues. So now I run the box on Windows.

Were you seriously running a Linux-based NAS with NTFS as the underlying filesystem, or did you mean something very different? I can't imagine why anyone would ever think their choice of disk filesystem on a server—hiding behind a network filesystem—should be influenced by what disk filesystems are supported by client devices.


Given my original question about my grandmother has been downvoted it says it all about the judginess of Linux users ;)

My setup was -

Windows and Ubuntu dual boot gaming PC with a NTFS 4G external hard disk, Windows network share. Default boot was to Ubuntu.

Torrents running off wireless laptop to the network share.

Amazon Fire stick with Kodi wireless running to network share.


200loc/h, it's 4-5 weeks of work for a single person


Does it need to be included in the Linux Kernel? I understand that NTFS is 27 years old and mostly to support Microsoft file systems. Can't the driver be an external download?


In-tree drivers are the standard for Linux kernel modules.


As long as it builds as a module, uses only the standard filesystem apis and doesn't have code changes outside that, code review is far less important IMO...


The linux kernel is successful partly because by and large what's in the kernel works and doesn't generally break on new revisions.

Accepting a new driver which may in fact not work, breaks in ambiguous ways, interacts with other components poorly, or otherwise generates headaches for the kernel could be more trouble than it's worth. Ultimately this driver will eventually need to be modified by others, and ensuing it doesn't get a reputation as a nightmare right out of the gate is also worthwhile.


At the risk of stating the obvious, this is an exceptionally short-sighted approach. Committing poorly understood code is not that different from using binary blobs.


So you’d willingly allow tens of thousands of lines of code which you don’t understand or really know what it does into a kernel which is used by literally billions of devices?


> uses only the standard filesystem apis and doesn't have code changes outside that

That's literally why having to review 27K lines of code is a pain. You're not seriously going to trust, sight unseen, that there's nothing in the code that slips in a backdoor or catastrophic data-loss bug, are you?


I wish you weren't getting downvoted, but what you said is absolutely correct. If it's a module and separate from core, who cares. If it's modifying core kernel or mixed up with kernel space code, then nope.


Linux isn't a microkernel; a filesystem accepted into the kernel source tree will run as kernel space code whether or not it's compiled as a loadable module. So all 27k lines need to be audited for security purposes and to ensure they're interfacing with the rest of the kernel only in the approved ways, because there aren't a lot of technological barriers to the filesystem misbehaving.

But more important than that is the maintenance burden. NTFS will be around for a long time, but it's also a moving target because Microsoft hasn't replaced it yet. Kernel developers have to keep in mind how this code will look in a few decades, after all the original developers are retired. If it's written in a very different style from other Linux filesystems, will there be anyone left who both knows enough about the workings of the Linux IO stack 20 years hence, and understands Paragon's code conventions?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: