I understand that reviewing 27k LoC is daunting and probably not very fun.
But unlike most patches that draw a similar response it's not a narrowly useful patch that mostly serves the submitter. Proper NTFS support benefits a large proportion of Linux users (the jab from the article that there are more advanced file systems out there seems out of place; there are no signs that windows is about to switch its default FS to something else).
Additionally this code has been used in production for years now (e.g. my 2015 router runs the closed source version of this driver in order to support NTFS formatted external drives) so most likely a lot of quality issues have already been found and addressed.
So I feel it's a bit unreasonable to respond with so much negativity to this contribution.
It's also no big deal from either side. Paragon sent in the patch and it's appreciated. There's a few problems to get this in. Reviewers noted the issues and what would need to be done to get this through. The process to get this in is happening.
Split your diff! and Fix your makefile! have to be one of the most benign and common pieces of diff feedback i've seen. I feel that you could make a media story about any submission to the Linux kernel based on there being comments in the review process.
I read the discussion. There's no drama at all. Paragon did an unreviewable code dump with intent to maintain and they are warmly welcome in general. David laid out the path to review and probable acceptance https://lore.kernel.org/linux-fsdevel/20200815190642.GZ2026@... If there was any fuss it's because of the unreviewable nature of the patch but especially by kernel standards the discussion was cordial. In fact, aside from Nikolay's outburst by any standard it was a cordial discussion. (Someone should've gently told him this is no way to welcome newcomers especially newcomers carrying such a gift.)
Others noted it needs to pass the existing test suite and that it is close.
It seems from the link that the kernel developers would rather have one patch per new file plus a patch that does the integration, instead of one big patch with everything. That's a bit unconventional, perhaps because they tend to use git alone instead of higher-level software that would help them break down a big single commit; but whatever they're doing clearly works for them.
The entire dispute seems to be that minor question of style, nothing substantive. I don't think anyone's specially unhappy on either side. The controversy seems manufactured, perhaps by a reporter who noticed the gruff language but lacked the technical knowledge to understand what's actually going on.
Most people developing free software (probably including both the submitters and recipients of this patch) could make a lot more money elsewhere, but have chosen to instead to do work with considerable public benefit. That's thankless enough already without some reporter inventing drama for clicks.
> perhaps because they tend to use git alone instead of higher-level software that would help them break down a big single commit
git is capable of breaking down a large diff into manageable pieces (e.g., limiting a diff to a single file), but reviewing code in a mailing list means replying to the message that contains a patch and replying inline to certain parts to comment on it.
As for higher level software that could break down a large commit, what specifically do you have in mind? I can't think of any feature that other review tools like Git??b, gerrit, reviewboard, phabricator, etc. that would make something like this easy to review.
I meant like GitHub and competitors, which let you attach comments to specific lines and files and such, and perhaps follow references into the full code faster than you could flipping between your mail client and your editor (and save you the effort of applying the patch to a local tree for that review). Since the kernel developers prefer to discuss on a plain mailing list and not use such tools, it makes sense that they prefer smaller chunks.
27 kLOC will be a big project to review no matter what, but I'd probably rather take them in a single commit--the files presumably depend on each other, and there's probably no order in which the files could be reviewed in isolation without reference to files not yet reviewed. (Obviously we try for hierarchical structure that would make that possible, but not usually with perfect success.)
That's a matter of personal preference, though, and people who want a project to merge their contributions should adhere to the maintainer's preferences. In any case, it seems Paragon intends to do exactly that. I doubt Paragon expected their reward for their contribution would be an article read by thousands of people that called it "half-baked" over this minor point, and I can't imagine such publicity encourages others to make similar contributions in future.
> I meant like GitHub and competitors, which let you attach comments to specific lines and files and such
Github does allow you to filter the diff down to the commit or jump to a particular file within the diff. Commenting on a line in the diff isn't really any different than positioning one's comment inline below the relevant line(s) of code in an email reply. I know that in Github, it's also possible to comment directly on a commit (though those comments are not displayed with any context in the general PR view), unlike an email reply to a particular patch series.
Depending on one's email client, it's certainly possible to search for things like /^diff/ or /^@@/ to jump from file to file or hunk to hunk within the compose window.
> perhaps follow references into the full code faster than you could flipping between your mail client and your editor (and save you the effort of applying the patch to a local tree for that review).
For some, the email client doubles as an editor (i.e., gnus). And, at least in my experience, it's far faster to navigate code in an editor compared to the web interface that Git??b provides.
> 27 kLOC will be a big project to review no matter what, but I'd probably rather take them in a single commit--the files presumably depend on each other
While that's true, the dependency can be preserved when merging the branch of the series of commits in the mainline repository. Plus, many may find it easier to review declarations, definitions, and calls in that order.
> Most people developing free software (probably including both the submitters and recipients of this patch) could make a lot more money elsewhere
I think most free software developers are normal corporate employees. I work on tons of free software as my job, like most of my peers, but that’s normal in the industry. I don’t consider myself a free software developer.
Fair--depending what "most" is weighted by, I may have overstated, and a Google employee who happens to get assigned to work on Chrome stuff is certainly making no personal sacrifice.
I meant independent volunteers or people working for free-software-focused companies (which I believe usually offer well below FAANG-level compensation, especially at the high end, though still enough to live quite well). Excluding hardware vendors porting Linux to their own products, I believe the core kernel developers tend to fall in to that last category. I have no specific knowledge of their individual compensation, but the technical leads responsible for closed-source projects of similar scope make incredible amounts of money.
But it's not like splitting the feature into 100 patches of 250 lines each would make it any quicker to review. Or merging code that was known not to work, as it was only a fraction of what was needed for the functionality.
> But it's not like splitting the feature into 100 patches of 250 lines each would make it any quicker to review. Or merging code that was known not to work, as it was only a fraction of what was needed for the functionality.
That would also be rejected, because the kernel maintainers aren't idiots and their standards aren't the stupid arbitrary rules you construe them to be. They generally want big changes to be broken up into logical, sensible chunks that each leave the tree in a usable state, so that git-bisect still works.
How do people merge big new filesystems in practice though? Especially one with years of pre-existing out-of-tree development?
I guess one could start by merging a skeleton of the filesystem which supports mount/unmount but then returns an IO error on every operation? And then a patch to add directory traversal (you can view the files but not their contents), and then a patch to add file reading, and then a patch to add file writing, and then a patch to add mkdir/rmdir, and then a patch to add rename/delete of regular files.
Breaking down an existing filesystem into a sequence of patches like that, no doubt it is doable, but it is going to be a lot of work.
My guess is that given the history of this filesystem implementation, most of the review effort will be focused on the interface between this FS and the rest of the kernel. It's typical for all the changes touching communal files or introducing generic helper functions or data structures to be broken out into separate commits. If any of those helpers are a reinvention of stuff that's already in the kernel, there will need to be a justification for why NTFS needs its own special versions. It's not typical for a large patch series adding genuinely new stuff to be broken up into absurdly tiny commits. For the stuff that's truly internal to the filesystem implementation, it looks like one patch per file will be an acceptable granularity.
Welcome to Linux kernel development. By lkml history, this tone is very mild. There are many examples of far worse commentary and personal attacks on devs. I’m not justifying this by the way. Linus can be a very smart jerk, and as a leader (THE leader) he sets the tone for what’s acceptable in the community.
There were two specific concerns in the initial review that I think were reasonable:
1) The Linux kernel already has an in-kernel read-only NTFS driver. What should be done about it? (There are a number of reasonable options here, including just getting rid of it and replacing it with Paragon's, but that requires at least some buy-in from the maintainers of the existing driver.)
2) The patch didn't actually build, which was a one-line Makefile fix, but raised some concern about how it wax tested/how the patch was generated.
It's strange to me that there's no file system with basic features like journaling and support for files larger than a couple of GB, that is supported across all major desktop OSes (MacOS, Windows, Linux, and FreeBSD). All platforms support the same i/o standards like USB or DisplayPort, why did filesystems never make the cut to become a cross-system standard?
Imagine if you could have a backup drive (with reasonable modern data protections) that you could just plug into different systems and save all your files to. Isn't it odd that such a simple thing isn't possible? I guess network attached storage has gotten pretty accessible at this point so there's no need for it?
I think the basic problem is that FAT is generally "good enough," and in the increasingly common case where it isn't exFAT is close to universal and addresses the only problem that consumers frequently run into (file size limit).
While FAT/exFAT leave the possibility of a variety of different types of filesystem inconsistency, these seem to be fairly rare in actual practice, probably in good part due to Windows disabling writecaching on devices it thinks are portable. This kind of special handling of external devices is sort of distasteful, and leads to some real downsides on Windows (e.g. LDM handling USB devices weird), but using a newer file system doesn't really eliminate that problem - NTFS and Ext* external devices require special handling on mounting to avoid the problems that come from file permissions traveling from machine to machine, for example.
> in good part due to Windows disabling writecaching on devices it thinks are portable. This kind of special handling of external devices is sort of distasteful
What's distasteful is that on my linux machine with a lot of ram, when I copy a multiple gigabyte file to a USB key it "completes" the copy almost immediately, when actually all it has done is copy the file to a ram buffer. Then when I try to disconnect the drive, it will hang for ages while it actually finishes the write. IMO windows does it better here (although I never realised what exactly they did, nice to know.
GUI file copy tools should be using O_DIRECT, or periodically calling f/sync(). An argument could also be made that the kernel write cache should have a size limit so that one-off write latency is masked, but very slow bulk I/O is not masked.
O_DIRECT seems like overkill, and the lack of write buffering could be a real detriment in some circumstances. Syncing at the end of each operation (from the user's perspective) should be the best mix of throughput and safety, but it makes it hard to do an accurate progress bar. Before the whole batch operation is finished, it may be useful to periodically use madvise or posix_fadvise to encourage the OS to flush the right data from the page cache—but I don't know if Linux really makes good use of those hints at the moment.
On really new kernels, it might work well to use io_uring to issue linked chains of read -> write -> fdatasync operations for everything the user wants to copy, and base the GUI's progress bar on the completion of those linked IO units. That will probably ensure the kernel has enough work enqueued to issue optimally large and aligned IOs to the underlying devices. (Also, any file management GUI really needs to be doing async IO to begin with, or at least on a separate thread. So adopting io_uring shouldn't be as big an issue as it would be for many other kinds of applications.)
Not always. If you're reading from a SSD and writing to a slow USB 2.0 flash drive, you could end up enqueuing in one second a volume of writes that will take the USB drive tens of seconds to sync(), leading to a very unresponsive progress bar. You almost have to do a TCP-like ramp up of block sizes until you discover where the bottleneck is.
Which distro/desktop? My standard ubuntu 18.04 with gnome and mounted through the file manager doesn't do this and copying to a slow USB drive is as glacial as it should be, but copying between internal drives is instant and hidden.
Default gnome on Ubuntu 20.04. How much free ram do you normally have? If you don't have enough to buffer the whole operation, then it's not a problem.
Now that I think about it, this might actually explain some bugs I've seen when copying multiple files. Copying one file seems to work but then copying a second the progress sits at 0%, it's probably waiting for the first transfer to sync.
I don’t know if it’s just because I’m largely on Mac and Apple’s drivers suck, but I have run into lots of data corruption issues with exFAT on large drives (1TB+). More than enough for me to stop using it.
(Now I use NTFS and Tuxera’s commercial Mac driver, because I don’t know how else to have a cross-platform filesystem without a stupidly-low file-size limit.)
It isn't a 1st class fs in any OS so it lacks some polish tooling-wise etc., but it should work fine for basic file transfer jobs.
I'm using it on an external HDD for copying/watching video files between Linux and Windows boxes and haven't had any problems yet.
On the other hand 7-8 years ago I tried to use an UDF partition for sharing a common Thunderbird profile between Windows and Linux and had done strange errors on Windows side after a while. I didn't dig further so it may have been a non-udf os or tooling issue, or it may have been solved in the meantime.
> why did filesystems never make the cut to become a cross-system standard?
Many reasons, the most important being patents and insistence by vendors to keep stuff proprietary (exfat, ntfs, HFS, ZFS).
Also, there are fundamental differences on the OS level:
- how users are handled: Unices generally use numerical UID/GID while Windows uses GUIDs
- Windows has/had a 255-char maximum length of the total path of a file
- Unices have the octal mode set for permissions, old Windows had for bits (write protected, archived, system, hidden) and that's it
- Windows, Linux and OS X have fundamentally different capabilities for ACLs - a cross platform filesystem needs to handle all of them and that in a sane way
- don't get me started on the mess of advanced fs features (sparse files, transparent compression, immutability of files, transparent encryption, checksums, snapshots, journals)
- Exotic stuff like OS X and NTFS additional metadata streams that doesn't even have representation ín Linux or most/all BSDs
And finally, embedded devices and bootloaders. Give me a can of beer and a couple weeks and I'll probably be able to hack together a FAT implementation from scratch. Everything else? No f...in' way. Stuff like journals is too advanced to handle in small embedded devices. The list goes on and on.
Filesystems have never been standardised. In mainframe/mini days manufacturers supplied a selection of OS options for the same hardware, and there was no expectation that the various filesystems would be compatible between different OSs.
Which is why we have abstraction layers like Samba (etc) on top of networked drives. They're descendants of vintage cross-OS utilities like PIP which provide a minimal interface that supports folder trees and basic file operations.
But a lot of OS-specific options remain OS-specific, and there's literally no way to design a globally compatible file system that implements them all.
This isn't to say a common standard is impossible, but defining its features would be a huge battle. And including next-gen options - from more effective security and permissions, to content database support, to some form of literally global file ID system, to smart versioning - would be even more of a challenge.
> Windows has/had a 255-char maximum length of the total path of a file
The actual path of file in Windows can be practically unlimited, but either requires using special network notation that can exceed 256 characters or relative addresses. Recent versions of Windows includes a setting that removes the limitation in the APIs because of development issues like node nested packages.
UDF looks a lot like that file system on paper, but my understanding is that it practically falls apart because there are too many bugs and too much variation in which features are really supported among the various implementations. Everyone agrees on the DVD-ROM subset, but beyond that it seems like a crapshoot.
Using a non-FAT filesystem for portable storage opens up lots of issues regarding broken permissions. Try to create a file on an ext4-formatted USB flash drive using your current user; rest assured that, unless your remembered to set its permissions to 777, on another computer you'll have to chmod it because it still belongs to the possibly dangling creator's UID and GIDs. If you don't have root access, and unless coincidentally your user has the same UID as your other machine, you're screwed and you have to go back grumpingly to a system where you have administrative privileges.
Same thing applies to NTFS, but I've seen that more often than not Windows creates files on removable drives with extremely open permissions, and in my experience NTFS-3G just straight ignores NTFS ACLs on the drives it mounts, so more often than not it JustWorks™ in common use cases.
I think a journaled extFAT-like filesystem would be perfect for this task, but given how hard it was for exFAT to even start to displace FAT32, even if it actually existed I wouldn't expect it to succeed any time soon.
Not standardising means being able to distinguish yourself easier, it's just that for anything that integrates with third-party stuff, standardisation is way cheaper.
Apple doesn't make monitors or flash drives, so making their own specs for those wouldn't be beneficial to them and would increase prices of compatible products.
But a filesystem is something they do make and being able to do whatever they want with it is quite beneficial, with no downsides as there are no third parties (that they care about) integrating with it that would have to put in extra work to support it (besides their direct competitors, which is a nice bonus).
In all three cases you list, it's a third-party module providing support, rather than it being a standard feature of the OS that can be expected to be generally available on more than 1% of the install base.
ZFS development moves so fast that it is common for my (FreeBSD-based) FreeNAS box to warn me when I upgrade my OS that certain actions will make it incompatible with the prior version of FreeNAS.
That is fine and appropriate for a drive that will be connected to the system for the foreseeable future.
That kind of compatibility concern makes me squeamish about using ZFS for a drive that I want to share between different systems. If it's easy to make it incompatible between two releases for the same system, that smells like a waiting nightmare trying to keep it compatible between Linux, FreeBSD and Windows.
Yeah it should stabilize in the next couple of months with the release of OpenZFS 2.0 as that release is supposed to signify the unification of ZFS on Linux and FreeBSD. ZFS on FreeBSD is being rebased onto ZFS on Linux. Theres also been some talk on adding MacOS zfs support OpenZFS but thats still up in the air.
Agreed! I’ve been running FreeBSD on various computers for very close to a decade now, and still run it on my mail server, but one problem that I faced a couple of years ago when I sold my old laptop, which I was running FreeBSD on, was that my other computer at home at the time was running Linux but I had an external HDD that I’d been using with the laptop and which I was using GELI encryption on.
Since I didn’t have money for any more hard drives at the time, I couldn’t transfer the data to anything else. So then when I wanted to access that data I’d do so via a FreeBSD VM running in VirtualBox. The performance was... not great.
I took the data that I needed the most, and for the rest of the data I let it sit at rest.
This week I wanted to use the drive again, and in the end because I was doing general cleanup, I decided to install FreeBSD on my desktop temporarily.
I actually love FreeBSD but the reason that I prefer to have my desktop running Linux is in big part because I want software on the computer to be able to take advantage of CUDA with the GTX 1060 6GB graphics card that I have in it, and unfortunately only the Linux driver by Nvidia has CUDA, the FreeBSD driver by Nvidia does not.
I was actually looking at installing VMWare vSphere on the computer instead, so that I could easily jump between running Linux and running FreeBSD with what I understand will probably be good performance compared to VirtualBox at least. But the NIC in my machine is not supported and vSphere would not install. I found some old drivers, messed around with VMWare tooling which required PowerShell, and which turned out not to work with the open source version of PowerShell on any other operating system than Windows. So then I downloaded a VM image of Win 10 from Microsoft [0], and used that to try and make a vSphere installer with drivers for my NIC. No luck at first attempt unfortunately. A decade ago I probably would have kept trying to make that work, but at this point in my life I said ok fine fuck it. I ordered an Intel I350 NIC online second-hand for about $40 shipping included, and the guy I bought it from sent it the next day. It is expected to arrive tomorrow. Meanwhile, I installed FreeBSD on the desktop. When the NIC arrives I will do some benchmarking of vSphere to decide whether to use vSphere on the desktop or to stick to either FreeBSD for a while on that machine or to put it back to just Linux again.
Anyways, that’s a whole lot more about my life and the stuff that I spend my spare time on than anyone would probably care to know :p but the point that I was getting to is that, with OpenZFS 2.0 I will be able to use ZFS native encryption instead of GELI and I will be able to read and write to said HDD from both FreeBSD and Linux.
I still need to scrape together money for another drive first before I can switch from GELI + ZFS to ZFS with native encryption though XD
Oh, and one more thing, with the external drive I was having a lot of instability with the USB 3.0 connection on FreeBSD, leading to a bit of pain with transferring data because the drive would disconnect now and then and I’d have to start over. But yesterday I decided to shuck the drive – that is, to remove the enclosure and to connect the drive with SATA like you would any other regular internal drive. It worked out excellently, the WD Essentials enclosure was easier to pry open than I had feared, and a video on YouTube showed me how to do it [1]. As prying tools I used a couple of plastic rulers. As a bonus, it also looks like I/O performance is better with the direct SATA connection than what I was getting with the USB 3.0 connection.
Speaking of that, some people have reported finding that the drives in their WD Essentials external drives were WD Red HDDs. I didn’t have the same luck with mine; mine was WD Blue. But idk if WD Red is even common with the capacity that mine has anyways. Mine is “only” 5TB and I think the people that have been talking about finding WD Red drives in theirs has bought 8TB models often. Idk. The main thing for me anyways is just to have my data and someplace to store it ^^
The ZFS driver is still early in development and quite unstable! I’ve used it in read-only mode just so I could have some access to my ZFS pool while booted into Windows, and although it kind of worked in that use case it was would still do weird things like randomly refuse to open certain files.
One thing that occasionally causes data interoperability problems for me is forgetting that Windows can't have colons in filenames. Not really sure what a filesystem driver could do about that.
You don't need to rebuild a disk because some clusters were orphaned. It doesn't mean disk was corrupt, it just means there is some allocated space that isn't used. It's trivial to fix too. There is no danger there. Actually, what journaling does is to rollback that allocated space. Doing that manually doesn't make it less safe. For backup scenarios exFAT is perfectly feasible.
What is feasible and what is desirable are two different things.
With journalling, I don't have to know or care about any of this: I restart and chances are it's all back to normal. This is desirable.
You can store your backups on stone tablets, with a machine that carves rock to write 1s and 0s and a conveyor belt that feeds new tablets. That is perfectly feasible. It is also not desirable.
Yes there are still dreams about bringing your home directory on a stick anywhere with you and just plugging it into nearest public toilet when you need it.
Ironically, a fully encrypted Linux installation with Btrfs or ZFS on a removable drive is quite easy to make and it works really well. I've one I made on an old SSD I had lying around and a 2.5" USB-C caddy and it's wonderful, you can have your work environment everywhere you want and even plug it into a running machine and boot it up using Hyper-V or KVM.
This is such an odd article. Perhaps Paragon isn't being entirely altruistic with this move but TR are being quite scathing of someone's work submitting a kernel driver with no direct financial reward - and no real praise for, hopefully, fixing one of the biggest out-of-the-box gripes that Windows / Linux desktop dual-boot environments have. It's no small wonder people often complain about the hardships of writing and maintaining open-source software!
Yeah, The Register can often be witty and irreverent, and I wouldn't complain about that, but that kind of wit only works if there's an actual point behind it. Jumping on the apparently popular bandwagon of squeezing out "drama" from the Linux kernel mailing list isn't adding anything of value.
It also seems a bit ignorant to call the submission "half-baked" just because it needed to be worked on and because someone pointed that out (also) in a slightly irreverent way.
All of those points you bring up about the feedback they got seem like just business as usual for a larger merge of code to a carefully developed FOSS project.
I think the issue here is the context... Paragon has had a rocky relationship with the Linux community in the past, for example their whiny reaction to exFAT being proposed for mainline a few months ago, so I think in the eyes of The Register there is a certain amount of inherent mistrust in Paragon proposing their NTFS driver for the kernel---shortly after going on the offensive against another formerly-commercial FS going mainline.
That said, I think the tension is more imagined than real, as the LKML doesn't really seem to have responded any more negatively than they do to most other patches.
There's no big issue, and there hasn't been one for many years. Ext2/3/4fs read/write support has existed for a long time for Windows, and FUSE/Dokan has a working NTFS driver with r/w support as well (also for a long time). It just doesn't work out of the box (though it does on some distributions).
Its gonna take a while till this driver is mainline in Linux kernel, and till that Linux kernel is included in distributions (especially LTS).
I haven't read The Register article, but in the past it has come to my attention they dramatize their articles, and I don't want to read such media.
Mostly better performance / less CPU usage. NTFS-3G is typically CPU-limited even for sequential access, which isn't exactly a great spot for a FS driver to be in.
On one hand yes, on the other hand if you want your patch to be included it's your job to make it reviewable and to make it pass all the various checks.
After all, we all lived long enough without their software, we can live without it a bit longer.
Yes it's a gift, but it's also code that has to be maintained and updated along with the kernel, so it's not really 100% a gift. If you then consider that they might keep on selling a proprietary version of the code (which - don't get me wrong - is 100% legit and fair) they might also get basically free labour: they could rebase onto the latest public gpl version, they might get notes of various issues and bugs...
It turns out someone (could be you?) wrote a Medium article about open-source software using the "free as in puppy" metaphor, although I don't think that they really used in the same way you are here with regards to a corporation using the open-source community to functionally receive free labor. I'm definitely going to add it to my mental list...
Wasn’t me. Gifts bearing obligations come in a lot of shapes and sizes, I just always found the puppy metaphor very compelling and so I like to use it.
(As a mental exercise sometime, go to a pet store and figure out how much a “$12 hamster” costs once you get everything you need to set up and maintain a habitat.)
> f you want your patch to be included it's your job to make it reviewable
I have a mid-junior level co-worker who submits PRs that are excessively large. As best I've been able to determine, he's not especially good at managing the dependencies in his code, and he doesn't want to submit a broken PR, so his default is to wait until he gets everything written instead of breaking it down into smaller pieces.
I think their (the kernel maintainers’) position is that a single patch of 27000 lines of diffs is a bit of a nightmare to do code review on. I’m not sure if you took a look at the patch file (available at https://dl.paragon-software.com/ntfs3/ntfs3.patch ).
I think their point is ‘man, how are we going to divide this up amongst the maintainers? Who gets to check which function or call?’
The word has several meanings, and I took it to mean "frighteningly large" rather than any of the more negative ones. Perhaps not the best choice, but not overtly rude.
Heh. I've certainly been handed stuff I didn't like. It's always been easier to work through if I avoided anything that came off as emotional or insulting.
Could you please stop posting flamebait and unsubstantive comments to HN? Also, could you please stop creating accounts for every few comments you post? We ban accounts that do that. This is in the site guidelines: https://news.ycombinator.com/newsguidelines.html.
You needn't use your real name, of course, but for HN to be a community, users need some identity for other users to relate to. Otherwise we may as well have no usernames and no community, and that would be a different kind of forum. https://hn.algolia.com/?sort=byDate&dateRange=all&type=comme...
If it's working code (assume it is) then requiring it to be turd-polished for the purposes of code review is absurd. Whoever is asking for that is either disingenuous or has no real experience as a software developer.
Taking up patches to code is also taking up vulnerabilities and bugs they carry. Even assuming good intentions, a gift of code is a gift with a burden attached, and should pass deep scrutiny before getting accepted.
Sad as it may sound, free code doesn't come for free.
Paragon stated their intent to maintain and support the code in their initial email, and added themselves to the maintainer files.
There's no drama happening here. Paragon guys are trying to give it "properly", and linux guys want it "properly", and the only thing happening is defining "proper" in this context
Intent, promises and reality aren’t always aligned. Especially when given a slap dash of tens of thousands of lines of code that didn’t meet kernel contribution guidelines.
Thats true... but nothing has happened yet. No one has yet failed to hold up their end. All parties seem to want this to work -- it certainly isn't a dead drop.
how is that a good comparison, when you're given puppies you are being made responsible for living creatures, If you're given a bunch of code you can literally just drop it.
Interesting you choose that metaphor. I can think of a particular gift horse where you would have seen Greek soldiers if you have looked into its mouth. :)
Working on someone else's code is no fun. I do it, when I get paid properly, but can't say I enjoy it. So I understand that not everyone is enthusiastic about that gift.
And who needs it? There are alternatives to read or write the occasional file (e.g. if you happen to have forgotten the Admin's password) of a NT box, but I can't say that I missed the ability to create new files in a NTFS. And why now? Surely those who actually needed that functionality, needed it years ago and meanwhile found some other solution. Perhaps those who actually need it still volunteer to ready that driver for inclusion into the kernel or sponsor someone who can?
NTFS-3g can create files and directories etc. on NTFS (albeit it doesn't do journaling, that's why it requires you to have a clean journal). Having a proper kernel driver is mostly about performance, not features. The current built-in kernel driver for NTFS is completely read-only though.
> The current built-in kernel driver for NTFS is completely read-only though.
Not quite:
> CONFIG_NTFS_RW:
> This enables the partial, but safe, write support in the NTFS driver.
> The only supported operation is overwriting existing files, without changing the file length. No file or directory creation, deletion or renaming is possible. Note only non-resident files can be written to you may find that some very small files (<500 bytes or so) cannot be written to.
I wonder if the reason for the huge line count is overly verbose code, or if it's just the inherent complexity of NTFS. For contrast, I wrote a FAT32 filesystem driver (read/write) for an embedded system a long time ago, and it was less than 1K lines --- of Asm.
It's just a single data point, but I tried Paragon's ext4 driver (the paid version) for Mac many years ago. It seemed to work, but when the drive was connected back to Linux, there were all kinds of fsck errors. Immediately deleted it.
This has also been my exact experience. I used Paragon's NTFS driver (paid) for Mac for my external SSD. After using, when plugging into Windows it would always find "errors" and recommend to Scan and Fix them.
Filesystem code is pretty tricky to begin with, and prone to very subtle bugs with very not-subtle consequences. And this isn't greenfield development of a new filesystem, but an implementation that needs to remain highly compatible with Microsoft's version. This FS driver has to be maintained to track changes to two operating systems. So this 27kloc can reasonably be expected to encompass a lot more complexity than your average 27kloc, and it requires a lot more review effort than something like 27kloc of GPU driver register definitions.
Not the Linux kernel, but a large embedded system.
> Filesystem code is pretty tricky to begin with, and prone
> to very subtle bugs with very not-subtle consequences.
This code has been running in the wild for quite a while now, it has had a trial by fire. And there's no way around testing, subtle ext4 bugs still crop up despite the maturity of the filesystem.
> And this isn't greenfield development of a new filesystem,
> but an implementation that needs to remain highly
> compatible with Microsoft's version.
No amount of code review will stop Microsoft from adapting their version. Also, I doubt Microsoft themselves will change too much about the filesystem given the compatibility they themselves have to maintain with cold storage NTFS drives.
> This FS driver has to be maintained to track changes to
> two operating systems.
You make it sound as if Microsoft have a hand in any of this. Also, have you seen the state of the current NTFS driver? It's a bit flakey (no disrespect to the maintainers).
> No amount of code review will stop Microsoft from adapting their version.
Way to miss the point. Code review for the kernel isn't just about verifying that the code currently works. It's also about making sure the code is maintainable. Microsoft is relevant here because their actions will increase the maintenance burden of any Linux NTFS driver. Kernel developers rightly need to be concerned about how difficult it will be to extend the NTFS driver to handle new NTFS features that Microsoft introduces.
The point wasn't so clear, but I see what you're saying now. Maintainability is normal code review though.
> It's also about making sure the code is maintainable. [..]
> Kernel developers rightly need to be concerned about how
> difficult it will be to extend the NTFS driver to handle new
> NTFS features that Microsoft introduces.
Maintainability is one thing, extensibility is another. Preparing your code to implement some changes completely outside of your control seems like a waste of time and something that might bite you later on.
Microsoft has been trying to come up with a replacement for NTFS for a very long time. They've had mixed success with trying to extend NTFS with more advanced features like the now-deprecated TxF. There's little doubt that even its creators see NTFS as something of a dead-end. Whether your grandmother is still using it in 10 years depends primarily on whether Microsoft can get its act together to pick and ship a replacement.
> Whether your grandmother is still using it in 10 years depends primarily on whether Microsoft can get its act together to pick and ship a replacement.
So yes she will I guess.
Then it's vital Linux gets NTFS working well.
My partner is not going to use Linux things if every time they try and transfer the 8k 3D holographic photos of our CRISPR'ed dog learning to spell to my grandmother it doesn't work on her Holovision.
True story, last month I just lost about 1 in 10 of my media files on my Linux Share to NTFS issues. So now I run the box on Windows.
> True story, last month I just lost about 1 in 10 of my media files on my Linux Share to NTFS issues. So now I run the box on Windows.
Were you seriously running a Linux-based NAS with NTFS as the underlying filesystem, or did you mean something very different? I can't imagine why anyone would ever think their choice of disk filesystem on a server—hiding behind a network filesystem—should be influenced by what disk filesystems are supported by client devices.
Does it need to be included in the Linux Kernel? I understand that NTFS is 27 years old and mostly to support Microsoft file systems. Can't the driver be an external download?
As long as it builds as a module, uses only the standard filesystem apis and doesn't have code changes outside that, code review is far less important IMO...
The linux kernel is successful partly because by and large what's in the kernel works and doesn't generally break on new revisions.
Accepting a new driver which may in fact not work, breaks in ambiguous ways, interacts with other components poorly, or otherwise generates headaches for the kernel could be more trouble than it's worth. Ultimately this driver will eventually need to be modified by others, and ensuing it doesn't get a reputation as a nightmare right out of the gate is also worthwhile.
At the risk of stating the obvious, this is an exceptionally short-sighted approach. Committing poorly understood code is not that different from using binary blobs.
So you’d willingly allow tens of thousands of lines of code which you don’t understand or really know what it does into a kernel which is used by literally billions of devices?
> uses only the standard filesystem apis and doesn't have code changes outside that
That's literally why having to review 27K lines of code is a pain. You're not seriously going to trust, sight unseen, that there's nothing in the code that slips in a backdoor or catastrophic data-loss bug, are you?
I wish you weren't getting downvoted, but what you said is absolutely correct. If it's a module and separate from core, who cares. If it's modifying core kernel or mixed up with kernel space code, then nope.
Linux isn't a microkernel; a filesystem accepted into the kernel source tree will run as kernel space code whether or not it's compiled as a loadable module. So all 27k lines need to be audited for security purposes and to ensure they're interfacing with the rest of the kernel only in the approved ways, because there aren't a lot of technological barriers to the filesystem misbehaving.
But more important than that is the maintenance burden. NTFS will be around for a long time, but it's also a moving target because Microsoft hasn't replaced it yet. Kernel developers have to keep in mind how this code will look in a few decades, after all the original developers are retired. If it's written in a very different style from other Linux filesystems, will there be anyone left who both knows enough about the workings of the Linux IO stack 20 years hence, and understands Paragon's code conventions?
But unlike most patches that draw a similar response it's not a narrowly useful patch that mostly serves the submitter. Proper NTFS support benefits a large proportion of Linux users (the jab from the article that there are more advanced file systems out there seems out of place; there are no signs that windows is about to switch its default FS to something else).
Additionally this code has been used in production for years now (e.g. my 2015 router runs the closed source version of this driver in order to support NTFS formatted external drives) so most likely a lot of quality issues have already been found and addressed.
So I feel it's a bit unreasonable to respond with so much negativity to this contribution.