If you're going from generator object (or any other lazy iterator, like the file object in this case) -> set object, there will be a memory usage increase with each additional line read.
What I meant is you could in theory process a generator and omit duplicates without any real memory usage (even with a file of millions of lines) by chaining generators together. This would be slower than a set but much more memory efficient.
What I meant is you could in theory process a generator and omit duplicates without any real memory usage (even with a file of millions of lines) by chaining generators together. This would be slower than a set but much more memory efficient.