I would back it up a little bit and say that any EDR thing would be capable of observing the source of the functions that a program will run and detect outliers. It's a great program to write, everyone should give it a try! It can also be unexpectedly complicated to get all of the corner cases right and you'll drive yourself mad once you try to think of the ways your detection method can be circumvented.
Go deeper. You're right on the cusp of it since all of your questions are fantastic. Even digging into one of your questions will bring up highly relevant material for understanding how a Linux system functions.
> The sed command is not explained at all. Even the description is non-sense if you don't know already what they are talking about. What is this "default directory", why do you need to set it, why there and only there, why that command works? Even the command itself is something, which I need to check the manual what the heck it does, because it's not a simple one.
By "default directory" they just mean that the upstream GCC source code has a file with default variables and they are modifying those variables with the sed command to use $prefix/lib and $prefix/usr/lib instead of $prefix/lib64 and $prefix/usr/lib64, e.g. lines that contain "m64=" and replacing "lib64" to be "lib". This is what sed is used for: To make string substitutions based on pattern matching. Think through ways of writing the sed command and testing on your own file to see how it behaves. This will lead you to more tools like diff, grep and awk.
> > -enable-default-pie and --enable-default-ssp
> The description is almost unusable. So we don't need it, but it's "cleaner", which in this context means exactly nothing. So what happens if I left out? Nothing? Then why should I care?
Go back and re-read the sections up to this point. Make note that you're in Chapter 5, which is the bootstrapping phase for the toolchain cross-compilers. Then look into the features that are mentioned. You can see descriptions by running the ./configure --help most times or looking up the GCC documentation. Those features are for security purposes and if you put that in perspective of the bootstrap phase they aren't needed if the only purpose of the temporary GCC binaries is to compile the final GCC in a later phase. To your point, perform an experiment and enable those features to see if there really is a difference other than time spent to compile. GCC incrementally adds security features like this and they are a big thing in hardened distributions.
> > --disable-multilib
> Okay, it doesn't support "something". I have no idea what is multilib, or why I should care. Basically the next arguments' description tells me that because it wouldn't work the compilation otherwise. And then..
A great feature to look up! Check out https://gcc.gnu.org/install/configure.html and search for the option and you'll find that it has to do with supporting a variety of target calling conventions. I can see how that'd be pretty confusing. It has to do with the underlying hardware support for application binary interfaces that GCC can utilize and it turns out you probably only need to support your native hardware (e.g. x86_64). That is, you're compiling your system from scratch and it'll only run on your native hardware (x86_64) but if you were a C/C++ programmer maybe you'd want to have support for other hardware (64-bit ARM systems are pretty common today as an example). So you can save time/space by disabling the defaults and honestly the defaults it includes are just not all that relevant on most systems today.
> > --disable-threads, --disable-libatomic, --disable-libgomp, --disable-libquadmath, --disable-libssp, --disable-libvtv, --disable-libstdcxx
> But why would they fail? I want to understand what's happening here, and I need to blindly trust the manual because they just tell me, that "they won't work, believe us".
Try it and find out. I would expect that they would fail due to reliance on other dependencies that may not have been installed or included in this bootstrapped build. Or maybe be/c those components don't behaved well with the LFS-based bootstrap methodology and ultimately aren't needed to bootstrap. Sure trust the LFSers but also think through a way to test your own assertions of the build process and try it out!
> > --enable-languages=c,c++
> Why are these the only languages which we need? What are the other languages?
GCC supports many language front-ends. See https://gcc.gnu.org/frontends.html. Only C/C++ is needed be/c you're bootstrapping only C and C++ based package sources. You can validate this as you build the remaining sections. It's conceivable that if you needed other languages in the bootstrap you could include them.
> So at the end, descriptions are not really helping to understand what's happening, if you don't know already. The last time when I started LFS (about 10 years ago), that was my main problem. That you already need to know almost everything to understand what's really happening, and why, or reading manuals, or trying to find basically unsearchable information (like why libatomic compiling would fail at this step). So after a while, I started the blind copy-pasting, because I didn't have the patience of literary months, and when I realized that this was pointless, I gave up.
It's a steep learning curve, especially of a bootstrap which by its nature is circular! tbh, that's sort of the utility of LFS, it can take you up to a certain point but there are so many options and pitfalls in building a Linux system (or really any complex piece of software) and the real usefulness is pushing through, picking something and learning about it. Then using what you learned to apply to the unknown. GCC is one of the most important packages too, so there's a lot to unpack in understanding anything about it, but the impact is large.
I believe the premise of the book is to expose the reader to an element of style that bridges different programming environments. Thus, "style" here is meant more in the terms of a restrictive technique to allow an exposure of these techniques, not writing in idiomatic, or latest-and-greatest Python.
I'd suggest this book even to experienced engineers to focus on different types or a style of programming that he/she may not be familiar. And when you recognize some of the styles it'll be a re-enforcement technique that you were doing something that is seen over and over again (much like a pattern language). I recall going over styles that are seen more in systems programming (like C). The book will contribute to an understanding of code architecture styles across different languages.
For the new engineer, I'd also highly suggest this book as you may begin to see a variety of ways to solve a problem. And I would not concern myself with idiomatic Pythonic styles as that will come with time.
An excellent and very fun read about compilation, interpretation, and programming semantics.
I don’t think you already have to know LISP, although there isn’t an introduction, but if you know any programming it would be enough to pick up the programs that are used as examples. Source code from the book is available and you will end up programming as you move along and it is a delight to see yourself make a program that can interpret another program.
I think you mean Inline::Perl5: https://github.com/niner/Inline-Perl5. It doesn't parse Perl 5 and run it inside of Perl 6 though, it just runs the Perl 5 interpreter for you from a Perl 6 program and lets you marshall data between the interpreters. It's a legitimate technique.
There is the v5 project implementing Perl 5 targetting nqp to be compiled inline with a Perl 6 program. https://github.com/rakudo-p5/v5
The project has mostly stalled since Inline::Perl5 became so effective. v5 had the issue of being unable to easily support binary XS type modules. Which means it could never support CPAN, unlike Inline::Perl5. It was meant for people to be able to copy their high mental cost algo code into a new Perl 6 project without fuss.
I've been prototyping with VoltDB be/c it has a pretty interesting model that should be able to achieve a near-linear scale of write operations for tables that are partitioned. After reading the docs on VoltDB [1] is became clear to me that they are putting the design constraints up front and if you can work through those constraints [2] you can achieve some wicked scale. But it's a bit more complex than your typical single host database.
The work that VoltDB makes you deal with up front are a lot like the work that would have to be done for a multi-master setup in PostgreSQL to function correctly. I like how VoltDB puts those problems up front, but I'm having problems seeing VoltDB as a general purpose solution. The old PG database I work on right now I can't see in the VoltDB, but maybe parts of it would fit OK.
I look forward to the tooling in PG to get better and better. It's a great community and I do like the work that 2ndquandrant is doing. I like how they approach the community with their work. I do the BDR work is important to understand so that when you're in a situation that calls for it you can take advantage of it (same with VoltDB).
If you have any question about whether VoltDB could fit or not, let us know. It's surely not as general purpose as PG, but yes, it's faster for many use cases and its clustering is fully consistent.
Thanks! I certainly will. It looks pretty cool. I'm certainly learning it and finding the technical details in the documentation to be quite interesting.
Yep, I think that's a pretty big part of the learning process. Realizing that you had a successful build but some things weren't stable. Naturally this would pose the question: What went wrong? To solve that problem you have to figure out a way to debug the system, make a decision as to how to rectify it, apply a fix, then figure out how to verify.
Now imagine the effort that has to go into making Debian work or what if the issue was related to sporadic kernel crashes. It's a lot of fun if you're into that sort of thing.
> What went wrong? To solve that problem you have to figure out a way to debug the system, make a decision as to how to rectify it, apply a fix, then figure out how to verify.
Agreed and that's exactly the kind of stuff I was looking for in LFS and I couldn't find. I don't blame the team too much it's already fun the way it is; but it's confusing nonetheless.
My first LFS was around 2001, which I think would be LFS v3.0. I can't begin to express how amazing the project was to my learning and development as a programmer.
When I started I was completely new to Linux and the command line. Everything I do today is rooted in the foundation that I learned by doing a close read of the book. The book requires no coding, just command-line, but soon after my first and second reads of the book I began to automate the build process that took me down the path of understanding how distros are put together.
I've only read the book end-to-end once or twice (I think that's all that is needed), but I've done hundreds of builds. Make no mistake: The book is difficult to get through. It requires trial and error as well as a close read of the book to understand why the build is done in the way that it is, which should be explained thoroughly, but if it isn't I'm sure the editors want to hear about the difficulty you are having. There are also bound to be problems bootstrapping from your <insert_favorite_distro>. That's always been a challenge and it's part of the build methodology that LFS uses to create a stable foundation to build a proper system (which relates to the efforts today regarding bit identical reproducible builds).
I went through the same process, at around the exact same time. I haven't done much with LFS since 2001/2002, but it was invaluable as a resource to learn how a system is put together, what provides what, and as an exposure to bootstrapping through chroot.
Building an LFS system is the sysadmin equivalent of a programmer writing their own DB, ORM, web framework or the like. The vast majority of the time, you don't want to use the result in production, but the process is an invaluable learning experience.