- new
- past
- show
- ask
- show
- jobs
- submit
I'd be very interested if the author could provide a post with a more in depth view of the passes, as suggested!
Yes, please!
It seems that the terminology as evolved, as we speak more broadly of frontends and backends.
So, I'm wondering if Bison and Flex (or equivalent tools) are still in use by the modern compilers? Or are they built directly in GCC, LLVM, ...?
There was some research on parsing C++ with GLR but I don't think it ever made it into production compilers.
Other, more sane languages with unambiguous grammars may still choose to hand-write their parsers for all the reasons mentioned in the sibling comments. However, I would note that, even when using a parsing library, almost every compiler in existence will use its own AST, and not reuse the parse tree generated by the parser library. That's something you would only ever do in a compiler class.
Also I wouldn't say that frontend/backend is an evolution of previous terminology, it's just that parsing is not considered an "interesting" problem by most of the community so the focus has moved elsewhere (from the AST design through optimization and code generation).
Personally I love the (Rust) combo of logos for lexing, chumsky for parsing, and ariadne for error reporting. Chumsky has options for error recovery and good performance, ariadne is gorgeous (there is another alternative for Rust, miette, both are good).
The only thing chumsky is lacking is incremental parsing. There is a chumsky-inspired library for incremental parsing called incpa though
Also, in what sense it is more conservative?
It uses ASCII for all output, replaces ZWJs to have consistent terminal output in the face of multi codepoint emoji for two out of the top of my head.
In typical modern compilers "frontend" is basically everything involving analyzing the source language and producing a compiler-internal IR, so lexing, parsing, semantic analysis and type checking, etc. And "backend" means everything involving producing machine code from the IR, so optimization and instruction selection.
In the context of Rust, rustc is the frontend (and it is already a very big and complicated Rust program, much more complicated than just a Rust lexer/parser would be), and then LLVM (typically bundled with rustc though some distros package them separately) is the backend (and is another very big and complicated C++ program).
AFAIK the reason is solely error messages: the customization available with handwritten parsers is just way better for the user.
https://github.com/NixOS/nix/blob/master/src/libexpr/parser....
https://github.com/NixOS/nix/blob/master/src/libexpr/lexer.l
The hard part about compiling Rust is not really parsing, it's the type system including parts like borrow checking, generics, trait solving (which is turing-complete itself), name resolution, drop checking, and of course all of these features interact in fun and often surprising ways. Also macros. Also all the "magic" types in the StdLib that require special compiler support.
This is why e.g. `rustc` has several different intermediate representations. You no longer have "the" AST, you have token trees, HIR, THIR, and MIR, and then that's lowered to LLVM or Cranelift or libgccjit. Each stage has important parts of the type system happen.
In particular, it makes parsing everything look like a huge difficult problem. This is my main problem with the Dragon Book.
In practice everyone uses hacky informal recursive-descent parsers because they're the only way to get good error messages.
Most roll their own for three reasons: performance, context, and error handling. Bison/Menhir et al. are easy to write a grammar and get started with, but in exchange you get less flexibility overall. It becomes difficult to handle context-sensitive parts, do error recovery, and give the user meaningful errors that describe exactly what’s wrong. Usually if there’s a small syntax error we want to try to tell the user how to fix it instead of just producing “Syntax error”, and that requires being able to fix the input and keep parsing.
Menhir has a new mode where the parser is driven externally; this allows your code to drive the entire thing, which requires a lot more machinery than fire-and-forget but also affords you more flexibility.
The rest of the f*cking owl is the interesting part.
As did all the UNIXes that used to rule before companies started sponsoring Linux kernel development, and were quite happily taking BSD code into them, alongside UNIX System V original code.
GCC approach is on purpose, plus even if they wanted to change, who would take the effort to make existing C, C++, Objective-C, Objective-C++, Fortran, Modula-2, Algol 68, Ada, D, and Go frontends adopt the new architecture?
Even clang with all the LLVM modularization is going to take a couple of years to move from plain LLVM IR into MLIR dialect for C based languages, https://github.com/llvm/clangir
The idea is that you should link the front and back ends, to prevent out-of-process GPL runarounds. But because of that, the mingling of the front and back ends ended up winning out over attempts to stay modular.
[0]: https://lists.gnu.org/archive/html/emacs-devel/2015-02/msg00...
[1]: https://lists.gnu.org/archive/html/emacs-devel/2015-01/msg00...
Valid points, but also the reason people wanting to create a more modular compiler created LLVM under a different license - the ultimate GPL runaround. OTOH now we have two big and useful compilers!
If it's free software then I can modify and use it as I please. What's limited is redistributing the modified code (and offering a service to users over a network for Afferro).
https://www.gnu.org/philosophy/free-sw.en.html#fs-definition
--- From the post:
I let this drop back in March -- please forgive me.
> Maybe that's the issue for GCC, but for Emacs the issue is to get detailed
> info out of GCC, which is a different problem. My understanding is that
> you're opposed to GCC providing this useful info because that info would
> need to be complete enough to be usable as input to a proprietary
> compiler backend.
My hope is that we can work out a kind of "detailed output" that is
enough for what Emacs wants, but not enough for misuse of GCC front ends.I don't want to discuss the details on the list, because I think that would mean 50 messages of misunderstanding and tangents for each message that makes progress. Instead, is there anyone here who would like to work on this in detail?
If you're going to make it hard for anyone anywhere to integrate with your open source tooling for fear of commercial projects abusing them and not ever sharing their changes, why even use the GPL license?
Stallman's insistence that gcc needed to be deliberately made worse to keep evil things from happening ran completely counter to his own supposed raison d'etre. Which you could maybe defend if it had actually worked, but it didn't: it just made everyone pack up and leave for LLVM instead, which easily could've been predicted and reduced gcc's leverage over the software ecosystem. So it was user-hostile, anti-freedom behavior for no benefit.
I am not familiar enough with gcc to know how it impacts out-of-tree free projects or internal development.
The decision was taken a long time ago, it may be worth revisiting it.
That said, if Rust is going to continue entrenching itself in the open source software that is widely in use, it should at least be able to be compiled with by the mainline GPL compiler used and utilized by the open source community. Permissive licenses are useful and appreciated in some context, but the GPL’d character of the Linux stack’s core is worth fighting to hold onto.
It’s not Rust in open source I have a problem with, it is Rust being added to existing software that I use that I don’t want. A piece of software, open source, written in Rust is equivalent to proprietary software from my perspective. I’ll use it, but I will always prefer software I can control/edit/hack on as the key portions of my stack.
This is how I feel about C/C++; I find Rust a lot easier to reason about, modify, and test, so I'm always happy to see that something I'm interested in is written in Rust (or, to a far lesser extent, golang).
> So for me the less entrenched Rust remains the more ability I keep to work on the software I use.
For me, the more entrenched Rust becomes the more ability I gain to work on the software I use.
> if Rust is going to continue entrenching itself in the open source software that is widely in use, it should at least be able to be compiled with by the mainline GPL compiler used and utilized by the open source community
I don't see why this ideological point should have any impact on whether a language is used or not. Clang/LLVM are also open-source, and I see no reason why GCC is better for these purposes than those. Unless you somehow think that using Clang/LLVM could lead to Rust becoming closed-source (or requiring closed-source tools), which is almost impossible to imagine, the benefits of using LLVM outweigh the drawbacks dramatically.
> A piece of software, open source, written in Rust is equivalent to proprietary software from my perspective.
This just sounds like 'not invented here syndrome'. Your refusal to learn new things does not reflect badly on Rust as a technology or on projects adopting it, it reflects on you. If you don't want to learn new things then that's fine, but don't portray your refusal to learn it as being somehow a negative for Rust.
> I will always prefer software I can control/edit/hack on as the key portions of my stack
You can control/edit/hack on Rust code, you just don't want to.
To be blunt, you're coming across as an old fogey who's set in his ways and doesn't want to learn anything new and doesn't want anything to change. "Everything was fine in my day, why is there all this new fangled stuff?" That's all fine, of course, you don't need to change or learn new things, but I don't understand the mindset of someone who wouldn't want to.
> This is how I feel about C/C++; I find Rust a lot easier to reason about, modify, and test, so I'm always happy to see that something I'm interested in is written in Rust (or, to a far lesser extent, golang).
You have to do better than "NO U" on this. The comparison to C/C++ is silly, because there is no way you're going to avoid C/C++ being woven throughout your entire existence for decades to come.
> I don't see why this ideological point should have any impact on whether a language is used or not. Clang/LLVM are also open-source, and I see no reason why GCC is better for these purposes than those.
I hope you don't expect people to debate about your sight and your imagination. You know why people choose the GPL, and you know why people are repulsed by the GPL. Playing dumb is disrespectful.
> don't portray your refusal to learn it as being somehow a negative for Rust.
But your sight, however, we should be discussing?
edit: I really, really like Rust, and I find it annoying that the clearest, most respectful arguments in this little subthread are from the people who just don't like Rust. The most annoying thing is that when they admit that they just don't like it, they're criticized for not making up reasons not to like it. They made it very clear that their main objection to its inclusion in Linux is licensing and integration issues, not taste. The response is name calling. I'm surprised they weren't flagkilled.
Keywords right there. People who don’t-like-Rust are the most coddled anti-PL group. To the extent that they can just say: I really need to speak my mind here that I just don’t like it. End of story.
I don’t think anyone else feels entitled to complain about exactly nothing. I complain about languages. In the appropriate context. When it is relevant or germane to the topic.
A “genius” Rust program running on a supercomputer solving cancer would either get a golf-clap (“I don’t like Rust, but”) or cries that this means that the contagion is irreversibly spreading to their local supercomputer cluster.
One thing is people who work on projects where they would have to be burdened by at least (even if they don’t write it themselves) building Rust. That’s practical complaining, if that makes sense. Here people are whining about it entrenching itself in muh OSS.
You started your comment with "I don't like the language". I can't find any technical or even legal-like argumentation (there is zero legal encumbering for using Rust AFAIK).
Your entire comment is more or less "I dislike Rust".
Question to you: what is the ideal imagined outcome of your comment? Do you believe that the Rust community will collectively disband and apologize for rubbing you the wrong way? Do you expect the Linux kernel to undo their decision to stop flagging Rust as an experiment in its code base?
Genuine question: imagine you had all the power to change something here; what would you change right away? And, much more interestingly: why?
If you respond, can we stick to technical argumentation? "I don't like X" is not informative for any future reader. Maybe expand on your multiple levels of disagreement with Rust?
1) I had no ideal imagined outcome to writing that comment. The parent asked what the GP meant by not liking Rust but that at least Rust could be compiled by gcc. I was just explaining why it may be preferable to someone that does not use (or in this case "like" Rust) to see it able to be compiled by a GPL piece of software that has been a part of the Linux core for almost all of Linux's existence. As to the rest of that question, of course, I don't think that anyone using/enjoying/designing/supporting Rust in any way would be convinced by anything I think or say (I'm just some guy on HN).
2) If I had the power to change what? The issue with Rust not being able to compile using gcc or more broadly concerning change things regarding Rust? I don't think a list of changes I'd make to Rust is what you wanted, so I'll assume you meant regarding compiling Rust via gcc. If I had the power to change Rust from being only compiled using rustc and moved to primarily gcc based I would. And the why is not particularly interesting, I will always prefer actions and decisions that take mind and market share away from anything that can be used to advance the interest of multi-national conglomerate corporations via permissive licensing of the core technologies of computing.
I know that is not a technical argument, but it is the reason I'd make the change. I will assert that such a reason is absolutely valid, but I don't take disagreement with my position to be a character flaw in someone.
I too am just one guy on HN but when I go to certain threads, I do expect no emotional and preference comments because I want to fill up my blind spots and emerge better educated. Obviously that does not mandate anything from you but since we are expressing preferences, that's mine.
RE: the rest, I am trying to understand your POV but can't. Is your issue with the difference between GPL and whatever Rust is licensed under?
That I could somewhat understand. But your rather emotionally loaded language against Rust itself I simply cannot and will not take seriously. Apparently Rust has an unique talent to tick people off on HN would be my tongue-in-cheek conclusion here because it has been years since I saw what we might call a "zealot fiercely arguing in favor of Rust" here, so the reason should be somewhere else.
Feel free to elaborate on that, though I am fairly sure such a discussion would not go anywhere. Emotion resents reason; emotion just wants to express itself.
But I do get weirded out how many people treat Rust like it's coming to eat their kids and puppies.
Fair enough, but what are those disagreements? I was fully in the camp of not liking it, just because it was shoved down every projects throat. I used it, it turns out its fantastic once you get used to the syntax, and it replaced almost all other languages for me.
I just want to know if there are any actual pain points beyond syntax preference.
Edit: I partially agree with the compiler argument, but it's open source, and one of the main reasons the language is so fantastic IS the compiler, so I can stomach installing rustc and cargo.
Unlike a project's license, this situation is entirely in your control. Rust is just a programming language like any other. It's pretty trivial to pick up any programming language well enough to be productive in a couple hours. If you need to hack on a project, you go learn whatever environment it uses, accomplish what you need to do, and move on. I've done this with Python, Bash, CMake, C++, JavaScript, CSS, ASM, Perl, weird domain-specific languages, the list goes on. It's fine to like some languages more than others (I'd be thrilled if C++ vanished from the universe), but please drop the drama queen stuff. You look really silly.
But, that doesn’t have any bearing on my lack of desire to learn Rust. Several other comments basically demand I justify that dislike, and I may reply, but there is nothing wrong with not liking a language for personal or professional use. I have not taken any action to block Rust’s adoption in projects I use nor do I think I would succeed if I did try. I have occasionally bemoaned the inclusion of Rust in projects I use on forums, but even that isn’t taken well (my original comment as an example).
This is an irrelevant and disingenious hacker jab (oh look, they’re not a “real hacker”).
The language itself I find wonderful, and I suspect that it will get significantly better. Being GPL-hostile, centralized without proper namespacing, and having a Microsoft dependency through Github registration is aggravating. When it all goes bad, all the people silencing everyone complaining about it will play dumb.
If there's anything I would want rewritten in something like Rust, it would be an OS kernel.
Never attribute to malice that which can be adequately explained by apathy. We have, unfortunately, reached a point where most people writing new software default to permissive and don't sufficiently care about copyleft. I wish we hadn't, but we have. This is not unique to Rust.
Ironically, we're better off when existing projects migrate to Rust, because they'll keep their licenses, while rewrites do what most new software does, and default to permissive.
Personally, I'm happy every time I see a new crate using the GPL.
> GPL-hostile
Rust is not GPL-hostile. LLVM was the available tool that spawned a renaissance of new languages; GCC wasn't. The compiler uses a permissive license; I personally wish it were GPL, but it isn't. But there's nothing at all wrong with writing GPLed software in Rust, and people do.
> having a Microsoft dependency through Github registration is aggravating
This one bugs a lot of us, and it is being worked on.
Not sure if it is particularly hostile. There are several GPL crates like Slint.
> Microsoft dependency through Github registration is aggravating
This one is concerning.
Many forget that Microsoft went from "FOSS is bad", to now having their fingers across many key FOSS projects.
They are naturally not the only ones, a developer got to eat, and big tech gladly pays the bills when it fits their purposes.
Is libgccjit not “a nice library to give access to its internals?”
The MIDI protocol is pretty good for what it is designed for, and you can make it work for actual real networking, but the connections will be clunky, unergonomic, and will be missing useful features that you really want in a networking protocol.
Googling “slip over midi” gives a lot of fashion blogging about mini dresses and slips that one wears under them, so I’m not quite sure what you mean.
But if you mean “midi over slip”, then that is the inverse case from what I am suggesting. Midi over slip (and slip could be any tcpip substrate, such as ethernet) has midi messages as the payload, carried via tcpip.
I’m talking about using midi messages to carry tcpip payloads. You can absolutely do it, but it isn’t really what the protocol is designed for.
And what google turns up when you enter those exact three words in a row is really none of my business.
https://en.wikipedia.org/wiki/Serial_Line_Internet_Protocol
https://en.wikipedia.org/wiki/MIDI
I suppose someone somewhere has done it (and I have always said that you can), but my best internet searches with a wide variety of terms don't show any old tutorials or products that explain how. Nor can I find anything else that uses MIDI to send tcpip packets over the MIDI connection. Not even a mention.
My Google-fu may be totally weak, but whatever.
I freshly, happily and totally concede that people have in the past used MIDI to send SLIP packets, and it is well understood how. Great. You are totally correct.
But all of this just proves the original point. It either precedes anything on the internet today, or is so obscure that no search engine can find it. Either way, if no one uses it or even bothers to explain how, I think it is pretty fair to conclude that it is rather unergonomic, and hacky, and doesn't provide all the features one really wants in a network connection.