What does HackerNews think of ripgrep?

ripgrep recursively searches directories for a regex pattern while respecting your gitignore

Language: Rust

#4 in Node.js
BurntSushi (dev) built a grep replacement, `rg` (ripgrep), in Rust, that tends to run circles in terms of speed around grep itself and other grep replacements. The UX is also pretty nice. I install it everywhere now.

I'm curious (like others here) how `rak` is better than `rg`.

https://github.com/BurntSushi/ripgrep

Thanks for posting this, but I think this is mostly taken out of context. There are two really key bits here.

First is that this change is only in ripgrep on master. It's not in a release. It will be in the next release. If you want this in your ripgrep, then `cargo install --git https://github.com/BurntSushi/ripgrep ripgrep` should do the trick. (Add `--features pcre2` if you want that.)

Second is that this make ripgrep around twice as fast for a certain (albeit pretty common) class of regexes. For example, it likely applies to the patterns `sherlock` and `\w+\s+sherlock\s+\w+`, but not to `sherlock|watson`. The latter uses a multi-substring search which isn't SIMD-ified on aarch64 yet. (That is, loosely speaking, my next step. Then hopefully release ripgrep 14 soon there after.)

I find Rust extremely unreadable and can detail exactly how:

I know ripgrep is written in Rust and works really well and "does one thing", so I'll go to its repo ( https://github.com/BurntSushi/ripgrep ) and read through a bit. First, I'm looking for a "src" folder, but that does not exist, so right off the bat I am a little uncomfortable. Nothing in the root folder looks like it would be the source. What are "complete" and "scripts" and "pkg" folders? Because those (in that order) would be what I check.

"complete" looks like command line completion. The "scripts" directory has one file, "copy-examples". "pkg" contains folders "brew" and "windows". Still lost...

I thought "crates" was like modules for Python, so I think that would only contain dependencies and stay out of there.

Finally, I relent and open "build.rs" which I suspect is something like "build.ninja" and I can figure out which node has source files. Stupid me. "build.rs" IS the source file.

Oh. No, actually, it's not all of ripgrep, in fact. It's the tip of the iceberg. The source is in crates/core/app.rs, and build does manpage compilation, registers shell completions, etc.

So, line one. I don't know what the return value App<'static, 'static> is. I understand it's an App object, and the tick (no idea what it's called; this is starting to feel like a fracture mechanics lecture) I know has to do with ownership. It's tough to see how there are two subtypes in App<> when the assignment of app has ten functions that assign attributes. I can certainly READ what is being assigned to the App object. I wouldn't be able to WRITE any of this so far. Unimportant, it's not the meat and potatoes of ripgrep.

Now I'm scrolling, looking for something cool to analyze and run across Vec<&'static str>. Okay, I thought tick was ownership, but I also thought ampersand was ownership-related. Maybe one is mutability? (This is like knowing how to play music really well but now learning a totally foreign instrument.)

Skimming, it looks like this whole file is the argc/argv parser. I'll look for something interesting in main.rs, search.rs, and subject.rs (the last because it's an unusual name).

    struct Config {
        strip_dot_prefix: bool,
    }
Well. I can read that! Next line...

    impl Default for Config {
        ...
...not that. It's probably something like a class method.

As I'm reading through, it feels like a LOT of kicking the can down the road and verbosity. There's a function binary_detection_explicit(self, detection) that just assigns detection to self.config.binary_explicit and returns self. binary_detection_implicit() does basically the same with a different assignment member. getters and setters, roadblocking readability since the dawn of computer science... (a few lines later, the function has_match() returns the value of has_match. ugh. This is why I prefer crudely written C to textbook C++.)

I see .unwrap() in a line. This tripped me up during the tutorial; still no idea what it does. Sure I can look it up, but there are a hundred other terms to learn: unwind, map, Result (OH! The two App return types are probably Ok and Err ... maybe?), convert, concat, const, continue, ...

subject.rs, btw, can probably be written in three lines of really dense Python or ten lines of really well-written Python with a few other lines of Doxygen. Instead, it's about 90 lines of Rust with another 50 lines of comments.

Finally, I find something noteworthy.

    fn search(...) ...
        fn iter(...) ...
            for subject in subjects {
                searched = true;
                let search_result = match searcher.search(&subject) {
                    ....
This is all extremely readable and just feels like a standard queue in any language. But then there are lines like:

    if let Some(ref mut stats) = stats {
        *stats += search_result.stats().unwrap();
    }
I give up. Are you assigning the variable stats to itself? by reference but mutable? If Rust doesn't have pointers, what is *stats? unwrap() is no longer the most confusing part of this.

-----

Postface, I'm originally a mechanical engineer who eventually got roped into writing driver software, so C is my comfort zone and I don't need anything much more complicated than "set bits here, read registers, use a 40-year-old communication protocol, handle errors".

There's simply no easy analog between Rust and Python ... or Rust and C ... or Rust and SQL ... or Rust and PRACTICE ... or Rust and any other language I've learned. I understand a lot of the CWE Top 25 software errors are mitigated by Rust and it's not just a theoretically correct but unusable language, but there hasn't been a strong reason to learn it, and the learning curve is made steeper by preconceived notions of every keyword.

-----

I forgot to compare with grep.c

Extremely readable. I know right away which part is args, where memory allocation happens, where file handling occurs, and then the "hard part"---where the magic happens---is the last page on my monitor. I can just stare at that and mentally deconstruct until pieces fall into place... Look up regcomp()? check. Look up grep_tree()? check. Read the kernel.org discussion on why grep_tree() was written and how it relates to ensure_full_index(), which explains some of the other variables.

The whole thing sits nicely in a few pages with terse but clear comments (unlike this post) and looks familiar even though I've never seen the source before.

-----

Last edit. I tried just now to refresh on .unwrap(). https://doc.rust-lang.org/rust-by-example/error/option_unwra...

Only because of the comments did I realize that calling drink.unwrap() will panic (throw an error?) if that argument is empty. So, unwrap is just the "if (... == nullptr) { return ...; }" of Rust, it seems.

But it's somehow an alternative to .expect() and has its own alternatives, unwrap_or/unwrap_or_else/unwrap_or_default, and it also seems optional. And my time in the rabbit hole is over; my children need to go to bed.

Have you tried out using ripgrep [0] or fzf [1]? 200ms is quite a lot for history search, I'd expect search to be 20ms or less. In fact, it should be instantaneous as it should all fit in memory really. I've been using fzf and the searches are all faster than my monitor can display frames (so less than 16ms), although I only have 10k entries on my history as I clear it up once in a while.

[0]: https://github.com/BurntSushi/ripgrep

[1]: https://github.com/junegunn/fzf

Congratulations (?) for making the top five google results for "ripgrep capture groups" [1].

I cite sources way more often than not, this time I got lazy after dithering over whether to go with the definitive ripgrep source page [2] or a decent looking third party(?) tutorial .. pressed for time I did neither.

[1] https://learnbyexample.github.io/learn_gnugrep_ripgrep/ripgr...

[2] https://github.com/BurntSushi/ripgrep

I'm unsure how to "sell" this idea. I don't want to force my view on others until I truly understand the problem that they're trying to solve with modules/microservices.

For searching multiple files, ripgrep works really well in neovim :)

[1] https://github.com/BurntSushi/ripgrep

I see no benefits in using Warp over a regular (and more supported) GPU-accelerated terminal emulator. It's just "prettier" out of the box, and it seems to market quite a bit to people that either don't know how to configure their shells, or don't know what they're doing.

  - If you want fuzzy command, history, file / contents search, use fzf [0] (you should probably be using fzf and ripgrep [1] anyway if you work daily in your terminal).
  - If you want sessions, use a multiplexer like tmux [2] or zellij [3].
  - If you need to have your own "cheatsheets" use navi [4]. If you want to sync them with your team, use whatever sync solution you like.
  - If you think you need a text editor in your shell's command line, reconsider. If you *really*  want to edit and re-execute the last command in your editor of choice, use something like "fc $!" [5] or create your own shell solution for it.
  - If you want a sexy prompt use starship [6].
  - If you want terminal sharing use tty-share [7].
  - If you want to ask GPT for help, don't do it in your terminal. Open up ChatGPT (or whatever future UI will exist), ask your question, and check that there's nothing harmful in what your about to execute. Sometimes friction is good.
For each of these ^ pieces of software there are tens and hundreds of alternatives.

If you want a terminal that's pretty out-of-the-box, where things are "clickable", you don't have the time or interest to invest your energy in learning tools that could massively boost your productivity for years to come, and you don't care about designing your own workflow and being able swap parts of it at any point, without depending on any single "app", and if you don't mind "logging into your terminal" (what the actual fuck, excuse the language) or the terminal adding its own SSH wrapper and doing things you don't know to the hosts you connect, then maybe Warp is OK for you. But then again, maybe you're not going in the right direction.

There are so many more awesome ways you could improve your shell experience than making things clicky. I don't understand what the market is for Warp, is it for wanna-be professionals that can't be bothered to become professionals? I completely fail to see how this could succeed as a paid product, especially with a subscription model.

  [0] https://github.com/junegunn/fzf
  [1] https://github.com/BurntSushi/ripgrep
  [2] https://github.com/tmux/tmux
  [3] https://github.com/zellij-org/zellij
  [4] https://github.com/denisidoro/navi
  [5] https://shapeshed.com/unix-fc
  [6] https://starship.rs/
  [7] https://github.com/elisescu/tty-share
The same thing happens to me and IMO almost always these are enough:

- HN Search website [0] (with the use of search operators), almost always I remember a keyword from the title and restrict the search between dates (or sort by date if is something recent).

- History Trends Unlimited chrome extension [1], in case I remember the domain.

- A local "personal knowledge base" / "digital garden" / "second brain" [2], in case I want to remember, review, and/or update my notes. "Local" to speed up search (with `ripgrep`[3])

[0] https://hn.algolia.com/

[1] https://chrome.google.com/webstore/detail/history-trends-unl...

[2] https://en.wikipedia.org/wiki/Personal_knowledge_base

[3] https://github.com/BurntSushi/ripgrep

Even if sed and grep are available their weird syntax is enough to make people write modern replacements.

I don't care if they're not 100% feature complete, the fact I can remember how to use them for my simple everday tasks (searching, finding/replacing across many files) without needing to consult a manpage or search online for answers is enough.

I used grep daily for years and _still_ it didn't feel like it made sense. I remember reading about Ack (https://beyondgrep.com/) some time in 2009-2010 , installing it and switching over entirely within about five minutes.

Modern sed:

https://github.com/chmln/sd

Modern grep:

https://github.com/ggreer/the_silver_searcher

https://github.com/BurntSushi/ripgrep

What are you talking about? If you bothered to search for it and go to the shop, you'd see there is not only a sign, but a huge fucking window with several signs helpfully telling you what the shop offers and even going so far as telling you when you might not want find the shop useful.

There's even a huge sign with only 12 words pithily explaining what the shop has inside.

https://github.com/BurntSushi/ripgrep

A grep alternative that optimizes for performance: https://github.com/BurntSushi/ripgrep . There are detailed performance comparisons and discussions in the readme there.
100%. That the author think it's "backwards" that they can't run a sustainable business by copying what's already been made but need to put in some actual work (and that building profitable software inherently must involves AI/ML experts and cloud specialist) speaks more about OPs bias than about actual requirements. Not meaning to point fingers at them specifically too much; they're most likely in a bubble where these views are implicitly assumed.

As for the specific example of Photoshop, well, yeah.. Adobe has put in at least that much work and resources behind it, so what do you expect is required to get a fair fight? Photoshop was an incredibly complex and refined piece of software with immense work behind it before they moved to the cloud. 99% you wouldn't pull that off in 2008 either. That GIMP is still mostly unheard of outside of enthusiast circles and never posed any threat to Adobe shows that it takes way more than a handful of skilled devs.

> Copies of grep

https://github.com/BurntSushi/ripgrep started in 2016 and went stable in 2019. It's not being sold as a SaaS subscription because why would it?

I often hear complaints about entitled users but I get some "entitled tech founder" vibes here, as if not being able to sustain on rent-seeking behavior is a defect.

Seems like ripgrep would be the optimal tool for this job.

https://github.com/BurntSushi/ripgrep