What does HackerNews think of corrode?

C to Rust translator

Language: Haskell

There is also https://github.com/jameysharp/corrode

I have not used it, but I doubt it's possible to translate C into idiomatic Rust mechanically.

I think you're missing an important point:

Checked C’s design is distinguished by its focus on backward-compatibility, incremental conversion, developer control, and enabling highly performant code

The problem is that there is a huge body of decades-old C code out there that we all depend on.

A typical Docker container with a web app contains maybe 10K lines of your own code in Python/Ruby/JS, 100K lines of framework code, and 1M or 10M lines of C code (i.e. the interpreter itself, the web server, SSL, the base image, etc.)

So I welcome new languages focused on incremental conversion.

That's what I'm doing with Unix shell: http://www.oilshell.org/blog/2018/01/28.html

I applaud projects like Corrode, a C to unsafe-Rust translator, although undoubtedly they would have an easier time if Rust was designed in the first place for conversion.

https://github.com/jameysharp/corrode

So Checked C seems like a great idea to me.

I've noticed that most programmers seem to wildly overestimate the rate at which code gets rewritten. I think it's more accurate to say that code piles up over time. And unfortunately when the foundations are unstable, what you build on top is also unstable.

I'm the author of https://github.com/dropbox/rust-brotli and I certainly ran into the issues mentioned in the article, and the initial version of my brotli decoder and encoder were each almost 10x slower. I also worked around each and every one of them, and the result is something that performs at 80-90% of the speed of the original on average, and some files compress faster with rust than with the original brotli codec.

allocator: I ran into the same situation and abstracted it with a generic allocator: https://github.com/dropbox/rust-alloc-no-stdlib it's an ugly solution, but it works and can allow for significant perf improvements down the line by bundling all same types together

benchmarks: I simply made a test that printed the time

primitive types: it's easy to write generic functions that do this

macro system: I found to be amazing, but I never used it to compact data types

borrow checker: split_at_mut was super helpful...also core::cmp::replace was useful to taking away pieces of a structure and manipulating them, then putting them back

Though it wasn't all roses: to try to get SSE vectorization I had to convert this 6 line short string matching function https://github.com/dropbox/rust-brotli/blob/238c9c539b446d7d...

into this multi hundred line monster with macros and so forth https://github.com/dropbox/rust-brotli/blob/238c9c539b446d7d...

but in the end it was as fast as hand-tuned intrinsics in C

I also found myself scared by dependencies, including onto the std library, so I tried to get everything to remain within the core library. This should allow something as low-level as a codec to be used in kernel space or in another place where a custom allocator is needed.

I also found that "rewriting in safe rust from C" was made easy by corrode https://github.com/jameysharp/corrode since C and rust can be interface compatible, it's easy to go one file at a time and turn it into working rust, then safe rust.

Bonsoir, allow me to introduce Servant.

http://haskell-servant.readthedocs.io/en/stable/index.html

Servant allows you to write a type that describes a server/API, and automatically derives a specification for it. You can then write handlers for the different API endpoints, and it will only compile if every endpoint is correctly handled. That includes content types, status codes, and so on, everything is verified, and things are often derived for you. All (de)serialization to/from JSON is automatic (although you can write your own instances if you want your JSON to look a certain way).

For example, from the Servant docs:

    type API = "position"  :> Capture "x" Int :> Capture "y" Int :> Get  '[JSON] Position
          :<|> "hello"     :> QueryParam "name" String           :> Get  '[JSON] HelloMessage
          :<|> "marketing" :> ReqBody '[JSON] ClientInfo         :> Post '[JSON] Email
This is equivalent to an API that defines these three endpoints:

    GET  /position/:x/:y/      returns a Position object as JSON
    GET  /hello?name=whatever  returns a HelloMessage object as JSON
    POST /marketing            given a JSON request that represents a ClientInfo object,
                               returns an Email object as JSON
but the spec is code: code that the compiler can check for you!

You can then write functions of the following types

    handlePosition  :: Int -> Int   -> Handler Position
    handleHello     :: Maybe String -> Handler HelloMessage
    handleMarketing :: ClientInfo   -> Handler Email
where "Handler thing" effectively means that you can do something like write to a log or throw an exception while you return the thing. This will typecheck, and you have a server that must do what it says on the "tin", the tin being the API type above.

If you add a new API endpoint, and forget to write a Handler for it, you'll get to know. If you change the Position type to work with x, y, and z coordinates but forget to update your handlers, you'll get to know. If you'd like to also allow clients to request HTML instead for some endpoint, just change the '[JSON] to '[JSON,HTML]. The Haskell typesystem will make sure that someone requesting a text/html Content-Type doesn't get hit with a 500 or something.

Note the Maybe String there: if you wrote a handler for the /hello endpoint that had the type String -> Handler HelloMessage, Servant would complain: you're expecting a query parameter, which the client doesn't have to provide. This is in stark contrast to, say, the "NoneType has no attribute whatever" problems that one risks having to face with Django: if it compiles, it meets the spec.

Of course, static typing won't prevent you from responding to /hello with Lovecraft quotes.

--

> I start getting lost in layers of monad transformers

Handler is actually a monad transformer, and it's a great first one: it's essentially[0]

  type Handler a = ExceptT ServantErr (ReaderT Config IO) a
if memory serves, which means that a "Handler a" is a computation that can

  * throw a ServantErr
  * read values from a Config object
  * perform IO actions (read files, make other network requests, log things, "fire missiles")
when run, returning a value of type a. Anecdotally, monad transformers never clicked for me until I started muddling through actually using them in code like this.

After a couple of months, you realise it's not too wrong to say you understand how to use them, and you slowly begin to be able to rely on the type system for support. Libraries like Servant are a great example of how this can really, really help. I stopped to think last week how similar it is to pair-programming with a really intelligent but sometimes obtuse friend who likes pointing out how "this doesn't follow from your assumptions", all the way from

> "Wait, but you can't add two books together."

to

> "What if someone PUTs to /login?"

which is what we had above, to

> "And what happens if I try to withdraw all my money exactly when the payment I've made to you is getting executed?"

(in this case, you discover the wonders of software transactional memory[1]). A better typesystem allows you to let the compiler handle more of the busywork that you'd originally have written tests or comments for.

It is sometimes confusing starting out (when one discovers that 3 isn't an Int, but a "Num a => a", for instance), but it gets better. Once you get past the Project Euler/"look ma, infinite lists!" stuff, "real" Haskell does have a tendency to make people underestimate how much they really know, but as I've discovered, taking the plunge reveals that one has progressed much farther than one thinks.

Also, #haskell on Freenode is one of the friendliest places I've encountered on the internet[0].

[0]: http://haskell-servant.readthedocs.io/en/stable/tutorial/Ser...

[1]: https://www.schoolofhaskell.com/school/advanced-haskell/beau...

[2]: Here's a hilarious example (NSFW language warning): https://gist.github.com/quchen/5280339

--

Other stuff:

Corrode is a C-to-Rust converter written in literate Haskell (as in, it's like a blog post that you can compile and run, to give a slightly strained analogy).

https://github.com/jameysharp/corrode

webshit weekly actually "quoted" a previous HN comment of mine about it, in which I was falsely slandered: I'm not a Rust evangelist!

I wrote a small "script" to make tmux status bars a while back that was a good example of how fun "scripting"-style thing can be in Haskell.

https://github.com/mrkgnao/remux

And there's always XMonad!

For automated tools that can convert a C codebase to Rust, you're looking for is Corrode: https://github.com/jameysharp/corrode

Last I heard, Corrode can convert many C syntactic structures into their Rust equivalents, they usually compile and sometimes they even work. Even when the conversion works flawlessly, the resulting Rust does exactly what the original C code did—there's no safety benefit (but you can begin refactoring to clean things up).

https://github.com/jameysharp/corrode is a C-to-Rust translator. It won’t be idiomatic Rust, though; not by a long shot. (Rust is much more restrictive in what it allows, for reasons of safety, though more expressive.) And it may not be correct, either; corrode is getting pretty good, but it’s not perfect yet.
Warning: rambling comment. A rambling post deserves a rambling answer, particularly if the rambling is in the wrong direction... http://xkcd.com/386/

> Every morning, I wake up, drink a glass of Soylent and recite the following: “Today, I will solve challenging problems. Tomorrow, I will also solve challenging problems. Every day, I will solve challenging problems, and then the robots will take over, and I will die a fulfilled man, and someone will post my obituary on Hacker News.”

That's this guy: https://alexvermeer.com/life-hacking/. Pretty sure he occupies a unique niche in the world though.

> I am creating

This statement is clearly false, as writing blog posts is in no way related to writing a transpiler. Typically all coding posts are post-mortems.

> a JavaScript to Rust transpiler in Haskell,

I guess this is a takeoff on the C-to-Rust translator (https://github.com/jameysharp/corrode)? (and all the compile-to-JS projects such as GWT, emscripten etc.). But those have actual use cases, whereas going from front-end to deep back-end / systems programming seems like a stretch.

It's true there are a lot of programming language and toolkit posts / flames on HN, and I admit they're often kind of shallow. But a small discussion is better than none at all, and it occasionally gets someone knowledgeable to contribute. But there's a qualitative difference between lots of people spending a little time on a subject versus a few people spending a lot of time on a subject. The first is often productive in a wisdom-of-the-crowds sort of way, while the second is only worthwhile if it's some kind of meeting or contract negotiation with external resources at stake. If you're exerting large amounts of energy arguing on HN then I would venture to say that you're using it wrong.

> will I be able to have conversations with normal human beings again?

I take offense at this. There is no reason you would want to talk to "normal human beings"; they are fundamentally disassociated from their internal desires. See https://youtu.be/eJ3RzGoQC4s?t=4107 (Century of the Self, part 2, Anna's project to create normal human beings). There are strong elements of this thinking in recent news surrounding the elections, so it's not surprising you would have been swept in. But the fact is that "normal human beings" are a phantom: http://ijr.com/2015/06/354635-epa-administrator-says-half-am.... Best you can do is segmentation, e.g. http://c.ymcdn.com/sites/dema.site-ym.com/resource/resmgr/Me....

> put an end to the godforsaken monotony day after fucking day > Will I be able to feel? If you cut me, will I bleed? > Please Just End My Suffering > Please save me

Sounds like a metal/rock band: https://www.youtube.com/watch?v=5NZsCYOM4j0 https://www.youtube.com/watch?v=2okd9UHLExY https://www.youtube.com/watch?v=XblNnon-XTc https://www.youtube.com/watch?v=BZg-72u-QpI. Maybe that was the point of this post, as some sort of experimental art.

> Be on the lookout for a stable release soon.

Software doesn't have stable releases: https://blog.codinghorror.com/the-infinite-version/. It has stable channels and (occasionally) pinned versions. But a transpiler is developer-oriented so would never have a stable (product) channel at all.

> I see clouds in the sky, and green grass. It has been fifty years and I am sitting in the park with my dog, feeding ducks and watching the local children at play. I haven’t uttered the phrase “type-safety” in years. All the startups are gone. I am free.

The Matrix Revolutions ending is better: https://www.youtube.com/watch?v=qTnBVDKuNdI

It's a shame, really, but if you could write a decent C to safe Rust transpiler, than Rust probably would not exist.

Lifetime inference etc is a hard problem, and if it could be done right now, compilers would just have static checks for all that.

There is a C to UNSAFE Rust transpiler, though: https://github.com/jameysharp/corrode.

> Hence the comment "With all the usual disclaimers about the inaccuracy of benchmarks". But it still gives a good, practical starting point for discussions. You're welcome to provide a counter data set to further the discussion.

There is nothing you can do in C++ that you can't do in Rust as far as memory is concerned. The compiler backends are even identical! You can drop libstd if you want in Rust, which you probably would in the microcontroller use case. You can even translate C to Rust [1], which should result in virtually identical LLVM IR!

That's what's so frustrating about throwing out benchmarks game numbers: the languages are isomorphic, so you end up ultimately comparing things like jemalloc implementations.

> Like ATMega? If an architecture is supported in modern versions of GCC, there's a good chance that it's needed by somebody. And that doesn't count the dozens of specialized C compilers for other non-standard architectures. Modern banks are still running Cobol on mainframes, after all, and microcontrollers are everywhere.

Why is COBOL on mainframes relevant? We're talking about C++ here. (Anyway, if you want to bring up mainframes, IBM has a working SystemZ backend for LLVM.)

Talking about "microcontrollers" is too broad of a brush. A lot of microcontrollers are ARM (or MIPS, etc.). Rust runs just fine on those.

> Ultimately, with enough time and money, Rust is capable of competing with C for the embedded space, making a lot of embedded developers happy. But not in the forseeable future.

This is again way too strong of a statement, because we have people using Rust right now for embedded IoT use cases (including some at Mozilla!) It all depends on what you need. If LLVM supports your architecture (which is probably does) then great! If it doesn't, then let's talk about the specific architecture you need and what you need it for, rather than making blanket "Rust is a non-starter for embedded use" statements.

We're never going to get support for every architecture anyone can come up with that C has ever run on. But who cares? What matters is whether Rust runs on a platform you were seriously considering using Rust for. If it doesn't, then we can get that fixed; chances are if you want that architecture, someone else using LLVM does too.

[1]: https://github.com/jameysharp/corrode

I hate that way of looking at it because I think it's totally wrong.

C (this view is usually presented as only about c) isn't fundamental or bedrock in any way. Most of our current stacks just happen to be written in it.

The original Mac OS for instance was written in Pascal and C. The most popular open source compilers are written in c++. Most browsers are written in c++ not in c.

And there's no reason these days you couldn't rewrite llvm (a set of compiler libraries and compilers mostly written in c++, not that rust compiler uses llvm for code generation but the frontend is in rust) in haskell or java or javascript(they'd be slower but they would work). Haskell has useful features for writing compilers and a lot of language analysis libraries already. If you re-wrote in in rust it'd be just as fast.

Hell this guy (https://github.com/jameysharp/corrode) wrote a rust-c source-to-source compiler in literate haskell. Also in a few years servo will be a full browser in rust and Firefox is already adding components written in rust.

Other then os interface & c binary api linking there is nothing special about c or c++.