What does HackerNews think of hail?

Scalable genomic data analysis.

Language: Python

#164 in Python
#6 in Scala
I don't have any funding to hire right now, but I'm always happy to chat about the industry and my experience building Hail (https://hail.is, https://github.com/hail-is/hail), a tool widely used by folks with large collections of human sequences.

The other posters are not wrong about compensation. Total compensation is off by a factor of two to three.

However, it is absolutely possible to work with a group of top-notch engineers on serious distributed systems & compilers in service of an excellent scientific-user experience. I know because I do. We are lucky to have a PI who respects and hires a diversity of expertise within his lab.

I enjoy being deeply embedded with our users. I do not have to guess what they need or want because I help them do it every day.

I also enjoy enmeshing engineering with statistics, mathematics, and biology. Work is more interesting when so many disciplines conspire towards the end of improved human health.

Hail at the Broad Institute of MIT and Harvard | Software Engineer | Boston, MA | ONSITE, https://hail.is, https://broadinstitute.org

The Broad Institute of MIT and Harvard was launched in 2004 to improve human health by using genomics to advance our understanding of the biology and treatment of human disease, and to help lay the groundwork for a new generation of therapies.

The Hail team's mission is to build tools to enable rapid analysis and exploration of biological datasets (100s of TB and tripling yearly). We are committed to open science and everything we do is open source. We currently develop in Python, Scala/Java, and C/C++ and use Spark, Kubernetes, Google Cloud Platform (GCP) and AWS, but will use any tools we need to get the job done. Come help us build the future of big scientific data analysis.

We have two positions:

Update: The Site Reliability Engineer position has been filled.

We also have a front-end/designer position that will be posted shortly. Email below, get in touch if you're interested.

You don't need experience in biology or our particular technologies. We work in a highly multi-disciplinary environment (with software engineers, biologists, bioinformaticians, doctors, operations, statisticians, etc.) Self-improvement is a fundamental part of our culture. You must be excited to be challenged and learn new things.

I'm the hiring manager. Get in touch with me directly if you have any questions: [email protected].

You can learn more about the project here: https://hail.is, https://github.com/hail-is/hail

We are one of several software engineer groups at the Broad that are hiring. You can find more positions here: https://broadinstitute.wd1.myworkdayjobs.com/broad_institute

Hmm I got the feeling that in genomics things were moving towards python too e.g https://github.com/hail-is/hail.
Yes! I was just thinking about this, actually. We're building scalable tools for analyzing genetic data on Spark:

https://hail.is https://github.com/hail-is/hail

It's tricky because we need someone who can write, has technical knowledge (python, Apache big data stack) and some knowledge of bioinformatics and statistical genetics. There might be an option for some paid work. What's the going rate for technical writing?

Hail

Github: https://github.com/hail-is/hail

dev gitter: https://gitter.im/hail-is/hail-dev

Starter issues (a bit sparse right now): https://github.com/hail-is/hail/labels/starter

Hail is a scalable framework for massive data analysis. It's written in Scala and built on Spark and the Hadoop ecosystem.

I'm a software engineer/mathematician, but we're embedded in a world-class genetics research lab. The first paper using Hail was put out recently:

http://biorxiv.org/content/early/2016/06/06/050195

with more in the pipeline. Hail is being used to analyze some of the largest genetic datasets out there (hundreds of thousands of exomes and tens of thousands of whole genomes). There's tons to do. Jump in or email me (see my profile) if you'd like to get involved. If you tell me what you're interested in, I will try to tailor a task for you. No bio knowledge needed.

License: MIT

P.S. We found one other contributor and our last hire through HN. So, thanks, HN!