Does anyone knows the difference to other languages like STAN?

Both use Hamiltonian Monte Carlo. As far as I know STAN cannot model factors but can restrict a variable to integers. In this case, Turing.jl and STAN do the same and switch back to slower MCMC.

So why should I switch to turing.jl?

Their wiki shows that turing.jl is about 10 times slower than STAN:

  Results
  Model 	Turing 	Stan 	Ratio
  Simple Gaussian Model 	1.2 	0.06 	20.24
  Bernoulli Model 	1.53 	0.05 	32.73
  School 8 	2.34 	0.1 	24.41
  Binormal (sampling from the prior) 	0.88 	0.11 	8.37
  Kid 	32.66 	4.72 	6.92
  LDA-vec 	23.94 	3.78 	6.34
  LDA 	72.78 	3.78 	19.28
  Mixture-of-Categorical 	22.28 	6.41 	3.48
  NOTE 1
  Numbers here are inference time in second - smaller number   indicates better performance.

We're setting up parameter estimation in DifferentialEquations.jl with both Stan.jl and Turing.jl as separate but connected projects. For generated data from the Lotka-Volterra equations, Turing.jl seems to be able to recover the four parameters well in minutes, whereas Stan was running for a few hours and didn't get it right. It's hard to know if it's because of the difference in the ODE solver and the accuracy there (Stan has to use its internal solvers, whereas Turing.jl gets to use our whole stack. For testing we chose Tsit5 which should be to some extent equivalent to their rk45, but without knowing the details of their solver it's hard to know how different it truly is) or due to the chosen method for MCMC method. One major advantage of Turing.jl though is it lets us use our full stack: Distributions.jl, all of JuliaDiffEq (ODE/SDE/DAE/DDE/etc. solvers), whereas Stan requires you use their two ODE solvers. So we're going to finish up this DiffEq->Stan automatic generation bridge for testing and benchmarking purposes, but the Turing.jl route looks much more promising. They have done a really good job.

I'm surprised you're able to get Turing to run faster, to be honest. I looked at Turing and was impressed by it, but in general my experience with MCMC libraries in Julia is similar to what the OP posted, being much slower than Stan (MAMBA and Turing being the two Julia libraries I've tried--both have seemed nice but have run significantly slower on the tasks I've used them for).

Incidentally, I haven't been able to get Turing to run on Julia 0.6, and am not going to reinstall an earlier version just for Turing. I've been waiting for a little while now.

My experience with Turing and MAMBA have kind of diminished my enthusiasm for Julia. Both libraries kind of represent what I was looking for in Julia (similar to what you mention about using your Julia stack), but the speed was a kind of a rude awakening. I'm kind of coming to the opinion that LLVM-based languages need to demonstrate much more consistent performance before they're really ready to replace C (Rust, Nim, and Crystal seem like they might be on their way, though).

I recommend someone thinking about Julia pay attention to these benchmarks:

https://github.com/kostya/benchmarks

Yes, Julia is really fast with some things, but for other things it's much slower, and those slow parts become the lowest common denominator.

Hopefully things improve, though, because I do like Turing and MAMBA on paper.