What does HackerNews think of prql?
PRQL is a modern language for transforming data — a simple, powerful, pipelined SQL replacement
This is off-topic, but we're always looking for compiler people at PRQL (https://prql-lang.org/) to help us build a query language for the next 50 years.
Come and take a look if that's something that floats your boat: https://github.com/prql/prql
I'm hoping that PRQL [0] will one day become the universal API for tabular/relational data. We're targeting SQL as the backend in the first iteration given it's universality but there are early plans to support other backends.
You can already use PRQL with Pandas, the tidyverse, shell and pretty much any database. See my presentation [1]. PRQL reads very similarly to dplyr, and in my (biased) opinion, actually a bit better than dplyr because it can do away with some of the punctuation due to being its own language.
For questions see our Discord [2] and if you would like to see PRQL in more places, file an issue on Github [3].
Some examples below:
## Pandas
```python
#!pip install pyprql
import pandas as pd
import pyprql.pandas_accessor
df = pd.read_csv("data/customers.csv")
df.prql.query('filter country=="Germany"')
```
## tidyverse ```sh
mkdir -p ~/.local/R_libs
R -q -e 'install.packages("prqlr", repos = "https://eitsupi.r-universe.dev", lib="~/.local/R_libs/")'
```
```R
library(prqlr, lib.loc="~/.local/R_libs/")
library("tidyquery")
"
from mtcars
filter cyl > 6
sort [-mpg]
select [cyl, mpg]
" |> prql_to_sql() |> query()
```
### PRQL ```prql
from employees
filter start_date > @2021-01-01
derive [
gross_salary = salary + (tax ?? 0),
gross_cost = gross_salary + benefits_cost,
]
filter gross_cost > 0
group [title, country] (
aggregate [
average gross_salary,
sum_gross_cost = sum gross_cost,
]
)
filter sum_gross_cost > 100_000
derive id = f"{title}_{country}"
derive country_code = s"LEFT(country, 2)"
sort [sum_gross_cost, -country]
take 1..20
```
[0]: https://prql-lang.org/[1]: https://github.com/snth/normconf2022/blob/main/notebooks/nor...
I think if you were going to do SQL over you would probably do it that way. SQL is definitely not perfect or optimal, just ubiquitous.
While it doesn't have the same spelling, pronunciation is the same.
FROM tablename t SELECT t.
and some form of autocomplete mechanism, either prefills all the column names from table "t" or suggests the list of columns and/or types associated with it.
This is much better than having to: 1. Run a SELECT with LIMIT statement just to get an idea of the layout. 2. Point and click through the IDE treeview.
Honestly, I don't think it helps a whole lot beyond this functionality, but I can see why folks who are accustomed to thinking in functional pipelines (from -> select -> map -> filter -> collect) can prefer this way of querying.
I think PRQL is one attempt at building something this way.[1]
It supports:
- functions,
- using an alias in same `select` that defined it,
- trailing commas,
- date literals, f-strings and other small improvements we found unpleasant with SQL.
https://lang.prql.builders/introduction.html
The best part: it compiles into SQL. It's under development, though we will soon be releasing version 0.2 which would be "you can check it out"-version.
Listing columns first Then table Then join Then filter Then group bys Then limit
The order of operations are out of whack and makes pipeline ing a little hard.
I found this to be closer to LINQ way
I hope in near future databases will come with better query languages...
There's no server process — it's all built in GitHub Actions, hosted in GitHub pages, and runs in the browser. The whole WASM code is 164 lines (o/w half are comments).
[1]: https://github.com/prql/prql [2]: https://lang.prql.builders/editor.html