There are things I've come to dislike and avoid when programming in general:

- Avoid programming in strings (especially in Bash, where nested quotes are full of pitfalls)

- Avoid magic switches that change behavior (like -F)

- Avoid terse or cryptic variable names (like $NF)

- Avoid terse and magical syntax (sorry Perl, happy to leave you behind me)

- Avoid programs that are hard to read

- Avoid programs that are difficult to debug while writing them

- Avoid programs that ignore types

For these reasons, I prefer to avoid awk for anything except the most trivial of tasks. I think the prevalence of scripting languages and the speed of execution and debugging today has made awk not as necessary as it may have been in the 70s. And as to the first point, I'm aware you can write awk scripts in files, and I feel like if your script has gotten complex enough that you need a file, you're creating something unmaintainable and unreadable that would be better suited in a different programming language.

Edit: I should add this article is great and a good introduction to awk, regardless of my personal taste for the tool.

jrumbut

The thing that prevents awk from being a major part of my daily routine is that it (amazingly) has poor CSV support. Consider the following:

col1,col2,col3

1,2,3

4,"hello, \"world\"",6

"7 buckets",,9

To get the usual awk experience with this very common file format, exactly the type of thing you want to parse with awk, you first need to install gawk, then use a big FPAT regex that needs to be adjusted for any new CSV variant.

I would love to see awk with "CSV mode", where it intelligently handles formats like this if you just pass a flag. I think awk would do well to differentiate itself with excellent 2d dataset parsing functionality, but at least catchup up to the average scripting language would be great.

I'm half expecting someone to say "just pass -csv it does what you want" and if so I'll be very excited.

dbro

There is a small program I wrote called csvquote[1] that can be used to sanitize input to awk so it can rely on delimiter characters (commas) to always mean delimiters. The results from awk then get piped through the same program at the end to restore the commas inside the field values.

In principle:

  cat textfile.csv | csvquote | awk -f myprogram.awk | csvquote -u > output.csv

Also works for other text processing tools like cut, sed, sort, etc.

[1] https://github.com/dbro/csvquote