Integrated DB Access From the Shell

One of the main things I do in my scripts regularly is work with databases using their cli.

psql -d "$DSN" -t -c "select id, email from guest" | while IFS=\| read id email
do
    valid=$((is_email_valid $email))
    psql -d "$DSN" -t -c "update guest set is_valid='$valid' where id=$id"
done

Even though this is clunky, for a one-off script iterating on a few records, this approach works. It does not scale, however, due to violating the golden rule.

Some of the problems with this approach are:

  1. Ripe for SQL-injection attack. We are essentially building SQL queries by string concatenation like the early 2000s and are prone to SQL-injection attacks as a result.
  2. Poor DB Performance. No prepared queries and persistent DB connections means, each update will have to establish a new DB connection and prepare/execute the query.
  3. No transaction control. Since each update is forced to establish a new DB connection, you effectively cannot batch updates.
  4. Poor Scaling. Since each update has to launch a child process, the overhead for a large input set piles up quickly and will almost become unworkable for a production issue.
  5. Database Driver is Hard-Coded. Try changing the database driver from psql to isql and you end up doing a lot of manual refactoring of the code base.
  6. Not easily usable interactively. Using the native db utilities in this way is not very comfortable for a terminal user. For example, I might want the headers and labels if I am outputting to my terminal, but would want a CSV when redirecting to a file. As a user, I would need to remember the flags for each of those operations.

This approach also does not yield well to idiomatic shell scripting. Here is my attempt to do this in an idiomatic way.

DSN="postgres"
SELECT id, email from guest | while read id email
do
    valid=$((is_email_valid $email))
    echo "$id" "$valid"
done | UPDATE guest set is_valid = $2 where id = $1

This relies on the capability of modern shells to define custom built-ins. Combine this with KEYBD traps and you can do some fun things interactively too.

# Download the results to a csv file
$ OFS=, SELECT \* from guest > guest.csv
2 rows selected

$ cat guest.csv
1,test@test.com,Y
2,test1@test.com,N

# Delete the data
DELETE from guest
2 rows deleted

# Bulk load data from a csv file
$ IFS=, INSERT into guest < guest.csv
2 rows inserted

# Dump data on the terminal from a query
# (will automatically use PAGER to give a sane user-experience)
$ SELECT \* from guest

id          1
email       test@test.com
is_valid    Y

id          2
email       test1@test.com
is_valid    N

2 rows deleted

Here is the keyboard trap I use with ksh to make typing these commands on the keyboard convenient and auto-escaping some of the special characters.

## The trap to use for keyboard bindings (KEYBD) to properly escape SQL queries
function db_keybd
{
   ##############################################
   ## In the code below type Ctrl-V+Esc for ^[ ##
   ##############################################   
   if [[ ${.sh.edchar} == ' ' ]]; then
      ## Make the keyword consistent case,
      ## so you are not forced to type all upper-case for the keywords
      if [[ ${.sh.edtext} == [sS][eE][lL][eE][cC][tT] ]]; then
         .sh.edchar="^[0cwSELECT"
      elif [[ ${.sh.edtext} == [uU][pP][dD][aA][tT][eE] ]]; then
         .sh.edchar="^[0cwUPDATE"
      elif [[ ${.sh.edtext} == [dD][eE][lL][eE][tT][eE] ]]; then
         .sh.edchar="^[0cwDELETE"
      elif [[ ${.sh.edtext} == [iI][nN][sS][eE][rR][tT] ]]; then
         .sh.edchar="^[0cwINSERT"
      fi
   fi

   ## Escape the special characters, unless it is already escaped
   [[ ${.sh.edchar} == '*' || ${.sh.edchar} == $'\'' || ${.sh.edchar} == '"' || ${.sh.edchar} == '(' || ${.sh.edchar} == ')' ]] \
      && [[ ${.sh.edtext} == @(SELECT|UPDATE|DELETE|INSERT)\ * ]] \
      && [[ ${.sh.edtext:$((.sh.edcol-1)):1} != \\ ]] \
         && .sh.edchar=\\${.sh.edchar}
}

trap db_keybd KEYBD

In part 2 of this post we will look at how to develop the plugins to implement the actual functions.

Manuj Bhatia

Manuj Bhatia