Posts Tagged ‘awk’

Exclude items in a list in a bash script

Saturday, May 22nd, 2010

In a Linux bash script you can loop over a set of data like a list of directories, database table names etc.
But you do not always want to use all the items in this list so you need to filter the results.

In this example I will use a generated list of directory names.
Filtering the directories we want to exclude is done by removing them from the directory listing using sed.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
DIRS=`ls -l --time-style="long-iso" $MYDIR | egrep '^d' | awk '{print $8}'`

# make a list of directories you want to exclude
$DIREXCLUDE="dir1 dir3 dir5"

# now remove the excluded directories from the directory list by looping over the directories and remove the excluded directories with sed:
for EXCLUDE in $DIREXCLUDE
do
    DIRS=`echo $DIRS | sed "s/\b$EXCLUDE\b//g"`
done

# and finally loop over the cleaned up directory list.
for DIR in $DIRS
do
    echo  ${DIR} :
done

* see a previous article about selecting directory names in a cron job.

List directory names in bash shell

Tuesday, April 6th, 2010

Here is a little trick you should know when selecting directories in a Linux bash script:

Getting a list of directory names in a bash shell is a simple task:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
#!/bin/bash

$MYDIR="/var/log"

DIRS=`ls -l $MYDIR | egrep '^d' | awk '{print $8}'`

# "ls -l $MYDIR"      = get a directory listing
# "| egrep '^d'"           = pipe to egrep and select only the directories
# "awk '{print $8}'" = pipe the result from egrep to awk and print only the 8th field

# and now loop through the directories:
for DIR in $DIRS
do
echo  ${DIR}
done

A script like this is usually needed for a job that needs to be scheduled with cron, for example to make nightly backups, scheduled svn updates etc.

But there is a little catch;
When running a script in cron, the environment in which the script runs can be different and the “ls -l” command may result in a different layout.

The problem is in the time format which can throw off the result that the awk part of the script returns.

Sometimes you need to use: awk '{print $8}'

And in another environment you need to use: awk '{print $9}'

To prevent maintaining two versions of the same script you need to make sure the time format is always the same, without altering the environment variables, which could have side effects.

The solution is something like this:
DIRS=`ls -l --time-style="long-iso" $MYDIR | egrep '^d' | awk '{print $8}'`

the: –time-style=”long-iso” makes sure that the same format for the date-time string is the same in all environments.