functions and error handling

.!.

Last time I tweaked the parser, tonight I'm going to move some of the code into functions. Many people think of functions as a way to reuse the same code, but they can also be used to seperate logic into bite sized chucks. This can often lead to code that's easier to read, and makes modifying things later easeir, both for you, and anyone else who wants to enhance the script.

#!/bin/bash shopt -s nocaseglob BASEDIR="/mnt/usb0/mp3/podCast" FEEDS="${BASEDIR}/feeds.lst" CACHEDIR="${BASEDIR}/cache" LOGFILE="${BASEDIR}/log"

Not much change here except I've added two config variables, CACHEDIR, and LOGFILE. The former will be used for caching the feed XML to help minimize network transfers, both for us and more importantly the feed provider. The later is for reporting errors, some of the functions will output lists of things to be piped into a loop, we can't just echo the errors.

function log() { echo "${1}" >>"${LOGFILE}" }

The first new function is the only one called more than once. It's nothing fancy, just appends the first argument to the logfile. Be sure to enclose the error message in "s otherwise you'll only see the first word in the log.

function getlist() { if [ ! -f ${FEEDS} ] ; then log "getlist cannot read feeds list: ${FEEDS}" return 1 fi grep -v -e '^[;#]' -e '^$' "${FEEDS}" }

Reading the feeds file has been moved from the end of outer while into a function so we can add some error handling without affecting readability. In the future this could be expanded to handle a feed list that's more complex than one url per line.

function getfeed() { # If we aren't given an URL, exit with a return code. if [[ "${1}" = "" ]] ; then log "getfeed called without an argument." return 1 fi # Try creating the cache dir. if [ ! -d "${CACHEDIR}" ] ; then if ! mkdir -p "${CACHEDIR}" ; then log "Unable to make cachedir: ${CACHEDIR}" return 2 fi fi # We have an url and a place to put it, let's try to get it. outfile="${CACHEDIR}/$(echo -n "${1}"|md5sum|awk {'print $1'})" if ! wget -N -q -O "${outfile}" "${1}" ; then log "wget exited with an error code trying to fetch: ${1}" return 3 fi if [ ! -f "${outfile}" ] ; then log "cached feed went missing: ${outfile}" return 4 fi cat "${outfile}" }

Feels like I'm getting carried away with this function. From the outside it isn't much different than the old wget command in the second while loop of podcast.003.sh. On the inside I have added a cache directory fo the XML. If the directory doesn't exist, we attempt to make it. The -N wget argument requests that wget check the last-modified header and not fetch the file unless the one on the server claims to be newer than the one on our disk. This should save bandwidth for both ourselves, and more impotantly the provider of the feed.

function getmp3() { echo "Channel: "$CHANNEL echo "Title: "$TITLE echo "Link: "$LINK echo "Date: "$DATE echo "ENCL: "$ENCL echo "HREF: "$HREF echo "" }

Still just a stub here.

while read URL ; do CHANNEL="" while read LINE; do TAG=$(echo ${LINE}|sed -n 's/<\([^>\ ]*\).*/\1/p') if [[ "${CHANNEL}" = "" ]] && [[ "${TAG}" =~ "title" ]]; then CHANNEL=$(echo ${LINE} | sed -n -e 's/<title>\([^<]*\)<\/title>/\1/pi'|sed -e 's/[\r\n]//') fi case "${TAG}" in 'title') TITLE=$(echo "${LINE}" | sed -n 's/<title>\([^<]*\)<\/title>/\1/pi') ;; 'link') LINK=$(echo "${LINE}" | sed -n 's/.*<link>\([^<]*\)<\/link>/\1/pi') ;; 'pubDate') DATE=$(echo "${LINE}" | sed -n 's/.*<pubDate>\([^<]*\)<\/pubDate>/\1/pi') ;; 'enclosure') ENCL=$(echo "${LINE}" | sed -n 's/.*<enclosure url=["'\'']\([^"'\'']*\)["'\''].*/\1/pi') ;; '/item') getmp3 ;; esac if [[ "${LINE}" =~ 'href=["'\''][^"'\'']*[\.]mp3["'\'']' ]] ; then HREF=$(echo "${LINE}" | sed -n 's/.*href=["'\'']\([^"'\'']*\).mp3["'\''].*/\1.mp3/pi') fi done < <(getfeed "${URL}") done < <(getlist)

podcast.004.sh

Finally, the body of the script. You can see in the last two lines we've replaced the commands with calls to the functions we defined above. I feel this is a lot cleaner, an easier to read.

One thing I did find out when I ran this script on an older computer, is that the =~ operator seems to be a bash 3 thing as it caused an error on bash 2.05b.0. If anyone knows which version of Bash this came about, please send me a note. If you're stuck with an older version of Bash, and cannot upgrade, the same results can be had using the following:

if echo "${LINE}" | grep 'href=["'\''][^"'\'']*[\.]mp3["'\'']' >/dev/null ; then HREF=$(echo "${LINE}" | sed -n 's/.*href=["'\'']\([^"'\'']*\).mp3["'\''].*/\1.mp3/pi') fi

I'm happy with how this script is turning out so far, and am actually using it to fetch podcasts with a primitive getmp3 function. In my next article I'll finish up the getmp3 function and add a simple playlist generator. I also plan to add the ability to add mp3tags as I've found too many podcasts have poor or no tags.

Leave a Reply

Your email address will not be published. Required fields are marked *