This show has been flagged as Clean by the host.
Hi this is your host, Archer72 for Hacker Public Radio.
In this episode I share some of my findings about a problem with the Newsboat naming of the HPR feeds,
which was brought up in comments about my Newsboat show, HPR4424.
hpr4424: How I use Newsboat for Podcasts: comment #6 : download-filename-format for HPR podcasts
Ken already had some findings of his own about the
ccdn.php extension in the feed.
hpr4424: comment #10 : Summary of findings
I thought that this might be able to be fixed on an invididual basis, and set out to ask Claude.ai a few questions.
But first, some colaboration from Dave Morriss about a good renaming format. This was definitely more on Dave’s side than mine, but came up with this.
You can tell Dave’s handywork from the short variable names, which stems from his extensive experience on Unix type machines in the University days.
#!/bin/bash
URL="$(cat /tmp/hpr-url.txt)"
echo "DEBUG URL: $URL" >> /tmp/hpr-debug.log
AUDIO_URL="$(curl -s "$URL" | grep -Eo 'https?://[^"]*\.(ogg|mp3)' | head -1)"
echo "DEBUG AUDIO: $AUDIO_URL" >> /tmp/hpr-debug.log
if [[ -z "$AUDIO_URL" ]]; then
echo "ERROR: Could not find audio URL from: $URL" >> /tmp/hpr-debug.log
exit 1
fi
# Changed destination to HPR-queue
DEST=~/podcasts/hub.hackerpublicradio.org/HPR-queue/
# Record files present before download
BEFORE="$(ls "$DEST"*.{ogg,mp3} 2>/dev/null | sort)"
wget -nc --content-disposition -P "$DEST" "$AUDIO_URL"
cd "$DEST"
# Record filename just downloaded (new file not in BEFORE)
AFTER="$(ls "$DEST"*.{ogg,mp3} 2>/dev/null | sort)"
DOWNLOADED="$(comm -13 <(echo "$BEFORE") <(echo "$AFTER"))"
echo "DEBUG DOWNLOADED: $DOWNLOADED" >> /tmp/hpr-debug.log
~/bin/exif-rename-hpr-dave.sh
# Find renamed file — newest file that wasn't in BEFORE
AFTER_RENAME="$(ls "$DEST"*.{ogg,mp3} 2>/dev/null | sort)"
RENAMED="$(comm -13 <(echo "$BEFORE") <(echo "$AFTER_RENAME"))"
echo "DEBUG RENAMED: $RENAMED" >> /tmp/hpr-debug.log
if [[ -n "$RENAMED" ]]; then
echo "\"$AUDIO_URL\" \"$RENAMED\" downloaded" >> ~/.local/share/newsboat/queue
else
echo "WARN: Could not determine renamed file" >> /tmp/hpr-debug.log
echo "\"$AUDIO_URL\" \"$DOWNLOADED\" downloaded" >> ~/.local/share/newsboat/queue
fi
At first the question was about something simple. The input was a query on one of the lines from Kevie’s
hpr4398 :: Command line fun: downloading a podcast
Particularly, the section on To get the latest episode of TuxJam
wget
curl https://tuxjam.otherside.network/feed/podcast/ | grep -o
'https*://[^"]*ogg' | head -1
Which I re-wrote to:
wget -nc -P ~/podcasts/TuxJam $(curl https://tuxjam.otherside.network/feed/ogg | grep -Eo 'https*://[^"]*ogg' | sort -u | xargs | head -1)
The reason for $() instead of backticks to enclose a command was that the former was being deprecated.
GNU Bash Reference Manual - 3.5.4 Command Substitution
-nc –no-clobber is to not re-download a podcast -P specifies download directory
I went on a different direction than downloading TuxJam and asked
to download the last 2 hpr shows, but
head -2 did not work as expected. This turned out to be
an issue with the placement of
xargs joining all URLs and passing them to
wget all at once.
Original:
wget -nc --content-disposition -P ~/podcasts/hub.hackerpublicradio.org/HPR-newsboat-test/ $(curl [http://hackerpublicradio.org/hpr\_ogg\_rss.php](http://hackerpublicradio.org/hpr_ogg_rss.php) | grep -Eo 'https\*://\[^"\]\*ogg' | sort -u | xargs | head -2)
New:
curl http://hackerpublicradio.org/hpr_ogg_rss.php | grep -Eo 'https?://[^"]*\.ogg' | sort -u | head -2 | xargs wget -nc --content-disposition -P ~/podcasts/hub.hackerpublicradio.org/HPR-newsboat-test/
Key fixes:
xargs before
head -2
, so
head actually limits the list to 2 URLs
xargs wget ... to the end, after the list is already
trimmed
https* →
https? (the original would also match
httpssss
, etc.)
Now I wanted the downloaded file to go the the queue file, located
in
~/.local/share/newsboat/queue
After several iterations in
Claude
, it was determined that the audio URL was not being expanded by
the
%u in the download macro.
The solution was to add a /tmp file to hold the actual audio URL
macro d set browser "echo %u > /tmp/hpr-url.txt && ~/bin/download-and-rename-hpr.sh"; open-in-browser ; set browser "your-normal-browser"
A few Claude questions later… After asking to add a function to the macro in order to add the resulting downloaded file to the queue, the file name hpr1234.ogg was being added to the queue instead of the renamed file.
Two things to fix:
ls -t runs before the rename has fully settled, or the
glob isn’t matching the new filename format (which includes
spaces and semicolons)
"url" "path" downloaded — we need to match that, with
downloaded status and the full renamed path including
extension
#!/bin/bash
# ~/bin/download-and-rename-hpr.sh
URL="$(cat /tmp/hpr-url.txt)"
echo "DEBUG URL: $URL" >> /tmp/hpr-debug.log
AUDIO_URL="$(curl -s "$URL" | grep -Eo 'https?://[^"]*\.(ogg|mp3)' | head -1)"
echo "DEBUG AUDIO: $AUDIO_URL" >> /tmp/hpr-debug.log
if [[ -z "$AUDIO_URL" ]]; then
echo "ERROR: Could not find audio URL from: $URL" >> /tmp/hpr-debug.log
exit 1
fi
DEST=~/podcasts/hub.hackerpublicradio.org/HPR-newsboat-test/
# Record files present before download
BEFORE="$(ls "$DEST"*.{ogg,mp3} 2>/dev/null | sort)"
wget -nc --content-disposition -P "$DEST" "$AUDIO_URL"
cd "$DEST"
# Record filename just downloaded (new file not in BEFORE)
AFTER="$(ls "$DEST"*.{ogg,mp3} 2>/dev/null | sort)"
DOWNLOADED="$(comm -13 <(echo "$BEFORE") <(echo "$AFTER"))"
echo "DEBUG DOWNLOADED: $DOWNLOADED" >> /tmp/hpr-debug.log
~/bin/exif-rename-hpr-dave.sh
# Find renamed file — newest file that wasn't in BEFORE
AFTER_RENAME="$(ls "$DEST"*.{ogg,mp3} 2>/dev/null | sort)"
RENAMED="$(comm -13 <(echo "$BEFORE") <(echo "$AFTER_RENAME"))"
echo "DEBUG RENAMED: $RENAMED" >> /tmp/hpr-debug.log
if [[ -n "$RENAMED" ]]; then
echo "\"$AUDIO_URL\" \"$RENAMED\" downloaded" >> ~/.local/share/newsboat/queue
else
echo "WARN: Could not determine renamed file" >> /tmp/hpr-debug.log
echo "\"$AUDIO_URL\" \"$DOWNLOADED\" downloaded" >> ~/.local/share/newsboat/queue
fi
Using
comm -13 on sorted before/after file lists is more
reliable than
ls -t — it finds exactly which file is new regardless of
timing. Check
/tmp/hpr-debug.log after the next run to see what
DOWNLOADED and
RENAMED show.
GNU Coreutils - Comm invocation
download-filename-format "%t.%e" macro d set browser "echo %u > /tmp/hpr-url.txt && ~/bin/download-and-rename-hpr.sh"; open-in-browser ; set browser lynx
Unless otherwise stated, our shows are released under a Creative Commons Attribution-ShareAlike 3.0 Unported (CC BY-SA 3.0) license.
The HPR Website Design is released to the Public Domain.