I want to download my favourite podcasts whenever new episodes come out.
Here is a Lil script to read in an XML feed, find which podcasts are missing, and download them. (It won't work for every podcast XML feed; it is not general-purpose.) I would love to know how to improve it!
# Get XML file
show["Downloading XML feed..."]
shell["curl URLGOESHERE > myfavpodcast.xml"]
# Get podcast list as table
x:read["myfavpodcast.xml"]
x:readxml[x]
x:(first x).children
x:(first x).children
x:table x
x:extract where tag = "item" from x
x:x.children
month: " " split "Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec"
month: month dict (list "%02i") format 1+range 12 # i = signed integer
# Remove all HTML tags
on contentOnly d do
if ("dict" = typeof d) "" fuse contentOnly@d.children else d end
end
on makeRecord x do
x: table x
x: x.tag dict "tag" drop x
description: x.description.children
description: " " fuse contentOnly@readxml[description]
date: ("d","m","y") dict 3 take 1 drop " " split x.pubdate.children[0]
date: date["y"], (month[date["m"]]), date["d"]
date: "-" fuse date
v: x.title.children[0],
x.link.children[0],
x["itunes:image"].attr.href,
description,
x.enclosure.attr.url,
x.guid.children[0],
date
(keys x) dict v
end
x:table makeRecord@x
# Check which podcasts are not present in folder
x:table each r in rows x
r.filename: "%s - %s.mp3" format r.pubdate, r.title
r.filename: "-" fuse "/" split r.filename # Unix can't handle / in filename
end
x:x.filename dict x
x:dir["."].name drop x
# Second round of escaping, for building string to execute in shell
x: each r in range x
r.filename: -1 drop 1 drop "%j" format r.filename # hack to escape double-quote
end
# Download missing podcasts
each r in x
show["Downloading: %s" format r.filename]
cmd:"curl \"%s\" > \"%s\"" format r.enclosure, r.filename
shell[cmd]
end
exit[]
Questions:
- Why does this have a ? character in the output? readxml["<tag><a>abc</a>. </tag>"
- How can I tell when square brackets are required (eg show[123]) or not (eg count 1,2,3)?
- 'Each' works through a table's columns rather than its rows. Is there any advantage to using a table instead of a dict if I'm not using select/update/extract/etc? I was trying to concatenate corresponding elements of two table columns (date and name) using extract, in order to fuse them - I ended up doing it via a "x:table each r in rows x ... end" where x is a table.
Next steps will be to:
- somehow associate the podcast image with the mp3
- validate the downloads completed (md5?).
