ogg downloader

About writing shell scripts and making the most of your shell
Forum rules
Topics in this forum are automatically closed 6 months after creation.
Locked
passstab

ogg downloader

Post by passstab »

I'm trying to make a script that will download all the ogg files on a specific IA album
this is what i have so far it doesn't work yet
the site should be the http link (e.g for http://www.archive.org/details/ball_and ... 1_librivox use http://ia600408.us.archive.org/2/items/ ... _librivox/)

Code: Select all

rm -f ./index.html 
rm -f ./sindex.html
echo 'please type name of site'
read SITE
FN=$(basename $SITE )
wget $SITE -O ./index.html
echo 'please type file extension'
read EXT
grep .$EXT ./index.html | cut -d "\"" -f 2 > ./itemnames
chmod a+wrx ./itemnames
cat ./itemnames | parallel -i -- "mkdir -p -m 777 ./$FN/{}  wget  -O -B $SITE -i  {} >> ./$FN/{}"
rm ./index.html
rm ./itemnames
i'm sure this already exists but I'm trying to learn bash
anyone know what's wrong with it?
Last edited by LockBot on Wed Dec 28, 2022 7:16 am, edited 2 times in total.
Reason: Topic automatically closed 6 months after creation. New replies are no longer allowed.
User avatar
xenopeek
Level 25
Level 25
Posts: 29590
Joined: Wed Jul 06, 2011 3:58 am

Re: ogg downloader

Post by xenopeek »

passstab wrote:this is what i have so far it doesn't work yet
I'm no BASH guru, but where does it go wrong? Would help if you can point out what part is working as it should, and where it breaks. (I usually add some echo statements inbetween lines to find out if things are going as they should.)

You stated you wanted to learn BASH, but for anybody wanting to have a website downloader, HTTrack for linux works very good in downloading websites or sets of files from websites.
Image
passstab

Re: ogg downloader

Post by passstab »

thanks
the error i get is

Code: Select all

sh: cannot create ./ball_and_cross_1001_librivox/{}: Directory nonexistent
i added one echo in the only place i see it could go

Code: Select all

rm -f ./index.html 
rm -f ./sindex.html
echo 'please type name of site'
read SITE
FN=$(basename $SITE )
wget $SITE -O ./index.html
echo 'please type file extension'
read EXT
grep .$EXT ./index.html | cut -d "\"" -f 2 > ./itemnames
echo $(cat ./itemnames)
chmod a+wrx ./itemnames
cat ./itemnames | parallel -i -- "mkdir -p -m 777 ./$FN/{}  wget  -O -B $SITE -i  {} >> ./$FN/{}"
rm ./index.html
rm ./itemnames
User avatar
xenopeek
Level 25
Level 25
Posts: 29590
Joined: Wed Jul 06, 2011 3:58 am

Re: ogg downloader

Post by xenopeek »

Then this is your error:
cat ./itemnames | parallel -i -- "mkdir -p -m 777 ./$FN/{} wget -O -B $SITE -i {} >> ./$FN/{}"
As you can see from the error you get, the red statement above resolves to the script trying to create the {} directory in your ./ball_and_cross_1001_librivox directory. As the latter doesn't exist yet, this fails. What are the {} for?
Image
passstab

Re: ogg downloader

Post by passstab »

Code: Select all

rm -f ./index.html 
rm -f ./itemnames
echo 'please type name of site'
read SITE
FN=$(basename $SITE )
wget $SITE -O ./index.html
echo 'please type file extension'
read EXT
grep .$EXT ./index.html | cut -d "\"" -f 2 > ./itemnames
echo $(cat ./itemnames)
chmod a+wrx ./itemnames
cat ./itemnames | xargs -I {} mkdir -p -m 777 ./$FN/{}  wget  -O -B $SITE -i  {} >> ./$FN/{}
rm ./index.html
rm ./itemnames
{} is supposed to be the xargs input (ditched parallel cause it froze my computer)
http://linux.die.net/man/1/xargs
mkdir has a -p option (parent) that i think should make that not a issue
https://secure.wikimedia.org/wikipedia/ ... r#Examples
but this is what i get

Code: Select all

lease type name of site
http://ia600408.us.archive.org/2/items/ball_and_cross_1001_librivox/
--2011-10-11 17:32:28--  http://ia600408.us.archive.org/2/items/ball_and_cross_1001_librivox/
Resolving ia600408.us.archive.org... 207.241.227.218
Connecting to ia600408.us.archive.org|207.241.227.218|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/html]
Saving to: `./index.html'

    [ <=>                                                                                                                ] 8,999       --.-K/s   in 0.08s   

2011-10-11 17:32:28 (104 KB/s) - `./index.html' saved [8999]

please type file extension
ogg
ballandcross_01_chesterton.ogg ballandcross_02_chesterton.ogg ballandcross_03_chesterton.ogg ballandcross_04_chesterton.ogg ballandcross_05_chesterton.ogg ballandcross_06_chesterton.ogg ballandcross_07_chesterton.ogg ballandcross_08_chesterton.ogg ballandcross_09_chesterton.ogg ballandcross_10_chesterton.ogg ballandcross_11_chesterton.ogg ballandcross_12_chesterton.ogg ballandcross_13_chesterton.ogg ballandcross_14_chesterton.ogg ballandcross_15_chesterton.ogg ballandcross_16_chesterton.ogg ballandcross_17_chesterton.ogg ballandcross_18_chesterton.ogg ballandcross_19_chesterton.ogg ballandcross_20_chesterton.ogg
../../adfgsa.sh: line 12: ./ball_and_cross_1001_librivox/{}: No such file or directory

User avatar
xenopeek
Level 25
Level 25
Posts: 29590
Joined: Wed Jul 06, 2011 3:58 am

Re: ogg downloader

Post by xenopeek »

Perhaps add as first line to your script:

Code: Select all

#!/bin/bash
Instead of your single not working line:
cat ./itemnames | xargs -I {} mkdir -p -m 777 ./$FN/{} wget -O -B $SITE -i {} >> ./$FN/{}
You can do something like the following (not sure I've put your commands in properly):

Code: Select all

ITEMNAMES=$(cat ./itemnames)
for ITEMNAME in $ITEMNAMES; do
	mkdir -p -m 777 ./$FN/$ITEMNAME
	wget  -O -B $SITE -i $ITEMNAME >> ./$FN/$ITEMNAME
done
Yes, more lines of code. But six months from now you'll still understand what this does and how it works. At least I've found that I can really make obfuscated one-liners :wink: Six months later, heck even six weeks later, I'd wished I programmed it a little clearer and added some comments.
Image
passstab

Re: ogg downloader

Post by passstab »

got it to work!

Code: Select all

#!/bin/bash
rm -f ./webpage
rm -f ./basenames
echo 'please type name of site'
read URL
wget $URL -O ./webpage
echo 'please type file extension'
read EXTENSION
grep .$EXTENSION ./webpage | cut -d "\"" -f 2 > ./basenames
wget -x -nH -B $URL -i ./basenames
rm ./webpage
rm ./basenames
the wget -x -nH options do something close enough to what i wanted
(ideally it would only make the last two directories e.g "/items/ball_and_cross_1001_librivox/" instead of /2/items/ball_and_cross_1001_librivox/)
i think i was giving it values ware it wanted filenames (your suggestion didn't work i think for the same reason)
thanks for the help
User avatar
xenopeek
Level 25
Level 25
Posts: 29590
Joined: Wed Jul 06, 2011 3:58 am

Re: ogg downloader

Post by xenopeek »

Yeah, wget has a lot of options :D
Image
Locked

Return to “Scripts & Bash”