BashFu - Archive the Latest Files to Send Away

About writing shell scripts and making the most of your shell
Forum rules
Topics in this forum are automatically closed 6 months after creation.
Locked
User avatar
Termy
Level 12
Level 12
Posts: 4254
Joined: Mon Sep 04, 2017 8:49 pm
Location: UK
Contact:

BashFu - Archive the Latest Files to Send Away

Post by Termy »

I couldn't be bothered to make a YouTube video (audio is weird at the moment anyway), and I just had to share my latest little adventure, because it's definitely going to be useful to somebody out there.

Here is the situation I just had:

I recently updated the cache used by my wcdl shell program, in order to get the latest wallpapers from an awesome wallpaper website. I had updated my collection of wallpapers, particularly the car category, which just so happens to be the one which interests my dad. I wanted to send him the latest ones. Unfortunately, I'd completely forgotten to do that at the time, so they were all now amongst my other 50,000+ wallpapers. OOPS. Luckily, I was able to programmatically get out of this pickle, as usual, because the shell is the best thing since sliced bread; hell, it's better than that!

I took a quick, semi-pureshell approach (in retrospect, using more non-shell utilities here would've been preferable, for performance reasons) to this task, running this command:

Code: Select all

7zr a $HOME/Desktop/NewCars.7z `while read -a X; do [ "${X[0]}" == PAGE: ] || find -name "${X[1]}" -ctime -2; done < $HOME/.wcdl/2017-11-24_23\:36\:54.log`
That one-liner read each line in the file as an index in an array, then parses them one by one, to grab just the file names. The filenames are then lazily sent to the find command to simply check the file currently processed by the read command is both present and was created within the last two days. The code within the graves (command substitution) results in each one of the files finally being sent to STDOUT, which is then read by 7zr, allowing it to know which files I wish to create an archive with.

There are definitely at least a few ways to improve this, but I was just doing this on-the-fly. Creating a find process for each file is really, really bad practice, but this being a one-hit thing, it worked fine for me. Were I to optimize it, I'd either ditch the while read approach and introduce grep, drastically speeding things up, or I would use a for loop to create the correct -name FILE arguments and expressions, in order to have find do most of the leg work.
Last edited by LockBot on Wed Dec 28, 2022 7:16 am, edited 1 time in total.
Reason: Topic automatically closed 6 months after creation. New replies are no longer allowed.
I'm also Terminalforlife on GitHub.
Misko_2083

Re: BashFu - Archive the Latest Files to Send Away

Post by Misko_2083 »

Termy, by the look on the script it seems to download the entire page.
It's no wonder your hard disk is full. :D

It would be easier to drag an' drop the image thumbnails while listing pages and download later.
This is only an example that downloads 1920x1080 images to ~/Downloads.
List pages https://wallpaperscraft.com/all/page3 , DND thumbnails from web browser to the dnd window area and click Get.
I have the latest yad, probably will work with previous versions.

Code: Select all

#!/bin/bash

# Save path
cd $HOME/Downloads

# Temporary File for file list
export urlist="$(mktemp -u --tmpdir url_list.XXXXXXX)"

# Temporary File for messages
export MSG="$(mktemp -u --tmpdir msg.XXXXXXXX)"

# Trap that removes temporary files on exit
trap "rm -f $urlist $MSG" EXIT

# Creates a named pipe
mkfifo $MSG

# Open up the name pipe on file descriptor 3 for reading and writing
exec 3<> $MSG

# welcome msg
echo "   Hi Termy" >&3

function download() {

    # Clears text-info dialog
    echo -e "\f" >$MSG

    echo "Downloading..." >$MSG

    while read -r line
      do
        echo "GET: ${line/'/wallpaper/'/'/image/'}_1920x1080.jpg" >$MSG
        wget --user-agent="Mozilla/5.0 Gecko/20100101" --timeout=30 --quiet --no-cookies "${line/'/wallpaper/'/'/image/'}_1920x1080.jpg"
        echo "SAVED: ${line##*/}_1920x1080.jpg" >$MSG
      done < "$urlist"

    # Empty the url list
    >$urlist

    echo "Done!" >$MSG
}
export -f download

function reset() {

    # Clears text-info dialog
    echo -e "\f" >$MSG

    # Empty the url list
    >$urlist

}
export -f reset

termykey=$(($RANDOM * $$))

yad --dnd \
    --plug=$termykey \
    --tabnum=1 \
    --text="\n\n\n\nput\nimages\nhere" \
    --text-align=center | tee >(sed -u 's#https://wallpaperscraft.com/wallpaper/##' >&3) >$urlist &

yad --text-info \
    --plug=$termykey \
    --tabnum=2 \
    --tail <&3 &

yad --no-escape \
    --paned \
    --title "Termy's little project" \
    --key=$termykey \
    --orient=hor \
    --splitter=100 \
    --width=400 \
    --height=300 \
    --button="Get:bash -c download" \
    --button="Clear:bash -c reset" \
    --button="exit:0"

# Close file descriptor 3
exec 3<&-

exit 0
Have fun.
User avatar
Termy
Level 12
Level 12
Posts: 4254
Joined: Mon Sep 04, 2017 8:49 pm
Location: UK
Contact:

Re: BashFu - Archive the Latest Files to Send Away

Post by Termy »

My drives are definitely not full. WCDL downloads and parses the page as "cache" which then gets stored (and cleared per the user's decision), but it takes up negligble amounts of space, even if you cache the entire collection of 1080p images, which I've done. The user can download any currently available resolution, not just 1080p, which just so happens to be my native and preferred resolution.

WCDL downloads quite a bit over 65,000 images, so if you're okay dragging and dropping that many times, then this tool isn't for you, as I prefer to simply (I see this approach as much being much easier, as there's simply less to do) download quickly and efficiently in bulk, by category and resolution. Again, if you just want a few wallpapers, then head over to the site and ignore this tool.

I've made many downloaders like this for various content and use WCDL a lot myself, so I'm happy with my approach, but I appreciate the reply. :)
I'm also Terminalforlife on GitHub.
Locked

Return to “Scripts & Bash”