Batch Downloading Image Files from a Webpage

Questions about other topics - please check if your question fits better in another category before posting here
Forum rules
Before you post read how to get help. Topics in this forum are automatically closed 6 months after creation.
Locked
user73

Batch Downloading Image Files from a Webpage

Post by user73 »

Is there a to do this that doesn't require a package outside the software manager repository? Like let's say I want to collect Star Wars concept art off a search engine query for my private collection. Is there a way I can batch download these images to disk without having to individually save them? I reckon there is a way to do this from the command line, similar to what this guy is doing from Windows: https://johnlouros.com/blog/batch-downl ... -a-website, but my bash skills are introductory at the moment. If someone could point me to some documentation, I would appreciate it.

While I'm on the topic of visual media, does anyone know of any software that might allow me to analyze a folder full of images and both locate and delete copies of the same image? That would be inevitable in a batch download operation.
Last edited by LockBot on Wed Dec 28, 2022 7:16 am, edited 1 time in total.
Reason: Topic automatically closed 6 months after creation. New replies are no longer allowed.
User avatar
BenTrabetere
Level 7
Level 7
Posts: 1890
Joined: Sat Jul 19, 2014 12:04 am
Location: Hattiesburg, MS USA

Re: Batch Downloading Image Files from a Webpage

Post by BenTrabetere »

user73 wrote: Wed Aug 07, 2019 12:34 pm Is there a to do this that doesn't require a package outside the software manager repository? Like let's say I want to collect Star Wars concept art off a search engine query for my private collection. Is there a way I can batch download these images to disk without having to individually save them?
I am not a lawyer and I have only a rudimentary understanding of copyright law, but I know enough to know that you really need to think before you proceed.

I do not know about the rest of the world, but in the US images are automatically copyrighted by their owners. Images do not need to carry an explicit copyright warning. Except in cases of "fair use" you cannot reproduce copyright images without the owner’s permission - you could be inviting unfriendly conversations with lawyers, cease-and-desist letters, and lawsuits. And you might have to deal with lawyers in cases fair use - I consider the whole concept to be "vague specific."

It is unlikely you will run into problems by snagging the occasional image. I suspect you will draw attention if you start downloading everything a search engine spits out.
Patreon sponsor since August 2022
Image
RIH
Level 9
Level 9
Posts: 2875
Joined: Sat Aug 22, 2015 3:47 am

Re: Batch Downloading Image Files from a Webpage

Post by RIH »

Off the wall idea user73 as I am sure someone will come up with a better way of doing it.. :D

You could go to a web site then Print, but take the option to 'save to PDF' rather than printing.

The resultant PDF file is often a real mess to try & read as an actual document but does contain the images in the web page that you have downloaded.
You could then use a PDF editor, or even Libre Office draw, to open the document & extract the pictures into some sort of readable order.

BenTrabetere, the OP says that he wants the pictures
for my private collection
. As far as I am aware this would not infringe copywrite - firstly because no one would know but him... :D
Image
ColdBootII

Re: Batch Downloading Image Files from a Webpage

Post by ColdBootII »

It is easily doable, the sole problem being, you don't really want to mess with Google pictures results.

Look what no more that 10s download (I interrupted it) did to me and mind you, these 237 images totaling 146MB in disk space, are a filtered list after I filtered and deleted those smaller than 50Kb.

Image

All you need is an extension https://chrome.google.com/webstore/deta ... mjbibjnakm and Pix. Still, using a script or an extension makes no difference. It's sheer madness to attempt it. :mrgreen:

May the force be with you,
Cheers! :D
user73

Re: Batch Downloading Image Files from a Webpage

Post by user73 »

It is unlikely you will run into problems by snagging the occasional image. I suspect you will draw attention if you start downloading everything a search engine spits out.
I don't have the time for everything, just the first few pages usually. Yet I would reckon that from a legal perspective, snagging the occasional album is no different than snagging the occasional image.

Besides, as RIH pointed out, this is purely for my private digital museum of art. I decided long ago to employ a strict a NO-SHARING policy precisely due to this consideration. Any pictures I am uncertain about the legality of is never shared around online or offline and is certainly never sold; particularly with regard to commercial material.

I've been collecting visual media off the Internet ever since I learned how to use the "Save" function as a teenager. My album is several GB large. But if just possessing copyrighted materials is a crime, than the rate at which I collect them is irrelevant.
The resultant PDF file is often a real mess to try & read as an actual document but does contain the images in the web page that you have downloaded.
You could then use a PDF editor, or even Libre Office draw, to open the document & extract the pictures into some sort of readable order.
Not bad idea. It's not to different to what I have been doing, which is screenshotting the images. The only concern would be if the resolution is good enough to bother. I'll have to try it out.

Edit: Nah, that's not gonna work. Having to extract them all would take just as long, if not longer, than just collecting them manually.
All you need is an extension https://chrome.google.com/webstore/deta ... mjbibjnakm and Pix
Yeah but I try to avoid using extensions for security reasons. I'm not that desperate. That's why I was inquiring about a command line method.
ColdBootII

Re: Batch Downloading Image Files from a Webpage

Post by ColdBootII »

^Yes, in addition to not being desperate about it, it also makes little sense to download great number of images when you can always cherry pick the few you actually like. What I mean is, we're not paying internet service by the hour, when scrapping images or portions of websites, as fast as possible, would be useful.

Cheers.
User avatar
ugly
Level 5
Level 5
Posts: 592
Joined: Thu Nov 24, 2016 9:17 pm

Re: Batch Downloading Image Files from a Webpage

Post by ugly »

Something like this should work:

https://github.com/RuanMuller/harx
user73

Re: Batch Downloading Image Files from a Webpage

Post by user73 »

ColdBootII wrote: Thu Aug 08, 2019 8:08 pm ^Yes, in addition to not being desperate about it, it also makes little sense to download great number of images when you can always cherry pick the few you actually like. What I mean is, we're not paying internet service by the hour, when scrapping images or portions of websites, as fast as possible, would be useful.
It has more to do with the keystrokes involved. Deleting files is faster than than saving them individually and is less stressful on the hand because you're just tapping the same key repeatedly. Saving 20 images individually vs batch saving an album of 80? I reckon I can probably delete 60 faster.
ColdBootII

Re: Batch Downloading Image Files from a Webpage

Post by ColdBootII »

Well, yes if there is only 80. The problem is, especially when it's the output of a search engine, there are many more. I guess you'll be clicking much more to delete. :mrgreen:

Anyways, there are no many ready-made scripts available around. For Google there's this one https://github.com/hardikvasa/google-images-download There's also webhttrack in the repos but it seems unmaintained and takes a learning curve to use. I think that a browser extension I've mentioned, is a better/simpler way of doing that.

Edit: BTW, I have not tried the github script myself so, I don't know if it's working as advertised or delved into its code to see if it's safe for use.

Cheers.
Locked

Return to “Other topics”