is there any kind of software that will scrape a website for any and all URL's on the site? Been searching for a while now not really coming up with much. Any ideas appreciated.
Thanks!
Chris
how to scrape url's from a website....
Forum rules
Before you post read how to get help. Topics in this forum are automatically closed 6 months after creation.
Before you post read how to get help. Topics in this forum are automatically closed 6 months after creation.
how to scrape url's from a website....
Last edited by LockBot on Wed Dec 28, 2022 7:16 am, edited 1 time in total.
Reason: Topic automatically closed 6 months after creation. New replies are no longer allowed.
Reason: Topic automatically closed 6 months after creation. New replies are no longer allowed.
Re: how to scrape url's from a website....
drat - no thoughts? Back to cut and paste I suppose. Keeps me out of trouble anyway!
Chris
Chris
Re: how to scrape url's from a website....
Hey all,
I need to ask this again. Let me put a little more info with it though so you understand what I am doing. I own my own small business selling card models. I have a horrible time with pirates stealing my work. Not really my work but the work of all the designers that make the models to sell. I legally represent them. I need to go to a website and go through the entire sites to scrape all the URL's of the illegal files from hosters like hotfile, rapidshare etc etc. I spent about 20 hours working on one site over the last week and bam overnight they are replenished.
I really hope someone can help me find a good solution
Thanks
Chris
I need to ask this again. Let me put a little more info with it though so you understand what I am doing. I own my own small business selling card models. I have a horrible time with pirates stealing my work. Not really my work but the work of all the designers that make the models to sell. I legally represent them. I need to go to a website and go through the entire sites to scrape all the URL's of the illegal files from hosters like hotfile, rapidshare etc etc. I spent about 20 hours working on one site over the last week and bam overnight they are replenished.
I really hope someone can help me find a good solution
Thanks
Chris
Re: how to scrape url's from a website....
Well, I'm of the opinion that if you make the legal way to get something the most convenient, the vast majority of people will use it rather than the illegal method - online TV for instance. Nevertheless, I found the exercise amusing, so add the following to ~/.bashrc: "function scrape { wget $1 -qO - | sed 's/"/\n"\n/g' | sed '/http/!d'; }", and from then on you'll be able to use e.g. "scrape www.linuxmint.com" to get a list of addresses (not necessarily valid ones though).
Re: how to scrape url's from a website....
to add the line to .bashrc, press ALT+F2 and type "gedit .bashrc" into the dialog, then copy and paste the line into it before saving and exiting. You'll need to use the terminal to actually use the command though.