I have access to a solar power converter by secured WIFI using http://160.190.0.1/home.
I can copy/paste its content from Firefox by mouse click to home.html + home_bestanden folder. I can read its content, see time + Watt + KW amounts and copy/paste these into Calc to create graphs.
Using WGET to do this only stores home.html without any folder and useless content. The reason I want to use WGET is to perform this action within an half hour interval and pull out this data by a program automatically.
When I am using WGET on randomly choosen Internet sites it works just like mouse click copy/paste and produces frozen websites like by mouse click.
Any idea where it goes wrong?
These are the first lines I get from WGET where in the whole file only - home.html - is recognizable:
\8BZX\00home.html\00\ED\EDn\DB8\F2\B7\B8w`\D8b-#\F5G\92\A6\EDƖ\81n\A0A\AFM\B1\ED]\EF\AE\B2D\D9L$R\A5(\BB\DE /r\CFr\EFt\AFp3\A4$˲\9C\B4W,\D0b E\CEg\863CR\A3\BD\97\E7/>\FC\E3\DD
These are the first lines I get from copy/paste Firefox:
<!DOCTYPE html>
<html><head>
<meta http-equiv="content-type" content="text/html; charset=UTF-8">
<meta charset="utf-8">
<title>Home</title>
<meta http-equiv="X-UA-Compatible" content="IE=edge">
<meta name="viewport" content="width=device-width">
<!--meta name="MobileOptimized" content="320"-->
<link href="Home_bestanden/layout.css" rel="stylesheet" type="text/css">
And the content of the href folder - home_bestanden - is:
dropdown.js zepto.js layout.css pic.png
WGET copies unreadable lines
Forum rules
Before you post read how to get help. Topics in this forum are automatically closed 6 months after creation.
Before you post read how to get help. Topics in this forum are automatically closed 6 months after creation.
WGET copies unreadable lines
Last edited by LockBot on Wed Dec 28, 2022 7:16 am, edited 1 time in total.
Reason: Topic automatically closed 6 months after creation. New replies are no longer allowed.
Reason: Topic automatically closed 6 months after creation. New replies are no longer allowed.
Re: WGET copies unreadable lines
Try using curl instead.
curl http://160.190.0.1/home -o result.html
Re: WGET copies unreadable lines
Will give it a try with cURL but think there is something else going on too. WGET --debug shows up with a line containing GZIP format. This would explain why the WGET outpufile besides the words home and html shows no readable info. Found in the Internet more examples and questions how to deal with that,
Re: WGET copies unreadable lines
Got more info but ......
cURL does not work either. For both WGET and cURL I find pages of settings but until now nothing does the job ...... beside one: WGET --debug.
There is one line that tells me that the website does something with gzip. Checked the output file that is named index.html with several un-zippers and found that 7-ZIP was able to open that index.html-file. Now it shows a zipped file called home.html that I can open and extract to get a file with html lines that I can read. Looks almost the same like the copy/paste version at mouseclick from Firefox.
But the next problem to work on is to get the required data too as only an uncolored and empty webpage is found without the Device Information nor the amount of KWatt produced.
cURL does not work either. For both WGET and cURL I find pages of settings but until now nothing does the job ...... beside one: WGET --debug.
There is one line that tells me that the website does something with gzip. Checked the output file that is named index.html with several un-zippers and found that 7-ZIP was able to open that index.html-file. Now it shows a zipped file called home.html that I can open and extract to get a file with html lines that I can read. Looks almost the same like the copy/paste version at mouseclick from Firefox.
But the next problem to work on is to get the required data too as only an uncolored and empty webpage is found without the Device Information nor the amount of KWatt produced.
Re: WGET copies unreadable lines
Try adding a user agent, like:
curl -A 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:58.0) Gecko/20100101 Firefox/58.0' http://160.190.0.1/home -o result.html
Re: WGET copies unreadable lines
First of all: thank you very, very much for helping me.
Run your cUTL command but the zipped output still shows up with only the frame as the folder is not saved too like with copy/pased-mouseclick from Firefox. Is that perhaps the reason no data is seen as it is stored in that folder in Java-formatt? But on the other hand: I can tell Firefox to store the site in TXT-format only and still see the data in the html lines.
Below is the output with data. Without it after <tr style=""> it shows up with <td></td><td></td><td></td><td class=""></td>
>>>>>>>>>>>>>>>
<td width="150"><span data-locale="sn">SN.</span></td>
<td width="120"><span data-locale="pacw">Pac(W)</span></td>
<td width="150"><span data-locale="etoday">E_Today(KWh)</span></td>
<td width="130"><span data-locale="status">Status</span></td>
<tr style="">
<td>BD36806011760040</td>
<td>158</td>
<td>0.70</td>
<td class="ok"></td>
etc.....
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
Unfortunately the Chines developers of the Zeversolar convertor do not like us to have direct access to the generated data. They prefer me to link to them far away, send all data and then to see what happens over here.
Think I will have to find me a webscrapper that I can instruct to run every half hour saving the Firefox view in timestamped files to extract the data from there if necessary by hand. Using Wget or cURL is a bit more relaxed for extracting data and storing it in sequential lines immediately after downloading. And it is for free.......
Run your cUTL command but the zipped output still shows up with only the frame as the folder is not saved too like with copy/pased-mouseclick from Firefox. Is that perhaps the reason no data is seen as it is stored in that folder in Java-formatt? But on the other hand: I can tell Firefox to store the site in TXT-format only and still see the data in the html lines.
Below is the output with data. Without it after <tr style=""> it shows up with <td></td><td></td><td></td><td class=""></td>
>>>>>>>>>>>>>>>
<td width="150"><span data-locale="sn">SN.</span></td>
<td width="120"><span data-locale="pacw">Pac(W)</span></td>
<td width="150"><span data-locale="etoday">E_Today(KWh)</span></td>
<td width="130"><span data-locale="status">Status</span></td>
<tr style="">
<td>BD36806011760040</td>
<td>158</td>
<td>0.70</td>
<td class="ok"></td>
etc.....
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
Unfortunately the Chines developers of the Zeversolar convertor do not like us to have direct access to the generated data. They prefer me to link to them far away, send all data and then to see what happens over here.
Think I will have to find me a webscrapper that I can instruct to run every half hour saving the Firefox view in timestamped files to extract the data from there if necessary by hand. Using Wget or cURL is a bit more relaxed for extracting data and storing it in sequential lines immediately after downloading. And it is for free.......
Re: WGET copies unreadable lines
In the end discovered that perhaps WGET and CURL could be able to activate the Java scripts but how this is done remains a secret. Found HTTRACK that does the trick without complaining and even thanking me for using it.
The command >>httrack 160.190.0.1<< showed after 5 seconds all I wanted to see in a mirror folder on my disk where a large amount of files with all details for further analysis is stored too. Even found that Java data is stored in a single line in home-2.html . Ones running in a loop and adding that line to a textfile now telling Calc to use only 2 values from it shows a power conversion graph within a few mouse clicks after a whole day of automated logging.
The command >>httrack 160.190.0.1<< showed after 5 seconds all I wanted to see in a mirror folder on my disk where a large amount of files with all details for further analysis is stored too. Even found that Java data is stored in a single line in home-2.html . Ones running in a loop and adding that line to a textfile now telling Calc to use only 2 values from it shows a power conversion graph within a few mouse clicks after a whole day of automated logging.
Re: WGET copies unreadable lines
Are you using a version of wget at or newer than 1.19.2? Because they annoyingly (!!!) made it so wget by default requests compressed content, and guess what, it seems to be via gzip. Look in the man page for wget, as there's a flag you can use to tell the server not to offer compressed content; can't remember which one it is, I'm afraid.
I'm also Terminalforlife on GitHub.
Re: WGET copies unreadable lines
Termy thanks, but I switched definitely over to Httrack that is default supplying me with an overload of information. Unzipping the Wget content is no issue as 7zip does the trick in the background after renaming the *.html to *.7z and saving its content as *.html. But Wget doesn't give me the data I seek as the Java script to add it is not activated. I only get an empty and colorless home page where Httrack gives me the web page like Firefox shows. And by the way: the same for CURL that also doesn't start the Java script. And by the way: downloading Httrack you also get an GUI version that is called WINhttrack that can be used by copy/paste on mouse click.