One of many generally used utilities by sysadmin is wget. It may be very helpful throughout web-related troubleshooting.
What’s a wget command?
wget command is a well-liked Unix/Linux command line utility for retrieving content material from the Web. It’s free to make use of and provides a non-interactive technique to obtain recordsdata from the Web. The wget command helps normal HTTPS, HTTP, and FTP protocols. Furthermore, it additionally permits you to use HTTP proxies.
How does wget make it easier to with troubleshooting?
There are numerous methods.
As a system administrator, you normally work on a terminal, and when troubleshooting internet functions, it’s possible you’ll not need to examine the entire web page, simply the connectivity. Otherwise you need to confirm intranet web sites. Otherwise you need to obtain a selected web page to confirm its content material.
wget is non-interactive, which suggests you may run it within the background even if you’re logged out. There could also be many cases the place it’s important that you simply disconnect from the system even if you end up retrieving recordsdata from the web. Within the background the wget
will run and full their assigned process.
It will also be used to get the total web site in your native machines. It could possibly monitor left XHTML and HTML pages to create a neighborhood model. To do that, the web page should be downloaded recursively. That is very helpful as you should utilize it to obtain necessary pages or websites for offline viewing.
Let’s examine them in motion. The syntax of the wget is as under.
wget [option] [URL]
Obtain an online web page
Let’s attempt to obtain a web page. For instance: github.com
wget github.com
If the connectivity is nice, the homepage shall be downloaded and the output shall be proven as under.
root@tendencies:~# wget github.com URL remodeled to HTTPS resulting from an HSTS coverage --2020-02-23 10:45:52-- https://github.com/ Resolving github.com (github.com)... 140.82.118.3 Connecting to github.com (github.com)|140.82.118.3|:443... linked. HTTP request despatched, awaiting response... 200 OK Size: unspecified [text/html] Saving to: ‘index.html’ index.html [ <=> ] 131.96K --.-KB/s in 0.04s 2020-02-23 10:45:52 (2.89 MB/s) - ‘index.html’ saved [135126] root@tendencies:~#
Obtain a number of recordsdata
Helpful if it’s good to obtain a number of recordsdata directly. This may provide you with an concept about the best way to automate file downloading by way of some scripts.
Let’s strive downloading Python 3.8.1 and three.5.1 recordsdata.
wget https://www.python.org/ftp/python/3.8.1/Python-3.8.1.tgz https://www.python.org/ftp/python/3.5.1/Python-3.5.1.tgz
So, as you may guess, the syntax is as under.
wget URL1 URL2 URL3
You simply must make it possible for area is given between the URLs.
Restrict the obtain velocity
This is able to be helpful if you wish to examine how a lot time it takes to obtain your file on completely different bandwidths.
The habits --limit-rate
possibility you may restrict the obtain velocity.
Right here is the output of downloading the Nodejs file.
root@tendencies:~# wget https://nodejs.org/dist/v12.16.1/node-v12.16.1-linux-x64.tar.xz
--2020-02-23 10:59:58-- https://nodejs.org/dist/v12.16.1/node-v12.16.1-linux-x64.tar.xz
Resolving nodejs.org (nodejs.org)... 104.20.23.46, 104.20.22.46, 2606:4700:10::6814:162e, ...
Connecting to nodejs.org (nodejs.org)|104.20.23.46|:443... linked.
HTTP request despatched, awaiting response... 200 OK
Size: 14591852 (14M) [application/x-xz]
Saving to: ‘node-v12.16.1-linux-x64.tar.xz’
node-v12.16.1-linux-x64.tar.xz 100%[===========================================================================================>] 13.92M --.-KB/s in 0.05s
2020-02-23 10:59:58 (272 MB/s) - ‘node-v12.16.1-linux-x64.tar.xz’ saved [14591852/14591852]
Downloading 13.92 MB recordsdata took 0.05 seconds. Now let’s strive limiting the velocity to 500K.
root@tendencies:~# wget --limit-rate=500k https://nodejs.org/dist/v12.16.1/node-v12.16.1-linux-x64.tar.xz
--2020-02-23 11:00:18-- https://nodejs.org/dist/v12.16.1/node-v12.16.1-linux-x64.tar.xz
Resolving nodejs.org (nodejs.org)... 104.20.23.46, 104.20.22.46, 2606:4700:10::6814:162e, ...
Connecting to nodejs.org (nodejs.org)|104.20.23.46|:443... linked.
HTTP request despatched, awaiting response... 200 OK
Size: 14591852 (14M) [application/x-xz]
Saving to: ‘node-v12.16.1-linux-x64.tar.xz.1’
node-v12.16.1-linux-x64.tar.xz.1 100%[===========================================================================================>] 13.92M 501KB/s in 28s
2020-02-23 11:00:46 (500 KB/s) - ‘node-v12.16.1-linux-x64.tar.xz.1’ saved [14591852/14591852]
Decreasing bandwidth took longer to obtain: 28 seconds. Think about your customers are complaining about gradual downloads, and you recognize their community bandwidth is low. You possibly can strive it quickly --limit-rate
to simulate the issue.
Obtain within the background
Massive recordsdata could take some time to obtain as within the instance above, the place you additionally need to set the velocity restrict. That is to be anticipated, however what in the event you do not need to stare at your terminal?
Properly, you should utilize it -b
argument to start out the wget within the background.
root@tendencies:~# wget -b https://slack.com
Persevering with in background, pid 25430.
Output shall be written to ‘wget-log.1’.
root@tendencies:~#
Ignore certificates error
That is helpful if it’s good to examine intranet internet functions that should not have the proper certificates. By default, wget generates an error if a certificates shouldn’t be legitimate.
root@tendencies:~# wget https://expired.badssl.com/
--2020-02-23 11:24:59-- https://expired.badssl.com/
Resolving expired.badssl.com (expired.badssl.com)... 104.154.89.105
Connecting to expired.badssl.com (expired.badssl.com)|104.154.89.105|:443... linked.
ERROR: can not confirm expired.badssl.com's certificates, issued by ‘CN=COMODO RSA Area Validation Safe Server CA,O=COMODO CA Restricted,L=Salford,ST=Larger Manchester,C=GB’:
Issued certificates has expired.
To hook up with expired.badssl.com insecurely, use `--no-check-certificate'.
The instance above is for the URL the place the certificates has expired. As you may see it advised to make use of --no-check-certificate
which ignores any certificates validation.
root@tendencies:~# wget https://untrusted-root.badssl.com/ --no-check-certificate
--2020-02-23 11:33:45-- https://untrusted-root.badssl.com/
Resolving untrusted-root.badssl.com (untrusted-root.badssl.com)... 104.154.89.105
Connecting to untrusted-root.badssl.com (untrusted-root.badssl.com)|104.154.89.105|:443... linked.
WARNING: can not confirm untrusted-root.badssl.com's certificates, issued by ‘CN=BadSSL Untrusted Root Certificates Authority,O=BadSSL,L=San Francisco,ST=California,C=US’:
Self-signed certificates encountered.
HTTP request despatched, awaiting response... 200 OK
Size: 600 [text/html]
Saving to: ‘index.html.6’
index.html.6 100%[===========================================================================================>] 600 --.-KB/s in 0s
2020-02-23 11:33:45 (122 MB/s) - ‘index.html.6’ saved [600/600]
root@tendencies:~#
Cool proper?
HTTP response header
View the HTTP response header from a selected website on the terminal.
Utilizing -S
will print the header, as you may see under for Coursera.
root@tendencies:~# wget https://www.coursera.org -S
--2020-02-23 11:47:01-- https://www.coursera.org/
Resolving www.coursera.org (www.coursera.org)... 13.224.241.48, 13.224.241.124, 13.224.241.82, ...
Connecting to www.coursera.org (www.coursera.org)|13.224.241.48|:443... linked.
HTTP request despatched, awaiting response...
HTTP/1.1 200 OK
Content material-Sort: textual content/html
Content material-Size: 511551
Connection: keep-alive
Cache-Management: non-public, no-cache, no-store, must-revalidate, max-age=0
Date: Solar, 23 Feb 2020 11:47:01 GMT
etag: W/"7156d-WcZHnHFl4b4aDOL4ZSrXP0iBX3o"
Server: envoy
Set-Cookie: CSRF3-Token=1583322421.s1b4QL6OXSUGHnRI; Max-Age=864000; Expires=Wed, 04 Mar 2020 11:47:02 GMT; Path=/; Area=.coursera.org
Set-Cookie: __204u=9205355775-1582458421174; Max-Age=31536000; Expires=Mon, 22 Feb 2021 11:47:02 GMT; Path=/; Area=.coursera.org
Strict-Transport-Safety: max-age=31536000; includeSubDomains; preload
X-Content material-Sort-Choices: nosniff
x-coursera-render-mode: html
x-coursera-render-version: v2
X-Coursera-Request-Id: NCnPPlYyEeqfcxIHPk5Gqw
X-Coursera-Hint-Id-Hex: a5ef7028d77ae8f8
x-envoy-upstream-service-time: 1090
X-Body-Choices: SAMEORIGIN
x-powered-by: Categorical
X-XSS-Safety: 1; mode=block
X-Cache: Miss from cloudfront
By way of: 1.1 884d101a3faeefd4fb32a5d2a8a076b7.cloudfront.internet (CloudFront)
X-Amz-Cf-Pop: LHR62-C3
X-Amz-Cf-Id: vqvX6ZUQgtZAde62t7qjafIAqHXQ8BLAv8UhkPHwyTMpvH617yeIbQ==
Size: 511551 (500K) [text/html]
Manipulate the person agent
There could also be a state of affairs the place you need to join a website utilizing a customized person agent. Or the person agent of a selected browser. That is doable by specifying --user-agent
. The instance under is for the person agent as MyCustomUserAgent.
root@tendencies:~# wget https://gf.dev --user-agent="MyCustomUserAgent"
Host header
When an software continues to be in growth, it’s possible you’ll not have URL to check it. Or perhaps you need to take a look at a separate HTTP occasion utilizing IP, however it’s good to specify the host header for the applying to work appropriately. On this state of affairs, --header
can be useful.
Let’s take an instance of testing http://10.10.10.1 with the host header as software.com
wget --header="Host: software.com" http://10.10.10.1
Not simply internet hosting, however you may inject any header you need.
Join by way of proxy
In case you are working in a DMZ atmosphere, it’s possible you’ll not have the ability to entry web websites. However you may reap the benefits of proxy to attach.
wget -e use_proxy=sure http_proxy=$PROXYHOST:PORT http://externalsite.com
Remember to replace the $PROXYHOST:PORT variable with the precise variables.
Join by way of a selected TLS protocol
Usually I like to recommend utilizing OpenSSL to check the TLS protocol. However it’s also possible to use wget.
wget --secure-protocol=TLSv1_2 https://instance.com
The above forces wget to attach over TLS 1.2.
Conclusion
Understanding the mandatory instructions may also help you at work. I hope the above provides you an concept of what you are able to do with it wget
.