Hi, Pentesters! In this article we are going to focus on the Kali Linux tool “Cewl” which will basically help you to create a wordlist. Let’s explore this tool and learn about what all other options this tool provides.
Table of Contents:
1. Introduction to Cewl
2. Default Procedure
3. Store this wordlist in a
file
4. Generating Wordlist of
certain length
5. Retrieval of Emails from
website.
6. To count the number of
words repeated in website
7. Increase spider depth
8. Verbose Mode
9. Alphanumeric Wordlist
10. Cewl with Digest/Basic
Authentication
11. Lowercase all parsed words
12. Proxy Support
Introduction to Cewl:
CeWL – A custom wordlist generator is a ruby
program that crawls a specific URL to a defined depth and returns a list of
keywords, which password crackers like John the Ripper, Medusa, WFuzz can use
to crack the passwords. Cewl also has an associated command line app FAB, which
uses same metadata extraction techniques to generate author/producer lists from
already downloaded files using information extraction algorithms like CeWL.
CeWL comes preinstalled with Kali
Linux. With this tool we can easily collect words and phrases from target page.
It is a robust program that can quickly scrape the webserver of any website.
Open the terminal of Kali Linux and
type “cewl -h” to see the lists of all the options it accepts, with complete
description.
Syntax: cewl <url> [options]
General Options:
-h, –help:
Show help.
-k, –keep:
Keep the downloaded file.
-d <x>, –depth <x>: Depth
to spider to, default 2.
-m, –min_word_length: Minimum
word length, default 3.
-o, –offsite:
Let the spider visit other
sites.
-w, –write:
Write the output to the
file.
-u, –ua <agent>:
User agent to send.
-n, –no-words:
Don’t output the wordlist.
–with-numbers:
Accept words with numbers in as
well as just letters
-a, –meta:
include meta data.
–meta_file file:
Output file for Meta data.
-e, –email:
Include email addresses.
–email_file <file>:
Output file for email addresses.
-c, –count:
Show the count for
each word found.
-v, –verbose:
Verbose.
–debug:
Extra
debug information
Authentication
–auth_type:
Digest or basic.
–auth_user:
Authentication username.
–auth_pass:
Authentication password.
Proxy
Support
–proxy_host:
Proxy host.
–proxy_port:
Proxy port, default 8080.
–proxy_username: Username for proxy, if required.
–proxy_password: Password for proxy, if
required.
Default Procedure:
Use the following command to generate a list of
words which will spider the given URL to specified depth and we can use it as
directory for cracking the passwords.
Command: cewl
http://www.vulnweb.com
Store this wordlist in a file:
Now to save this all wordlist in file for record
keeping, efficiency and readability we will use
-w option to save the output in a text file.
Command: cewl http://www.vulnweb.com
-w dict.txt
Here dict.txt is the file name where the wordlist
will be stored. Once the file has been created you can open it to see if the
output is stored in the file.
Generating wordlists of certain length:
If you want to create a wordlist of specific length
than you can choose to use option -m and provide the minimum length for the
keyword and hence it will create wordlists for certain length.
Command: cewl http://vulnweb.com / -m 10 -w dict.txt
So basically, this will create a wordlist in which
each word has minimum 10 letters and store these keywords in the file dict.txt.
Screenshot is attached for your reference.
Retrieval of Emails from website:
In order to retrieve emails from the website we can
use -e option, while -n option will hide the lists of created while crawling
the provided website. As you can see in the screenshot attached it has found 1
email-id from the website.
Command: cewl https://digi.ninja/contact.php
-e -n
To count number of words repeated in website:
If you want to count number of times a word is
repeated in a website, then use -c option that will enable count parameter.
Command: cewl
http://www.vulnweb.com -c
For your reference a screenshot is added below which
prints the count for every keyword repeated in website.
Increase Spider depth:
You can use -d option with depth number to activate
depth parameter for more quick and intense crawling so that a large list of
words is created. The depth level is set to 2 as default.
Command: cewl http://vulnweb.com -d 3
Verbose Mode:
We have a -v option for the verbose mode to extend
the website crawling result and retrieving complete detail of website.
Command: cewl http://vulnweb.com
-v
So, this will display extended website crawling
result. Below we have attached a screenshot so that you will get a clear idea.
Alphanumeric Wordlist:
Sometimes it may happen that you may need a
alpha-numeric wordlist for that you can use –with-numbers option to get a alpha-numeric
wordlist.
Command: cewl http://testphp.vulnweb.com/artists.php --with-numbers
Cewl with Digest/Basic Authentication:
It may happen sometime that some web application may
have a authentication page for login and for that the above basic command will
not give desired results. So for that you need to bypass the authentication
page by using command given below.
Command: cewl
http://testphp.vulnweb.com/login.php
--auth_type Digest --auth_user test –auth_pass test -v
In this command we have used following options:
--auth_type: Digest
/Basic
--auth_user: Authentication Username
--auth_pass: Authentication password
Lowercase all parsed words:
When you need the keywords to be generated in
lowercase for that you can use --lowercase option to generate the words in
lowercase.
Proxy Support:
This default command for cewl will not work properly
if you have attached a proxy server. We
tried to access the application through ip address but proxy server is attached
hence this gave us Forbidden Error page.
And here if we apply default cewl command so it will
generate the error page wordlist. Hence to get the appropriate wordlist of the
web application we have used command as:
Command: cewl http://192.168.1.141 --proxy_host
192.168.1.141 --proxy_port 3128
In this command we have used following options:
--proxy_host: Your Host
--proxy_port: Port number of your proxy
Author: Divya Adwani is researcher and technical
writer who is very much keen to learn and enthusiastic to learn ethical hacking
0 comments:
Post a Comment