A Pentester is as good as their tools and when it comes to cracking the password, stressing authentication panels or even a simple directory Bruteforce it all drills down to the wordlists that you use. Today we are going to understand wordlists, look around for some good wordlists, run some tools to manage the wordlists, and much more.
Table of Contents
·
Introduction
·
What are Wordlists?
·
Built-in Wordlists
o
Kali Linux Wordlists
o
Dirb Wordlists
o
Rockyou Wordlist
o
Wfuzz Wordlists
·
Online Wordlists
o
GitHub Wordlists
o
Seclists Wordlists
o
Assetnode Wordlists
o
Packet Strom Wordlists
·
Cleaning Wordlists
·
Crafting Wordlists
o
CeWL
o
Crunch
o
Cupp
o
Pydictor
o
Bopscrk
o
BEWCor
o
Dymerge
o
Mentalist
·
Conclusion
Introduction
Ever since the evolution of
Penetration Testers has begun, one of the things we constantly see is that the
attacker cracks the password of the target and gets in! Well in most of the depictions
of the attacks in movies and series often show this situation in detail as it
is the simplest attack to depict. No matter how simple cracking passwords or
performing Credential Stuffing were once a bane on the Web Applications. Today
we somehow have got a bit of control over them with the use of CAPTCHA or Rate
Limiting but still, they are one of the effective attacks. The soul of such
attacks is the wordlist.
What are Wordlists?
A wordlist is a file (a text file
in most cases but not limited to it) that contains a set of values that the
attacker requires to provide to test a mechanism. This is a bit complex, let's
dilute it a bit to understand better. Whenever an attacker is faced with an Authentication
Mechanism, they can try to work around it but if that is not possible then the
attacker has to try some well-known credentials into the Authentication
Mechanism to try and guess. This list of well know credentials is a wordlist.
And instead of manually entering the values one by one, the attacker uses a
tool or script to automate this process. Similarly, in the case of cracking
hash values, the tool uses the wordlists and encodes the entries of wordlists into
the same hash and then uses a string compare function to match the hashes. If a
match is found then the hash is deemed as cracked. It can be observed that the
importance of wordlist is paramount in the Cyber Security World.
Wordlists in Kali Linux
Since Kali Linux was specially
crafted to perform Penetration Testing, it is full of various kinds of
wordlists. This is because of the various tools that are present in the Kali
Linux to perform Bruteforce Attacks on Logins, Directories, etc. Let’s go
through some of the wordlists from the huge arsenal of wordlists Kali Linux
contain.
Wordlists are located inside the
/usr/share directory. Here, we have the dirb directory for the wordlists to be
used while using the dirb tool to perform Directory Bruteforce. Then we have
the dirbuster that is a similar tool that also performs Directory Bruteforce
but with some additional options. Then we have a fern-wifi directory which helps
to break the Wi-Fi Authentications. Then we have the Metasploit which uses
wordlists for almost everything. Then there is a nmap wordlist that contains
that can be used while scanning some specific services. Then we have the
Rockstar of Wordlists: rockyou. This is compressed by default and you will have
to extract it before using it. It is very large with 1,44,42,062 values that
could be passwords for a lot of user accounts on the internet. At last, we have
the wfuzz directory that has the wordlists that can be used clubbed with wfuzz.
Location: /usr/share/wordlists
Dirb Wordlists
To take a closer look at one of
the directories, we use the tree command to list all the wordlists inside the
dirb directory. Here we have different wordlists
that differ in size and languages. There is an extensions wordlist too so that
the attacker can use that directory to perform a Directory Bruteforce. There
are some application-specific wordlists such as apache.txt or sharepoint.txt as
well.
Location:
/usr/share/wordlists/dirb
Rockyou Wordlist
Rockyou.txt is a set of
compromised passwords from the social media application developer also known as
RockYou.
It developed widgets for the Myspace application. In December 2009, the company
experienced a data breach resulting in the exposure of more than 32 million
user accounts. It was mainly because of the company’s policy of storing the
passwords in cleartext.
Location: /usr/share/wordlists
When first booting Kali Linux, it
will be compressed in a gz file. To unzip run the following command. It will decompress
and ready for use on any kind of attack you want.
gzip -d
/usr/share/wordlists/rockyou.txt.gz
Wfuzz Wordlists
Wfuzz tool was developed to
perform Bruteforcing attacks on web applications. It can further be used to
enumerate web applications as well. It can enumerate directories, files, and
scripts, etc. It can change the request from GET to POST as well. That is
helpful in a bunch of scenarios such as checking for SQL Injections. It comes
with a set of predefined wordlists. These wordlists are designed to be used
with wfuzz but they can be used anywhere you desire. The wordlists are divided
into categories such as general, Injections, stress, vulns, web services, and
others.
Location:
/usr/share/wordlists/wfuzz
Looking into the Injections directory
we see that we have an All_attack.txt that is a pretty generic wordlist for
testing injections. Then we have a specific one for SQL, Directory Traversal,
XML, XSS injections. Moving onto the general directory, we see that we have the
big.txt that we discussed in the Dirb section. We have common.txt that also is the
default wordlist in many tools due to its small size. Then we have the
extensions_common.txt which contains like 25-ish extensions that might be
enumerated some files that can be considered low-hanging fruits. Then we have the http_methods.txt wordlist.
It contains the HTTP Methods such as POST, GET, PUT, etc. They can be used
while testing if the target application has any misconfigured methods enabled
or they forgot to disable them at the application and server level.
mutations_common.txt also contains a bunch of uncommon extensions that could
lead to the enumerations of rare artifacts.
Then we have the spanish.txt
wordlist for the as you have guessed it for Spanish words/names/passwords. The
other directory contains the common passwords and names that can be used to
extract usernames or passwords at some forget password form where it responds
with such messages that the user exists or it doesn’t exist. Let’s move onto
the stress directory. It contains a wordlist designed to stress test the
mechanism. It contains wordlists that contain the alphabets or numbers or
special characters and hex codes for the same. Then we have the vulns
directory, which contains the wordlists specially made for testing a particular
vulnerability. We have the apache
wordlist, CGI wordlist, directory wordlist, iis wordlist, oracle9 wordlist,
SharePoint wordlist, tomcat wordlist, and many more. Use these wordlists into a
specific scenario where you are confirmed about the framework and versioning
information and just use it to target a particular entry point.
GitHub Wordlists
We learned about the huge
collection that Kali Linux contains. But sometimes they tend to be not as
latest as we require. This can happen in a scenario in which a new 0-day has
been discovered. There will be no entry in those dictionaries. This is where we
can go wild searching on the internet but it is vast and takes more time. This
is where we can snoop in GitHub as many people might create such a dictionary. So,
searching GitHub might give you those new and fresh dictionaries or it can help
you find that specific dictionary that you require to fuzz a specific
framework.
Link: GitHub
Wordlists
Seclists
Seclists are a collection of
multiple types of wordlists that can be used during Penetration Testing or
Vulnerability Assessment, all collected in one place. These wordlists can
contain usernames, passwords, URLs, sensitive data patterns, fuzzing payloads,
web shells, etc. To install on Kali Linux, we will use the apt command followed
by the Seclists as shown in the image below.
GitHub: Seclists
apt install seclists
The installation will create a
directory by the name of Seclists inside the /usr/share location. Going through
we can see the different categories of wordlists such as Discovery, Fuzzing,
IOCs, Misc, Passwords, Pattern Matching, Payloads, Usernames, and Web-Shells.
Assetnode Wordlists
The Assetnode Wordlist releases a
specially curated wordlist for a whole wide range of areas such as the
subdomain discovery or special artifacts discovery. The best part is that it
gets updated on the 28th of Each month as per their website. This is
the next best thing that was released ever since the Seclists. To download all
wordlists at once anybody can use the following wget command.
Website: Assetnote
Wordlists
wget -r --no-parent -R
"index.html*" https://wordlists-cdn.assetnote.io/ -nH
PacketStrom Wordlists
Packet Storm Security is an
information security website that offers current and historical computer
security tools, exploits, and security advisories. It is operated by a group of
security enthusiasts that publish new security information and offer tools for
educational and testing purposes. But much to our surprise, it also publishes
wordlists. Any user that has crafter some specified wordlist can submit their
wordlist on their website. So, if you are looking for a unique wordlist be sure
to check it out.
Link: Pack
Strom Security Wordlists
Cleaning Wordlists
Till now we saw multiple
wordlists that contain thousands and thousands of entries inside them. Now
during penetration testing on your vulnerable server or any CTF, it is possibly
fine as they are designed to handle this kind of bruteforce but when we come to
the real-life scenario things get a little complicated. As in real life, no
development team or owner is going to permit you to perform a thousand after
thousand wordlist bruteforce. This can hamper its quality of service to other
customers. So, we should decrease the wordlist entries. I know it sounds
counterproductive but it is not. The wordlists might contain some payloads that
might be exceeding 100 characters or even be too specific for them to extract
anything directly. Then we do have some payloads that are the way to similar to
each other that if we replace any one of them, the result remains the same. Jon Barber
created a script that can remove noisy charters such as ! ( , %. Furthermore,
tidy the wordlist so that it can be more effective.
GitHub: CleanWordlist.sh
./clean_wordlists.sh
HTML5sec-Injections-Jhaddix.txt
We can check the lines that were
removed from the HTML5 Injection wordlist using the diff command as shown in
the image above.
diff
HTML5sec-Injections-Jhaddix.txt_cleaned < (sort
HTML5sec-Injections-Jhaddix.txt) | more
Crafting Wordlists: CeWL
CeWL is a Ruby application that
spiders a given URL to a specified depth, optionally following external links,
and returns a list of words that can then be used for password crackers such as
John the Ripper. CeWL also has an associated command-line app, FAB (Files
Already Bagged) which uses the same metadata extraction techniques to create
author/creator lists from already downloaded. Here we are running CeWL against
the tart URL and saving the output into a wordlist by the name of dict.txt.
GitHub: CeWL - Custom
Word List generator
Learn More: Comprehensive
Guide on CeWL Tool
Crafting Wordlists: Crunch
Crunch is a wordlist generator
where you can specify a standard character set or a character set you specify.
crunch can generate all possible combinations and permutations. Here, we used
crunch to craft a wordlist with a minimum of 2 and a maximum of 3 characters
and writing the output inside a wordlist by the name of dict.txt.
Learn More: Comprehensive
Guide on Crunch Tool
Crafting Wordlists: Cupp
A weak password might be very
short or only use alphanumeric characters, making decryption simple. A weak
password can also be easily guessed by someone profiling the user, such as a
birthday, nickname, address, name of a pet or relative, or a common word such
as God, love, money, or password. This is where Cupp comes into use as it can
be used in situations like legal penetration tests or forensic crime
investigations. Here, we are creating a wordlist that is specific for a person
named Raj. We enter the details and upon submission, we have a wordlist that is
generated especially for this user.
GitHub: CUPP - Common User
Passwords Profiler
Learn More: Comprehensive
Guide on Cupp– A wordlist Generating Tool
Crafting Wordlists: Pydictor
Pydictor is one of those tools
that both novices and pro can appreciate. It is a dictionary-building tool that
is great to have in your arsenal when dealing with password strength tests. The
tool offers a plethora of features that can be used to create that perfect
dictionary for pretty much any kind of testing situation. Here, we defined the
base and length as 5 and then create a wordlist. The wordlist contains the
numeric up to 5 digits.
GitHub: pydictor
Learn More: Comprehensive
Guide on Pydictor – A wordlist Generating Tool
Crafting Wordlists: Bopscrk
Bopscrk (Before Outset PaSsword
CRacKing) is a tool to generate smart and powerful wordlists for targeted
attacks. It is part of Black Arch Linux for as long as we can remember. It
introduces personal information related to the target and combines every word
and transforms it into possible passwords. It also contains a lyric pass module
which allows it to search lyrics related to the favorite artist of the target
and then include them into the wordlists.
GitHub: Bopscrk
Here, we can see that the
wordlist that was crafter from the details that were provided by us is neat and
crafter with a high chance to be the actual password of the Raj user.
Crafting Wordlists: BEWGor
For starters, let’s begin with
the pronunciation. It is pronounced as Booger. I know not easy to wrap your head
around it. BEWGor is designed to help with ensuring password security. It is a
Python script that prompts the user for biographical data about a person,
referred to as the Subject. This data is then used to create likely passwords
for that Subject. BEWGor is heavily based on Cupp but they are different in
some ways as It presents vastly Increased Information Detail on Main Subject, it
includes support for an arbitrary number of family members and pets, Users can
use permutations to generate possible passwords. Also, BEWGor can generate huge
numbers of passwords, create Upper/Lower/Reverse variations of inputted values,
save raw inputted values to a Terms file before variations are generated, set
upper and lower limits on output line length, and check that an inputted
Birthday is valid. Birthdays must not be the future, a false leap day, June
32nd, etc.
GitHub: BEWGor - Bull's
Eye Wordlist Generator
After working for a while, we see
that we have a refined wordlist for the user Raj. It can now be used to
bruteforce the credentials of Raj.
Merging Wordlists: DyMerge
A simple, yet powerful tool - written
purely in python - takes given wordlists and merges them into one dynamic
dictionary that can then be used as ammunition for a successful dictionary-based
(or bruteforce) attack.
GitHub: DyMerge - Dynamic
Dictionary Merger
Learn More: Comprehensive
Guide on Dymerge
Here, we have two wordlists:
1.txt and 2.txt. Both containing 5 entries each. We will use DyMerge to combine
both wordlists.
Running DyMerge, we provide
result.txt as the wordlist to be created by merging 1.txt and 2.txt. This can
be observed that the result.txt has 10 entries from both of the wordlists.
Crafting Wordlists: Mentalist
It is a GUI tool for crafting custom
wordlists. It uses common human paradigms for creating password-based
wordlists. It can craft the full wordlist with passwords but it can also create
rules compatible to be cracked with Hashcat and John the Ripper.
It generates by joining nodes
which in turn take a shape of a chain. The initial node in the chain is called
the Base Words node. Each base word is then passed to the next node in the
chain as it is processed. That’s how the
words get modified throughout the wordlists. After working on the chain, it
finally writes the result of the chain into the file specified or converts it
into the rules as per the user request.
Hashcat/John Rules
For offline cracking, there are
times where the full wordlist is too large to output as a whole. In this case,
it makes sense to output to rules so that Hashcat or John can programmatically
generate the full wordlist. Download the release from GitHub.
GitHub: Mentalist
We are using Windows OS here to
demonstrate the ability of Mentalist. We have chosen the English Dictionary as
the Base Words. It calculates that 235,886 possible keywords can be manipulated
into the passwords by taking English dictionaries as a base. Then we provide some additional options such
as Case and if we want to substitute entries and If we want to add Special
Character after each entry.
After running for a while, it has
crafted a text file by the name of dict.txt. It contains all the passwords that
were possible to craft as per our requirements.
Conclusion
The point that we are trying to
convey through this article is that wordlist is one of the most important
assets a penetration tester can have. There are multiple resources to get a
wordlist and multiple tools to craft a wordlist of your own. We wanted this
article to serve as your go-to guide whenever you are trying to learn or use a
wordlist or any of the tools to craft a wordlist.
0 comments:
Post a Comment