Mlget - For all Your Malware Download Needs
Custom tool to download malware from a variety of sources. Can also upload to MWDB instances of your choosing; complete with comments and tags!
Source | Releases |
License Type | MIT License |
Version At Blog Release | v2.1 |
Summary
Mlget is a command line tool for facilitating the download of file hashes from multiple sources. Mlget supports the three common hash types: MD5, SHA1, and SHA256. Mlget also has the ability to upload to multiple MWDB instances.
Important
- The user controls the API endpoint URLs queried.
- All URLs are set in the config file (
.mlget.yml
).
- All URLs are set in the config file (
- The user controls what MWDB instances can be uploaded to.
- These are defined using the UploadMWDB type flag in the config.
Problem Statements
This tool was born out of needing to solve challenges around downloading, uploading, and recording hashes not found when hunting for samples.
Problem Statement 1: Where is the hash?
Many samples can be found on VirusTotal, but not always; or maybe we don’t have access to VirusTotal. Mlget strives to solve the problem of where to download the hash by querying different services in the hopes one of them has it.
Currently there are 10 different services mlget will query - all but one requires an API key.
- CAPE Sandbox
- InQuest
- Hybrid Analysis
- Joe Sandbox
- MalwareBazaar - No API key required for download
- Malshare
- MWDB CERT Polska - Really any up to date MWDB instance
- Polyswarm
- Triage
- VirusTotal
Any of these can be added more than once, but it does not always make sense to do so. For example, if you’ve access to both a local instance of CAPE and also access to the public CAPE instance, then one entry for each can be added to the configuration (.mlget.yml
). For other ones, like MalwareBazaar, it makes sense to only have one entry in the config.
There are some services missing from this list but I hope to add more over time like:
- Reversing Labs
- UnpackMe
- VMRay
Problem Statement 2: How can I download and upload simultaneously?
A common use case mlget strives to solve is uploading the downloaded files to MWDB without requiring additional steps from the user. Mlget does this via the --uploaddelete
and --upload
flags. When either of these flags is set and there is one or more UploadMWDB instances defined in the config (.mlget.yml
), then the sample gets uploaded to all of the UploadMWDB instances. If a hash is found on one instance but not another; then, it will use that copy to upload to the other UploadMWDB instances.
Mlget can also add comments (--comment
) and tags (--tag
) to UploadMWDB when the sample is uploaded. Tags and comments are still added if the sample already exists. Mlget is smart enough check the UploadMWDB instances prior to querying the rest of the sources to see if the sample already exists. It only checks the UploadMWDB instances if either --uploaddelete
or --upload
flags is present. The --uploaddelete
flag will delete the sample after its been successfully uploaded to the UploadMWDB instances.
Problem Statement 3: What was I looking for again?
Sadly, sometimes the hash can’t be found across the different services; so, mlget records these not found hashes to a file when either --output
or --readupdate
is passed to it. Mlget stores the hash, any tags, and all of the comments in the output file. The --output
flag will create a new file using the current date time prepended to either "_not_found_hashes.txt
or the name of the file read in from --read
. This file can be fed back to mlget using the --read
or --readupdate
flags.
As the name implies, --readupdate
combines both of these actions into one. It will read in the hashes stored in the file and then update the same file with the hashes not found.
Installing
The only prerequisite needed is golang. The latest version should work.
Steps
- Download the latest release and extract it
- From inside the extracted folder run:
go get -u
go mod tidy
go build
- Run
mlget
with no parameters to have it walk through the config creation process.- Can also create it using your prefered text editor.
Recommend adding mlget
to your path. Also consider creating an alias if you find yourself using the same flags over and over again like --readupdate
and --uploaddelete
.
Config Format
The config is stored in the user’s home directory (os.UserHomeDir()
) and is called .mlget.yml
. Version 2.x’s format looks like this:
repository 0:
type: MalwareBazaar
url: https://mb-api.abuse.ch/api/v1
api: ""
queryorder: 1
repository 1:
type: CapeSandbox
url: https://www.capesandbox.com/apiv2
api: users_api_key_goes_here
queryorder: 2
repository 2:
type: UploadMWDB
url: users_instance_of_mwdb:port/api
api: users_api_key_goes_here
queryorder: 0
Use queryorder to control the order the services are searched.
The allowed type values are:
- CapeSandbox
- HybridAnalysis
- InQuest
- JoeSandbox
- Malshare
- MalwareBazaar
- MWDB
- Polyswarm
- Triage
- UploadMWDB
- VirusTotal
MWDB and UploadMWDB are both MWDB instances. MWDB is used for download only. UploadMWDB is used for uploading samples to. The only time UploadMWDB is queried for downloading a sample is when the --downloadonly
flag is specified.
If you use mlget to create the config, it will suggest default URL values for each different type except for UploadMWDB. It’s possible to add additional entries later by either editing the config file directly or using the --addtoconfig
flag.
Upgrading the Config
The config file format changed between version 1.x and version 2.x. For those that used version 1.x and are running version 2.x for the first time, mlget will prompt the user before updating the config. If Y
is selected; then, it will make a config backup to ~/.mlget-bak.yml` before upgrading the config. Mlget does its best to make the config update seamless, but each user should take their own backup prior to mlget updating the file (just in case).
Usage
Help
mlget - A command line tool to download malware from a variety of sources
Usage: ./mlget [OPTIONS] hash_arguments...
--addtoconfig Add entry to the config file
--comment strings Add comment to the sample when uploading to your own instance of MWDB.
--config Parse and print the config file
--downloadonly Download from any source, including your personal instance of MWDB.
When this flag is set; it will NOT update any output file with the hashes not found.
And it will not upload to any of the UploadMWDB instances.
--from string The service to download the malware from.
Must be one of:
- tg (Triage)
- mb (Malware Bazaar)
- ms (Malshare)
- ha (Hybird Anlysis)
- vt (VirusTotal)
- cp (Cape Sandbox)
- mw (Malware Database)
- ps (PolySwarm)
- iq (Inquest Labs)
- js (Joe Sandbox)
If omitted, all services will be tried.
--help Print the help message
--noextraction Do not extract malware from archive file.
Currently this only effects MalwareBazaar and HybridAnalysis
--output Write to a file the hashes not found (for later use with the --read flag)
--read string Read in a file of hashes (one per line)
--readupdate string Read hashes from file to download. Replace entries in the file with just the hashes that were not found (for next time).
--tag strings Tag the sample when uploading to your own instance of MWDB.
--upload Upload downloaded files to the MWDB instance specified in the mlget.yml file.
--uploaddelete Upload downloaded files to the MWDB instance specified in the mlget.yml file.
Delete the files after successful upload
Example Usage: mlget <sha256>
Example Usage: mlget --from mb <sha256>
Example Usage: mlget --tag tag_one --tag tag_two --uploaddelete <sha256> <sha1> <md5>
Here are some common uses mlget strives to address.
Note: The hash values must be the last set of arguments added.
Download without uploading to UploadMWDB
mlget [hash]
Downloading multiple hashes without uploading to UploadMWDB
mlget [hash] [hash]
Downloading and uploading to UploadMWDB and deleting from local disk
mlget --uploaddelete [hash] [hash]
Download only
This will also query any UploadMWDB instances set and will download the file from it if found on any of those instances. It will not upload anything to the UploadMWDBs. It will also not output any not found hash to a file. This flag is for download only which makes it ideal for pulling a sample down from UploadMWDB after it’s been uploaded.
mlget --downloadonly [hash]
Download and record what was not found
mlget --output [hash] [hash]
Read in hash values from a file and download them
mlget --read [filename.txt]
Download hashes from both a file and from command line arguements
mlget --read [filename.txt] [hash] [hash]
Download, upload, and record “Not Found Hashes”
This command will:
- read hashes in from a file
- get hashes from the command line
- query all services found in the config for the hashes
- upload the files to UploadMWDB
- add tags to the uploaded files in UploadMWDB
- add comments to the uploaded files in UploadMWDB
- record not found hashes back to the same file it read hashes in from
mlget --readupdate [filename.txt] --tag [tag] --tag [tag] --comment "[comment]" --comment "[comment]" --uploaddelete [hash] [hash]
Important Note: This will not add the comments and tags passed in from the command line to the samples read from --readupdate
. Mlget will use the tags and comments read from the file for those hashes only. The hashes, tags, and comments passed in on the command line are grouped together. There is no cross over.
Download from a specific service
mlget --from vt [hash]