xorhex logo

xorhex

Focus on Threat Research Things.

Mlget - For all Your Malware Download Needs

Custom tool to download malware from a variety of sources. Can also upload to MWDB instances of your choosing; complete with comments and tags!

xorhex

8-Minute Read

mlget
SourceReleases
License TypeMIT License
Version At Blog Releasev2.1

Summary

Mlget is a command line tool for facilitating the download of file hashes from multiple sources. Mlget supports the three common hash types: MD5, SHA1, and SHA256. Mlget also has the ability to upload to multiple MWDB instances.

Important

  • The user controls the API endpoint URLs queried.
    • All URLs are set in the config file (.mlget.yml).
  • The user controls what MWDB instances can be uploaded to.
    • These are defined using the UploadMWDB type flag in the config.

Problem Statements

This tool was born out of needing to solve challenges around downloading, uploading, and recording hashes not found when hunting for samples.

Problem Statement 1: Where is the hash?

Many samples can be found on VirusTotal, but not always; or maybe we don’t have access to VirusTotal. Mlget strives to solve the problem of where to download the hash by querying different services in the hopes one of them has it.

Currently there are 10 different services mlget will query - all but one requires an API key.

Any of these can be added more than once, but it does not always make sense to do so. For example, if you’ve access to both a local instance of CAPE and also access to the public CAPE instance, then one entry for each can be added to the configuration (.mlget.yml). For other ones, like MalwareBazaar, it makes sense to only have one entry in the config.

There are some services missing from this list but I hope to add more over time like:

  • Reversing Labs
  • UnpackMe
  • VMRay

Problem Statement 2: How can I download and upload simultaneously?

A common use case mlget strives to solve is uploading the downloaded files to MWDB without requiring additional steps from the user. Mlget does this via the --uploaddelete and --upload flags. When either of these flags is set and there is one or more UploadMWDB instances defined in the config (.mlget.yml), then the sample gets uploaded to all of the UploadMWDB instances. If a hash is found on one instance but not another; then, it will use that copy to upload to the other UploadMWDB instances.

Mlget can also add comments (--comment) and tags (--tag) to UploadMWDB when the sample is uploaded. Tags and comments are still added if the sample already exists. Mlget is smart enough check the UploadMWDB instances prior to querying the rest of the sources to see if the sample already exists. It only checks the UploadMWDB instances if either --uploaddelete or --upload flags is present. The --uploaddelete flag will delete the sample after its been successfully uploaded to the UploadMWDB instances.

Problem Statement 3: What was I looking for again?

Sadly, sometimes the hash can’t be found across the different services; so, mlget records these not found hashes to a file when either --output or --readupdate is passed to it. Mlget stores the hash, any tags, and all of the comments in the output file. The --output flag will create a new file using the current date time prepended to either "_not_found_hashes.txt or the name of the file read in from --read. This file can be fed back to mlget using the --read or --readupdate flags.

Output file contents

Figure 1: Contents of the output file. Right click and open image in new tab to read the contents.

As the name implies, --readupdate combines both of these actions into one. It will read in the hashes stored in the file and then update the same file with the hashes not found.

Installing

The only prerequisite needed is golang. The latest version should work.

Steps

  • Download the latest release and extract it
  • From inside the extracted folder run:
    • go get -u
    • go mod tidy
    • go build
  • Run mlget with no parameters to have it walk through the config creation process.
    • Can also create it using your prefered text editor.

Recommend adding mlget to your path. Also consider creating an alias if you find yourself using the same flags over and over again like --readupdate and --uploaddelete.

Config Format

The config is stored in the user’s home directory (os.UserHomeDir()) and is called .mlget.yml. Version 2.x’s format looks like this:

repository 0:
  type: MalwareBazaar
  url: https://mb-api.abuse.ch/api/v1
  api: ""
  queryorder: 1
repository 1:
  type: CapeSandbox
  url: https://www.capesandbox.com/apiv2
  api: users_api_key_goes_here
  queryorder: 2
repository 2:
  type: UploadMWDB
  url: users_instance_of_mwdb:port/api  
  api: users_api_key_goes_here
  queryorder: 0

Use queryorder to control the order the services are searched.

The allowed type values are:

  • CapeSandbox
  • HybridAnalysis
  • InQuest
  • JoeSandbox
  • Malshare
  • MalwareBazaar
  • MWDB
  • Polyswarm
  • Triage
  • UploadMWDB
  • VirusTotal

MWDB and UploadMWDB are both MWDB instances. MWDB is used for download only. UploadMWDB is used for uploading samples to. The only time UploadMWDB is queried for downloading a sample is when the --downloadonly flag is specified.

If you use mlget to create the config, it will suggest default URL values for each different type except for UploadMWDB. It’s possible to add additional entries later by either editing the config file directly or using the --addtoconfig flag.

Add To Config

Figure 2: Add to config prompts.

Upgrading the Config

The config file format changed between version 1.x and version 2.x. For those that used version 1.x and are running version 2.x for the first time, mlget will prompt the user before updating the config. If Y is selected; then, it will make a config backup to ~/.mlget-bak.yml` before upgrading the config. Mlget does its best to make the config update seamless, but each user should take their own backup prior to mlget updating the file (just in case).

Usage

Help

mlget - A command line tool to download malware from a variety of sources

Usage: ./mlget [OPTIONS] hash_arguments...
      --addtoconfig         Add entry to the config file
      --comment strings     Add comment to the sample when uploading to your own instance of MWDB.
      --config              Parse and print the config file
      --downloadonly        Download from any source, including your personal instance of MWDB.
                            When this flag is set; it will NOT update any output file with the hashes not found.
                            And it will not upload to any of the UploadMWDB instances.
      --from string         The service to download the malware from.
                              Must be one of:
                              - tg (Triage)
                              - mb (Malware Bazaar)
                              - ms (Malshare)
                              - ha (Hybird Anlysis)
                              - vt (VirusTotal)
                              - cp (Cape Sandbox)
                              - mw (Malware Database)
                              - ps (PolySwarm)
                              - iq (Inquest Labs)
                              - js (Joe Sandbox)
                            If omitted, all services will be tried.
      --help                Print the help message
      --noextraction        Do not extract malware from archive file.
                            Currently this only effects MalwareBazaar and HybridAnalysis
      --output              Write to a file the hashes not found (for later use with the --read flag)
      --read string         Read in a file of hashes (one per line)
      --readupdate string   Read hashes from file to download.  Replace entries in the file with just the hashes that were not found (for next time).
      --tag strings         Tag the sample when uploading to your own instance of MWDB.
      --upload              Upload downloaded files to the MWDB instance specified in the mlget.yml file.
      --uploaddelete        Upload downloaded files to the MWDB instance specified in the mlget.yml file.
                            Delete the files after successful upload

Example Usage: mlget <sha256>
Example Usage: mlget --from mb <sha256>
Example Usage: mlget --tag tag_one --tag tag_two --uploaddelete <sha256> <sha1> <md5>

Here are some common uses mlget strives to address.

Note: The hash values must be the last set of arguments added.

Download without uploading to UploadMWDB

mlget [hash]

Downloading multiple hashes without uploading to UploadMWDB

mlget [hash] [hash]

Downloading and uploading to UploadMWDB and deleting from local disk

mlget --uploaddelete [hash] [hash]

Download only

This will also query any UploadMWDB instances set and will download the file from it if found on any of those instances. It will not upload anything to the UploadMWDBs. It will also not output any not found hash to a file. This flag is for download only which makes it ideal for pulling a sample down from UploadMWDB after it’s been uploaded.

mlget --downloadonly [hash]

Download and record what was not found

mlget --output [hash] [hash]

Read in hash values from a file and download them

mlget --read [filename.txt]

Download hashes from both a file and from command line arguements

mlget --read [filename.txt] [hash] [hash]

Download, upload, and record “Not Found Hashes”

This command will:

  • read hashes in from a file
  • get hashes from the command line
  • query all services found in the config for the hashes
  • upload the files to UploadMWDB
  • add tags to the uploaded files in UploadMWDB
  • add comments to the uploaded files in UploadMWDB
  • record not found hashes back to the same file it read hashes in from

mlget --readupdate [filename.txt] --tag [tag] --tag [tag] --comment "[comment]" --comment "[comment]" --uploaddelete [hash] [hash]

Download, Upload, Delete

Figure 3: Download, Upload, Delete.

Important Note: This will not add the comments and tags passed in from the command line to the samples read from --readupdate. Mlget will use the tags and comments read from the file for those hashes only. The hashes, tags, and comments passed in on the command line are grouped together. There is no cross over.

Download from a specific service

mlget --from vt [hash]

Recent Posts

Categories

About

Hosting my custom tools, threat research, and general reverse engineering notes.