Part 1 - MWDB
Exercise #1.0: Getting familiar with the interface
MWDB welcomes us with a list of recently uploaded samples.
Recent views allow to see basic information about latest objects in the repository and interactively explore the dataset using Lucene-based queries and clickable fields. If sample was uploaded within last 72 hours, it is additionally marked with yellowish background. Yellow color is a bit more intense if file was uploaded at least 24 hours ago.
If you click on sample hash, you will navigate to the detailed sample view. Here you can see all the details about file. Left side contains a few tabs. The first one called Details presents basic file information like original file name, size, file type and hash values.
On the right side of view you can see tags, relations with other objects and the comments section. This information are added to MWDB mainly by our analysis backend, where sample is sent on the first upload. If analysis was successful, all interesting analysis artifacts are uploaded back to the MWDB.
During our tour we will go through all of these elements, starting from tags.
Exercise #1.2: Exploring sample view and hierarchy
Goal: explore the sample view, understand the object hierarchy
Copy hash to the query field in Samples view:
5762523a60685aafa8a681672403fd19
Click on the hash in row to navigate to the sample details.
Here you can see all the details about file. Left side contains three tabs: Details, Relations and Preview. The first one called Details presents basic file information like original file name, size, file type and hash values.
Blue-colored fields are clickable, you can use them to quickly search for other samples matching the given criteria.
Just beneath the file details tab, attributes are displayed. This section contains all sorts of miscellaneous information about the sample and/or analysis.
On the right side of view you can see tags, relations with other objects and the comments section.
As you deduce from tags, sample is a rar archive (
archive:rar
) and comes from Malwarebazaar (feed:malwarebazaar
). Link to the sample on MalwareBazaar platform can be found in the Attributes section.In this case, related samples are the file that were unpacked from that archive.
See the
Related samples
box on the right. Go to the child sample taggedripped:formbook
This is the actual executable contained in the malicious archive. Based on tags we can say that:
runnable:win32:exe
it is Windows 32-bit executableyara:win_formbook
One of our Yara rules matched this sample as Formbooket:formbook
ET Pro traffic rules matched this sample as Formbook (more info in comments)ripped:formbook
We have successfully ripped Formbook configuration from this sample
Navigate to the next child tagged
dump:win32:exe
This is the memory dump that contains the unpacked formbook payload. We got it by running the sample on our sandbox and then performing memory dumps when specific interesting prerequisites are matched. It’s worth nothing that while the analysis produces a bunch of memory dumps, we upload the best one that allowed us to get the complete malware configuration.
Check
Static config
tab.See the extracted static configuration.
Configuration is the second data type in MWDB. Malware configurations are meant to parametrize the malware behavior and they usually contain useful IoCs.
The format of configuration depends on malware family, usually deriving from the structure “proposed” by the malware author.
Go back to the
Details
tab and make sure you’re on MD58e56eee9cf853d2ec4c695282c01fe0a
Go to the
Relations
tab. It presents the parents and children of current object. Notice how two distinct samples have been unpacked into the same malicious core.Click on the Config box in the Relations graph to expand it. Zoom out the graph to see the whole graph.
Exercise #1.3: Looking for similar configurations
Goal: Find configurations that are similar to Formbook config
Click on the config hash (
f2e216695d4ce7233f5feb846bc81b8fffe9507988c7f5caaca680c0861e5e02
) inRelated configs
tab.Go to the Preview tab
Configurations are just a simple JSON objects. The only special thing is hashing algorithm e.g. lists are hashed non-orderwise, so if domains were ripped in different order, configuration hash will be still the same.
Go back to the Details tab. Expand
urls
and click onwww.discorddeno.land/suod/
cfg.urls*.url:"www.discorddeno.land/suod/"
The resulting query looks for all
url
keys inurls
lists that havewww.discorddeno.land/suod/
.Let’s check if
/suod/
path was used in other configs as well.Modify query to look for other configs with
/suod/
path replacing the domain with wildcard*
.cfg.urls*.url:"*/suod/"
There are two configurations. What URL was used in the second configuration?
Now let’s check if
/suod/
occurs in other configurations regardless of the configuration structure. For that query we can use full-text search in JSON.cfg:"*/suod/*"
Are there more configurations like that?
If not, let’s search for configurations with .land TLD
cfg:"*.land*"
Then click on
agenttesla
config (e031b192d40f6d234756f8508f7d384db315983b57d8fc3216d20567056bd88b
) - you might have to scroll down a bit.Note
Hint: Instead of scrolling, you can help yourself by adding AND family:agenttesla to the query
Ok, there is no .land TLD. but .landa e-mail address. To ilustrate how full-text search works, go to Preview, press CTRL-F and type “.land” to see what parts of JSON were matched
How we can improve our query? Let’s add
"
character at the end to match the end string.cfg:"*.land\"*"
Go to the Gandcrab configuration and check in Preview what was matched.
Exercise #1.4: Blobs and dynamic configurations
Goal: Familiarize yourself with the blob object type
The third object type in MWDB is blob. While config represents structured (JSON) data, blob is an unstructured one. Blobs are just simple text files, usually containing some raw, but human-readable content.
Let’s take a look at some examples.
Navigate to https://mwdb.cert.pl/blob/60c9ad80cde64e7cae9eec0c11dd98175860243aa40a3d8439bbf142d2a0e068
What we see is bunch of decrypted strings from AgentTesla that were ripped from the malware sample.
They’re not structured because we don’t semantically analyze every string, but it’s still nice to have them in repository.
Jump to https://mwdb.cert.pl/blob/48914f0a6b9f4499da31d2217a7ee2e8c8f35f93ab5c992333f5c1aa947d9009
We’re now looking at decrypted strings from the Remcos family. Even if data is unstructured, it can be considered a part of static configuration and used in searching for malware similarities.
Let’s take a look at the parent of this blob: the static configuration object.
https://mwdb.cert.pl/config/29c1f3c14a446b2a77ce58cbc59619fbfe7459c56fe1c8408597538384aa56ac
Not much, just C2 host/port and password.
Let’s take a look for another configuration with this host by expanding
c2
key and clicking at thehost
address.Oh, there is another one.
The resulting query is
cfg.c2*.host:"ongod4life.ddns.net:4344"
You should be looking at: https://mwdb.cert.pl/config/9afac348443a7aa9ca5d33cffcc984751cebf15f065cb90b48911943fb10e1f6
They’re pretty much the same and only the
raw_cfg
differs. How to easily compare them?Go to the blob (
da2055f0e90355bfaf3cc932f7fdb2f82bfd79c26f95b61b23b9cd77f9b0e32d
). In blob view, find theDiff with
button on the right side of tabs. Click that button.Now we can choose another blob to compare.
The simplest way is to copy to clipboard the previous blob id and paste it into query bar.
48914f0a6b9f4499da31d2217a7ee2e8c8f35f93ab5c992333f5c1aa947d9009
Then press ENTER and choose the searched blob. What’s the difference between these blobs?
Warning
This feature has known bug in v2.10.1, so go directly into this link: https://mwdb.cert.pl/diff/48914f0a6b9f4499da31d2217a7ee2e8c8f35f93ab5c992333f5c1aa947d9009/da2055f0e90355bfaf3cc932f7fdb2f82bfd79c26f95b61b23b9cd77f9b0e32d
But blobs are not only the strings and unstructured static things.
Go to the
Blobs
list and click ondyn_cfg
inBlob type
column or type manuallytype:dyn_cfg
. Then filter outdynamic:mirai
tag (there are lots of them but they’re not that interesting).Check out the
hancitor
dynamic configuration.https://mwdb.cert.pl/blob/3b032876cc2d77d28625b9dfee0686663e60385cda7f9031afac6cf2b0c6d6e4
Dynamic configurations also parametrize the malware behavior, but they’re fetched from external source. In that case we have set of commands to run a second stage malware.
Fetched second stage is linked as a blob child.
Other possible types of dynamic configuration are injects, mail templates for spam botnets or malware updates.
Example of more verbose Kronos dynamic configuration:
https://mwdb.cert.pl/blob/2e4d109edb8b2fa7c1f1d7592a284bbf15e3e51d24d1d9cdda91c9ae582cf05c/config
Blob children are files dropped from C&C and structured (parsed) fragments of dynamic configuration
Exercise #1.5: Let’s upload something!
Goal: Learn how object sharing and access inheritance work.
All objects you’ve seen so far are shared with all MWDB accounts: the ‘public’ group.
If you use a query:
NOT shared:public
You should not have any results, because all samples you see are public. So how to gather some ‘private’ samples? You need to upload them!
Fetch an example sample from GitHub
ex5malware.zip
. Don’t unpack it, just download to some temporary location.$ wget https://github.com/CERT-Polska/training-mwdb/raw/main/ex5malware.zip
Click on
Upload
in the navbar (https://mwdb.cert.pl/upload)Select the sample you have just downloaded.
Take a look at
Share with
options. There are four options:All my groups
sample will be shared with all your private groups, so it will be visible to the general public. E.g. it will be shared only within your organizationSingle group
in case you belong to multiple user groups, you can select a specific oneEverybody
, everyone will see the sampleOnly me
, not event your colleagues from the same organization will be able to see the sample
Upload a sample with a default
All my groups
option.Take a look at the Shares box. Who has access to your sample?
Use the Relations tab to traverse to the Config. Who has access to the configuration?
In MWDB sharing model, if you upload a private sample you get immediate access to all its descendants. So you always get all the data related with ripped configuration, but not necessarily all of its parents.
If you want, you can always change your mind and share the sample with somebody else. But you can’t reverse the action, so if something was shared by mistake, contact the administrators.
Go back to the original sample and share it with
public
usingShare with group
input field.