Monday, February 3, 2020

Hunting for APT28 malware in a stockpile of samples

Recently I wanted to do some data analysis on APT28 malware samples I had.  I have some samples sorted and organized but have a pile of unsorted encrypted zip and rar files with a bunch of other unrelated malware samples and warez.

The question is what APT samples are hiding in my stockpile of malware samples and what of those samples are related to APT28.



So how do we get to the juicy samples inside the thousands of password protected files?

We brute force them of course.  Being as they're malware samples more than likely the password will be something like the following:

infected
password!
malware

or some variation similar.


After some google and testing of a small script I had something that worked using John The Ripper to brute force the zip file password.

#!/bin/bash
echo "Brute all the zip files in dir";
if [ $# -ne 2 ]
then
echo "Usage $0 <directory_with_zip_files> <wordlist>";
exit;
fi
FILES="$1*.zip"
echo $FILES
for f in $FILES
do
for i in $(john --wordlist=$2 --rules --stdout)
do
echo -ne "\rtrying \"$i\" "
unzip -d zip-out -o -P $i $f >/dev/null 2>&1
STATUS=$?
if [ $STATUS -eq 0 ]; then
echo -e "\nArchive: $f  password is: \"$i\""
fi
done
done


Running it.

<  INSERT FORGOTTEN SCREENSHOT HERE   >

Modifying this script I was able to get a sort of hacky brute force that seems to work with the rar files.

#!/bin/bash
echo "rar file brute";
if [ $# -ne 2 ]
then
echo "Usage $0 <directory_with_rar_files> <wordlist>";
exit;
fi
FILES="$1*.rar"
echo $FILES
for f in $FILES
do
#unrar x $f -pinfected rar-out/ >/dev/null 2>&1
while IFS= read -r line
do
echo "File: $f"
echo -ne "\rtrying \"$line\" "
unrar x $f -p$line rar-out/ >/dev/null 2>&1
STATUS=$?
if [ $STATUS -eq 0 ]; then
echo -e "\nArchive: $f  password is: \"$i\""
fi
done < $2
done


Yes I realize its not perfect like the output password isnt set to the right variable... but it works and ill fix it later.



Running it.





brute forcing the zips was a lot cleaner.


Anyways we now have two directories with a bunch of malware samples.  i also ran the zip brute force inside the zip-out directory to get any samples still ziped up and I got a few. :)



So now we have all the malware samples that were decrypted from the rars and zips.


How are we going to sort through 10,000+ malware samples?


With Yara and bash of course.

Using the Yaras APT rules to sort through all the samples we find some interesting malware.



yara -p 20 -g /YARA_RULES/rules/malware/APT_*.yar -r /MALWARE 



command breakdown:

-p 20          Use 20 threads
-g               print tags
<yara rules>
-r                recursive search
<malware directory>

run with the -m flag to get meta data which will be very helpful when sorting the malware families.


So we see there's a lot of info and a lot of various APT malware samples.  Now we need to sift out the APT28 samples.


This is where we grep is our friend

grep "APT28"

 yara -p 20 -g -m /YARA_RULES/rules/malware/APT_*.yar -r /MALWARE | grep "APT28" | sort | cut -d"/" -f1,2,3,4,5,6,7,8,9,10,16,17,18,19

You can ignore the cut command.  I just wanted to clean up the output.





Now lets sort the malware into its family groups.


Basically we want to sort out the APT28 familys into the sample gorups
we use grep to pull out samples related to the family name like
grep "CORESHELL"
grep "X-Agent"
etc..

 
Using a little command line kung-fu we can pull out the sample directories and the copy those samples into the malware family directories.


I wrote a small shell script to do this.

echo "YARA APT28 MALWARE FAMILY SORTER"
echo " Sorts CORESHELL, X-Agent, XTunnel, etc..."
list=(X-Agent CORESHELL XTunnel EVILTOSS BlackEnergy)
for i in ${list[@]}
do
# Sorted known APT28 files
yara -p 20 -g -m /YARA_RULES/rules/malware/APT_*.yar -r /MALWARE-SAMPLES/APT28/ | grep "GRIZZLY-STEPPE" | grep "$i" | sort > APT_28-$i-Family_Samples.txt
# Unsorted stockpile dir
        yara -p 20 -g -m /YARA_RULES/rules/malware/APT_*.yar -r /MALWARE | grep "GRIZZLY-STEPPE" | grep "$i" | sort >> APT_28-$i-Family_Samples.txt
cat APT_28-$i-Family_Samples.txt | cut -d"]" -f3 > sample_dir.txt  samples=sample_dir.txt
while read -r sample
do
echo "\nFAMILY: $i"
echo "$sample" cp "$sample" APT28/Malware-Family/$i/
done < "$samples" done

I manually created directories... why? Because that's just how it happened.
Running the script resulted in the following




There you have it.  We successfully sorted through a pile of malware searching for samples from APT28 and separated out the samples into the malware families.

Next step is to use the malware sample set for some data science and machine learning fun.

Like what?

Well like doing a little shared code analysis on the samples.

But that's for another blog post.