Tofsee Spambot features .ch DGA - Reversal and Countermesaures

Published on December 22, 2016 11:30 UTC by GovCERT.ch (permalink)
Last updated on December 22, 2016 16:30 UTC

Today we came across an interesting malware sample that appeared in our malware zoo. The malware, which we identified as Tofsee, has tried to spam out hundreds of emails within a couple of minutes. However, this wasn’t the reason why it popped up on our radar (we analyze thousands of malware samples every single day, many of which are spambots too). The reason why this particular sample caught our attention were the domains queried by the malware. The domains appear to be algorithmically generated, and about half of the domains use the country code top level domain (ccTLD) of Switzerland:

>Wireshark screenshot of Tofsee DNS queries
Wireshark screenshot of Tofsee DNS queries (click to enlarge)

Domain generation algorithms (DGAs) that use the ccTLD for Switzerland are very rare. Gozi is currently the only malware covered by the DGArchive that uses .ch --- and only in 1 of over 90 different known configurations.

This blog post first describes the analysis of the DGA. We then give a reimplementation of the DGA in Python, as well as a list of the generated domains for the next 52 weeks. We conclude with measure that we took to deal with algorithmically generated .ch domains.

Analysis

We analyzed the following Tofsee sample, with a fairly recent compile timestamp of Fri, 16 Dec 2016 07:09:11:


md5	490e121113fbadd776ca270c6788c59a
sha	c294e79a5f0fbbffd535bb517f43bf69e9bbfb03
sha256	36704ec52701920451437a870e7d538eb409f50a4ae2f8231869500d1d6de159

Seeding

The following graph node show the seeding of the DGA. First, the number of seconds that have elapsed since 1 January 1974 are counted (offset 0x40A0A0). The difference between this date and the unix epoch --- 126230400 seconds --- is added to the result at 0x0040A0A8, effectively yielding the current unix time. It is unclear to us why the authors used this convoluted method to get the current time. The unix time then undergoes four integer division by 60,60,24 and 7, to get the number of weeks since epoch. This value is used as the seed for the upcoming domain generation algorithm. Domains are therefore valid for one week, starting on Thursday at midnight UTC.

Seeding of the DGA
Seeding of the DGA (click to enlarge)

Seeding also calls a pseudo random number generator (PRNG) and takes the result modulo 10 to get a value between 0 and 9. The random number generator is the standard linear congruential algorithm used by the Borland C/C++ compiler:


unsigned int rand()
{
    r2 = 22695477 * r2 + 1;
    return r2 >> 16;
}

The initial value of r2 is virtually unpredictable:


GetSystemTimeAsFileTime(&SystemTimeAsFileTime);
GetVolumeInformationA(0, 0, 4u, &VolumeSerialNumber, 0, 0, 0, 0);
r2 = (VolumeSerialNumber \
      ^ SystemTimeAsFileTime.dwHighDateTime \
      ^ GetTickCount()) & 0x7FFFFFFF;

DGA

The generated domains, along with the port 443 and an unknown constant 2, are stored 69 bytes apart in memory (called target in the next graph). In total 10 domains are generated:

Loop that generates 10 domains
Loop that generates 10 domains (click to enlarge)

At offset 0x040A0FC a random string is generated based on the seed, i.e., number of weeks. We will come back to this routine later. The random string is repeated once at 0x040A114; so that, for example, drs becomes drsdrs.
A random letter between "a" and "j" is then appended to the string to complete the second level domain. The picked letter is based on the unpredictable PRNG shown in the seeding section above for the first generated sld. After that, the DGA picks the remaining letters in order (offsets 0x0040A16D to 0040A175). For example, if in the first iteration the letter "c" is appended, then the following iterations append "d","e","f","g","h","j","a" and finally "b".
The top level domain is set to .ch for the first five domains, and to .biz for the remaining five domains. For any given run of the malware this will result in exactly one tld per generated second level domain per run of the malware, although a second level domain will be paired with both top level domains through additional runs of the malware.
Finally, let's come back to the random string generation routine called at 0x040A0FC:

Disassembly of the sld generation
Disassembly of the sld generation (click to enlarge)

This routine uses the seed r, i.e., the number of weeks since January 1st, 1970, to generate a random string as follows:


string = ""
DO
    string += r % 26 + 'a'
    r = r / 26            // integer division
WHILE r
string = reverse(string)

For example, as of December 20, 2016, the number of week since epoch is 2450. Dividing by 26 results in 94 with a remainder of 6, so the first letter is g. Dividing 94 by 26 results in 3 with remainder 16, so the second letter is q. Dividing 3 by 26 results in 0, so we append letter d and exit the loop. The random string gqd is reversed resulting in dqg.

Reimplementation

The following reimplementation of the DGA in Python prints all 20 potential domains for any given date. Please note that each actual run of the malware will only generate and test one domain per second level domain.


from datetime import datetime
import time
import argparse

def dga(r):
    domain = ""
    while True:
        domain += chr(r % 26 + ord('a'))
        r //= 26
        if not r:
            break
    for i in range(10):
        for tld in ['.biz', '.ch']:
            yield 2*domain[::-1] + chr(i + ord('a')) + tld

def printdomains(date):
    unixtimestamp = time.mktime(date.timetuple())
    seed = int(unixtimestamp // 60 // 60 // 24 // 7)

    for domain in dga(seed):
        print(domain)


if __name__=="__main__":
    parser = argparse.ArgumentParser()
    parser.add_argument("-d", "--date", help="date for which to generate domains")
    args = parser.parse_args()
    if args.date:
        d = datetime.strptime(args.date, "%Y-%m-%d")
    else:
        d = datetime.now()
    printdomains(d)

List of Domains

The following table lists all generated domains of the next 52 weeks. The domains are given as brace expansions, so dqgdqg{a..j}.{ch,biz} stands for 20 different domains. All times are given in CET.

start end domains
2016-12-15 01:00:00 2016-12-22 00:59:59 dqgdqg{a..j}.{ch,biz}
2016-12-22 01:00:00 2016-12-29 00:59:59 dqhdqh{a..j}.{ch,biz}
2016-12-29 01:00:00 2017-01-05 00:59:59 dqidqi{a..j}.{ch,biz}
2017-01-05 01:00:00 2017-01-12 00:59:59 dqjdqj{a..j}.{ch,biz}
2017-01-12 01:00:00 2017-01-19 00:59:59 dqkdqk{a..j}.{ch,biz}
2017-01-19 01:00:00 2017-01-26 00:59:59 dqldql{a..j}.{ch,biz}
2017-01-26 01:00:00 2017-02-02 00:59:59 dqmdqm{a..j}.{ch,biz}
2017-02-02 01:00:00 2017-02-09 00:59:59 dqndqn{a..j}.{ch,biz}
2017-02-09 01:00:00 2017-02-16 00:59:59 dqodqo{a..j}.{ch,biz}
2017-02-16 01:00:00 2017-02-23 00:59:59 dqpdqp{a..j}.{ch,biz}
2017-02-23 01:00:00 2017-03-02 00:59:59 dqqdqq{a..j}.{ch,biz}
2017-03-02 01:00:00 2017-03-09 00:59:59 dqrdqr{a..j}.{ch,biz}
2017-03-09 01:00:00 2017-03-16 00:59:59 dqsdqs{a..j}.{ch,biz}
2017-03-16 01:00:00 2017-03-23 00:59:59 dqtdqt{a..j}.{ch,biz}
2017-03-23 01:00:00 2017-03-30 01:59:59 dqudqu{a..j}.{ch,biz}
2017-03-30 02:00:00 2017-04-06 01:59:59 dqvdqv{a..j}.{ch,biz}
2017-04-06 02:00:00 2017-04-13 01:59:59 dqwdqw{a..j}.{ch,biz}
2017-04-13 02:00:00 2017-04-20 01:59:59 dqxdqx{a..j}.{ch,biz}
2017-04-20 02:00:00 2017-04-27 01:59:59 dqydqy{a..j}.{ch,biz}
2017-04-27 02:00:00 2017-05-04 01:59:59 dqzdqz{a..j}.{ch,biz}
2017-05-04 02:00:00 2017-05-11 01:59:59 dradra{a..j}.{ch,biz}
2017-05-11 02:00:00 2017-05-18 01:59:59 drbdrb{a..j}.{ch,biz}
2017-05-18 02:00:00 2017-05-25 01:59:59 drcdrc{a..j}.{ch,biz}
2017-05-25 02:00:00 2017-06-01 01:59:59 drddrd{a..j}.{ch,biz}
2017-06-01 02:00:00 2017-06-08 01:59:59 dredre{a..j}.{ch,biz}
2017-06-08 02:00:00 2017-06-15 01:59:59 drfdrf{a..j}.{ch,biz}
2017-06-15 02:00:00 2017-06-22 01:59:59 drgdrg{a..j}.{ch,biz}
2017-06-22 02:00:00 2017-06-29 01:59:59 drhdrh{a..j}.{ch,biz}
2017-06-29 02:00:00 2017-07-06 01:59:59 dridri{a..j}.{ch,biz}
2017-07-06 02:00:00 2017-07-13 01:59:59 drjdrj{a..j}.{ch,biz}
2017-07-13 02:00:00 2017-07-20 01:59:59 drkdrk{a..j}.{ch,biz}
2017-07-20 02:00:00 2017-07-27 01:59:59 drldrl{a..j}.{ch,biz}
2017-07-27 02:00:00 2017-08-03 01:59:59 drmdrm{a..j}.{ch,biz}
2017-08-03 02:00:00 2017-08-10 01:59:59 drndrn{a..j}.{ch,biz}
2017-08-10 02:00:00 2017-08-17 01:59:59 drodro{a..j}.{ch,biz}
2017-08-17 02:00:00 2017-08-24 01:59:59 drpdrp{a..j}.{ch,biz}
2017-08-24 02:00:00 2017-08-31 01:59:59 drqdrq{a..j}.{ch,biz}
2017-08-31 02:00:00 2017-09-07 01:59:59 drrdrr{a..j}.{ch,biz}
2017-09-07 02:00:00 2017-09-14 01:59:59 drsdrs{a..j}.{ch,biz}
2017-09-14 02:00:00 2017-09-21 01:59:59 drtdrt{a..j}.{ch,biz}
2017-09-21 02:00:00 2017-09-28 01:59:59 drudru{a..j}.{ch,biz}
2017-09-28 02:00:00 2017-10-05 01:59:59 drvdrv{a..j}.{ch,biz}
2017-10-05 02:00:00 2017-10-12 01:59:59 drwdrw{a..j}.{ch,biz}
2017-10-12 02:00:00 2017-10-19 01:59:59 drxdrx{a..j}.{ch,biz}
2017-10-19 02:00:00 2017-10-26 01:59:59 drydry{a..j}.{ch,biz}
2017-10-26 02:00:00 2017-11-02 00:59:59 drzdrz{a..j}.{ch,biz}
2017-11-02 01:00:00 2017-11-09 00:59:59 dsadsa{a..j}.{ch,biz}
2017-11-09 01:00:00 2017-11-16 00:59:59 dsbdsb{a..j}.{ch,biz}
2017-11-16 01:00:00 2017-11-23 00:59:59 dscdsc{a..j}.{ch,biz}
2017-11-23 01:00:00 2017-11-30 00:59:59 dsddsd{a..j}.{ch,biz}
2017-11-30 01:00:00 2017-12-07 00:59:59 dsedse{a..j}.{ch,biz}
2017-12-07 01:00:00 2017-12-14 00:59:59 dsfdsf{a..j}.{ch,biz}

Actions taken

To prevent that the Tofsee botnet operators are able to abuse the Swiss domain name space (ccTLD .ch) for hosting their botnet Command&Control infrastructure (C&C), we have discussed further actions with the registry of ccTLD .ch (SWITCH). Together with SWITCH and the Registrar of Last Resort (RoLR), all possible DGA domain name combinations have been set to non registrable at the registry level. It is therefore not possible to register any of the DGA domain names for the next 12 months.

Back to top