Recently published blog posts:
Go to the blog archive and browse all previous blog posts we have published so far.
Subscribe to the GovCERT.ch blog RSS feed to stay up to date and get notified about new blog posts.
Recently published whitepapers:
Subscribe to the whitepapers RSS feed to stay up to date and get notified about new whitepapers.
Report an incident: incidents[at]govcert{dot}chGeneral inquiries: outreach[at]govcert{dot}ch
The following email address can be considered as point of contact for FIRST members and other CERTs/CSIRTs:incidents[at]govcert{dot}ch
GovCERT.ch PGP-Key (preferred) Alternative GovCERT.ch PGP Key (for older versions of PGP without Curve25519 support) GovCERT.ch SMIME
Published on December 15, 2016 09:30 UTC by GovCERT.ch (permalink) Last updated on December 15, 2016 10:12 UTC
When we read the blog post about Mirai’s new DGA feature, a recommended read, we decided to try to re-implement its DGA. This was quite a challenge, as we’re not too familiar with the MIPS architecture and could not easily find any x86 sample with DGA.
The main DGA generation loop did not look too difficult to implement. Here is a copy of it, with some register renaming we did to facilitate a Python implementation:
We were initially surprised by the observation that, while we see 0x61 (a lowercase ‘a’) added at the end, we don’t seem to see any modulo 26 operation, so we would expect the base name to contain non-printable characters. Nevertheless, we tried to mimic the code instruction by instruction in Python; after repeatedly fixing typos, we came up with the following abomination of code:
def calcDGA(day,month,year,seed=0): msk32 = 0xffffffff yearVal = (year*8) & msk32 yearValB = 0x3ffff00 # lowest byte is irrelevant monValB = 0x51eb851f # “|month” can be ignored msk1 = 0xfffffff0 msk2 = 0xfffffffe res = "" for i in xrange(12): daySeed = day ^ seed yearVal -= year yearVal &= msk32 monMask = month & msk2 month2 = (month * 4) & msk32 yearVal ^= year month2 ^= month dayMask = day & 0x1fff month = (monMask << 4) & msk32 monVal = year & msk1 monMask = (monMask*2) & msk32 daySeed = (daySeed*4) & msk32 month = (month-monMask) & msk32 monVal = (monVal << 17) & msk32 yearVal = yearVal >> 11 dayMask ^= daySeed month2 >>= 8 year = monVal ^ yearVal dayMask = (dayMask<<4) & msk32 yearVal = (day >> 15) & msk32 month ^= month2 day = yearVal ^ dayMask monMask = year ^ month ^ day dayMask = ((monMask * monValB) >> 32) & msk32 dayMask >>= 3 yearVal = ((seed*8 + day) << 8) & yearValB monVal = seed >> 6 seed = monVal ^ yearVal monVal = (dayMask << 3) & msk32 yearVal = (dayMask << 5) & msk32 yearVal = (yearVal - monVal + dayMask) & msk32 monMask = (monMask - yearVal) & msk32 monMask = (monMask + 0x61) & msk32 res += chr(monMask & 0xff) yearVal = (year * 8) & msk32 suffix = "" if day&1 == 0: suffix = "online" if day%3 == 0: suffix = "tech" if day%4 == 0: suffix = "support" return res,suffix
Yes, this re-implementation is ugly – more than ugly. But surprisingly it produces printable output, as you can easily try it out yourself. While playing with it, we noticed that the letter ‘z’ seems to be absent – only 25 letters of the possible 26 appear. This already triggered some memories, but unfortunately we did not follow them yet. So, where hides the mapping to lowercase letters? The solution lies in the variable monValB, which contains 0x51eb851f, approximating 8/25 modulo 32 (0x51eb851f*25 = 0x8’0000’0007); the subsequent division by 8 (right-shifting of 3 bits) results in a division by 25 (stored in dayMask) – the compiler seems to avoid actual division instructions. Just after that, dayMask is again multiplied by 25, which is done by shifting it left 5 bits (multiplication with 32), shifting the same value once more by 3 bits (multiplication with 8), taking the difference (multiplication with 32-8=24), and finally adding dayMask once more (multiplication with 25). So, the compiler optimizes multiplications as well. After a final subtraction, that results in a “modulo 25”, and then 0x61 is added – note that “z” can never be produced this way, as suspected. Mystery solved.
Actually we see a similar trick later on in the code, the multiplication with 0xAAAAAAAB is in fact an approximation of 2/3 and used to calculate the modulo 3 value in the TLD selection code.
Now we tried to beautify the code, and after several copy and paste steps in Python, we reached the following:
def calcDGA(day,month,year,seed=0): msk32 = 0xffffffff monVal = 0x51eb0000 + month yearVal = (year*8) & msk32 yearValB = 0x3ffff00 msk1 = 0xfffffff0 msk2 = 0xfffffffe res = "" for i in xrange(12): yearVal = (((yearVal - year) ^ year) & msk32) >> 11 year = (((year & msk1) << 17) & msk32) ^ yearVal dayMask = (((day & 0x1fff) ^ (((day ^ seed)*4) & msk32))<<4)&msk32 yearVal = (day >> 15) & msk32 month = ((14*(month&msk2))&msk32)^((month^((month*4)&msk32)) >> 8) day = yearVal ^ dayMask monMask = year ^ month ^ day yearVal = ((seed*8 + day) << 8) & yearValB monVal = seed >> 6 seed = monVal ^ yearVal res += chr(monMask % 25 + 0x61) yearVal = (year * 8) & msk32 suffix = "" if day%2 == 0: suffix = "online" if day%3 == 0: suffix = "tech" if day%4 == 0: suffix = "support" return res,suffix
Which still works fine. But during the modification of this code, it started to look more and more like a DGA we already knew – especially after we saw the multiplication with 14. Actually, it looked very similar to the Ranbyus DGA:
FOR i = 0 TO 13 day = (day >> 15) ^ 16 * (day & 0x1FFF ^ 4 * (seed ^ day)) year = ((year & 0xFFFFFFF0) << 17) ^ ((year ^ (7 * year)) >> 11) month = 14 * (month & 0xFFFFFFFE) ^ ((month ^ (4 * month)) >> 8) seed = (seed >> 6) ^ ((day + 8 * seed) << 8) & 0x3FFFF00 domain[i] = ((day ^ month ^ year) % 25) + 'a'
We gave it a shot, and tried the C code. The only slight modifications required were to change the length of the base names from 14 down to 12, and the different selection of TLDs. Also, dga.c produces 30 domains for one day, while Mirai only uses the first one of them. Anyway, the code indeed produces the same base names:
Date | Ranbyus DGA | Mirai Base | Mirai DGA 2016-12-04 | vmdefmnsndojtu.tw | vmdefmnsndoj | vmdefmnsndoj.tech 2016-12-05 | xpknpxmywqsrce.net | xpknpxmywqsr | xpknpxmywqsr.support 2016-12-06 | lvfjcwwobycjik.com | lvfjcwwobycj | - 2016-12-07 | nympompksmfxsf.pw | nympompksmfx | - 2016-12-08 | kedbuffigfjsfq.in | kedbuffigfjs | kedbuffigfjs.online 2016-12-09 | pjapobyvmsnesx.me | pjapobyvmsne | pjapobyvmsne.support 2016-12-10 | gjnhvqeopqkjjf.cc | gjnhvqeopqkj | - 2016-12-11 | lokwpdhjqhcvtn.su | lokwpdhjqhcv | - 2016-12-12 | frwhojupdcrcat.tw | frwhojupdcrc | - 2016-12-13 | hufuhuerwgsigx.net | hufuhuerwgsi | - 2016-12-14 | bwhrdaumwuvnrb.com | bwhrdaumwuvn | bwhrdaumwuvn.support 2016-12-15 | daldkougomttqb.pw | daldkougomtt | - 2016-12-16 | gjuyvvwwmbwvdn.in | gjuyvvwwmbwv | - 2016-12-17 | lgigrtmnssejtd.me | lgigrtmnssej | - 2016-12-18 | vsuuqkqbavvaiu.cc | vsuuqkqbavva | - 2016-12-19 | bpmsfckfkrprom.su | bpmsfckfkrpr | bpmsfckfkrpr.support 2016-12-20 | oornsduuwjlifv.tw | oornsduuwjli | oornsduuwjli.tech 2016-12-21 | qjqubpciajocbs.net | qjqubpciajoc | qjqubpciajoc.tech 2016-12-22 | exvdaajegjuror.com | exvdaajegjur | exvdaajegjur.support 2016-12-23 | gsqvwjkbmwadex.pw | gsqvwjkbmwad | - 2016-12-24 | poorcetnmjfchv.in | poorcetnmjfc | poorcetnmjfc.online 2016-12-25 | ulficqrujekplt.me | ulficqrujekp | ulficqrujekp.support 2016-12-26 | ltwehsdfhkegkh.cc | ltwehsdfhkeg | - 2016-12-27 | qqnbfqrdjyjkqb.su | qqnbfqrdjyjk | qqnbfqrdjyjk.support 2016-12-28 | xtldkfvredxhll.tw | xtldkfvredxh | - 2016-12-29 | aojpocslpwsuik.net | aojpocslpwsu | aojpocslpwsu.support 2016-12-30 | tyxdowioaygktb.com | tyxdowioaygk | - 2016-12-31 | vtrndmhsgadage.pw | vtrndmhsgada | vtrndmhsgada.online
You can find a longer list of domains here: Mirai DGA botnet domains
As described in the above blog post, the DGA is not always applied. A wrapping code checks if the current date is in November or start of December (1st to 3rd); on these days the DGA is turned off, on all other days turned on. But the programmers forgot to consider the year in their code: it will turn off the DGA again in November 2017 for another month. So this DGA selection code is a bit buggy. Maybe the programmers don’t plan to use the DGA for such a long time – for sure there will be other samples out until then.
Please note that our interpretation of the TLD selection algorithm differs from the one described in the blog post: the code ends in three separate if statements, not in one combined if-elif-elif construct. It actually checks whether the residual value is divisible by 2, 3, or 4. However, there are cases where neither matches, e.g., 0xef2f25d5 representing December 12th, and no valid domain is produced by the algorithm. This bug is not too critical though: at some later day, a valid domain will be generated anyway.
Also interesting is the distribution of the 3 TLDs support, tech, online and the days where no TLD can be generated. Theoretically one would expect 25% “online”, 25% “support”, 16.6% “tech”, and 33.3% unassigned days. But, as pointed out in the comments of the blog post, the relevant residual value can only take one of 31 values (because only the day of the month and initial seed determine it), and so it is not so surprising that the distribution is unusual. Consequently, the TLD pattern is repeating for every month.
Back to top