back to article That MYSTERY Duqu Trojan language: Plain old C

An appeal for help from the programming community has allowed antivirus analysts to classify the unknown language used to develop key components of the Duqu Trojan. Duqu creates a backdoor on compromised SCADA-based industrial control systems. The malware is closely related to the nuke plant centrifuge-busting Stuxnet worm and …

COMMENTS

This topic is closed for new posts.

Page:

  1. g e
    Black Helicopters

    So....

    It was created by professional developers [check] quite possibly in an organised team of developers [check] where some aspects of the team worked on separate chunks of the code [check] possibly unaware of what other coding cells were working on [check]

    So, CIA / NSA, anyone (or any other shady TLA agency)? Team A with lower clearance worked on general framework while team B (and maybe C) worked on sensitive and intricate exploit code to run within it...

    1. Ilgaz

      Framework oriented, Apple like

      They say it is some Apple style code, it sounds right but the first object oriented operating system was next, the grandfather of OS which was loved by NSA and probably CIA.

    2. Anonymous Coward
      Anonymous Coward

      Re: So....

      I agree with the first half of what you say, but I don't see why it necessarily means that someone like CIA/NSA is behind it. The approach described is quite standard in any company that likes to develop software in a sensible manner (as opposed to hacking a solution out). Having one team develop the framework while another develops the application/payload allows different people to focus on what they're good at and should result in a well-structured and reliable program.

      I would say that the obvious desire for a professional development does indicate some level of organisation and a long-term aim, but you can find those features in many non-spook organisations.

      1. Anonymous Coward
        Anonymous Coward

        Re: So....

        Actually this eliminates a number of countries.

        I know I am going to be flamed on this one but in some countries anyone who could write a framework has either long emigrated or undergone frontal lobotomy to be in charge of an outsourcing project run by Crapita, W*nkPro or someone else of that ilk.

        So yeah, 3 letter agency - may be. 2 letter and a number agency - definitely not. Many letters agency in a place with religious headgear - quite likely.

        Objective C is not necessarily old school. It is quite likely to be multiplatform new school too. I would not be surprised if we see the same framework in trojan targeting non-Windows industrial equipment next.

      2. Jonathon Green

        Re: So....

        Has anyone considered the possibility that just because this beast uses an OO framework developed by well-disciplined, well trained professional engineers that doesn't mean the actual payload code was developed by them...

        My experience is that this sort of code is usually generated with the aid of a pretty compact set of cleverly implemented, well honed macros, data structures and the odd script, and that being able to *use* this sort of stuff doesn't necessarily being able to generate it - could it be that someone's got hold of someone else's toolkit (could be a published academic project, could be something that's escaped from a long-gone commercial project, could be anything), possibly as a result of working on an entirely legitimate project at a low level and is now turning it to their own nefarious uses?

        1. John Smith 19 Gold badge

          Re: So....

          Has anyone considered the possibility that just because this beast uses an OO framework developed by well-disciplined, well trained professional engineers that doesn't mean the actual payload code was developed by them...

          Err

          The fact that the payload was fairly quickly identified as being C++ in MS developer studio suggests that already.

    3. Lee Dowling Silver badge

      Re: So....

      Or, plucking anything NOT a boring conspiracy theory out of the air:

      - any professional team of software engineers working for a company who's slightly amoral.

      Russian spammers, industrial rivals ("Look how rubbish their security is! By us instead!"), god-knows-what that could lead to profit indirectly. What better way to make an attack code within a legitimate corporate environment than to make it so modular that no one programmer actually knows what it is while they are working on it.

      First, we don't even know what it sends back to central servers. Second, it doesn't "damage" anything, even the SCADA control systems it seeks out. Third, it steals PKE certificates. That's about all we know about it.

      My actual "conspiracy theory" explanation would be: Written by a bunch of professional programmers who usually do professional programming work, in a modular way so that few know its actual purpose or where their code ends up, or could give them away. Then distribute to key sites, "discovered" (or just wait until someone DOES discover it), then pick major nation that's a key "potential" culprit and blame them for it with zero evidence (my bet at the moment would be China but Iran's a strong candidate too), in order to justify - what? A war? A cyber-attack? Internet censorship? A Great Wall of America? Removing their nuclear ability? I don't know.

      But the media certainly seem to have hooked into those attacks fast and been quick to name potential sources and blame them without evidence past "we saw one copy spread from a China IP". By which argument, the Chinese government makes every spam message on the planet with the purpose of attacking "Western" government by a social DoS. It's just as insane and just as without proof (and, actually, probably more effective).

      All this tells us: Security analysts don't know anything (I've seen lots of posts on Duqu in the last few weeks that say it's DEFINITELY not C, etc.) and still haven't properly reverse-engineered it (apparently too busy trying to discover the language so they can google "decompiler for language X"), and the people who wrote it had done some programming in their lives enough to get used to doing things properly and modularly.

      Anything beyond that is conspiracy theory, and a bad one at that. And picking ANY source on such limited information (Hell, they could find "(C) China 2012" in the damn thing - that's still not any sort of proof it was from China, or the Chinese government, for example) is ridiculously stupid.

      Ask yourself: Why have governments been SO keen and SO quick to suggest the origin of code that we'll probably NEVER know exactly where it came from?

    4. Anonymous Coward
      Anonymous Coward

      Re: So....

      This could have been written by a software company at the behest of an intelligence agency. There are plenty of software companies founded by ex-Mossad/CIA/NSA/etc staffers who could be tapped up for this kind of work.

  2. Graham Bartlett
    Mushroom

    "more efficient, smaller, faster, more flexible and re-useable"

    Queue a re-run of the 1990s C versus C++ flame wars on whose code is most efficient, smallest and fastest!

    (Yes, the fuse is lit. I'm running now - better keep up...)

    1. Tchou
      Pint

      Re: "more efficient, smaller, faster, more flexible and re-useable"

      It is obviously about three times faster to type 'C' than 'C++'.

      1. NomNomNom

        Re: "more efficient, smaller, faster, more flexible and re-useable"

        "It is obviously about three times faster to type 'C' than 'C++'."

        That is just the kind of pitiful appeal to an irrelevant speed difference that I've come to expect from C loons.

        1. Tchou
          Pint

          Re: "more efficient, smaller, faster, more flexible and re-useable"

          Come on, didn't you caught my joke?

          Surround yourself with try/catch mate, it could save you from saying stupidities....

      2. Anonymous Coward
        Anonymous Coward

        Re: "more efficient, smaller, faster, more flexible and re-useable"

        > It is obviously about three times faster to type 'C' than 'C++'.

        Only twice as fast if you are going to waste time with quotes.

    2. itzman

      Re: "more efficient, smaller, faster, more flexible and re-useable"

      FORTH obviously :-)

      A language that vanished with cheap large capacity RAM.

      .

      1. Vic

        Re: "more efficient, smaller, faster, more flexible and re-useable"

        > A language that vanished with cheap large capacity RAM.

        A language that hasn't vanished at all.

        I've got several customers for whom I write Forth. One project in particular might end up funding my retirement in a couple of years.

        You'd be surprised where Forth crops up...

        Vic.

    3. Loyal Commenter Silver badge
      Boffin

      Re: "more efficient, smaller, faster, more flexible and re-useable"

      I would have thought it depends entirely on the compiler, since the eventual target is going to be machine language. Various compilers may wrap this machine language in various levels of cruft...

  3. Notas Badoff
    WTF?

    Tocharian A and B are not intercomprehensible

    and everybody's forgotten about Tocharian C...

    http://en.wikipedia.org/wiki/Tocharian_languages#Writing_system

    Anybody else think not being able to identify C compiled code is itself a comment on the industry? That it was object-oriented confusing them is a crock.

    1. bazza Silver badge

      Re: Tocharian A and B are not intercomprehensible

      "Anybody else think not being able to identify C compiled code is itself a comment on the industry? "

      Yes!

    2. Destroy All Monsters Silver badge
      Thumb Down

      Re: Tocharian A and B are not intercomprehensible

      "Anybody else think not being able to identify C compiled code is itself a comment on the industry?"

      I dunno. I don't think so. Can you do it? Are you trick-cyclist enough?

      There are many ways to compile source down to machine code, and if the result has something that looks "this" pointers either on-stack or in registers, it's not obvious where it's coming from. At all.

      1. heyrick Silver badge

        Re: Tocharian A and B are not intercomprehensible

        "I dunno. I don't think so. Can you do it? Are you trick-cyclist enough?"

        Often, actually, yes. Unless they replaced the entire standard library, you will find that a lot of compilers give hints. From RISC OS "Shared C library not loaded"; MicroSoft's "Microsoft Visual C++ Runtime Library"; Linux/GCC "GLIBC_2.3"; lcc is harder, but it contains "malloc", "strncmp" plus "c:\lcc\projects\...blah..."...

        Just had a look at the exe for InnoSetup uninstall program. It isn't laid out like traditional C (code first, with strings at the end). At a guess, I'd say some sort of Pascal but tending towards high level concepts a la VB. Thus, most likely shot given file layout would be something like Delphi.

        So, in answer to your question, you can often tell a lot about the file "looking at it" in a text editor (not a disassembler, not yet).

  4. Ramazan
    Coffee/keyboard

    wtf is object orientated C?

    Do you mean OOPC? Or Objective C? Or some unknown 01d5k001 l33t OO C preprocessor framework? You don't know, so it's M3G4 F4IL, DUD3, obviously

    1. bazza Silver badge

      Re: wtf is object orientated C?

      Once upon a time C++ compilers merely translated C++ source code into C source code then ran a C compiler.

      Just because C isn't restricted to being object orientated (nor is C++ for that matter) doesn't mean that you can't write code in a an object oriented way. Functions pointers in structures aren't a million miles away from methods inside objects (that's probably how the early C++ compilers/translators did it). Though of course with C you can cheat, so one would generally opt to be disciplined when coding that way to stick to whatever rules you choose. I wrote an entire windowing graphics library that way back in the really old days (1980s).

      1. This post has been deleted by its author

      2. Ramazan

        Re: wtf is object orientated C?

        I prefer to use C preprocessor without forcing OO paradigm on code, but there are some who worship OO. Here's what Igor Soumenkov from Kaspersky Labs says:

        "The Duqu Framework shares many principles of libevent, but it is completely object-oriented, even all events and callbacks are wrapped in objects ... Now, there are several open-source “OO C” frameworks available, and some of them produce code constructions that are very similar to those in the Duqu code. The best match we found is SOO (Simple Object Orientation for C), however it could not have been used in Duqu, because it was only published when the Trojan was already in the wild"

        1. Charles Manning

          The whole Linux kernel is OO C

          Object oriented C really just means using function pointers etc to set up interfaces so that it is easy to plug in new behaviour/code/whatever. That's how Linux is able to plug in different file systems, device drivers, schedulers and such.

          Of course this is nothing special to Linux either. BSD, Windows and other OSs use exactly the same mechanism (though some use C++ wrappers in places too).

          1. Ramazan

            Re: The whole Linux kernel is OO C?

            Sometimes Linux kernel code is useful outside of kernel BTW (for instance, you can use linux/list.h in userspace projects, but you need to hack it first), but from my experience I cannot say it's wholly OO. Quite the contrary...

            1. Anonymous Coward
              Anonymous Coward

              Re: The whole Linux kernel is OO C?

              The OP was correct. The Linux kernel is object oriented C. It has data abstraction, inheritence and polymorphism. Also, because it's the kernel it can cheat and edit its own code at run time :)

              Of course "the whole linux kernel is OO C" is not the same as "the Linux kernel is wholly OO C" just as object orientedness is not the only abstraction. There is a lot of both procedural and iteratative abstraction in the kernel (the previously mentioned lists are responsible for a lot of the iterators you find scattered through the kernel).

        2. John Hughes
          Happy

          Re: wtf is object orientated C?

          "however it [SOO] could not have been used in Duqu, because it was only published when the Trojan was already in the wild"

          Well, it could have been used by Duqu if Duqu was written by the same people who wrote SOO.

    2. Ken Hagan Gold badge

      Re: wtf is object orientated C?

      Amongst other things, it is clear evidence of illiteracy, since "orientated" isn't a proper word.

      1. diodesign (Written by Reg staff) Silver badge

        Re: Re: wtf is object orientated C?

        Christ, I knew today was going too well. I have no idea how that slipped past. Long day. Fixed.

        C.

      2. Anonymous Coward
        Anonymous Coward

        Re: wtf is object orientated C?

        Orientated vs oriented is a British vs American thing; both are acceptable but the American one is arguably slightly more correct, from discussion I've read. I was surprised, twenty-odd years ago, to see "objected-oriented" rather than "objected-orientated"; I assumed it was an American corruption but that doesn't seem to be the case; e.g., see http://forum.wordreference.com/showthread.php?t=431002

        1. Ken Hagan Gold badge
          Headmaster

          Re: wtf is object orientated C?

          An interesting link, if only to see the number of Americans who believe that the "ate" form is a Britishism. Speaking as a Brit, I can assure you that it bloody isn't. It's wrong over here too. It has *always* been object-oriented.

          Doesn't stop people making the mistake but that, in turn, doesn't mean we have to roll over and let the language get mangled again.

  5. Anonymous Coward
    Anonymous Coward

    Is it just me or

    1) Do Kaspersky look dafter with every press release?

    2) Does the idea of "professional", "old skool" developers choosing to use VC when they have other options ie when not dictated by corporate policy seem utterly utterly laughable?

    3) Does (2) support the concept that this code came from a clueless badly-micromanaged organisation e.g. corporate IT, government department, etc?

    1. SoaG

      Re: Is it just me or

      "came from a clueless badly-micromanaged organisation"

      That doesn't really narrow it down much. Where is this workplace of myth and legend that doesn't fit your description, and are they hiring in Canada?

      1. Anonymous Coward
        Anonymous Coward

        "clueless badly-micromanaged organisation"

        "Where is this workplace of myth and legend that doesn't fit your description, and are they hiring in Canada?"

        Quite. I'm afraid it's not so much "where" as "when". Back in the day when technology was driven and managed by technologists and engineers. These days, on the whole the PHBs and clueless beancounters are in charge around the globe. There are a few exceptions but they ones I know of are smaller companies (albeit with global markets) and I'm afraid I don't know any in Canada (recruiting or otherwise). Best of luck anyway.

    2. Sureo
      Gimp

      Re: Is it just me or

      Plain old C. My first love and still my favorite.

  6. Pinkerton
    Coat

    ...More efficient, smaller, faster, more flexible and re-useable code...

    So we've eliminated disgruntled Microsoft employees, then.

    Boom tish. Thangyouverymuch. I'm available for weddings, birthdays and bar mitzvahs, etcetera.

    1. Anonymous Coward
      Anonymous Coward

      Re: ...More efficient, smaller, faster, more flexible and re-useable code...

      you'll starve

  7. Anonymous Coward
    Anonymous Coward

    Missed the obvious conclusion

    The article states: "The best match we found is SOO (Simple Object Orientation for C), however it could not have been used in Duqu, because it was only published when the Trojan was already in the wild. "

    Obvious conclusion is that the VXers open-sourced their toolkit after Duqu was released!

    1. John McCallum
      Devil

      Re: Missed the obvious conclusion

      I am not in the software business but that was my thought when I read that quote.Something like hey how great am I.

      1. Destroy All Monsters Silver badge
        Big Brother

        Re: Missed the obvious conclusion

        I'm sure a "cleaner team" has already been dispatched to erase any suspicious traces.

  8. Anonymous Coward
    Anonymous Coward

    : (((

    "That MYSTERY Duqu Trojan language: Plain old C"

    So, NOT written in Thetan?

    : (

  9. This post has been deleted by its author

  10. Jason Hindle

    One thing's clear. It's not the work of script kiddies.....

    The pointer stuff will be a bit too hard for them. While it doesn't necessarily have the fingerprints of the CIA/NSA/Mosad, I can't think of too many people out there who'd have the motivation to steal information about industrial control systems with nuclear applications (or, as with the apparently related Stuxnet, disrupt industrial control systems that just happen to be in Iranian nuclear research facilities).

  11. Majid
    FAIL

    My money is on the virus writers.

    If something as trivial as normal C written in an object oriented way, together with non-default optimizing flags can throw the virus-hunters completely off, then any decent programmer (and I mean someone that can write some c and some assembly code) can completely outwit them.

    Oh and the oldest C++ compilers where just preprocessors first converting it to C with some arrays of function pointers resembling the vtable. So C and C++ where originally not that far apart.

    So I would say, learn to read assembly code and don't rely so much on reverse engineering tools, that will decompile your assembly into a language you do understand.

  12. John Smith 19 Gold badge
    Meh

    A few notes

    For those who don't know.

    In assembly language pointers-to-functions are usually called *addresses*. Pretty much *every* OS function list is a table of addresses or jump table. Even *DOS* could have been turned into a reasonable OS as the poorer versions were swapped out and better ones substituted. TSR's did it on the fly (Hacking the the jump table has been used in software up to mainframes).

    Historically the C++ *pre-processor* the Bell Labs originally developed mapped the methods to a virtual function table (and AFAIK most modern full compilers still do).

    Delphi was basically object orientated Pascal and (internally) also used this. Full Pascal *did* include a way to do pointers to functions. Perhaps if Turbo Pascal had included them from first release the development landscape would be *very* different.

    It might a bit surprising it's taken this long to spot that it's C all along but given the almost *infinite* ways the OO functions could be *implemented* (and given it's a framework how the developers *might* have over ridden them by hacking the C source) perhaps the big breakthrough was to stop looking at *as* language X but C wrapped in framework X.

    This *totally* different approach suggest a completely separate team using it's in house developed framework, or possibly someone who has *inherited* this from a much larger effort. A little going away present taken by a disgruntled ex-employee, re-purposed for a bit of private enterprise?

    So either a big team developing a fairly complex code base from *scratch* or a team that is leveraging an *existing* way of doing business.

    While I could believe there would need to be little communication between the teams (penetration and payload) I'd find it very hard to believe the payload team would be ignorant of *what* they were building.

    BTW the Kaspersky team even hit the *version* of Visual C (2008) that was used which is pretty impressive.

    Something tells me we have not seen the last of this gangs efforts.

    1. Destroy All Monsters Silver badge
      Big Brother

      Re: A few notes

      Checking the comments on http://www.securelist.com/en/blog/667/The_Mystery_of_the_Duqu_Framework, we find:

      *A commenter called "SCooke" says:*

      Re: That code looks familiar

      It's easier to figure this out if you consider vendor sourcing. The work was probably done by a government. And, whether the software was sourced through a US agency or whether a US agency itself was the creator, the net result is the same: you're looking for a major GSA-contracted firm who A) has clearance, B) has a compiler team, C) has a track record of providing similar product to the US government, and D) has a compiler codebase that looks kind of unfamiliar and not mainstream.

      The likely suspects fitting that set of criteria are IBM, Microsoft, SAS and SAIC. All the others (remnant AT T, HP, remnant SGI... who am I forgetting?) incorporate a considerable amount of fairly recognizable shared compiler code in their offerings. Since you've disqualified Microsoft, my bet is on IBM.

      I don't think it's SAS, because their compiler codebase is ancient. I don't think it's SAIC, because for them this would be a fairly difficult project. Three reasons why I think IBM.

      First is that IBM has a library of bizarro options to select from. There's an internal HLASM-to-C frontend. There's all the CSet descendants. They've got research versions of damn near everything. (I'd try getting ahold of the ia32 version of CSet - probably hard to come by, but out there). They've also got a Windows source license, and if you were going to write a virus, that's always handy.

      Second is that IBM has a history of doing projects like this. If there was a federal bid, they almost certainly would have been a bidder.

      Third is that the project could have been run out of IBM Haifa. A number of the old IBM AV team probably either were there or ended up there, so it wouldn't be too far out of their wheelhouse. And if you wanted to build a state-sponsored virus, you'd almost certainly want to build it in a country who already has near-active hostilities with the intended target for the virus such that those acts of aggression don't become de facto acts of war for you.

      If you want to dig into that, have someone from IBM wander through the employee-written and internal software libraries for all the preprocessor frontends for various languages and compiler backends that output to ia32. Probably none of that is inherently secret. I bet you'll find something that produces similar output.

      *A commenter on the Securelist called "2esoskwahom4" says:*

      sniffing from wrong direction, what does history tell you?

      both "As400tech" and "SCooke" handed you the best hints.

      A few years back I worked at East Fishkill long enough to meet eggs rubbing elbows with the 'black' GSA guys working down in Endicott and Watson (mostly the latter). The big topic at the time was exhorbitantly hi-priced memory being frantically consumed (we knew it was NSA, we realized later for upgrading Echelon to make it's data more transparent for future TIA transactions) post-911.

      A cyberop like this would inevitably end up at big blooze' shop for the reasons scooke mentions: NOTHING gets thrown away by Endicott's hacks (a somewhat frustrating problem for workers needing access to boxes), their library of tools is as incomprehensibly massive as it is old. Indeed, Watson has not infrequently sent researchers there first to get their feet wet.

      This probably initiated at Watson under NSA aegis, followed by research of tools at Endicott's library, then a handover to Haifa after payload completion. It's unrecognizable because NSA would demand that; any self-respecting beemer hack would know to hit up Endicott's libraries to make it so.

      That said, it might be a little naive thinking any ibm'er you ask is gonna be successful convincing one of the mustier Endicott hacks to pony up from their libraries. scooke is right none of it is officially secret - but it frequently is VERY proprietary for some of them. A handful of old Endicott hacks still spend more time there than at home. That should tell you something about their priorities. It's all who you know. 'n no, I don't.

  13. Dan 55 Silver badge
    Facepalm

    Visual C++ Express is now officially l33t

    Ye Gods, the decline of the industry is worse than I thought.

    Still, at least it means I should manage to hold onto a job.

  14. Gordon Fecyk
    Facepalm

    So old, it's new again!

    I have faith in the anti-virus industry, all right. Just not the kind that anti-virus sales droids like their customers to have.

  15. druck Silver badge
    Mushroom

    CFront

    It could have been C++ converted to C by CFront, an aberration which we had to go through in the late 80s before native C++ compilers. The C output from that was truly hideous.

Page:

This topic is closed for new posts.

Other stories you might like