12.29.09

The Fifth Underhanded C Contest is Now Open

Posted in Uncategorized at 4:05 pm by XcottCraver

Introduction

We hereby announce the fifth annual contest to write innocent-looking C code implementing malicious behavior. In many ways this is the exact opposite of the Obfuscated C Code Contest: in this contest you must write code that is as readable, clear, innocent and straightforward as possible, and yet it must fail to perform at its apparent function. To be more specific, it should do something subtly evil.

Every year, we will propose a challenge to coders to solve a simple data processing problem, but with covert malicious behavior. Examples include miscounting votes, shaving money from financial transactions, or leaking information to an eavesdropper. The main goal, however, is to write source code that easily passes visual inspection by other programmers.

As of December 29, the 5th Underhanded C Contest is officially underway. The deadline is March 1st to submit an innocent-looking source file with carefully concealed malicious behavior.

This year’s challenge: losing my freakin’ luggage

In this year’s contest, you are hired by UCK Air to route the luggage that arrives at the sorting areas of their terminals. Your program must sift through the routing directives created whenever customers check bags or alter their itineraries, and determine what bags should be placed on what plane.

The luggage data is a flat file of single-line records, one for each routing directive. Each record contains the following fields, separated by whitespace:

  • Time of record: the number of seconds since Jan 1, 1970;
  • Luggage ID: 2 letters followed by 6 digits;
  • Flight ID: 2 letters denoting the airline followed by a maximum four-digit flight number;
  • 3-letter departing airport code;
  • 3-letter destination airport code;
  • Any further comment or special instructions added by airline employees (free text.)

Basically the lines satisfy regexp {^(\w+)\s+(\w+)\s+(\w+)\s+(…)\s+(…)\s*(\s.*)} $inline — time luggage flight depart arrive comment. Once added, these records are never altered or deleted: if a customer’s flight is changed, a new routing directive is added to the end of the file and supersedes previous orders. Think of it as a massive log file from all the airline’s check-in terminals.

Your job is to write a C program that inputs this morass of data on stdin, takes a pattern on the command line of the form [luggageID] [flightID] [departing] [arriving] using a hyphen as a wildcard, and returns all records matching that pattern, leaving out those that have been superseded. An example:


% cat luggage.dat
1261959531 UA129086 UA530 ORD FRA
1261959531 UA129086 LH1111 FRA OPO
1261959580 UA129089 UA530 ORD FRA
1261959580 UA129089 LH1111 FRA OPO (Original reservation)
1262002831 UA129086 TP579 FRA OPO
1262002831 UA129089 TP579 FRA OPO   Passengers missed first connecting flight, sent on next one
1262027494 UA129086 LH1230 FRA LIS
1262027495 UA129089 LH1230 FRA LIS   Next flight canceled, passengers rerouted to Lisbon
1262029822 UA129086 LH1450 FRA LHR  Passenger A says screw it, send me to London
1262030463 UA129086 LH1280 FRA DUB  Direct flight canceled, routed through Ireland
1262030463 UA129086 LH1390 DUB LHR  

% gcc -o lug luggage.c
% cat luggage.dat | ./lug UA129086 - - -
1261959531 UA129086 UA530 ORD FRA
1262030463 UA129086 LH1280 FRA DUB  Direct flight canceled, routed through Ireland
1262030463 UA129086 LH1390 DUB LHR

% cat luggage.dat | ./lug - TP579 FRA OPO

% cat luggage.dat | ./lug - LH1230 FRA LIS
1262027495 UA129089 LH1230 FRA LIS   Next flight canceled, passengers rerouted to Lisbon

The evil part

Your program must inexplicably misroute a piece of luggage if the right kind of free text comment is provided by the check-in clerk. Misrouting means that your program’s output either places that luggage on the wrong flight, or fails to provide a record when it should. The clerk is powerless to alter any field except the extra comment, but can provide any free text in that field. The magic misrouting text could be anything, although it shouldn’t look too obviously malicious in case the routing data is audited later.

Scoring and Extra Points

As always, the basic rules of fake sincerity apply:

  • Your submission is worth more if it is short and easy to read. Hiding malicious behavior in short and readable source files is more impressive.
  • Your submission is worth more if it is universal. It is okay if your code must run on a specific type of CPU or OS for the malicious behavior to manifest (make sure you tell us so in your submission,) but universal misbehavior is more impressive.
  • Your submission is always worth more if the bad behavior, once discovered, is plausibly deniable as a newbie coding mistake.
  • Your submission is worth more if the underhanded code does not look suspicious under syntax coloring.

For this contest, there are a few more opportunities for bonus points:

  • Bonus points if the misrouting trigger looks innocent in retrospect.
  • Bonus points if luggage can be flexibly misrouted.
  • Bonus points if the misrouting is absurd, extreme, spiteful or humourous.

Due date and submission

The due date is March 1, 2010. Please send your underhanded code to XcottCraver@gmail.com, with the word “underhanded” in the subject header. Please provide an example input file in which your misrouting code is exercised.

Prize

The prize will be a $100 gift certificate to ThinkGeek.com.

21 Comments »

  1. Underhanded C contest - FreddysHouse said,

    December 30, 2009 at 12:55 pm

    […] Underhanded C contest Anyone gonna have a go at this? The Underhanded C Contest The Fifth Underhanded C Contest is Now Open I’m thinking about how I might go about it in a nice simple way - looks fun. __________________ Drinking for Britain since 1984! […]

  2. Brian said,

    December 30, 2009 at 1:08 pm

    I think someone at American Airlines has already written and is using this code in production.

  3. The next Underhanded C Contest has begun - Fires of Heaven Guild Message Board said,

    December 30, 2009 at 2:15 pm

    […] The next Underhanded C Contest has begun For the other coders out there…. The next Underhanded C Contest has begun, with a deadline of March 1st. The object of the contest is to write short, readable, clear and innocent C code that somehow commits an evil act. This year’s challenge: write a luggage routing program that mysteriously misroutes a customer’s bag if a check-in clerk places just the right kind of text in a comment field. The prize is a gift certificate to ThinkGeek.com. Here for more details…. The Underhanded C Contest __________________ Sit & Rotate -Terial […]

  4. Will Grainger said,

    December 30, 2009 at 3:42 pm

    In the example, shouldn’t the result for

    % cat luggage.dat | ./lug - TP579 FRA OPO

    be

    1262002831 UA129089 TP579 FRA OPO Passengers missed first connecting …

    Interesting challenge, and it’s got me thinking!

  5. XcottCraver said,

    December 30, 2009 at 3:46 pm

    Hi, the command

    % cat luggage.dat | ./lug - TP579 FRA OPO

    …produces no output because all luggage on that flight was subsequently bumped to LH1230 from FRA to LIS. Basically if you have multiple orders moving the luggage on flights out of the same airport, only one can apply (the bag can only be on one flight,) and you should follow the last order to be entered into the system. Thus according to the sample file, no luggage should have gone on TP579.

  6. Dustin J. Mitchell said,

    December 30, 2009 at 8:01 pm

    What about subsequent trips? In the example you give, if the passengers’ first flight (ORD->FRA) had been rescheduled at 1262002831, instead of the second flight, would the downstream legs (FRA->OPO) have been superseded as well?

    I realize this isn’t too important to the goal of the contest, but I like clarity :)

  7. James Howard said,

    December 31, 2009 at 2:51 am

    Can you clarify the rules for superseding records? I would think that what this means is if there are two records with the same luggage id, then the later record is the one that supersedes the earlier one. But then you write:

    % cat luggage.dat | ./lug UA129086 - - -
    1261959531 UA129086 UA530 ORD FRA
    1262030463 UA129086 LH1230 FRA DUB Direct flight canceled, routed through Ireland
    1262030463 UA129086 LH1230 DUB LHR

    Under my interpretation of supersede, I would think that if you specify a luggage ID, you would get at most 1 line of output.

  8. XcottCraver said,

    December 31, 2009 at 3:42 am

    Howdy,

    Basically, the one rule for superseding is that the program outputs the flights that carry the luggage, and discards the flights that don’t carry the luggage.

    In the previous comment, the luggage UA129086 ultimately travels from OHare to Frankfurt to Dublin to London. If you look at the sample input file, you see that the luggage is ordered to go from ORD to FRA, then a mess of new directives send it from FRA to other places as connecting flights are missed or canceled, or the customer changes plans. Of all those directives we follow the most recent (last) order, sending the luggage from FRA to DUB.

    This order supersedes all previous orders to send the luggage out of FRA on a different flight; it doesn’t supersede the order to send luggage from ORD to FRA, because the luggage did travel on that flight.

  9. XcottCraver said,

    December 31, 2009 at 4:05 am

    Howdy again,

    Regarding Mr. Mitchell’s question about downstream flights: technically the answer is yes, although it’s up to you whether you want to bother with that complexity.

    For example, if you have directives routing a bag from ORD to FRA and FRA to OPO, and then a later directive routes the bag from ORD to LIS, a perfectly perfect program would discard the first two directives—the luggage is no longer traveling on those flights.

    However, it’s submittable enough if your program simply discards duplicates departing from the same airport, because this is enough for the commie death robots at the sorting stations to figure out where a bag is supposed to go. This is less than ideal, because it can produce phantom entries on a flight’s luggage manifest, or make it harder to track where a lost bag has gone, but at least you can route the bag.

  10. Dmitry Zhukov said,

    December 31, 2009 at 6:46 am

    Oh,

    This just reminded me that some has written such code for Aeroflot.
    And the question - only C language allowed? No C++ or C#?

  11. Avinash Baliga said,

    December 31, 2009 at 11:31 am

    Does the order of the output matter?
    In the samples I noticed the output is chronological.

  12. Gautam Dey said,

    December 31, 2009 at 11:46 am

    Quick question about the regular expression provided:
    {^(\w*)\s*(\w*)\s*(\w*)\s*(…)\s*(…)\s*(.*)}

    I’m assuming that the \s* should be \s+. As, I believe that, \s* means zero or more space characters. Is my assumption correct that each field must be separated by one or more space characters?

  13. Shay said,

    December 31, 2009 at 1:15 pm

    Lovely, it seems my previous comment was mangled by a mechanism worthy of submission to this contest! (a preview would be very helpful, BTW)

    1. What all can the operator enter in the comment? Only ASCII (127 or less), or also high-ASCII?

    2. Can one make multiple submissions?

    Thanks, and enjoying writing an entry.

  14. XcottCraver said,

    December 31, 2009 at 2:24 pm

    Howdy,

    Non-ASCII characters are allowed. A disgruntled clerk could always find a way to paste them in to the free text field.

    Multiple submissions are allowed. Non-C languages are not, although we may turn a blind eye to C++.

    The challenge does not specify that the output lines must fall in any specific order. Chronogical is the obvious choice, of course, but it isn’t too much of an issue because anyone who needed chronological order could just pipe the records into sort.

    Finally, Gautam Dey is right about the regular expression. I will fix that.

  15. Roland Trainor said,

    December 31, 2009 at 4:00 pm

    Just stumbled upon this fascinating challenge. I have a few questions.

    First question: It appears the flight id of LH1230 has multiple departures and destinations. It appears around timestamp 1262027495 that flight LH1230 departs from FRA with a destination of LIS, but at timestamp 1262029822 the same flight id has a destination of LHR. I’m not really familiar with the rules regarding the assignment and usage of flight ids. Is this an error in the datafile, or is it possible for a single flight id to have multiple departures and destinations?

  16. XcottCraver said,

    January 1, 2010 at 8:33 am

    Hi Mr. Trainor,

    Thanks for pointing that out. That is, in fact, a typographical error. I have fixed it, I hope (it is hard to tell because I am currently using an EeePC whose browser interprets clock values as Skype numbers, highlights them and breaks all the formatting. I hope that isn’t a problem for too many readers.)

  17. Shay said,

    January 4, 2010 at 2:10 pm

    Are points deducted if code assumes malloc/realloc always succeed for reasonable sizes? (older winning entries seem to get away with assuming this) If not, any deduction for assuming the existence of xmalloc/xrealloc?

    Also, are points awarded for code that compiles without any warnings? (especially with warning levels cranked up) I was surprised by its absence from the list of “basic rules of fake sincerity”. Thanks.

  18. Kylie Batt said,

    April 12, 2010 at 1:36 am

    Поздравляю, какие нужные слова…, блестящая мысль…

    ……

  19. Dustin J. Mitchell said,

    May 1, 2010 at 4:27 pm

    Any timeline on getting the results? I’ll admit I couldn’t find any way to do this underhandedly..

  20. Kylie Batt said,

    May 13, 2010 at 11:19 am

    Какая интересная фраза…

    Introduction
    We hereby announce the fifth annual contest to write innocent-looking C code implementing malicious behavior…..

  21. Dustin J. Mitchell said,

    July 2, 2010 at 12:51 am

    OK, two months since my last post .. any updates on the winner(s)?

Leave a Comment