04.01.13

The 6th Underhanded C Contest is now Open

Posted in Uncategorized at 1:20 pm by XcottCraver

After many years of the dead air um dead air, the Underhanded C Contest is back.

The goal of the contest is to write code that is as readable, clear, innocent and straightforward as possible, and yet it must fail to perform at its apparent function. To be more specific, it should do something subtly evil. Every year, we will propose a challenge to coders to solve a simple data processing problem, but with covert malicious behavior. Examples include miscounting votes, shaving money from financial transactions, or leaking information to an eavesdropper. The main goal, however, is to write source code that easily passes visual inspection by other programmers.

This year’s challenge can be found under the “This Year” tab. The goal is to create a program that gives unexpected friend access in a social network.

The prize is a $200 Gift Certificate to ThinkGeek. It runs April 1st (not kidding, for reals, today) until the arbitrary deadline of July 4th. NOTE: for a winner outside the US, if ThinkGeek does not deliver to your country, we will make the prize a gift certificate to some online retailer that does.

The deadline is July 4th, 2013.

Note: BFF_list is a double pointer

A typo in the original announcement had a user type with an entry user *BFF_list; this was changed to user **BFF_list; that is, each user contains an array of pointers to other users.

2009 Winners!

Posted in Uncategorized at 11:21 am by XcottCraver

Apologies for the incomprehensibly long delay. The Underhanded C contest is alive again, and here are the winners of the previous contest.

Here are the runners up, and the winner, of the Fifth Underhanded C Contest. The sixth contest will be announced later today, April 1st.

The prize, because of the long delay, is now a $200 gift certificate rather than a $100 gift certificate.

Runners Up

RHays

This beautiful submission is only 71 lines of extremely readable C code, and is an excellent example of hiding an evil bug in plain sight with a visual ploy rather than a technical ploy.

The author stores each baggage record in a simple struct that holds each field in its own string:


struct trip_t {
    char fields [6] [256];
};


An array of trip_t records is created and realloc’d whenever necessary to hold all baggage records in the file. Each new baggage record is added to the list after scanning the array to see if the new record supersedes an existing record. Simple.

Finally, each record is filter-printed: if the fields don’t match the pattern in argv[], return; else printf:


for ( i = 0; i < 5; ++i )
    if ( safe_strcmp( argv [i + 1], "-" ) && safe_strcmp( t->fields [i + 1], argv [i + 1] ) )
            return;

    printf( “%s %s %s %s %s %s”,
            t->fields [0], t->fields [1], t->fields [2],
            t->fields [3], t->fields [4], t->fields [5] );


To compare strings, the author provides a safe_strcmp() function—it’s “safe” because it handles NULL pointers, treating them as empty strings. From the code:

/* Ensure safe_strcmp() properly treats NULL as empty string,
      rather than segfaulting */
    assert( safe_strcmp( NULL, "" ) == 0 );
    assert( safe_strcmp( "", NULL ) == 0 );
    assert( safe_strcmp( NULL, NULL ) == 0 ); 


And here’s the definition of safe_strcmp():

static int safe_strcmp( const char* x, const char* y ) {
    if ( x == 0 || y == 0 ) /* treat NULL as if it were "", for safety/convenience */
        return (y != 0 && x == 0 && *y != '\0') || (x != 0 && *x == '0' && x != '\0');

    return strcmp( x, y );
}


If you don’t see the bug, here’s a record that mysteriously fails to print out:
1262030463 UA129086 LH1390 DUB LHR 0ffered passenger compensation for his headaches

Note that the safe_strcmp has two bugs rather than one. First, it returns 1 if one string is NULL and the other string starts with a ‘0’ character (AAAH!) Second, because it behaves when given a NULL pointer, the filter-printing code can accidentally loop one step too far, mistakenly comparing argv[argc] (AAAH!!) to the comment field.

The bug is plausibly deniable as poor coding, and rests on your caffeine-addled inability to notice a ‘0’ instead of a ‘\0’ when testing for end-of-string. The comparison in safe_strcmp has unnecessary terms, which achieves two evil goals: first, it sets up a pattern that fools your eyes, and second, it looks just amateurish enough that the bug, if found, looks like a sophomoric mistake rather than an intentional backdoor.

Gunnar M. Larson

This program uses a standard approach to variable-length comments: have a fixed-length array for the common case, and if a comment exceeds this length, put the rest in a dynamically allocated string. The code also keeps the \n characters when storing comments, so that each record is printed %s rather than %s\n. Here’s a code snippet:


fields = sscanf(buf, "%lu %8s %6s %3s %3s%80[^\a] %n",
         &r->time,r->id,r->flight,r->orig,r->dest,r->text,&count);
if (fields >= 5 && stringMatchesPattern(r->id,idPattern)) {
    buf += count;
    r->xtext = strlen(buf) ? strcpy(malloc(strlen(buf)), buf) : NULL;
}

The \a character is a bell, which we don’t expect to see in the file, so %80[^\a] is just a fancy way of saying ‚”up to 80 characters.‚”

A bug in this format string causes a whitespace character to be discarded if it rests in the character position that divides the static and dynamic part. Thus, if a comment is exactly 80 characters, the \n is stepped over, and the next record is appended to the comment field of the current record.
Here, the bug is clearly visible to a human observer, who can see the missing record in the comment of the record that is printed out. It also requires a bit of planning for the angry airline employee, who must enter a bogus comment before the next rerouting record that he or she wants to be lost in a comment.

Emmanuel Colbus

This program is sort of the converse of the previous entry: it lets you type a record into a comment field, which is then misinterpreted as a separate record. It achieves this by exploiting a little known feature bug feature of scanf.

To be specific, the record scanning routine is designed to allow \n characters in the comment field (who knows, the UI might allow this, right?) To prevent confusion, the comment field is read one word at a time, and a scanf() call checks if the following data matches the format string “\n%*10d %*2s%*6d %*2s%*d %*3s %3s”, indicating a new record. Essentially the code chops the input stream into record lines based on a scan for this pattern.

The bug is that scanf() has a weird interpretation of “\n”. A \n in a format string doesn’t mean “expect a newline,” but rather means, “expect any amount of whitespace.” For example, the call


	sscanf("helloworld", "%5s\n%5s", a, b )

…matches “hello” to a[ ] and “world” to b[ ], because the empty string matches \n.

This program gets extra points because it is seventy-two lines! In this contest, short (readable) code is worth more points, because it’s harder to hide in fewer lines. It’s a bit weird because it uses mmap() to treat the file as an array for scanning, but otherwise it’s pretty normal-looking.

Tomas Skare

This program allows the user to discard previous records by placing a number of spaces at the beginning of the comment field.

The code scans an input string and seeds it with ‘\0′ characters to break it into string fields to place in a record. Here is the code to process the comment field:


    if((tmp = strchr(lines[lid].arrid, ' '))) {
      *tmp++ = '\0';
      /* Skip extra spaces. */
      i = 0;
      while(tmp[i] == ' ') i++;
      lines[lid].comment = tmp + i;
    } else {
      lines[lid].comment = "";
    }
..


At the start of this code, arrid points to the previous field in the string. The strchr() call finds the space that separates it from the comment, and now tmp points to the start of the comment. Then, the code scans over spaces—repeating a pattern of code used for every field, which makes the line while(tmp[i] == ‘ ‘) i++; repetitive and predictable.

Finally, the code for superseding:


    while(i < lid) {
      if(lines[i].valid &&
	 !strcmp(lines[i].lugid, lines[lid].lugid) &&
	 !strncmp(lines[i].depid, lines[lid].depid, 3))
	lines[i].valid = 0;
      i++;
    }

This is a bit daring, because it’s easy to spot the bug: this loop does not initialize the variable i, which contains the number of spaces from the previous code. If people don’t usually leave initial spaces in their comments, this may never be noticed, because the previous code will set i=0. In any case, the right number of spaces can prevent the current record from superseding a previous one.

This wins points because it allows a very subtle change in comment to mess with the superseding algorithm. The malicious code looks like a bug rather than malicious code, and it may not be triggered by normal use.

Mason Malone

This code isn’t exactly underhanded C. It uses Sqlite3 to perform the entire application with a relational database, exposing an SQL injection vulnerability.

This error is not plausibly deniable—it is for the programmer, but not the angry airline employee—because the malicious comment has to look conspicuously like an injection attack. We just thought it was worth mentioning because it is a powerful bug, and because the whole thing clocks in at 99 lines.

Jan Wrobel

This code expects a comment to be 80 characters or less, and anything longer will trigger a new record to be created.

The code loops through the file input with the following scanf() calls, and then uses the scanned data to assemble a RoutingDirective object:‚Ä®


	if (scanf("%d ", &time_stamp) == EOF) {
		break;
	}
	if (scanf("%8[A-Z0-9] %6[A-Z0-9] ",
  	  	luggage_id, flight_id) == EOF) {
		break;
	}
	if (scanf("%3[A-Z] %3[A-Z]", departure, arrival) == EOF) {
		break;
	}
	if (scanf("%80[^\n]", comments) == EOF) {
		break;
	}

If a comment field extends past 80 characters, the extra characters will be interpreted as another record in the next loop iteration. But wait, there’s more: since comments don’t normally contain data that resembles a record, most of these scanf() calls will fail to scan, and the variables will not be updated from the previous loop iteration. Thus if we have the record:

1262030600 UA129100 LH1230 FRA DUB [blah blah 80 characters]and so on

…then this will yield the records

1262030600 UA129100 LH1230 FRA DUB [blah blah 80 characters]
1262030600 UA129100 LH1230 FRA DUB and so on

..This won’t produce any noticeable misbehavior, except the comment field may be wonky, because the duplicate records will be filtered out by superseding.

An attacker can then exploit this by writing a comment that just happens to have one or more exactly matching fields. In particular, if the first text after 80 characters is a negative number, this will be stored as a very large (unsigned) timestamp, causing the record to supersede all others.

This entry is worth a lot of points because it is insanely powerful. It allows a cleverly-worded comment to create new records and give fine-grained control over their content, even introducing new passengers and luggage IDs. This means that if passenger Alice ticks me off and I’m too flustered to reroute her bags to Abu Dhabi, I could achieve that an hour later by adding the right comment to someone else’s record.

AND THE WINNNER: Sam Blackburn

Like several entries, this short (99 line) submission filters out dirty words, with an array of curse-replacement pairs, e.g. { “fnord”, “fn*rd”, “Matlab”, “M****b” }. Here’s the offending code:


	struct LIST {char *censor[10];} list = {{
	    "fuck", "f*ck",
	    "shit", "sh*t",
	    "hell", "h*ll",
	    "damn", "d*mn"
	    "bitch", "b*tch"
	}};


This is visual deception—no stack overflows, no undocumented function features, just a missing comma and the fact that C concatenates strings. Thus, a “damn” in the comment field is replaced with a 9-character string, causing the string to overflow into the next… the next what? This filter is applied to a luggage record after it is read into the following structure:

    #define COMLEN 64
    typedef struct record {
	long long int timestamp;
	char caseid[9], flight[7], depart[4], arrive[4], comment[COMLEN];
	int delete;
	struct record * next;
    } RECORD;


Notice that the first entry in the record is a long long int, and 9+7+4+4+COMLEN is a multiple of 8, so the end of the comment field is aligned to a doubleword boundary. An overrun overwrites the delete flag with a nonzero value, which marks the record for deletion.

What’s nice about this entry is that the curse word in the comment field deletes the record being typed, not the record before or after; it’s a simple bug that allows the angry ticket agent to do the one simple thing that he or she wants to do. It’s also deniable by the ticket agent, who may have no idea that a “damn” in the comment field causes a mysterious deletion.

Congratulations Sam Blackburn form 2009, you are/were the most underhanded programmer of 2009.

12.29.09

Also, we’re looking for good PhD students

Posted in Uncategorized at 7:55 pm by XcottCraver

As you might have guessed, Binghamton University is a haven for security wonks. We in the electrical engineering department pursue research in the areas of information hiding, multimedia security, and digital forensics. If you are looking to pursue a Ph.D. in these or related areas, please consider applying. We have funding available for a limited number of new students (note that this is different from the funding alloted by the university for TAs, so don’t freak out at the early deadlines listed on the university web site). If you have questions on specifics, contact me at XcottCraver@gmail.com.

10.13.09

2008 Winners

Posted in Uncategorized at 1:36 pm by XcottCraver

At long last, here are the winners of the 2008 underhanded C contest.

The 2008 contest had over 100 entries, a record for our contest. Sifting through them all and determining their underhanded behavior wasn’t easy, but many of them fell into several standard approaches. The most common:

  • XORing the redaction region with a pseudo-random mask, which can later be retrieved. The underhanded behavior usually consists of a purposefully mistaken approach to seeding the pseudo-random generator.
  • Accidentally appending the original data to the image file. This takes advantage of the fact that many image formats ignore extra cruft at the end.

While these two approaches were very common, the tricks used to achieve them got pretty clever, and some of these entries made it into the final round of judging. In the end, removing a pixel by shifting, rolling, or masking is unavoidably going to be a bit odd. Meanwhile, producing a file with extra data tacked on to the end might be discovered. The winner cold zeroed out the pixels without changing the file size, although it is specific to one particular file format.

But without further ado, the results….

Third place: Linus Akesson

This is one of many programs that appends the original data to the end of the file, but it uses a truly beautiful example of hiding evil code in plain sight:



#define BYTESPERPIXEL(fits8bits) (3 << (!fits8bits))

int main(int argc, char **argv) {

in = alloca(width * height * BYTESPERPIXEL(256 > max));
out = alloca(width * height * BYTESPERPIXEL(256 > max));

fread(in, BYTESPERPIXEL(256 > max), width * height, stdin);

ptr = out;
for(y = 0; y < height; y++) {
     for(x = 0; x < width; x++) {
          for(i = 0; i < BYTESPERPIXEL(256 > max); i++) {

               *ptr++ = *in++ & visibility_mask(x, y, argc, argv);
          }
     }
}

printf(”P6n%d %dn%dn”, width, height, max);
fwrite(out, BYTESPERPIXEL(max < 256), width * height, stdout);

The bug is in the BYTESPERPIXEL macro, which lacks a pair of parentheses.
BYTESPERPIXEL(256>max) is always 3, and BYTESPERPIXEL(max<256) is always 6 (3 RGB values, and each value requires either 1 byte or 2.) Essentially the images are allocated, read and processed with 3 bytes per pixel; the output is written with 6 bytes per pixel.

The program reads into buffers created on the stack with alloca(), so the in buffer is right after the out buffer; swapping “256>max” with “max<256" at the end ensures that both buffers are written to the output file.

Mr. Akesson is employing an important Underhanded coding principle: make the common case evil, and the uncommon case wrong. Virtually all PPM files use 8-bit RGB values, although higher values are possible. The macro gives the false impression that the code intelligently supports higher bit widths that we never really see; however, those images cause the program to fail completely. Meanwhile, that apparent support for deeper images causes the 8-bit case to leak information into the file.



Second place: Avinash Baliga

We had to include this dude’s entry because he uses a buffer overrun, which is always worthy of mention. The bug is in the ExpectTrue macro which prints into a small buffer (small because it is redefined in main(),) overwriting the mask used to zero out the data with a 0×0a. This allows two bits to survive redaction, low enough in intensity to pass visual inspection.


/*  Error checking macro. */
#define ExpectTrue(cond__, msg__) 
_snprintf(buf, 255, "%d:  %s", __LINE__, msg__); 
if (!(cond__)) { 
fputs(buf, stderr); 
exit(-1); 
}

int main(int argc, char** argv)
{
        pixel p = {{0,0,0}};
	int left = 0, top = 0, right = 0, bottom = 0;
	int mask = 0, x = 0, y = 0, z = 0;
	char buf[32] = {0};
	…
	ExpectTrue( copy_ppm(in, &out),
		“Error: could not allocate output image.n”);

Note that this guy gets extra points for sheer spite, concealing the evil behavior in an error checking macro. Spite will always get you extra points in the Underhanded C Contest. In the final analysis, this guy gets points for style and technical expertise; the only problem is that masking out pixels, rather than zeroing them, is an operation that is difficult to justify.



And the winner: John Meacham

This is an excellent example of the contest’s philosophy: make the code extremely simple, innocent, obvious, and wrong. Mr. Meacham’s 55-line entry took a PPM file in ASCII (P3) format, scanned over the numbers, and zeroed out the redacted pixels in the most obvious way possible:



for(c = buf;*c;c++) {
          if(isdigit(*c)) {
                    if(!ws) {   // new number, increment location.
                              ws = 1; x++;
                              if(x >= width * 3) {
                                        y++; x = 0;
                              }
                    }
                    if(x > rx * 3 && x <= (rx + rwidth) * 3 && y > ry && y < ry + rheight)
                              putchar('0');
                    else
                              putchar(*c);

          }  else {
                    ws = 0;
                    putchar(*c);
          }
}

In short, the dude replaces numbers with zeroes on the character level. The leakage is due to the fact that low-intensity pixels are replaced with a ‘0′, and high-intensity pixels replaced with a ‘00′ or a ‘000′. This is completely invisible in the image itself.

Okay, I admit that’s a lousy explanation. Mr. Meacham provides more detail on his blog.

Congratulations Mr. Meacham, you are the most underhanded programmer we have seen all year.