Archive for Technical Article

Enumerating Three Choices

I was recently posed a question by a psychologist friend of mine. She needed to enumerate all possible answers to a set of 20 questions. Each individual question could have three answers. She then planned to count the actual frequency of answers from the test sample (say, N~=100) and run this result through some other statistical software. The problem was that the statistical software’s input format required the permutations of answers, even if they had a frequency of zero.

I could easily write a procedural iterative program to enumerate all possible binary choices for 20 questions. (Wait, can I?) Wait – what about three possible choices?

The difficulty is deciding just what type of problem this actually is. Is it a permutation thing (i.e., order matters)? Combination (i.e., order doesn’t matter)? Enumeration (iterate each possible permutation)? Using what number base (3)? Should I use an iterative or recursive solution, or perhaps a clever workaround? How long would a reasonably efficient program take?

In any event, I knew that the output set was potentially large, and it wasn’t immediately apparent how to write a piece of C code to enumerate all possible permutations of answers for 20 questions. Let’s define what I’m looking at: I want all possible combinations (i.e., I don’t care about order specifically) of answer strings to a list of 20 questions where each question could be answered in 1 of 3 ways. If I were just interested in permuting the test questions themselves, I would have a set of 20! or 2,432,902,008,176,640,000 or 2.4×1018 (as Unix `bc’ tells me). But I’m not; I don’t care what order the questions are in, but rather producing all possible ways to answer the test itself for one arbitrary ordering.

For 1 question, there are three possible answers: 0, 1, 2 (let’s say they correspond to “yes”, “maybe”, “no”). You might also consider a null answer in addition to this set, but for now, let’s ignore that possibility. So, there are three possible ways that this one question test could be answered. What if we added more questions?

Q1: ?
0, 1, 2
Q2: ?
0, 1, 2

So we have all possible answer strings (of length 2) as:

00
01
02
10
11
12
20
21
22

With just 2 questions, there are 9 possible answer permutations.

Thus, the number of questions becomes the length of our string. The number of response options become the possible values of each location in the bitstring. In binary (base 2) number systems, the possibilities proceed in powers of 2 (e.g., for a 100 question test with only two possible response options, the total number of answer combinations is 2^100). With base 3 and 20 questions, the total number of unique answer combinations is 3^20; that is, someone could answer the test itself in any one of 3^20 ways. For those of us who are computationally challenged, this number has a value of 3,486,784,401 or about 3.4 billion (as an aside, even if 10,000 people responded to this test, the sets of answers they produce would very sparsely populate this space of 3.4 billion strings).

So how do you generate 3.4 billion unique strings of length 20 where each position could be one of three values? Actually, how do you write a program to do that for you?

It turns out that we can take a couple of interesting detours into mathematics [1,2], but I’d prefer to stay on the CS side of the house for now. Let’s think about symbol representation. Above, I abandoned ‘y’, ‘n’, and ‘m’ for representing the choices because those three characters have no directly mathematical relationship. Representing the answers as 0, 1, and 2 (on a binary machine) really doesn’t seem to have much advantage either.

There is in fact a ternary number system with some interesting properties. According to the American Scientist article “Third Base” (Nov/Dec 2001), the ternary number system “is the most efficient of all integer bases; it offers the most economical way of representing numbers.”

Perhaps I could simply write a program to count to 3,486,784,401, converting each integer value to a bitstring and then write a routine for converting that bitstring to a tritstring (ternary string) and printing it to stdout. I’ll try that, and see what happens. But let’s consider the heart of that approach: the translation of binary bitstrings to tritstrings. Let’s translate the binary representation of decimal 12 to ternary.

12 contains two powers of 2: 8 + 4. So in a 4-bit string, let’s turn on the leftmost two bits to get: 1100 (i.e., 1*23 + 1*22 + 0*21 + 0*20, or 1*8 + 1*4 + 0 + 0).

Now how do we convert from base 2 to base 3? How do you convert from one base to another? Well, I just did it for base 10 to base 2, albiet informally. You keep dividing the number to be translated by the target base and keep track of the remainder. The sequence of remainders becomes the digits of your translated number. To wit:

12 / 2 = 6 remainder 0
6 / 2 = 3 remainder 0
3 /2 = 1 remainder 1
1 / 2 = 0 remainder 1

Once the division results in a zero value, we have all the remainders we need (read that right most column of remainders upwards…see how it equals 1100?).

Armed with this knowledge, need we convert from decimal to binary to ternary at all, or can we just directly convert?

12 / 3 = 4 remainder 0
4 / 3 = 1 remainder 1
1 / 3 = 0 remainder 1

Is 110 (ternary) really equal to 12? Yes: 1*3^2 + 1*3^1 + 0 or 1*9 + 1*3 or 12.

Code

Ternary Counting

Postscript

Along the way to investigating this, I stumbled across two other interesting American Scientist articles: “E Pluribus Confusion“, about “fair” seat allocation in the US House of Representatives (a process with a series of varied hacky tie-breaking rules over the years) and “The Bootstrap” about the limits of our common statistical assumptions.

References

[1] http://lin-ear-th-inking.blogspot.com/2012/11/enumerating-permutations-using.html
[2] http://www.roard.com/docs/cookbook/cbsu2.html

Comments off

Slow Curtain of Death

I should rename this blog “Complaints about Apple Mac OS X”.

Apple has their own alternative to Microsoft’s “Blue Screen of Death.” I call it the “Slow Curtain of Death” — the machine suddenly stops responding, and a line proceeds from the top of the screen to the bottom, dimming the display and displaying a typically sexy-looking Apple message box saying “You must restart your computer. Press…”

My new Macbook Pro crashed (i.e., the kernel had a “panic” — yes, that’s a technical term for when the underlying operating software encounters an unrecoverable error) for the fourth time in a week from the exact same line of code (it looks like an error in handling thread locking conditions in the kernel scheduler). I have sent all four of these reports to Apple via their automated reporting mechanism.

I suspect the culprit is some incompatibility between my version of VMWare Fusion and Snow Leopard (but this is complete speculation). I append the start of the most recent crash report below. Note a few things. First, the error output line itself is quite detailed, even including a possible cause (unlocking an already unlocked mutex or spinlock? why would that cause a problem?) It even includes the file sched_prim.c and line number of the error. Too bad I don’t have Mac OS X source code. The report also includes a dump of the stack, but it lacks reporting the standard dump of CPU registers.

Interval Since Last Panic Report: 27770 sec
Panics Since Last Report: 1
Anonymous UUID: B22098C3-CA08-4230-A115-AFDF431CCB8B

Tue Jan 18 14:50:12 2011
panic(cpu 2 caller 0x226b53): “thread_invoke: preemption_level -1, possible cause: unlocking an unlocked mutex or spinlock”@/SourceCache/xnu/xnu-1504.9.26/osfmk/kern/sched_prim.c:1471
Backtrace (CPU 2), Frame : Return Address (4 potential args on stack)
0xbe56be18 : 0x21b50c (0x5d4438 0xbe56be4c 0×223974 0×0)
0xbe56be68 : 0x226b53 (0x58babc 0xffffffff 0x58ba54 0×226714)
0xbe56bee8 : 0×227259 (0xde9db98 0×0 0x6bc9f000 0×1)
0xbe56bf58 : 0x2272c4 (0x22fc20 0x863ea0 0×0 0x2a358d)
0xbe56bf78 : 0x22fdba (0x22fc20 0x863ea0 0×0 0×0)
0xbe56bfc8 : 0x2a06cc (0x863ea0 0×0 0×10 0xe1933c0)

BSD process name corresponding to current thread: kernel_task

Mac OS version:
10J567

Kernel version:
Darwin Kernel Version 10.6.0: Wed Nov 10 18:13:17 PST 2010; root:xnu-1504.9.26~3/RELEASE_I386
System model name: MacBookPro6,2 (Mac-F22586C8)

System uptime in nanoseconds: 38460313523585

Comments off

Early October Self-Induced DoS

I recently updated this blog from the WordPress control panel. All it takes to update is a single mouse click.

This action was ill-advised, uniformed, and generally ignorant. First, I did not follow the advice in the upgrade dialog that urged me to back up my content. Second, when the update finished, none of my posts were available. The front page of the site said something like “no posts matched your search criteria.” And it has been like that for about a week. Today I decided to get to the bottom of the issue. I was alarmed because the Yahoo control panel indicated that my database was using zero MB and had never been backed up.

Calling Yahoo support did not help much. I’ll leave out the details, but the nice technical person on the other end of the call basically said that WordPress-related issues, database-related issues, and other pertinent issues were out of Yahoo’s scope of support. In short, they supply the rope, and you do the hanging (and I deserved to hang for not having taken precautions to understand just what I was doing).

How did we resolve this issue?

I installed phpMyAdmin through the Yahoo control interface. This was immediately reassuring because it claimed that my blog’s database had 21 tables in it, and the table overview showed that most of them had data.

The “wp_posts” table, however, was marked as “in use”, and issuing queries (such as `DESCRIBE wp_posts;’) returned an error:

Can’t open file: ‘wp_posts.MYI’ (errno: 145)

This forum web page helped tremendously. It seems like during the upgrade, some process belonging to the previous PHP code or some process belonging to the upgrade PHP code failed to detach itself from using the table. The table was thus locked, and no other processes (including new requests from the new PHP code to display posts) could read or write the table.

Fortunately, phpMyAdmin contains two options: “check table” and “repair table.” Issuing the first SQL command confirmed that the table was ‘marked as crashed’ and that ’1 client is using or hasn’t closed the table properly.’ After coming back to the view of the tables, the wp_posts table was no longer marked as in use. I then issued the repair command. Within five minutes, post content was visible on the site.

Whew.

Comments off

Mini-Dissection of IRS Phishing Spam

Four days ago I received an email from tax-refunds@irs.gov. A selection of interesting headers from the message appears below (sensitive information, such as my email address, elided). You’ll note two things. First, this email came to one of my U of C addresses, but it purports to be from the IRS. Why would the IRS (an American government agency) be sending a legitimate email to some random Canadian email address (yes, I’m a US citizen, but that’s besides the point — we’re playing the odds here, and certainly I have other valid US-based addresses the IRS could reach out to). So: strike one. Second, the U of C spam filter correctly tagged this as spam (hence the {Spam?} prefix in the Subject header. Strike two. Finally, the IRS itself says that it does not request information via email: http://www.irs.gov/privacy/article/0,,id=179820,00.html. Strike three. But it gets better. The phisher keeps swinging for the fences.

Return-Path:
Received: from correo.ziv.es (correo.ziv.es [77.226.243.115]) by
forward.ucalgary.ca (Postfix) with ESMTP id 25CE038227 for
; Mon, 6 Sep 2010 19:10:26 -0600 (MDT)
thread-index: ActOKXOX06gLrg0bRziJn6dBptyFCg==
Received: from User ([60.32.171.42]) by correo.ziv.es with Microsoft
SMTPSVC(6.0.3790.3959); Tue, 7 Sep 2010 03:10:24 +0200
From:
Subject: {Spam?} IRS Annual Notification (ID: A20W852)
Date: Tue, 7 Sep 2010 03:10:32 +0200
MIME-Version: 1.0
Content-Type: multipart/mixed;
boundary=”—-=_NextPart_000_009D_01C2A9A6.691CE492″
X-Priority: 3
X-Mailer: Microsoft Outlook Express 6.00.2600.0000
Message-ID:
To: undisclosed-recipients:;
X-Foo-MailScanner: Found to be clean
X-Foo-MailScanner-SpamCheck: spam, SpamAssassin (not cached,
score=11.622, required 6.2, autolearn=disabled, BAYES_99 5.50,
FORGED_MUA_OUTLOOK 3.12, FORGED_OUTLOOK_HTML 0.00, HTML_IMAGE_ONLY_20
1.55, HTML_MESSAGE 0.00, HTML_SHORT_LINK_IMG_3 0.00, MIME_HTML_ONLY 1.46)
X-Foo-MailScanner-SpamScore: sssssssssss
X-Foo-MailScanner-From: tax-refunds@irs.gov
X-Spam-Status: Yes

The headers indicate that someone connected to a Spanish ISP (I’m guessing from “correo” and .es) has fallen victim to a spambot using their Microsoft Outlook Express (or this could be forged information, but why include code for mailing in your bot when you can just ask the OS to ask an application to do it on your behalf?). Spanish citizens probably aren’t sending email on behalf of the IRS. Strike four.

This message contained some boilerplate legalese (“This message is intended only for the use of the individual…blah blah blah”) and an HTML attachment. The boilerplate language also contained the sentence: “Any views or opinions presented are solely those of the author and do not necessarily represent those of the company.” The IRS is an agency, not a company, and so this sentence doesn’t jive. Strike five.

The HTML attachment was more interesting, although quite a basic attempt at phishing. I saved it to a file system and use the Unix `file’ command to see the general format of what it might contain:

[locasto@xorenduex quarantine]$ file Refund_Payment_Form\(ID\ A20W852\).html
Refund_Payment_Form(ID A20W852).html: ASCII English text, with very long lines, with CRLF line terminators
[locasto@xorenduex quarantine]$

Using the Unix `more’ command, I started to scroll through the file. Most of it appears to be a fairly standard scrape of a real IRS web page.

At line 324, however, we see an HTML table definition that contains a brief Javascript program starting at line 338 (immediately before this, the phisher attempts to close any other active script context with an “end” SCRIPT tag. The script is mean to check HTML form input before submission. It pops up Javascript alert boxes asking for a series of items. It starts by asking 3 times for your SSN, then requests your CVC/CVV2 (those short codes on the back of your credit card), your ATM signature, credit card expiration month and year, full name, billing address, home phone, date of birth (3X), mother’s maiden name, and name of your bank.

Places to enter this information appear in an HTML form defined immediately after this script. The target action of this form is to post the information the victim gives it to a PHP web page at hikinginn.com. The URL looks like some bulletin board or photo gallary: “/bbs/data/egallery/119701770/indexppl.php”. Presumably this page has been hijacked to contain code that accepts data from this phishing attempt. When I used wget to fetch this page, wget returned an HTTP 404 (Not Found) error, indicating that this page may well have been taken down.

[locasto@xorenduex quarantine]$ wget http://www.hikinginn.com/bbs/data/egallery/1197017760/indexppl.php
--10:54:55--  http://www.hikinginn.com/bbs/data/egallery/1197017760/indexppl.php
           => `indexppl.php'
Resolving www.hikinginn.com... 218.232.66.19
Connecting to www.hikinginn.com|218.232.66.19|:80... connected.
HTTP request sent, awaiting response... 404 Not Found
10:54:56 ERROR 404: Not Found.

The web site itself is for a resort on the Korean island of Jeju.

By the way, if you’re confused about what CVC/CVV2 is, then the phishing form helpfully asks: “Need help with CVC/CVV2?” and links this question text to: http://www.sti.nasa.gov/cvv.html

Comments off

Shutting Down the Internet

I was recently cited, among others (including Sal Stolfo and Chris Kruegel), for a Politifact article by Lukas Pleva on whether it was possible for private industry to shut down the Internet as a protection measure during some large-scale cyber attack with or without some form of government involvement:

The article is here:
Glenn Beck Host Says Obama May Soon Be Able to Shut Down the Internet

Although the folks cited in the article generally agree that the technical capability to do such a thing exists in the private sector, the experts question either the wisdom of such a move or the probability of such an action actually occurring without some form of high-level coordination between their corporate overlords and either the military or some civilian government agency.

The question of whether the government should have its hand on an Internet Kill Switch (this phrase itself smacks of hyperbole, and may be an overreaction or misrepresentation of the actual proposed legislation) has been raised largely due to provisions in recently proposed legislation, a previous version of which this blog has commented on before. This new round of media hysteria was prompted by Joe Lieberman’s resurrection of a similar, but more measured (by some accounts) idea. Schneier recently blogged about his take on this whole controversy.

Both obvious and subtle questions exist here, including:

  1. What does “shut down” mean?
  2. How complete would this shutdown be?
  3. Is it desirable to shut down the Internet during a cyber attack?
  4. Is it technically possible to do so?
  5. Is it administratively or politically possible to do so?
  6. Do private Tier 1 ISPs need either government permission or {techincal, logistical, communications} assistance to unplug?
  7. How fast can this shutdown event take place?
  8. Where should ultimate authority for such a move rest?
  9. Under what conditions do we plug back in?
  10. Are there alternatives?

We’ll try to deal with these below one at a time. Briefly, the answer depends on what type of threat it is, what “shutting down” the Internet means, and whether we distinguish between an administrative decision to shutdown versus a technical action to accomplish or realize this shutdown.

Disclaimer: There are only a few folks on the planet who fully understand the subtleties of controlling BGP and interdomain routing and working with it on a daily basis; I don’t pretend to be one of them. I’ve studied the basics of Internet routing along with academic research on routing security issues, but I’m willing to take correction or feedback if I’ve gotten something wrong.

1. What Do You Mean by “Shutdown”?

This term may entail a different series of actions and events to different people. I take this term to mean to termination of layer 3 (e.g., IP) connectivity and the termination of the BGP routes between major U.S. & North American ISPs and the rest of the world. Such a termination in connectivity could be accomplished in any number of ways (some of which are more realistic than others), such as (1) physically unplugging or severing border router links, cables, and fiber, (2) setting up traffic filters on border routers using their installed software (e.g., using IOS)…such a step is quite similar to setting up “firewall” rules for network packet filters like BSD pf or Linux iptables/netfilter, (3) stop announcing BGP routes or issue BGP route withdrawal messages, (4) setting a pack of rabid backhoes loose near network POPs and peering points.

“Shutdown” could also entail the activation of a large number of network filters looking for certain flows, content, or source addresses, networks, or routing prefixes (in the core, these are essentially the same data). These filters would have the effect of limiting traffic from flowing without completely disconnecting machinery or routing paths or implying some type of shut off or power outage.

2. How Complete Would the Shutdown Be?

There are private-sector companies (i.e., large Internet Service Providers or ISPs) that control much of the core Internet infrastructure (e.g., interdomain routing and DNS) that could shut down this infrastructure (i.e., the servers running these protocols) during some kind of global conflict. While it is true that there are a large number of ISPs, only a few really big players exist, and if they decide to terminate connectivity, this action would involve a large chunk of the network. Such an action by “US-friendly companies” would take large sections of the US and some other countries offline (the US serves as a transit network for a lot of worldwide traffice simply because many types of communications lines pass through us).

Such a shutdown would necessarily be incomplete. The Internet was designed by DARPA-funded scientists to be resilient even in the face of widespread nuclear attack. Taking the US routing infrastructure offline would still leave the rest of the world connected, and after a period of a few minutes for routers to reconfigure routes, the rest of the world would be exchanging traffic (probably more slowly, since the US contains a lot of high-speed links), but connected nonetheless (modulo some specific unreachable destinations simply due to how the physical and virtual infrastructure are connected). Many smaller regional ISPs have peering agreements and relationships that would still enable some traffic to flow, albeit more slowly (or possibly not very widely).

The bottom line is that no single company (or government) has the ability to shut off the Internet as a whole, but a small number of companies could disconnect large segments of it if they both chose and agreed to do so (which entails some administrative oversight giving permission to such a drastic change, since ISPs are paid to route traffic: no packets moving, no money).

3. Do We Want to Shutdown?

I think legitimate concerns exist as to whether a shutdown provides the right response in any reasonable case. While we have been conditioned by certain software practices that a reboot or reinstall is the standard way of getting back to a known good state, terminating the global instance of BGP (or a large portion thereof) represents a risky (albeit fascinating) and uncontrolled experiment.

Also, in most cases, eliminating this infrastructure would be the absolute worst course of action system defenders could take, as it greatly reduces communications (email, VoIP, social networking) that defenders require to coordinate against a large-scale threat. Even in the most dire of circumstances (i.e., whatever movie-plot scenario one might imagine), such action really isn’t an option — there are many ways to filter or reduce certain types of traffic that would be much more effective than simply severing links.

4. Is it Technically Straightforward to Accomplish This Shutdown?

I claim that it is technically “trivial” to shut down the US part of the Internet. Private-sector companies run this infrastructure, and their network operators have the skill and knowledge to configure it. In fact, accidental misconfigurations that severly disrupt connectivity occur quite often due to simple human error; see, for example, the AS7007 incident. One need not ask the US government for a technical aid to the shutdown process. This process should be as simple as pressing the right buttons — although I don’t know if these technicians actually practice such a maneuver or plan for it. Even if they do, I take it as given that they might make mistakes in the heat of the moment.

5. Is it Administratively or Politically Straightforward to Do So?

I’d say “no” and give as evidence the furor over this topic. I think that the political world tends to view the Internet as akin to any other piece of infrastructure (roads, water system, electrical grid), and I doubt that analogy provides a serviceable one. In the case of an Internet-scale attack on US information infrastructure, I don’t think that the conditions for the President to request a shutdown are clear or at all well-understood: the administration would almost certainly require private-sector analysis to inform its opinion. Furthermore, from a technical standpoint, this is the “nuclear” option, and we have no technology that tells us “how bad” a cyberattack actually is: are we being tickled with a feather, walloped by an anvil, or smacked on the backside with a plastic shovel? A misjudgment and overreaction here could be a cure much worse than the (misdiagnosed) disease.

6. Do Tier 1 ISPs Require Corporate, Political, or Military Involvement?

This answer depends on the definition of “involvement.” Much of the argument on this topic has been phrased in absolute terms: an administration would have sole command authority to issue an “Internet Kill” order. While government has not restrained itself from overreaching in the technical sphere before (see, for example, the downsides of CALEA and its invasion of the academic sphere), I doubt that political authority over the Internet would really assume this kind of authoritarian form (my personal politics make me extremely uncomfortable with this level of government control, so perhaps this is wildly optimistic thinking on my part). I don’t think that the government would either command or require ISPs to seek permission to enact large-scale filtering.

Nor do I think that ISPs would need a government whip to work together. Although ISPs compete with each other in a number of dimensions, and policy dictates the actual routing, ISPs also peer with each other and cooperate on a range of issues.

I don’t think that the ISPs need government assistance in terms of logistics; there is no need for the government to setup a hotline, website, or working groups, committees, panels, etc. to help ISPs talk with each other during such an emergency. Such communication could happen over the channels that ISPs already have established (some of these are informal contacts such as network operators sharing cell phone information) for Internet-scale emergencies (these happen regularly due to simple misconfiguration or failure of physical infrastructure).

In fact, the relationship is almost exactly the other way around: government requires industry assistance in terms of information, data, and analysis in case of such an event.

I do, however, concur that some part of the government would want to be in the decision loop for taking such a drastic step. They may not actually give the go-ahead or command that it be done, but I suspect that they’d want veto power or at least a warning that the business community was about to do this. This organ might be DHS, DOD, DNI, Interior, Commerce, NSA, or some other agency…I doubt the government has a coordinated plan or point of contact for such events (which I suspect was the intent behind the relevant clauses in the Rockafeller-Snowe bill to enable the executive branch to make such a call). I see this legislative attempt as a symptom of a government/administration that is on the verge of “getting it” in terms of the importance of critical information infrastructure, even if the expression of this awareness is to introduce clarity in the form of additional executive branch power over private commerce.

7. How Fast Could the Shutdown Take Place?

Network operators — the actual technicians in charge of routers and other network equipment — are a small, fairly tight-knit community. Even though these engineers work for many different companies, they (at least those working for the major players or Tier 1 ISPs) know each other quite well, and NANOG holds regular meetings. Informal cooperation happens all the time. I expect that in an Internet-scale emergency (as there have been in the past), this community would be in touch with each other quite quickly: so it is conceivable that they could coordinate a response to a major event and terminate basic connectivity within a matter of hours or minutes. Such a move would probably require some cooperation and coordination from both the political/military world as well as corporate approval. I assume that some minimal coordination happens before admins start typing at keyboards…but in a flat-out emergency, shutting off network interfaces can be accomplished very quickly.

Once either corporate leaders (alone or in consultation with civilian or military leaders) reach a decision, the technical difficulty of shutting down routers and other networking equipment can be accomplished within a few minutes. The bulk of any delay in reducing connectivity almost certainly rests in the human and policy decisions necessary to give the green light to such activity. I suspect that Tier 1 ISPs have some business process (independent of government regulation or cooperation) that requires VP or Director-level permission to execute such an action.

Where Should Ultimate Authority for Such a Move Rest?

This is the whole point, isn’t it? The answer depends on your politics. From a technical perspective, this is the difference between “policy” and “mechanism.” The mechanism is in place and sits almost entirely in private hands. The policy is distributed across the private and public sector, and I’m willing to believe that factions exist in both spheres that respectively (1) want and (2) abhor the responsibility for making such a call.

Under What Conditions Do We Plug Back In?

I see this question as more important than the others. Pulling the plug is a decision made under a certain set of circumstances and with a certain set of criteria in mind; have the politicians planned for when it will again be “safe” to plug back into the Internet? How will they know for sure? Do they realize that the Internet is already a very loud and risky battleground, and that we run this risk every day? Should all commerce, community, and information exchange grind to a halt simply because a few politicians and White House advisors got a bit nervous during a particularly loud cyberattack? Can the US financial markets and other information infrastructure be offline for extended periods of time?

This question highlights how (from a technical perspective) the issue of an Internet kill switch (either public or private) seems a bit nonsensical: it is overkill and almost certainly something likely to be used in a knee-jerk fashion with no thought for the recovery complexity. There is probably a good analogy to be made here that illustrates the self-defeating futility of disconnection, but I can’t think of one at the moment.

What Are the Alternatives?

The deployment of “reasonable” alternative defenses or reactions differs based on what type of attack we have to consider. Companies (including large ISPs, but also your “average” Fortune 500) have a variety of other internal defense mechanisms against cyberattack (coordinated or otherwise), but the efficacy of these mechanisms varies widely, and the effect is almost always local or limited to their own network infrastructure.

More Resources

For understanding interdomain routing, a good place to start is Tim Griffin’s page. You can move on to JI’s Fall 2002 Internet Routing course at Columbia and then Radia Perlman’s Interconnection’s book.

The company Renesys also provides deep, wide analysis of Internet-scale phenomena and conditions. At least in the public world, they have no serious competitor.

[Updated 15 July to point to Schneier's blog post. -Ed.]

Comments off

A (Simulated) Turing Machine

Very neat… (a simulation of) a Turing Machine: http://aturingmachine.com/index.php

Comments off

Trusting SSL Certs

The problem of user “click-through” of SSL certificate warnings when connecting to websites using SSL/TLS is an old, well-recognized one. Users have little information with which to make a meaningful trust decision.

I recently visited a Web site where I had to submit reviews, and the connection was HTTPS. Firefox’s new in-browser dialog warning about the risks of an unverified or otherwise broken SSL cert appeared, complaining that the certificate was invalid for a few different reasons, namely that it was not valid for the URL (“3 of 4″ rather than “trust2010.org”), that it was self-signed (not in itself an uncommon situation for these types of paper reviewing sites and other ad-hoc, self-hosted sites), and that it had expired (it was only valid for one month in late 2009). A screenshot of this warning follows:

Firefox Complaining About the Validity of a Certificate

Firefox Complaining About the Validity of a Certificate

I typically tend to add these exceptions after a cursory glance at the certificate information. After all, many sites have some desire to run their own PKI, distribute self-signed certs, be their own CA, or otherwise present certificates that are not signed by one of the existing built-in root certificates in your browser.

The only thing that stopped me from approving the exception was the certificate information, which was so obviously bogus and homemade that I did not want to risk connecting to the site and giving away my log-in credentials. Screenshot of this certificate follows:

The Invalid Certificate Information

The Invalid Certificate Information

Note that had this information been more realistic, I would have had no way to feel suspicion about the certificate. Even if it were revoked or outdated, for the purposes of uploading a few reviews, I might have easily just added the exception and gone on. The moral of the story here is that even experienced users, when presented with credible information, have no way to ascertain the trustworthiness of the information contained in a certificate. So: do certificates need to be active entities and go about proving their provenance to a user in an active manner, such as playing a game, completing a formal proof, or otherwise attesting to some properties known only to the user and the endpoint he is trying to communicate with?

Comments off

Counterfeit Hardware

An interesting detective story dealing with hardware disassembly to check the provenance of micro SD cards:

http://www.bunniestudios.com/blog/?p=918

Not exactly trojan hardware, but a good case illustrating the actual level of trustworthiness of real hardware nonetheless, and it includes an interesting foray into the economics of micro SD production near the end.

[I picked this link up from a post to a private mailing list -Ed.]

Comments off

System Forensics

RFC 3227 is a handy resource for students interested in the challenges of beginning the recovery process:

http://www.faqs.org/rfcs/rfc3227.html

I hadn’t known about this until reviewing a paper recently. This (short) RFC contains some guidelines for performing forensics on a compromised computer system. Nothing earth-shattering, but it does provide a nice collection of principles.

Why do these practices matter? Because expert witnesses and the legal system can easily question the quality of digital evidence:

http://www.piercelaw.edu/assets/pdf/release-mavis-case-expert-report.pdf
(this report received Slashdot coverage last summer).

Comments off

CPU DoS Attacks

Also known as CPU starvation or CPU consumption attacks, such attacks present a difficult challenge to commodity computing platforms: users typically believe that commodity hardware is a high-assurance product and that software errors present more of a threat to reliability, quality of service, or security.

A Denial-of-Service (DoS) attack on a Central Processing Unit (CPU) represents an intentionally induced state of partially or completely degraded CPU performance in terms of the ability of the CPU to make progress on legitimate instruction streams.

Background

This type of attack represents a condition of the CPU whereby its available resources (registers, data path, arithmetic functional units, floating point units, and logic units) remain in an intentionally induced state of overload, livelock, or deadlock.

An attacker can prevent the CPU from making progress on the execution of benign processes in a number of ways, but at least two mainstream methods suffice to overload the CPU or impair its ability to multiplex between a collection of processes. The first method involves the exploitation of a hardware error in CPU design or construction to halt or loop the CPU or otherwise place it in an error state requiring a hard reset. The Intel F00F bug provides an example of this method of attack. The second method involves exploiting a software error in the operating system (example 1, example 2) or user-level software to cause the CPU to continuously service the faulting software (sometimes in spite of the kernel scheduler’s attempts to ensure fair CPU multiplexing). This style of attack is closely related to Algorithmic Complexity DoS attacks (they differ in that such attacks can also overload or impair memory performance rather than the CPU as the chief avenue of service degradation).

Applications

Attackers can cause the CPU to hang, halt, or execute malicious (or useless) code rather than legitimate, benign processes. They can do so by identifying and building on an error in the CPU hardware or by causing the kernel or one or more user-level processes to consume more than their fair share of execution time. Often, the latter attack involves identifying a software error ({\it e.g.,} to cause an unterminated loop) or supplying data to system calls specially constructed to cause uncharacteristically long system call execution time.

Low-tech versions of this latter style of attack can include manipulating the scheduler priority for one or more processes, disabling or removing resource limits (if provided by the operating system) or issuing a fork bomb or similar resource exhaustion attack.

Descriptions of hardware bugs are often available in the published CPU errata lists for each CPU model. The errata lists often describe such errors and their preconditions in enough detail to enable the reconstruction of code sequences that manifest the error state (which often, but not always, results in a CPU hang or inconsistent state rapidly leading to a hard or soft reset). Attackers can then either directly upload and run such code sequences on a target platform or construct data aimed at eliciting such instruction sequences from the execution of existing program binaries.

The aforementioned Pentium F00F bug supplies one prominent example of a hardware error leading to a hung CPU. The Pentium processor failed to correctly handle an illegal formulation of the CMPXCHG8B instruction. Specifically, if this instruction was given a non-memory operand (the implicit operand is the concatenation of the EDX and EAX registers, and the explicit operand must refer to memory)and the instruction was given the LOCK prefix, then the CPU entered a complex failure state. Normally, supplying a non-memory operand to this instruction should generate an illegal opcode exception. Unfortunately, simultaneously specifying the LOCK prefix (which is also illegal for this type of instruction) exploited a bug in the CPU: when the CPU recognized the invalid opcode due to the non-memory operand, it attempted to invoke the invalid instruction handler vector, thus causing two reads to the memory bus. The LOCK prefix, however, caused the bus to enter a state where it expects a read-write pair of bus requests rather than two memory bus reads, and the CPU subsequently hung. Intel introduced clever workarounds, including some that took advantage of the bug’s behavior, but the ease with which this hardware error could be exploited should serve as a warning that commodity computing hardware remains complex and full of significant errors.

Open Problems

Preventing DoS attacks is notoriously difficult — the point of most software and hardware computing systems, is, after all, to provide service, and exhausting available bandwidth, memory, or CPU cycles remains a major concern in the absence of redundancy or strict and well-calibrated resource limits.

Hardware errors will continue to present a troubling source of potential CPU DoS attacks. Hardware cannot be patched as easily as software, and simply executing a user-level program with the right mixture of instructions can compromise an entire machine, including software layers like the OS or a Virtual Machine Monitor that are traditionally supposed to enforce isolation or access control.

Latent software errors in the OS kernel and a wide variety of user-level applications also present opportunities for CPU exhaustion, livelock, or deadlock. With the increasing emphasis on parallel computing models and multicore systems, software errors involving improper lock ordering or bugs in threading libraries supply ample material for impairing the ability of the CPU to make progress on benign instruction streams.

Related Work

  1. Intel Core 2” Theo de Raadt. The openbsd-misc mailing
    list. June 27, 2007.
  2. The Pentium F00F Bug” Robert R. Collins.
  3. Denial of service (CPU consumption) via a long argument to the
    MAIL command.
    ” The Apache Software Foundation. 15 June 2006.
  4. Remote Code Execution Through Intel CPU Bugs” Kris
    Kaspersky. HITBSecConf2008.

Comments off