The Recycle Bin

Entries tagged as ‘programming’

Holy anti-feature, batman

June 13, 2008 · Leave a Comment

I had the opportunity to attend at talk by Mark Russinovich, of Sysinternals fame, during last week’s Trustworthy Computing Conference.  The topic of the talk was about security boundaries in Windows, and more specifically, what is not a security boundary.  The talk was very interesting, and I don’t want to reveal too much here, but there was one part of it that stuck with me and has been bothering me for a little while now.  One of the technologies he addressed was Patchguard, or Kernel Patch Protection, which was introduced in 64-bit Vista and Server 2008.  Patchguard is intended to keep programs from patching, hooking, or otherwise tampering with the internals of the NT kernel.  It does this by periodically taking a checksum of some important structures in the kernel (SSDT, interrupt table, HAL tables, etc) and comparing the current value with the previous one.  Any discrepancy here will indicate that the kernel has been subverted.  If it notices any changes to these structures, it throws an exception which throws a blue screen error.  Sounds good, right?  Sounds like a great new security feature, no more rootkits!  Well, not really.  The truth is, KPP really does nothing to stop malicious code, and in fact, is pretty useless in doing so.  Mark revealed in his talk, that that was never the intention of KPP, but rather, it was conceived as a way to force legitimate developers to stop using these techniques in their own programs.  See, most anti-virus and security products will use some level of system hooking in order to get a good view of activity.  In fact, one of Mark’s very own tools, RegMon, hook’s the SSDT to watch registry activity.  He even wrote a publication about the technique!  The problem with kernel hooking is that it is entirely unsupported and significantly reduces stability.

So here’s what I don’t understand.  Microsoft has recognized that system hooking leads to instability.  They’ve decided that programmers aren’t good enough to extend a kernel function safely without throwing a blue screen exception, so now they’re not going to allow us to hook certain system structures (pfft, allow is a funny thought).  But, instead of actually fixing the gaping holes in their system, they’re going to simply watch for system hooking, and then guarantee that the system will crash, by causing the crash.  Oh yeh, and they are going to blame it on the developer.  It’s like a car company deciding that talking on your cell phone while driving is dangerous, so they’re going to create a system that detects if you’re on the phone and then drives the car off the road for you.  That will show you! 

I just don’t understand this anti-feature.  There are plenty of legitimate reasons for hook these system functions, and it can be done safely.  I know it can because Mark has done it, and I’ve done it.  If you don’t want developers to subvert the kernel, then provide a complete API that we can use to extend and monitor the system, and fix the problems with your system that allows someone to take write-protected virtual memory, map it to physical memory, strip all the restrictions off of it, and send it back patched.  Don’t just come behind perfectly valid code, throw an exception and blame it on us. 

Categories: Security · Vista · microsoft · programming
Tagged: , , , , , ,

When a File Exists, but Doesn’t

April 4, 2008 · 1 Comment

I stumbled across a rather obnoxious “feature” in Vista today that I thought I should share with everyone.  With Vista, Windows finally enforces file system permissions and separates applications from user data.  It does this by restricting write access to the C:\ directory.  A user can read and write freely in their C:\Users\$username\ directory, but when they try to write to C:\Program Files or C:\Windows, they are presented with a UAC prompt that demands administrator privileges.   

This poses a problem for legacy applications that weren’t written with Vista’s security model in mind.  When you launch a program you are starting it with your permission set, which by default doesn’t have write access to the C:\ root.  If a program tries to write to say, C:\Windows, it will be denied.  To ease compatibility problems, Vista redirects these programs to a virtual directory that the program does have write access to.  A program trying to write to C:\Windows will be redirected to C:\Users\$username\AppData\Local\VirtualStore\Windows and will never know there was a problem. 

I discovered this the hard way this morning.  I have an application that normally needs to run with elevated privileges.  It has to start a kernel-mode service, read a log file in C:\Windows and write its own log file there.  When the application starts up it loads its own log file and displays it to the screen.  I wanted to run the application with a clean slate, so I stopped the service, deleted the services log and deleted the applications log.  I started up the program and was surprised that it was still loading its log file.  I thought I had deleted it!  Both Explorer and the command prompt claimed that the file was gone.  I tried to enter the path directly into Explorer and it couldn’t load the file.  But still, my application could load it.  I thought there might have been some strange caching with .Net applications, so I wrote a Win32 C program that loads the file and that one worked too!  So, now I’ve got two applications that claim to be loading a file that doesn’t exist!  Now I’m starting to get confused.  I run my program in Debug mode and drop into the debugger when it tries to load the file.  I was hoping that the debugger could give me more information about the file that it’s loading.  Even the debugger is telling me that it’s loading a log file in C:\Windows that doesn’t exist!  How can a debugger lie?  Ok, I know the answer to that, but it really doesn’t apply here…  The most bizarre part of this is that I could read the file from the command prompt with Edit, but only if I ran as a standard user, not an administrator.

edit

As you can see, the admin prompt loads nothing, but the standard one loads the log.  What could possibly be going on here!?  It turns out, the Vista developers decided it would be better to completely lie to an application instead of allowing it to crash.  Apparently, I opened my application once as a standard user, and it got redirected to C:\Users\$usersname\AppData\Local\VirtualStore\Windows and wrote the log there.  Whenever an application tried to load a file from a protected folder, it will check that location, and then check the Virtual Store.  If the file is in the VS, then the application is given a handle to that, but the OS never tells the program where the file actually is located.  This keeps the transition invisible to legacy programs who will end up just reading and writing from the virtual store.  This would explain why Edit could only find the file if it was opened under standard user, since the admin one wouldn’t know which user’s virtual store to look into.

This was all quite aggravating to me and violates venerable UNIX law:

when you fail, fail loudly with lots on information to stderr

I guess this is what happens when you try to make changes to a system but still support backwards compatibility.  My opinion is that developers have been allowed to abuse the user permission system in Windows for too long, and don’t deserve the niceties provided but this feature.  If you want to write in a protect folder, you must run with elevated privileges, otherwise you catch the error or crash.  This Virtual Store is no more than a lie and operating systems shouldn’t lie.

Categories: Vista · programming
Tagged: , ,

Big-Oh

March 11, 2008 · Leave a Comment

The other day I was reading a certain social news aggregator and I came across a link to a blog about a new functional language being built off of .Net, called F#[I did it with .Net: Why I love F#].  I haven’t looked into F# very much so I can’t say much about it.  I can tell you that it is a functional language being developed at Microsoft Research based of the OCaml language and using the .Net 2.0 runtime.  What was more interesting was the conversation that the blog submission generated.  The post was about the time complexity of a certain function in F# compared to a more tradition imperative language.  The discussion brought to light how confusing time-complexity measurement can be for people, since it seemed most there didn’t know what they were talking about.  So gather round folks for a riveting discussion about algorithm performance rating.

When computer scientists say that a certain algorithm is fast they are usually talking about its asymptotic time complexity.  That is to say, they are measuring the number of steps an algorithm takes in respect to its input size.  To simplify things, we imagine a computer where each basic step takes a constant amount of time.  Now we can judge an algorithm based on how many times a basic step is performed.  What we are most interested in is the upper bound of the growth rate, in other words, the amount of steps it takes in the worst case scenario.  This is often written in Big-Oh notation (ex. O(n), to be read as “order n”).  Here is a graph that shows the growth rate of the most common measurements:

O(1), O(logN), O(N), O(N^2), O(2^N)

time complexity

What this graph is showing is the time it takes an algorithm to complete as the input size increases.  As you can see, O(2^N) grows very rapidly.  This is an example of a very very slow algorithm.  Linear time, O(N), is considered acceptable for most problem sets, and algorithm researchers are thrilled if they can get it any lower.

I’m going to try to explain this with a real example and some code.  To do this I need to introduce a data structure known as a linked list.  The point of a linked list is to hold a collection of similar pieces of data in an efficient way.  The data structure consists of a series of nodes that has a value and pointer to the next node in the list.  Here is a node object in c#:

public class Node
    {
        public int value;
        public Node next;

        public Node()
        {
            next = null;
        }
    }

To create a list of these nodes, all you need to do is make a pointer to the first node, or the HEAD of the list.  This way the link list can be scattered throughout memory, and nodes can be added and removed very efficiently.  Here’s a picture to demonstrate:

link

If we start at the HEAD node and follow all of the pointers, then the list would be:

1 7 10 -5 12

Like any data structure, linked-lists are good for certain tasks and bad for others.  For example, to find a particular node you would have to start at the HEAD and walk through the list until it is found.  In a worst case scenario you would have to walk through the entire list which would take n steps (n being the number of nodes).  This operation would have a time complexity of O(n).  Here’s is some typical find functions in c#

    class LinkedList
    {
        Node head;

        int getIndex(int value)
        {
            int index = 0;
            Node temp = head.next;
            while (temp != null)
            {
                if (temp.value == value)
                    return index;
                temp = temp.next;
                index++;
            }
            return -1;
        }

        Node getNode(int index)
        {
            int counter = 0;
            Node temp = head.next;
            while (temp != null)
            {
                if (counter == index)
                    return temp;
                temp = temp.next;
                counter++;
            }
            return null;
        }
    }

As you can see, both of these functions require some looping. In fact, they will loop until either the requested node is found, or the end of the list is reached.  Obviously, this is better than an O(2^N) algorithm, but it still isn’t all that fast.  To contrast, adding and removing elements are two operations that linked lists perform very quickly.  To add a node we simply adjust what the HEAD node is pointing to, and to delete a node we take the value and pointer from the next node in the list.  This is kind of hard to imagine, so here is some diagrams:

First, adding a node:

add

Second, deleting a node:

delete

The difference between these functions and the previous two is that they don’t have to loop through the list when they run.  The worst case in either function is that it takes one step to perform, so they get a time complexity of O(1), otherwise known as constant time.  Here is some code:

class LinkedList
    {
        Node head;

        void Add(int value)
        {
            Node newNode = new Node();
            newNode.next = head.next;
            newNode.value = value;
        }

        void Remove(Node n)
        {
            if (n.next == null) n = null;
            else
            {
                n.value = n.next.value;
                n.next = n.next.next;
            }
        }
    }

Is it obvious now how deleting and adding a node is faster than finding one?  How would you rate a function that removes a particular node at a specified index?  More code:

void removeAt(int index)
   {
       Node toRemove = this.getNode(index);
       this.Remove(toRemove);
   }

Code reuse makes this functions easy!  The first line calls to the function getNode(index).  We’ve already learned that finding a node at a particular index is an O(n) problem.  The second line removes that node, which we just said is a O(1) problem.  Thus, this function has the time complexity of O(n) + O(1).  When adding time complexities together, we only care about the largest complexity, and throw out the smaller one.  This is because the rating is supposed to indicate how quickly the function grows and smaller complexities don’t add anything to the upper bound.  That being said, the time complexity of the removeAt(index) function is O(n).  Get it?

Now these are just the basics of algorithm analysis.  This was intended to merely be an introduction to the are, and quick explanation of how we rate algorithms for those who don’t already know.  There is an entire field of research devoted to optimizing current solution sets.  Computer scientists strive to find the perfect algorithm and data structure for a problem set and without optimal data structures and algorithms you would never be able to find an email in a 5GB inbox, or have an object in Halo obey the laws of physics.

Categories: C# · programming
Tagged: , , ,