Sometimes, Microsoft Gets it Right (The .NET Framework 4)

Parallel Programming are two words which are not nearly used enough by programmers today. I think this is partially due to the fact that most developers are answerable to management types who simply “want to get the job done”. It’s also highly susceptible to deadlocks, race conditions and other problems, which are somewhat more avoidable in traditional single-threaded-apartment model applications. The problem is, that we’ve reached a precipice in CPU architecture where CPUs are scaling out instead of up. In other words, instead of simply working harder and faster, they’re working smarter – executing many simultaneous operations.

The only problem with this is that applications need to be programmed to capitalize on this architecture – and making Multi-threaded applications easier is obviously on Microsoft’s mind with the announcements of features in the upcoming The .NET Framework 4.0.

One of the main the main features I am looking forward to is the Parallel class to easily thread simple loops. This Parallel class represents a significant advancement in simplistically managing loops.  The .Net 4.0 team assures us that “for many common scenarios, it just works, resulting in terrific speedups”.  A similar technique can be used to write parallel loops over iteration spaces of non-integral objects.

Parallel.For(0, N, i=>
  {
    DoWork(i);
  });

There are also overhauls to the ThreadPool class (which was in dire need of serious attention) and the inclusion of “Tasks” – simple generic types which assist developers in creating native IAsyncResult objects: this means that Task can be used as the core of a Begin/End implementation.  They’ve also really thought about these improvments, with easy and clear ways to cancel parallel operations, as well as a number of great ways to handle Exceptions within parallel blocks.

There are of course other advantages to the 4.0 Framework, but it’s the big emphasis on easing MTA (Multi-Threaded Apartment) model application development that’s got me excited.

Enjoy!

C# Levenshtein Distance (Difference Between 2 Strings)

If you want to match approximate strings with fuzzy logic, use the Levenshtein distance algorithm. Many projects need this logic, including programs such as spell-checkers, suggestion searches and plagiarism detectors.

In information theory and computer science, the Levenshtein distance is a metric for measuring the amount of difference between two sequences (i.e., the so called edit distance). The Levenshtein distance between two strings is given by the minimum number of operations needed to transform one string into the other, where an operation is an insertion, deletion, or substitution of a single character. A generalization of the Levenshtein distance (Damerau–Levenshtein distance) allows the transposition of two characters as an operation.

I needed to simply to measure the difference between two independant strings.  This was my saving grace, and the C# implementation I found:

using System;

/// 
/// Contains approximate string matching
/// 
static class LevenshteinDistance
{
    /// 
    /// Compute the distance between two strings.
    /// 
    ///The first of the two strings.
    ///The second of the two strings.
    /// The Levenshtein cost.
    public static int Compute(string s, string t)
    {
        int n = s.Length;
        int m = t.Length;
        int[,] d = new int[n + 1, m + 1];

        // Step 1
        if (n == 0)
        {
            return m;
        }

        if (m == 0)
        {
            return n;
        }

        // Step 2
        for (int i = 0; i <= n; d[i, 0] = i++)
        {
        }

        for (int j = 0; j <= m; d[0, j] = j++)
        {
        }

        // Step 3
        for (int i = 1; i <= n; i++)
        {
            //Step 4
            for (int j = 1; j <= m; j++)
            {
                // Step 5
                int cost = (t[j - 1] == s[i - 1]) ? 0 : 1;

                // Step 6
                d[i, j] = Math.Min(
                    Math.Min(d[i - 1, j] + 1, d[i, j - 1] + 1),
                    d[i - 1, j - 1] + cost);
            }
        }
        // Step 7
        return d[n, m];
    }
}

What’s the best Way for 2 Processes to Communicate in .Net?

While trawling the internet the other day, I came across this question, and thought it might be something that others might like to know.  The question was:

What’s the best (or maybe not the best — just good) way for two processes in the same machine to communicate, using .NET?

Actually the two processes in the app I’m working on aren’t even two different programs; they’re just two instances of the same EXE. I wanted to do something like a singleton app, but have it per user (meaning a Terminal Server or Citrix or App-V server with multiple users should be able to launch their own single copy of the app). If another instance is run by the same user, it should just delegate the task to the already running instance, then exit. Only one instance per user of the program should be running. So far I’ve done (thanks to StackOverflow) the part that detects whether an instance of the app is already running, using Mutex. But I need the second app instance to be able to send data to the first app instance.

I’m leaning towards using named pipes and WCF’s NetNamedPipeBinding for this, but if you have better ideas I’ll really appreciate it. Thanks 🙂

IPC is what I’ve used in the past for this. And it is supprisingly easy. .Net remoting is a good option but unfortunately it is a restricted option becasue you can’t for example use it on the CF.

Below is a copy of the class I use to perform Inter-process Communication, you can use it in conjuction with a MutEx if you wish, but it isnt necessary. As long as the “pMappedMemoryName” and “pNamedEventName” are the same in both processes, it should work just fine. I tried to make it as event driven as possible.

The class looks a little like this:

  public class IpcService {
    private IServiceContext mContext;
    const int maxLength = 1024;
    private Thread listenerThread;
    private readonly string mMappedMemoryName;
    private readonly string mNamedEventName;
    public event EventHandler IpcEvent;
    private readonly bool mPersistantListener;

    public IpcService(bool pPersistantListener)
      : this(pPersistantListener, "IpcData", "IpcSystemEvent") {
      ;
    }

    public IpcService(bool pPersistantListener, string pMappedMemoryName, string pNamedEventName) {
      mPersistantListener = pPersistantListener;
      mMappedMemoryName = pMappedMemoryName;
      mNamedEventName = pNamedEventName;
    }

    public void Init(IServiceContext pContext) {
      mContext = pContext;
      listenerThread = new Thread(new ThreadStart(listenUsingNamedEventsAndMemoryMappedFiles));
      listenerThread.IsBackground = !mPersistantListener;
      listenerThread.Start();
    }

    private void listenUsingNamedEventsAndMemoryMappedFiles() {
      IntPtr hWnd = EventsManagement.CreateEvent(true, false, mNamedEventName);
      while (listenerThread != null) {
        if (Event.WAITOBJECT == EventsManagement.WaitForSingleObject(hWnd, 1000)) {
          string data = Peek();
          EventsManagement.ResetEvent(hWnd);
          EventHandler handler = IpcEvent;
          if (handler != null) handler(this, new TextualEventArgs(data));
        }
      }
      EventsManagement.SetEvent(hWnd);
      Thread.Sleep(500);
      HandleManagement.CloseHandle(hWnd);
    }

    public void Poke(string format, params object[] args) {
      Poke(string.Format(format, args));
    }

    public void Poke(string somedata) {
      using (MemoryMappedFileStream fs = new MemoryMappedFileStream(mMappedMemoryName, maxLength, MemoryProtection.PageReadWrite)) {
        fs.MapViewToProcessMemory(0, maxLength);
        fs.Write(Encoding.ASCII.GetBytes(somedata + "\0"), 0, somedata.Length + 1);
      }
      IntPtr hWnd = EventsManagement.CreateEvent(true, false, mNamedEventName);
      EventsManagement.SetEvent(hWnd);
      Thread.Sleep(500);
      HandleManagement.CloseHandle(hWnd);
    }

    public string Peek() {
      byte[] buffer;
      using (MemoryMappedFileStream fs = new MemoryMappedFileStream(mMappedMemoryName, maxLength, MemoryProtection.PageReadWrite)) {
        fs.MapViewToProcessMemory(0, maxLength);
        buffer = new byte[maxLength];
        fs.Read(buffer, 0, buffer.Length);
      }
      string readdata = Encoding.ASCII.GetString(buffer, 0, buffer.Length);
      return readdata.Substring(0, readdata.IndexOf('\0'));
    }

    private bool mDisposed = false;

    public void Dispose() {
      if (!mDisposed) {
        mDisposed = true;
        if (listenerThread != null) {
          listenerThread.Abort();
          listenerThread = null;
        }
      }
    }

    ~IpcService() {
      Dispose();
    }

  }

Simply use the Poke method to write data, and the Peek method to read it, although I designed it to automatically fire an event when new data is available. In this way you can simply subscribe to the IpcEvent event and not have to worry about expensive and constant polls.  Enjoy.

Simple Solution to Illegal Cross-thread Calls in C#

If the thread call is “illegal” (i.e. the call affects controls that were not created in the thread it is being called from) then you need to create a delegate so that even if the decision / preparation for the change is not done in the control-creating thread, any resultant modification of them will be. Basically, this is a long winded way of saying you can’t (and shouldn’t) access controls on a form from a thread that didn’t create the control in the first place.

.Net version older than 2.0 (1.0 and 1.1) would actually allow this to happen, but cross-threaded operations should always be dealt with cautiously and properly. The technical solution to the problem is to create a delegate function, and pass the delegate function to the Invoke() method on the control, which allows code to execute as if it was being executed from the control’s parent thread, but if you need to do this a lot, all the extra functions and delegates can make the code unreadable, if a lot of it is going on. Fortunately there are 2 “cheats” to the problem. The first is to use the ThreadStart delegate (System.Threading namespace), since its already setup as a simple delegate with one parameter. The second, is to use anonymous delegates, which is something I have covered in the past.

if (label1.InvokeRequired) {
  label1.Invoke(
    new ThreadStart(delegate {
      label1.Text = "some text changed from some thread";
    }));
} else {
  label1.Text = "some text changed from the form's thread";
}

Now, the more curious and perceptive of you are asking, why not just use control.Invoke() everytime you change a value? Well, the truth is you could. But calling control.Invoke() has a great deal more overhead than not calling it. However, when you avoid calling control.Invoke() and access UI objects directly from the UI thread, you’re writing incorrect code that could cause stability problems in your application.

The problem is that a “window” object in the underlying Windowing API in Win32, as represented by the HWND handle, has thread-affinity. It must be directly accessed only from the thread that created it. If it’s not, the results are unpredictable and can cause subtle, intermittent bugs.

By all means skip control.Invoke() if you’ve only got one thread. However, if you need to affect a UI object from a worker thread, you absolutely must use control.Invoke() to transition back to the UI thread before making that call or you’re in for a world of pain.

WorkingSet vrs WorkingSet64

Recently I’ve been doing a lot of work with the Process.GetCurrentProcess() method and since the application I am building is being developed on the 2.0 64bit framework, I was worried that if I used Process.GetCurrentProcess().WorkingSet64, that the application wouldnt work with 32bit OS’s, since WorkingSet was deprecated, but not enforced.

After some thought though, I realised that the WorkingSet returns the amount of memory being used by the process as an integer (32 bit signed integer). OK, so the maximum value of an integer is 2,147,483,647 — which is remarkably close to the total amount of memory that a process can have in its working set.  Except, there is actually a switch in Windows that will allow a process to use 3 gig of memory instead of 2 gig.  So what would happen when you poll the WorkingSet you will get a negative number, a really big small negative number. Usually, in the realm of -2,147,482,342.  As the more perceptive of you have guessed the problem already.  The overflow bit.

So you ask, why didn’t Microsoft just change the API so WorkingSet returned an Int64 instead of an Int32.  Well they could, except that they would break applications built against version 1.0 and 1.1 frameworks, as this post explains.

But after all this pondering, it turns out that WorkingSet64 des exactly what WorkingSet does, except returns an Int64 instead – and as such is less prone to breaks.  Works with both 32bit and 64bit Windows and Frameworks and all is good with the world.