A bug hunting story

Today I found a bug. It was so interesting that I decided to write a longer post here about it.
I created a strip down solution with the only classes and methods I need to demonstrate the bug. This is the reason if the story wont seem too realistic.

A long long time ago I need a dictionary to store some integers with a key which was based on a string but has some other features (not shown here). So I created MyKey class for this:

[Serializable]
public class MyKey
{
    private string key = null;

    public MyKey(string key)
    {
        if (key == null)
        {
            throw new ArgumentNullException("key");
        }

        this.key = key;
    }

    private int? hashCode = null;
    public override int GetHashCode()
    {
        int ret = 0;

        if (hashCode == null)
        {
            hashCode = this.key.GetHashCode();
        }

        ret = hashCode.Value;

        return ret;
    }

    public override bool Equals(object obj)
    {
        bool ret = false;

        MyKey other = obj as MyKey;
        if (other != null)
        {
            ret = Equals(other);
        }

        return ret;
    }

    public bool Equals(MyKey other)
    {
        bool ret = false;

        if (other != null)
        {
            if (this.hashCode == other.hashCode)
            {
                if (this.key == other.key)
                {
                    ret = true;
                }
            }
        }

        return ret;
    }

    public override string ToString()
    {
        string ret = String.Concat("\"", key, "\"");
        return ret;
    }
}

It was used happily like this:

// create data
var data = new Dictionary<MyKey, int>();
data[new MyKey("alma")] = 1;

Later I wrote some code to persist these data via serialization.
Everything was working like a charm.

// serialize and save it
var serializedData = Serializer.Serialize(data);
SaveToFile(serializedData);

...

// load and deserialize data
var serializedData = LoadFromFile();
var data = Serializer.Deserialize(serializedData);

There was a usecase when after deserialization some of the values in data must be changed:

// as in deserialized data
var specificKey = new MyKey("alma");
if (data[specificKey] == 1) // a KeyNotFoundException occures here!
{
    data[specificKey] = 2;
}

KeyNotFoundException? I was sure that there should be a value in all of data instances with the given key! Lets see in QuickView:

There is an “alma” key!
Let’s comment out the line causing the exception and check data after the expected value modification to “2”:

Much more interesting isnt it?
I quickly put all the data creation, serialization, deserialization code into one unit test to have a working chunk of code I can use for bug hunting:

[TestMethod]
public void TestMethod1()
{
    var d = new Dictionary<mykey, int="">();
    d[new MyKey("alma")] = 1;

    var serialized = Serializer.Serialize(d);

    var data = Serializer.Deserialize(serialized);

    var specificKey = new MyKey("alma");
    {
        data[specificKey] = 2;
    }
}

But in the unit test everything was working! I simply cant reproduce the bug in such a way.
But when running App1, which was creating and serializing the data and running App2 which was deserializing and modifying it the bug always presents itself.
How can be a duplicate key in a Dictionary<,>? MyKey‘s implemetation, especially the Equals() override is so trivial that it cannot allow two instances created from
same string to be not equal.

But wait a minute!

How can the hashCode’s differ?!?!?!

Yes. A quick search on the net answers everything. MSDN clearly describes in a big “Important” box:

The hash code itself is not guaranteed to be stable. Hash codes for identical strings can differ across versions of the .NET Framework and across platforms (such as 32-bit and 64-bit) for a single version of the .NET Framework. In some cases, they can even differ by application domain.

As a result, hash codes should never be used outside of the application domain in which they were created, they should never be used as key fields in a collection, and they should never be persisted.

App1 was running in x86 and App2 in x64 environment. Thats why the string hashcodes differ.

The fix is really easy. Just turn off hashCode calculation optimalization for serialization:

[Serializable]
public class MyKey
{
   ...

   [NonSerialized]
   private int? hashCode = null;
   ...
}

Now hashCode will be recalculated once in all runtime environments.

I never thought about the possibility of unstable hashcodes.
I hope I am not the only man in the world with such wasteful brain.

IE11 on Win7: no password sent during Basic Authentication

Today I set up a download directory for one of our customers. I turned on basic authentication, gave a nice /=%27gujdw765 password and tried out from Firefox.
Everything worked. Then I tried via IE11. IE started to prompt me for credentials again and again.

I checked error log on webserver and found password mismatch messages. Tried again with same results.
Tried on my second machine with same results again.

I sent over the info to my deskmate. He tried the same in IE and it worked for him!
What the hell is happening to my machines?

I modified apache config to log Authorization header value:

CustomLog ${APACHE_LOG_DIR}/access.log  "\"%{Authorization}i\""

Now from access log it becomes clear, that my IE does not send my password over the wire!
E.g. in case of typed user name “myuser” and password “blahblah”:

"Basic bXl1c2VyOgo="

I started some experiments and found that sometimes the password was sent sometimes not.
But why?

You will never guess.

Because I was copy&pasing that nice password via clipboard AND used Shift+Insert key combination for pasting!
When I was doing my experiments I was typing directly “12345” into password field and it was sent gracefully.
My deskmate was using Ctrl+V which was working too.
But no one was able to send the password pasted via Shift+Insert!

After these it was easy to find the official info: https://support.microsoft.com/en-us/kb/2547752
It is known problem since 2011 and there is a fix for it.
But why wasnt it distributed via Windows Update since then?

assemblyBinding not working in web.config?

Check Your config’s configuration tag! Has it any namespaces? e.g.:

<configuration xmlns="http://schemas.microsoft.com/.NetConfiguration/v2.0">
   <runtime>
      <assemblyBinding xmlns="urn:schemas-microsoft-com:asm.v1">
         <dependentAssembly>
            <assemblyIdentity name="SomeAssembly" publicKeyToken="..." />
            <bindingRedirect oldVersion="1.0.0.0" newVersion="2.0.0.0" />
         </dependentAssembly>
         ...

Remove namespace!

<configuration>
...

Voilá, it starts working!

Credits to this poster.

Drupal SSO: An unsupported mechanism was requested…

Here is a step that may help You to debug the error above.

Today I was setting up SSO on a Drupal page against MS AD. Something went wrong and I found the following message in site’s error_log:

gss_accept_sec_context() failed: An unsupported mechanism was requested (, Unknown error)

Let’s modify a little bit apache’s config, and add this to appropriate place (global or vhost level):

 LogLevel debug

Restart apache and check the log again:

kerb_authenticate_user entered with user (NULL) and auth_type Kerberos
Acquiring creds for HTTP/intranet.kesz.hu@KESZ.HU
Verifying client data using KRB5 GSS-API
Client didn't delegate us their credential
Warning: received token seems to be NTLM, which isn't supported by the Kerberos module. Check your IE configuration.
GSS-API major_status:00010000, minor_status:00000000
gss_accept_sec_context() failed: An unsupported mechanism was requested (, Unknown error)

So the real problem was: Warning: received token seems to be NTLM, which isn’t supported by the Kerberos module. That is much more informative than the unknown error we had before!

Dont forget to set back LogLevel after You finished because the log file fast becomes really large…

hpacucli error: no controllers detected.

I would like to check my P400 controller inside a HP DL180 G5 ProLiant server which was running x64 Linux with 2.6.31.6 kernel. HP has a utility named hpacucli designed for this kind of task so I was looking for appropriate download link. After found some versions (I newer understood the logic behind HP’s software version numbering) and downloaded them I realized that neither of them was seeing my P400 controller:

root@xerxes:~/opt/compaq/hpacucli/bld# ./.hpacucli ctrl all show

Error: No controllers detected.

The Internet is full with solutions to this problem:
– Use uname26 utility to hide Your 3.x kernel! But I have 2.6…
– modprobe sg before hpacucli! Sg was compiled into kernel already…
– etc…

After some hours of trials I was strace-ing the hpacucli command and found that it tries to open it’s libcpqimgr which was placed in the same directory but not in LD_LIBRARY_PATH, so it didnt found it. So “No controllers detected” actually means “Hey guy, I didnt find my right hand which I can use for detecting controllers!”. Nice.

Little modification to command line:

root@xerxes:~/opt/compaq/hpacucli/bld# LD_LIBRARY_PATH=. ./.hpacucli ctrl all show

Smart Array P400 in Slot 6                (sn: PAFGL0M9VWK002)

I understand that if I installed the package and used the appropriate script provided by HP I never met this situation. So this is an unexpected thing. But maybe others run into same so remember: strace is (one of) Your best (non human) friends!

CLR20r3 FileLoadException and Why Keep Settings from .config?

Today one of my colleagues tried to start a newer version of one of or .Net tools on her Win7 computer which was just copied from deployment share.

The app didn’t started, only Windows Error Reporting was doing something on the systray. In the Eventlog there was an error 22 with CLR20r3 and mentioning a FileLoadException. P4’s value: PresentationFramework. Short check about installed frameworks, etc.: everything seemed to be fine. Nothing useful was found in generated WER file either.

The tool was running well on other’s machine. What happened with her’s?

In the app’s .config file there were custom app settings in the appSettings section that’s why we keeped the original file from previous installation. The problem was that in .config there was an assemblyBinding section with a bindingRedirect too:

<configuration>
  <runtime>
    <assemblyBinding xmlns="urn:schemas-microsoft-com:asm.v1">
      <dependentAssembly>
        <assemblyIdentity name="AnAssembly" publicKeyToken="123123123123" culture="neutral" />
        <bindingRedirect oldVersion="0.0.0.0-1.0.33334.0" newVersion="1.0.33334.0" />
      </dependentAssembly>
     ...

The new code had a newer version of the AnAssembly which wasnt used because of the bindingRedirect above! The new .config had the updated version numbers, but we overwrite it with the previous version of file because we would like to keep the correct appSettings values. It is very handy to use the appSettings section in the .config but it is a bad idea because of framework configuration is keeped in the same place. Keeping it separately seems better idea.

PS.: What about PresentationFramework in the P4 value? Completely missleading info…

Jenkins: Publish Over CIFS plugin

If You run jenkins on windows machine I may save You some hours.

I wanted to copy some files from my job’s workspace to a network share which needed authentication. So XCOPY wasn’t enough, I looked for some other solution. The Publish Over CIFS plugin can handle this situation, so I installed and configured it and added a new “Send build artifact to windows share” post build step to my job.

After a quick build delay I found these lines at the end of the build’s console output:

CIFS: Connecting from host [atlas]
CIFS: Connecting with configuration [fileserver DEPLOY share] ...
CIFS: Removing WINS from name resolution
CIFS: Setting response timeout [30 000]
CIFS: Setting socket timeout [35 000]
CIFS: cleaning [smb://fileserver.mecset.local/DEPLOY/ArdinTemplatingRedmineConnector/]
CIFS: Disconnecting configuration [fileserver DEPLOY share] ...
CIFS: Transferred 0 file(s)

0 files transferred? I naturally have some result files, so I checked paths, etc. No errors but no files copied.

After a short 2 hours of trial-and-errors I realized: despite the hosting windows isn’t the plugin IS CASE SENSITIVE! Grrrrr…

My source files were set to:

ArdinTemplatingRedmineConnector/bin/debug/**

After changed to:

ArdinTemplatingRedmineConnector/bin/Debug/** 

everything worked fine.

Nothing was written on plugins wiki pages about this so if some source engine bringed You here I hope it helps.