I always liked Intel's hardware design and documentation. I learned a lot reading 8086, 80286 and 80386 manuals back when they were new and shiny packages of raw power. Even some of the failed Intel's technologies, such as iAPX 432, were a great inspiration for me those days.
Application performance was always a very important topic for me, so I was always interested in software profilers. Microsoft ships a command line profiler with Visual Studio, but it's a very awkward tool and it doesn't work well in multi-threaded environment. I used a few other titles, such as DEC's HiProf or NuMega's TrueTime, but application profilers always suffered from lack of interest from the development crowd and most of them didn't do well.
Naturally, when I got my hands on the Intel's performance analyzer (VTune) in the summer of 2007, I was looking forward to work with another example of Intel's great work.
At first, everything was working just great. VTune instrumented 32- and 64-bit processes effortlessly and produced very detailed, to the microsecond call graphs and had more in store to profile threads and monitor system resources. I thought to myself that I finally found the tool that will serve me for years.
November 16th, 2007
One day in November 2007 VTune just stopped working - no call graphs were produced. Upon closer examination, I figured that VTune's instrumentation code was crashing while trying to fix up one of the latest Microsoft's runtime libraries. I wrote a detailed report an posted it in the Intel's forums:
I haven't gotten any response from the forum moderators, so I posted the issue on the Intel's premium support website, as it was critical for me at the time to use a profiler.
November 18th, 2007
I was pleasantly surprised to hear from Intel Support on Sunday. They notified me that an engineer will look into the problem. I thought to myself: "What a great company - not only they responded right away, but they also looked into the details and immediately assigned an engineer".
November 21st, 2007
Intel asked for a sample application that would demonstrate the problem. That's reasonable, so I created a sample executable to emulate IIS and a sample DLL to emulate the COM component that was crashing when instrumented by VTune. The simple two-liner sample was crashing immediately - support cases don't get any easier than that.
November 28th, 2007
Surprisingly, Intel couldn't reproduce the problem in their environment and they asked me for executables compiled on my machine. This was strange, because there was not much trick in reproducing the crash and I suspected that Intel's Support didn't bother with installing the same version of the OS and VC runtime, which was important because VTune was miscalculating offsets for fix-up jumps in the instrumentation code, probably because they relied on a certain version of VC runtime.
I packaged all executables and captured a full crash dump, so they can look at the post mortem memory, which included all of the system DLLs, as well as the crash point - a piece of cake for somebody who knows how to look at crash dumps.
At that point I found it odd that every time I add something to the case, I would get a response only the next day, no matter how early in the day I would post. I did some investigation and it turned out that Intel outsourced VTune support to Russia, which is 8+ hours ahead of Canada, depending on the time zone.
Intel Support went silent for a few days. Well, I thought, they must be looking for a solution.
December 5th, 2007
It was a critical issue, so I pinged Intel and ask for a status update. Here's the response I got:
OK. I will talk with our engineering to see any update. Engineering team has development work and may work on other bug reports.
Thanks for your patiences.
Wow. Apparently Intel didn't consider this issue as a critical one, after all. I created another support case and requested another engineer, preferably one located in North America, so I wouldn't lose a day with every reply.
December 7th, 2007
Intel did what I asked and assigned a support person located somewhere in North America. This got my hopes up - apparently Intel cared enough to take some action. I explained the situation again and Intel disappeared for a few days.
December 17th, 2007
I pinged Intel again and asked them where they are with this case. I also pointed out that I am not happy with the quality of support I am getting.
December 26th, 2007
I got a response that I'm not using VTune correctly. The support person told me that IIS has a special profiling mode, which VTune uses. The fact that they misspelled IIS as ISS throughout the entire post hinted me that they don't have much experience with either of these. I suggested that we do a WebEx conference, so I can demonstrate the problem on my machine. I figured that since they are not experienced to look at crash dumps, I could just walk them through the crash and show where and how it happens.
December 28th, 2007
Intel informed me that there are two technologies used by VTune to instrument binarries. So, I figured one was used to instrument DLLs, so they can be launched by any process (this is the one that crashed my process) and another that latched onto the specified executable, such as IIS and instrumented some of the DLLs it loaded.
They also told me that their support engineer ran my sample and was able to reproduce the problem. Isn't it amazing - it took Intel over a month to get to this point. Fine, I thought, better late then never. I was actually happy, hoping that now the resolution of this problem isn't too far. So, all this time, Intel support was troubleshooting the non-existing problem in the second instrumentation approach.
January 4th, 2008
Intel support told me that the first technology is obsolete and asked if I can work with the second one. Fine, I tried running the application as they suggested and VTune didn't capture anything.
Intel struggled with the fact that VTune didn't work using the second technology either and until January 10th we exchanged a few messages that can be summed up as "did you do this?"; "yes, I did".
January 11th, 2008
I was quite upset with Intel's inability to troubleshoot problems at that point and requested to talk to the manager. Here's the response I got:
Well, Andre, you wil the prize! You ARE the most difficult customer I have ever had to work with. If you would stop fighting with me and work with me, we might make some progress.
Please answer this simple question: did you or did you not set the instrumentation level of w3wp.exe to minimal BEFORE your ran the activity?!
At this point I worked out my original performance analysis problem without VTune and was just curious to see how much Intel will drown themselves if I just do what they ask me. So I didn't respond to this rudeness and started to do everything they asked me to try.
January 14th, 2008
Intel came up with a very elaborate, 16 item IIS launching sequence. I followed the sequence and VTune didn't capture anything. The result was the same as in my early January tests. I ran the test a few more times in different combinations of loading various binaries and then, oh miracle, I got my call graph. It appeared that when IIS was listed first in the instrumentation, it would work.
January 16th, 2008
I figured that I was too happy too soon. VTune behaved erratically, failing most of the time and sometimes locking up the machine so badly that I had to restart the machine, which was a royal pain in the backside. I described all of my experiments to Intel started to wait for feedback.
January 21nd, 2008
Intel wanted me to confirm that I did everything as they requested. I did and they asked me to provide all of VTune projects, which contained log files Intel could use. I checked the logs and didn't see there anything interesting. Nevertheless, I packaged all of the projects and provided them to Intel.
February 6th, 2008
Intel got back to me and asked if I can use another of their tools (PTU, I think it was called), to profile my application. Apparently, they didn't have a clue what was going on with VTune. I told them that I would rather like them to fix VTune.
February 7th, 2008
I was tired of following pointless instructions and decided to spend some time to get to the bottom of this. I ran a few more tests and check the system event log and there I saw an error that described that the system file w3wp.exe (IIS) was damaged and it was replaced from the system cache.
That was it - VTune modified IIS and the system was restoring the original. Those times when Windows wasn't quick enough to delete the instrumented process, VTune created a call graph. Most of the time, though, it just ran the original un-instrumented IIS.
I let Intel know of my findings.
February 11th, 2008
Intel confirms this to be a problem with VTune and suggests a couple of hacks. One was published by Microsoft in 2001 and was no longer applicable and one required a kernel debugger hooked up to the OS. None of these options worked for me and I asked Intel for a better resolution to my problem.
Intel went silent.
March 3rd, 2008
I pinged Intel and received no response.
March 7th, 2008
I asked for a status updated once again. This time Intel came up with this response:
VTune(tm) Analyzer callgraph technology instruments or changes the disk image of the binary or executable. Where the OS or system takes special protective action to prevent changes to binaries and files, this callgraph technology, which is core to its design, will not work. As a result we will, unfortunately, be unable to provide a fix for this problem. The best work around at this time will be to use an older release of the OS so you can disable Windows File Protection (WFP).
April 26th, 2008
Intel kept quiet until this date, probably hoping that I'm going to just go away. I reminded them of the case and Intel simply closed the case, giving me this excuse for a reply:
This issue was escalated to me for review. We examined the issue and determined that we will not be updating the product to handle call graph for IIS with the new Windows File Protection (WFP) mode. Call graph was originally designed to work with simple .exe files. It was extended to work with IIS and has worked well. The purpose of WFP is to avoid changes to executables that call graph performs. We will not attempt to work around that with the current VTune call graph technology. We will be developing new call graph techniques that will be less intrusive and offer lower overhead. VTune call graph continues to work for .exes and also works where WFP is not enabled.
-Performance, Analysis and Threading customer support manager
So, out of two technologies VTune is based on, one failed because Intel couldn't figure out how to instrument system binaries (the crash where it all started) and the other failed because they didn't realize how the system works (WFP). You would think that a giant such as Intel would contact Microsoft and try to find some kind of resolution, but no, they haven't even tried.
This experience has completely shuttered for me the image of Intel as a self-respecting company. In three and a half months of dealing with this support case Intel showed no technical or troubleshooting skills, no attention to detail and no will to find a solution. What a shame.