Sunday, February 26, 2017

Top 10 C# projects errors found in 2016

To measure the efficiency of our analyzer, and also to promote the methodology of static analysis, we regularly analyze open source projects for bugs and write articles about the results. 2016 was no exception. This year is especially important as it is the year of the "growth" of the C# analyzer. PVS-Studio has obtained a large number of new C# diagnostics, an improved virtual values mechanism (symbolic execution) and much more. Based on the results of our teamwork, I compiled a kind of chart of the most interesting bugs, found in various C# projects in 2016.

Picture 3
Tenth place: when a minute doesn't always have 60 seconds
I'll start the chart with a bug from the Orchard CMS project. The description of the error can be found in this article. In general, the full list of all the articles we checked is here.
V3118 Seconds component of TimeSpan is used, which does not represent full time interval. Possibly 'TotalSeconds' value was intended instead. AssetUploader.cs 182
void IBackgroundTask.Sweep()
{ 
  ....
  // Don't flood the database with progress updates; 
  // Limit it to every 5 seconds.
  if ((_clock.UtcNow - lastUpdateUtc).Seconds >= 5)
  {
     ....
  }
}
The developer mistakenly used Seconds instead of TotalSeconds in this case. Thus, we will get not the full number of seconds between the dates _clock.UtcNow and lastUpdateUtc, as the developer expected, but only the Seconds component of the interval. For example, for the time interval of 1 minute 4 seconds it will be not 64 seconds, but 4 seconds. Incredible, but even experienced developers make such mistakes.
Ninth place: the expression is always true
The following error is from the article about the analysis of GitExtensions.
V3022 Expression 'string.IsNullOrEmpty(rev1) || string.IsNullOrEmpty(rev2)' is always true. GitUI FormFormatPatch.cs 155
string rev1 = "";
string rev2 = "";

var revisions = RevisionGrid.GetSelectedRevisions();
if (revisions.Count > 0)
{
  rev1 = ....;
  rev2 = ....;
  ....
}
else

if (string.IsNullOrEmpty(rev1) || string.IsNullOrEmpty(rev2)) // <=
{
    MessageBox.Show(....);
    return;
}
Note the else keyword. Most likely, this is not the right place for it. Inattention during refactoring, or just banal tiredness of a programmer, and we are getting a radical change in the logic of the program, which leads to unpredictable behavior. It's great that a static analyzer is never tired.
Eighth place: a possible typo
In the article about the analysis of FlashDevelop source code, we see a peculiar error, caused by a typo.
V3056 Consider reviewing the correctness of 'a1' item's usage. LzmaEncoder.cs 225
public void SetPrices(....)
{
    UInt32 a0 = _choice.GetPrice0();
    UInt32 a1 = _choice.GetPrice1();
    UInt32 b0 = a1 + _choice2.GetPrice0();  // <=
    UInt32 b1 = a1 + _choice2.GetPrice1();
    ....
}
I agree with the analyzer, as well as the author of the article. It feels like a0 should be used instead of the a1 variable in the marked line. In any case, it wouldn't hurt to give more meaningful names to variables.
Seventh place: logic error
This article was written based on the second check of the Umbraco project. An example of an interesting, in my opinion, error from it.
V3022 Expression 'name != "Min" || name != "Max"' is always true. Probably the '&&' operator should be used here. DynamicPublishedContentList.cs 415
private object Aggregate(....)
{
  ....
  if (name != "Min" || name != "Max")
  {
    throw new ArgumentException(
      "Can only use aggregate min or max methods on properties
       which are datetime");
  }
  ....
}
An exception of the ArgumentException type will be thrown for any value of name variable. It's because of the erroneous use of the || operator instead of && in the condition.
Sixth place: incorrect loop condition
The article about the check of Accord.Net has a description of several amusing bugs. I have chosen two, one of which is related to a typo again.
V3015 It is likely that the wrong variable is being compared inside the 'for' operator. Consider reviewing 'i' Accord.Audio SampleConverter.cs 611
public static void Convert(float[][] from, short[][] to)
{
  for (int i = 0; i < from.Length; i++)
    for (int j = 0; i < from[0].Length; j++)
      to[i][j] = (short)(from[i][j] * (32767f));
}
There is an error in the condition of the second for loop, whose counter is the j variable. Using the variable names i and j for the counters is some sort of classic of programming. Unfortunately, these variables are very similar looking, so the developers often make mistakes in such code. I do not think that in this case it is worth any recommendation to use more meaningful names. Nobody will do that anyway. So, here is another recommendation: use static analyzers!
Picture 5
Fifth place: using a bitwise operator instead of a logical operator
One more interesting and quite a wide-spread error from the article about the analysis of Accord.Net.
V3093 The '&' operator evaluates both operands. Perhaps a short-circuit '&&' operator should be used instead. Accord.Math JaggedSingularValueDecompositionF.cs 461
public JaggedSingularValueDecompositionF(....)
{
  ....
  if ((k < nct) & (s[k] != 0.0))
  ....
}
It is obvious that even if the first condition is true, the erroneous use of & instead of && will lead to the check of the second condition, which in turn, will cause array index out of bounds.
Fourth place: quotation mark and... a quotation mark again
The fourth place is taken by the Xamarin.Forms project; the full article about its check is here.
V3038 The first argument of 'Replace' function is equal to the second argument. ICSharpCode.Decompiler ReflectionDisassembler.cs 349
void WriteSecurityDeclarationArgument(CustomAttributeNamedArgument na)
{
  ....
  output.Write("string('{0}')",
    NRefactory.CSharp
.TextWriterTokenWriter
.ConvertString(
(string)na.Argument.Value).Replace("'", "\'"));
  ....
}
In this case, the quotation mark will be replaced with... a quotation mark. I do not think that this is what the developer wanted.
Picture 8
The analyzer did a great job!
Third place: ThreadStatic
In third place, is the Mono project that was also rich in amusing bugs. Here is the article about its check. One of these bugs is a real rarity.
V3089 Initializer of a field marked by [ThreadStatic] attribute will be called once on the first accessing thread. The field will have default value on different threads. System.Data.Linq-net_4_x Profiler.cs 16
static class Profiler
{
  [ThreadStatic]
  private static Stopwatch timer = new Stopwatch();
  ....
}
In a nutshell: the field marked by the ThreadStatic attribute is initialized incorrectly. In the documentation of the diagnostic you may see a detailed description of the situation, and some tips on how to avoid such errors. It's the perfect example of an error that is not that easy to find and fix using the usual methods.
Second place: Copy-Paste - classic!
One of the classical, to my mind, errors of the Copy-Paste type which was found in the already mentioned Mono project (see the full article).
V3012 The '?:' operator, regardless of its conditional expression, always returns one and the same value: Color.FromArgb (150, 179, 225). ProfessionalColorTable.cs 258
Here is a short code fragment, where this bug was found:
button_pressed_highlight = use_system_colors ?
                           Color.FromArgb (150, 179, 225) : 
                           Color.FromArgb (150, 179, 225);
You may ask: "Is it really such a huge mistake?" "Does this obvious mistake really deserve second place in the chart?" The thing is that this code fragment was purposefully formatted to make it more visible. And now imagine that you have the task of finding an error like this without any tools. So, those with weak nerves, we would ask not to look at this; here is a screenshot of a full code fragment with an error (click on the image to enlarge):
Picture 1
Apparently, this mistake can be done by any programmer, regardless of the qualification. Successfully finding such errors demonstrates the full power of static analysis.
First place: PVS-Studio
No, it's not an illusion. "PVS-Studio" really is written here. It took first place in our chart. Not because it is a great static analyzer, but because our team has ordinary people, who make simple human errors when writing code. This was the main topic of an article we wrote previously. With the help of PVS-Studio, we detected two errors in the code of PVS-Studio itself.
V3022 Expression 'RowsCount > 100000' is always false. ProcessingEngine.cs 559
V3022 Expression 'RowsCount > 200000' is always false. ProcessingEngine.cs 561
public void ProcessFiles(....)
{
  ....
  int RowsCount = 
    DynamicErrorListControl.Instance.Plog.NumberOfRows;
  if (RowsCount > 20000)
    DatatableUpdateInterval = 30000; //30s
  else if (RowsCount > 100000)
    DatatableUpdateInterval = 60000; //1min
  else if (RowsCount > 200000)
    DatatableUpdateInterval = 120000; //2min
  ....
}
The result of this code fragment (on condition that RowsCount > 20000) will always be a value DatatableUpdateInterval equal to 30000.
Fortunately, we have already done some work in this sphere.
Picture 4
Thanks to widely used incremental analysis in our team, the articles "bugs in PVS-Studio found by PVS-Studio" are very unlikely to appear.
Conclusion
I should note that anyone can make a personal chart of bugs found using a free version of PVS-Studio static analyzer for checking individual projects. 

Download and try PVS-Studio: http://www.viva64.com/en/pvs-studio/

To purchase a commercial license, please contact us via email. You can also write to us to get a temporary license key for a comprehensive investigation of PVS-Studio, if you want to avoid the limitations of the demo version.

Monday, February 20, 2017

Bugs from the USSR

History is experience that helps the modern generation not to make the same mistakes again. But in programming, as well as in other developing areas, such an ideal scenario is not always possible. Why? Because new languages are still appearing, a lot of processes are becoming more complex, and the machines are getting smarter. In this article, I'll tell you two real stories. What do they have in common? First of all, the time - both of them happened in the USSR. Secondly, the people. These stories could have totally different scenarios if the main characters revealed their best or worst traits. Thirdly, the programming of course, otherwise this article would not be in our blog.
Picture 6

How to prove that you're a real programmer?

The Russian domestic auto industry often gets attacked by the media. Having been an active user of it for some time, I cannot help agreeing with it: the quality of our cars could be better. I am not going to to speculate on the causes of such a situation - I don't have the necessary competence - I'll just tell a story about a bug in AvtoVAZ (Volzhsky Avtomobilny Zavod or Volga Automobile Plant).

Picture 7
It has already been published on various sites, so I will tell it briefly. In the year 1983, a talented mathematician, Murat Urtembayev, was given a job at AvtoVAZ. This young specialist was full of enthusiasm, but the chief officers weren't very encouraging to the new specialist, and gave him an ordinary position.
Then he decided to prove that he was a great programmer, and deserved some respect (a higher appointment, holidays in a spa, a good salary, etc.). His plan was such: he wrote a patch for a counter-program that is responsible for the cyclical rhythm of nodes provision to the conveyor line. The patch would cause a failure to the automatic equipment, and the necessary details wouldn't come at the right time, resulting in chaos on the production line. Right at that moment, Murat would volunteer to fix this issue, and show his bosses the unique abilities of their new, undervalued employee.
But something went wrong. It was an easy task to insert the floppy disk with a virus. The patch was supposed to start working right on the first day after his vacation, which would give him an additional alibi. Finally, he could heroically save AvtoVAZ. But apparently, Murat really was not a very good programmer, because the equipment started glitching 2 days before D-Day. The parts were getting to the conveyor in the wrong order and at the wrong time. Engineers were frantically searching for the technical error; the possibility of a bug in the code occurred to them only later. The incorrect code fragment was found, but the program was still glitching. 
Picture 10
In the end, either conscience or vanity forced Murat Urtembaev to confess. The programmer was convicted of hooliganism, given a suspended sentence, and was ordered to reimburse the cost of two "Zhiguli" cars.

Intuition vs. facts

This story was also widely known, although not all of the facts are available to the general public. Why? Because it's the story of how a nuclear war almost started. September 1983 -the world situation cannot be called 'very favorable'. Reagan, having the post of President of the United States, openly names the SOVIET UNION an "Evil empire". Any provocative action from either one of the parties could lead to the rupture of the stretched string of fragile peace, and the beginning of the war.
100 km away from Moscow, a Soviet military officer, Stanislav Yevgrafovich Petrov, started the night operational shift in the command post, 'Serpukhov-15'. The Lieutenant Colonel was personally monitoring the situation displayed on the screen by satellites.
Picture 8
The surveillance area wasUS territory. Suddenly, a warning appeared on the screen... The United States launched a missile! The siren started howling, the system began an automatic check - everything was correct, there is no mistake! Petrov had to make a well-thought out decision. There were two options:
  • Act according to instructions. Seeing that the missiles are moving towards the USSR, Petrov had to press the button. The duty officers had a nuclear briefcase ready to deliver to the head of the USSR, Yuri Andropov. They had only 30 minutes to strike back. If the USSR had launched rockets on the night of September 26, 1983, a nuclear war could have started.
  • Trust his intuition. Stanislav Yevgrafovich was thinking in the following way: ""There is no missile attack, the computer is being a fool. I am a programmer, I made them. And if I made them, they cannot be smarter than me, the creator" (the source [ru]). It was impossible to ignore the fact that: "... the missile attacks didn't start from a single base, the missiles were taking off from all of them at once".
The data from Petrov's computer was visible to the superior officers, who were at a loss.
Why is Stanislav Yevgrafovich hesitating, and not confirming the attack? They phoned him. The Lieutenant Colonel reported: "The information is false".
Petrov's intuition and experience hadn't failed him. Later, it was proven - the system glitch occurred due to the influence of external factors, which were previously not taken into account: the sensors were affected by the sunlight reflected fromhigh-altitude clouds (source). The computer didn't manage to detect a false signal.
Petrov's act had quite an ambiguous reaction. The USSR respected bureaucracy, and Stanislav Yevgrafovich behaved in contradiction to it. Logically, he should be punished. On the other hand, everybody was well aware that if he had pushed that alarm button, the USSR would have launched the missiles. And this time these would be real rockets, not just erroneous pixels on a display. As a result, Petrov was given a verbal reprimand. Soon after, he left the Army; still as a Lieutenant Colonel.
Picture 9
His act was treated quite differently by the US. He was recognized as a hero and awarded a prize by the UN "in recognition of the part he played in averting a catastrophe." But now Stanislav Yevgrafovich Petrov is still living in Russia, in a simple apartment in Fryazino. He doesn't consider himself to be a hero, but willingly speaks about 1983.
If the biography of this Soviet Officer arouse interest, you may see a documentary "The Man Who Saved the World".

Some philosophy

All modern industrial, and even more so military, installations have the required software installed. Almost all processes are automated, therefore the human presence becomes less necessary. But at the same time, the requirements for software quality increases. That's why it is highly recommended to use additional tools for the analysis of code, and I cannot help mentioning PVS-Studio analyzer. :) 
During our work we should also do the analysis of the safety of our code, in particular, we must search for undeclared functions (software back doors) with specialized tools. This will prevent a situation similar to what happened with the AvtoVaz hacker.
Picture 2

But, in conclusion, none of the programs have intuition, emotions, or free will. So can we really say that they make unbiased decisions? Can we fully trust artificial intelligence? Following the algorithm of the program, the USSR would have launched the missiles in 1983. And now, dear readers, I also suggest reflecting on the topic: "Can we trust a program in making an important decision, or should it be the responsibility of human beings?"

Thursday, February 16, 2017

Porting is a Delicate Matter: Checking Far Manager under Linux

Far Manager, which takes over from Norton Commander, created back in the times of DOS, is one of the most popular file managers on Microsoft Windows. Far Manager facilitates the file system management (file creation, editing, viewing, copying, moving, search, and deletion) and provides means to extend the standard feature set (handling of the network, archives, backup copies, and so on). Far Manager was recently ported to Linux, and there is currently an alpha version available. The PVS-Studio team couldn't ignore that event and miss the opportunity to test the quality of the ported code.
Picture 24

About Far Manager

Far Manager is a console keyboard-oriented file manager for operating systems of the Microsoft Windows family. The project inherits the two-panel layout, the standard (default) color scheme, and the set of keyboard shortcuts from a popular file manager Norton Commander and provides a convenient user interface for handling files and directories (creating, viewing, editing, copying, renaming, deleting, and so on).

Figure 1 - Far Manager 2 on Windows (click to enlarge)
Figure 1 - Far Manager 2 on Windows (click to enlarge)
Far Manager was created by Eugene Roshal. The first version was released on September 10, 1996. The last version (1.65) in whose development Roshal took part is dated June 23, 2000. After that, the Far Group in fact took over the project. It's not until March 29, 2006, that the next version, v1.70, was released. On December 13, 2008, version 2.0 was released and the program became open source. It has been distributed under the revised BSD license ever since. Far Manager versions 1.70 through 2.0 look almost the same, so users can move to newer versions without having to adapt from scratch. Unicode support was added in version 1.80. The latest release, v3.0, is dated November 4, 2016.
On August 10, 2016, the development group released the first test build of the Linux port, Far2l. This build currently features a built-in usable terminal and plugins Align, AutoWrap, Colorer, DrawLine, Editcase, FarFTP, FarLng, MultiArc, NetBox, SimpleIndent, TmpPanel. The source code is distributed under the GPLv2 license.
Figure 2 - Far Manager 2 on Linux (click to enlarge)
Figure 2 - Far Manager 2 on Linux (click to enlarge)

Enough talking

The analyzer output a total of 1038 General Analysis warnings for Far2l project. The chart below shows how the warnings are distributed across the severity levels:
Figure 1 - Warning distribution across the severity levels
Figure 1 - Warning distribution across the severity levels
Let me comment on this diagram briefly. The analyzer output 153 High-level, 336 Medium-level, and 549 Low-level warnings.
This number is relatively large, but we should keep in mind that not each warning is a real bug. Having studied High- and Medium-level messages, I found 250 cases that were very likely to be errors.
For the High and Medium levels, the rate of false positives is about 49%. In other words, every second warning points to a real defect in the code.
Now let's estimate the relative error density. The total number of source lines of code (SLOC) of this project is 538,675. Therefore, the error density is 0.464 errors per 1000 SLOC. One day, I believe, we will gather all these statistical data together and write a summary article about the code quality of different projects.
It should be noted that the error density parameter we have calculated does not reflect the general error density across the whole project: it may be both greater (if the analyzer failed to notice a real bug) and less (if the analyzer reported correct code as faulty). Other projects usually show higher error density, so you can call it a successful port from the code quality viewpoint. However, we strongly recommend that the authors fix the errors found by the analyzer, as they are far from harmless.

Analysis results

One thing you should know before reading on is that the examples discussed below were refactored to make them easier to read. Remember also that these are just the most interesting examples out of all the numerous errors found by PVS-Studio in this project.

Copy-Paste

Picture 28
PVS-Studio diagnostic message: V501 There are identical sub-expressions 'Key == MCODE_F_BM_GET' to the left and to the right of the '||' operator. macro.cpp 4819
int KeyMacro::GetKey()
{
  ....
  DWORD Key = !MR ? MCODE_OP_EXIT : GetOpCode(MR, Work.ExecLIBPos++);
  ....
  switch (Key)
  {
  ....
  case MCODE_F_BM_POP:
  {
    TVar p1, p2;

    if (Key == MCODE_F_BM_GET)
      VMStack.Pop(p2);

    if (   Key == MCODE_F_BM_GET    // <=
        || Key == MCODE_F_BM_DEL 
        || Key == MCODE_F_BM_GET    // <=
        || Key == MCODE_F_BM_GOTO)
    {
      VMStack.Pop(p1);
    }
    ....
  }
  }
}
The Key variable is compared with the MCODE_F_BM_GET constant twice. This must be a typo and the programmer actually intended to compare Key with some other constant. The analyzer found 3 more issues of this kind:
  • V501 There are identical sub-expressions '!StrCmpN(CurStr, L"!/", 2)' to the left and to the right of the '||' operator. fnparce.cpp 291
  • V501 There are identical sub-expressions '!StrCmpN(CurStr, L"!=/", 3)' to the left and to the right of the '||' operator. fnparce.cpp 291
  • V501 There are identical sub-expressions 'KEY_RCTRL' to the left and to the right of the '|' operator. keyboard.cpp 1830
PVS-Studio diagnostic message: V581 The conditional expressions of the 'if' operators situated alongside each other are identical. Check lines: 267, 268. APIStringMap.cpp 268
static BOOL WINPORT(GetStringType)( DWORD type,
                                    LPCWSTR src,
                                    INT count,
                                    LPWORD chartype )
{
  ....
  while (count--)
  {
    int c = *src;
    WORD type1, type3 = 0; /* C3_NOTAPPLICABLE */
    ....
    if ((c>=0xFFE0)&&(c<=0xFFE6)) type3 |= C3_FULLWIDTH; // <=
    if ((c>=0xFFE0)&&(c<=0xFFE6)) type3 |= C3_SYMBOL;    // <=
    ....
  }
  ....
}
The second condition looks like it was written using Copy-Paste and is identical to the first one. However, if this is a conscious decision, the code can be simplified by removing the second condition:
....
if ((c>=0xFFE0)&&(c<=0xFFE6)) type3 |= C3_FULLWIDTH | C3_SYMBOL; 
....
It was not the only error of this type:
  • V581 The conditional expressions of the 'if' operators situated alongside each other are identical. Check lines: 272, 273. APIStringMap.cpp 273
  • V581 The conditional expressions of the 'if' operators situated alongside each other are identical. Check lines: 274, 275. APIStringMap.cpp 275
  • V581 The conditional expressions of the 'if' operators situated alongside each other are identical. Check lines: 6498, 6503. macro.cpp 6503
  • V581 The conditional expressions of the 'if' operators situated alongside each other are identical. Check lines: 1800, 1810. vmenu.cpp 1810
  • V581 The conditional expressions of the 'if' operators situated alongside each other are identical. Check lines: 3353, 3355. wrap.cpp:3355
PVS-Studio diagnostic message: V523 The 'then' statement is equivalent to the 'else' statement. Queque.cpp 358
void FTP::AddToQueque(FAR_FIND_DATA* FileName, 
                      LPCSTR Path, 
                      BOOL Download)
{
  ....
  char *m;
  ....
  if(Download)
    m = strrchr(FileName->cFileName, '/'); // <=
  else
    m = strrchr(FileName->cFileName, '/'); // <=
  ....
}
The second condition in this example must have been also written with the help of "Copy-Paste": whatever the value of Download (TRUEFALSE), the 'm' pointer will be assigned the position of the last occurrence of the '/' character.

Undefined behavior

Picture 37
PVS-Studio diagnostic message: V567 Undefined behavior. The 'Item[FocusPos]->Selected' variable is modified while being used twice between sequence points. dialog.cpp 3827
int Dialog::Do_ProcessSpace()
{
  ....
  if (Item[FocusPos]->Flags & DIF_3STATE)
    (++Item[FocusPos]->Selected) %= 3;       // <=
  else
    Item[FocusPos]->Selected = !Item[FocusPos]->Selected;
  ....
}
We are obviously dealing with undefined behavior here: the Item[FocusPos]->Selected variable is modified twice in one sequence point (an increment and dividing of modulo 3 followed by an assignment).
There was one more fragment with similar undefined behavior:
  • V567 Undefined behavior. The '::ViewerID' variable is modified while being used twice between sequence points. viewer.cpp 117
PVS-Studio diagnostic message: V610 Undefined behavior. Check the shift operator '<<'. The right operand 'sizeof (wchar_t) * 8' is greater than or equal to the length in bits of the promoted left operand. RegExp.cpp 4467
#define rechar wchar_t
#define RE_CHAR_COUNT (1 << sizeof(rechar) * 8)

int RegExp::Optimize()
{
  ....
  for (op=code; ; op=op->next)
  {
    switch (OP.op)
    {
    ....
    case opType:
    {
      for (int i = 0; i < RE_CHAR_COUNT; i++)    // <=
      {
        if (ISTYPE(i, OP.type))
        {
          first[i]=1;
        }
      }
      
      break;
    }
    }
    ....
  }
  ....
}
The error has to do with the fact that the type wchar_t is 4 bytes long on Linux, so signed int (4 bytes) is shifted by 32 bits to the left. As specified by the C++11 standard, when the left operand has a signed type and a positive value, a left-shift by N bytes causes undefined behavior, if N is greater than or equal to the length in bits of the left operand. This is what the fixed version of the code should look like:
#define rechar wchar_t
#define RE_CHAR_COUNT (static_cast<int64_t>(1) << sizeof(rechar) * 8)

int RegExp::Optimize()
{
  ....
  for (int64_t i = 0; i < RE_CHAR_COUNT; i++)
  {
    ....
  }
  ....
}
The analyzer found a few more defects leading to undefined behavior related to the left-shift:
  • V610 Undefined behavior. Check the shift operator '<<'. The right operand 'sizeof (wchar_t) * 8' is greater than or equal to the length in bits of the promoted left operand. RegExp.cpp 4473
  • V610 Undefined behavior. Check the shift operator '<<'. The right operand 'sizeof (wchar_t) * 8' is greater than or equal to the length in bits of the promoted left operand. RegExp.cpp 4490
  • V610 Undefined behavior. Check the shift operator '<<'. The right operand 'sizeof (wchar_t) * 8' is greater than or equal to the length in bits of the promoted left operand. RegExp.cpp 4537
  • V610 Undefined behavior. Check the shift operator '<<'. The right operand 'sizeof (wchar_t) * 8' is greater than or equal to the length in bits of the promoted left operand. RegExp.cpp 4549
  • V610 Undefined behavior. Check the shift operator '<<'. The right operand 'sizeof (wchar_t) * 8' is greater than or equal to the length in bits of the promoted left operand. RegExp.cpp 4561

Incorrect memory handling

Picture 41
Let's start the new section with a little warm-up. Try to spot the bug in the code below by yourself (Hint: it's in the TreeItem::SetTitle function).
class UnicodeString
{
  ....
  UnicodeString(const wchar_t *lpwszData) 
  { 
    SetEUS(); 
    Copy(lpwszData); 
  }
  ....
  const wchar_t *CPtr() const { return m_pData->GetData(); }
  operator const wchar_t *() const { return m_pData->GetData(); }
  ....
}

typedef UnicodeString FARString;

struct TreeItem
{
  FARString strName;
  ....
}

TreeItem **ListData;
void TreeList::SetTitle()
{
  ....
  if (GetFocus())
  {
    FARString strTitleDir(L"{");
    const wchar_t *Ptr = ListData 
                         ? ListData[CurFile]->strName
                         : L""; 
    ....
  }
  ....
}
PVS-Studio diagnostic message: V623 Consider inspecting the '?:' operator. A temporary object of the 'UnicodeString' type is being created and subsequently destroyed. Check third operand. treelist.cpp 2093
Pretty subtle, isn't it? In this example, the ListData[CurFile]->strName variable is an instance of class UnicodeString, which contains an overloaded implicit conversion operator to type const wchar_t*. Now pay attention to the ternary operator in the function TreeList::SetTitle: the second and third operands have different types (UnicodeString and const char [1], respectively). The idea was that if the first operand returns false, then the pointer Ptr will be pointing to an empty string. Since the constructor UnicodeString is not declared as an explicit, in fact, the third operand is chosen as a temporary object, (which, in its turn, will be cast to type const wchar_t*). Further, the temporary object is destroyed and Ptr will point to invalid data. This is what the fixed code looks like:
....
const wchar_t *Ptr = ListData 
                     ? ListData[CurFile]->strName.CPtr()
                     : L"";
....
An interesting thing about the next example is that it triggered two diagnostics at once.
PVS-Studio diagnostic messages:
  • V779 Unreachable code detected. It is possible that an error is present. 7z.cpp 203
  • V773 The function was exited without releasing the 't' pointer. A memory leak is possible. 7z.cpp 202
BOOL WINAPI _export SEVENZ_OpenArchive(const char *Name,
                                       int *Type)
{
  Traverser *t = new Traverser(Name);
  if (!t->Valid()) 
  {
    return FALSE;
    delete t;
  }
  
  delete s_selected_traverser;
  s_selected_traverser = t;
  return TRUE;
}
Well, what do we have here? Firstly, there is, indeed, unreachable code in the if statement's body: if the condition is true, the function exits, returning FALSE. Secondly, that unreachable code simply caused a memory leak: the object pointed to by the pointer is not deleted. To fix these errors, the two statements within the if block need to be swapped.
The next example demonstrates how you can make a mistake when evaluating the size of an object of a class (struct) using a pointer.
PVS-Studio diagnostic messages:
  • V568 It's odd that 'sizeof()' operator evaluates the size of a pointer to a class, but not the size of the 'PInfo' class object. filelist.cpp 672
  • V568 It's odd that 'sizeof()' operator evaluates the size of a pointer to a class, but not the size of the 'PInfo' class object. filelist.cpp 673
int64_t FileList::VMProcess(int OpCode,
                            void *vParam,
                            int64_t iParam)
{
  switch (OpCode)
  {
  ....
  case MCODE_V_PPANEL_PREFIX:           // PPanel.Prefix
  {
    PluginInfo *PInfo = (PluginInfo *)vParam;
    memset(PInfo, 0, sizeof(PInfo));          // <=
    PInfo->StructSize = sizeof(PInfo);        // <=
    ....
  }
  ....
  }
}
Both errors have to do with the function sizeof(PInfo) returning the pointer size (4 or 8 bytes) instead of the expected structure size. Therefore, memset will fill with zeros only the first 4 (8) bytes of the structure, and the PInfo->StructSize field will be assigned the pointer size. Here's the fixed version:
....
PluginInfo *PInfo = (PluginInfo*)vParam;
memset(PInfo, 0, sizeof(*PInfo));
PInfo->StructSize = sizeof(*PInfo);
....
The analyzer found two more defects of this type:
  • V568 It's odd that 'sizeof()' operator evaluates the size of a pointer to a class, but not the size of the 'HistoryItem' class object. history.cpp 594
  • V568 It's odd that 'sizeof()' operator evaluates the size of a pointer to a class, but not the size of the 'handle' class object. plugins.cpp 682

Strange conditions

Picture 39
Another warm-up. Try to find the bug in the code below:
int FTP::ProcessKey(int Key, unsigned int ControlState)
{
  ....
  if(   !ShowHosts 
     && (ControlState == 0 || ControlState == PKF_SHIFT) 
     && Key == VK_F6)
  {
    FTP *ftp = OtherPlugin(this);
    int  rc;

    if(   !ftp 
       && ControlState == 0 
       && Key == VK_F6)
    {
      return FALSE;
    }
    ....
  }
  ....
}
PVS-Studio diagnostic message: V560 A part of conditional expression is always true: Key == 0x75. Key.cpp 493
Note the external and internal conditions: the Key variable is compared with the constant VK_F6. If the execution reaches the internal condition, Key is guaranteed to be equal to VK_F6, making the second check redundant. The improved code will look as follows:
....
if(   !ftp 
   && ControlState == 0)
{
  return FALSE;
}
....
This diagnostic found a few more defects like that:
  • V560 A part of conditional expression is always true: !cps. DString.cpp 47
  • V560 A part of conditional expression is always true: !ShowHosts. FGet.cpp 139
  • V560 A part of conditional expression is always false: !wsz. cnDownload.cpp 190
  • V560 A part of conditional expression is always true: !UserReject. extract.cpp 485
  • And 8 additional diagnostic messages.
PVS-Studio diagnostic message: V503 This is a nonsensical comparison: pointer <= 0. fstd_exSCPY.cpp 8
char *WINAPI StrCpy(char *dest, LPCSTR src, int dest_sz)
{
  if(dest <= 0)   // <=
    return NULL;
  ....
}
This code contains a meaningless comparison of a pointer with a negative value (pointers don't work with memory areas that have negative addresses). This is what the fixed version could look like:
....
if(dest == nullptr)
  return NULL;
....
PVS-Studio diagnostic message: V584 The 'FADC_ALLDISKS' value is present on both sides of the '==' operator. The expression is incorrect or it can be simplified. findfile.cpp 3116
enum FINDASKDLGCOMBO
{
  FADC_ALLDISKS,
  FADC_ALLBUTNET,
  ....
};

FindFiles::FindFiles()
{
  
  ....
  if (   FADC_ALLDISKS + SearchMode == FADC_ALLDISKS     // <=
      || FADC_ALLDISKS + SearchMode == FADC_ALLBUTNET)
  {
    ....
  }
  ....
}
The analyzer detected a strange condition in the first part of a compound conditional expression. Based on the FINDASKDLGCOMBO enumeration, the FADC_ALLDISKS constant has the value 0 and FADC_ALLBUTNET has the value 1. If we use the numerical values in the conditional expression, we'll get the following:
if (   0 + SearchMode == 0
    || 0 + SearchMode == 1)
{
  ....
}
Judging by this code, the whole condition can be simplified:
if (   SearchMode == FADC_ALLDISKS
    || SearchMode == FADC_ALLBUTNET)
{
  ....
}

Incorrect format string handling

Picture 40
PVS-Studio diagnostic message: V576 Incorrect format. Consider checking the fourth actual argument of the 'swprintf' function. The char type argument is expected. FarEditor.cpp 827
void FarEditor::showOutliner(Outliner *outliner)
{
  ....
  wchar_t cls = 
    Character::toLowerCase((*region)[region->indexOf(':') + 1]);
  
  si += swprintf(menuItem+si, 255-si, L"%c ", cls); // <=
  ....
}
This might be a porting error. It has to do with the fact that in Visual C++, the format-string specifiers in the functions printing wide strings are interpreted in a non-standard way: the %c specifier is expecting a wide character (wide char, wchar_t), while on Linux, as specified by the standard, %c is expecting a multibyte character (multibyte symbol, char). The fixed code should look as follows:
si += swprintf(menuItem+si, 255-si, L"%lc ", cls);
 
PVS-Studio diagnostic message: V576 Incorrect format. Consider checking the fourth actual argument of the 'swprintf' function. The pointer to string of char type symbols is expected. cmddata.cpp 257
void CommandData::ReadConfig()
{
  ....
  wchar Cmd[16];
  ....
  wchar SwName[16+ASIZE(Cmd)];
  swprintf(SwName,ASIZE(SwName), L"switches_%s=",Cmd);  // <=
  ....
}
This case is similar to the previous one: the format string contains specifier %s, so a multibyte string (char*) is expected. However, what it receives is a wide string (wchar_t*). This is the fixed code:
swprintf(SwName,ASIZE(SwName), L"switches_%ls=",Cmd);
The analyzer also reported two other instances with incorrectly passed format-string parameters:
  • V576 Incorrect format. Consider checking the third actual argument of the 'fprintf' function. The char type argument is expected. vtansi.cpp 1033
  • V576 Incorrect format. Consider checking the third actual argument of the 'fprintf' function. The char type argument is expected. vtansi.cpp 1038

Conclusion

What conclusions can we draw about the Linux port of Far Manager? True, there are many defects, but it's just an alpha version after all, and the project is still developing. Static code analysis can help you find bugs at the earliest development stage and prevent them from making it to the repository, but to fully feel its advantages, you should run it regularly (or at least during night builds).
I offer you all to try PVS-Studio and evaluate its usefulness for yourself: the analyzer can run on Microsoft Windows and supports deb/rpm-based Linux distributions, allowing you to scan projects quickly and on a regular basis. What's more, if you are a student, an individual developer, or a developer of open-source, non-commercial software, you can use PVS-Studio for free.

In this video tutorial you may see how install PVS-Studio for Linux and check your project (using Far Manager as an example). If you know an interesting project worth checking, you may suggest it on GitHub. Here are more details about that: "Propose a project for analysis by PVS-Studio: now on GitHub".