One of the things I struggle with is not using Hungarian notation. I don’t want to have to go to the variable definition just to see what type it is. When a project gets extensive, it’s nice to be able to look at a variable prefixed by ‘bool’ and know that it’s looking for true/false instead of a 0/1 value.
I also do a lot of work in SQL Server. I prefix my stored procedures with ‘sp’ and my tables with ‘tbl’, not to mention all of my variables in the database respectively.
I see everywhere that nobody really wants to use Hungarian notation, to the point where they avoid it. My question is, what is the benefit of not using Hungarian notation, and why does the majority of developers avoid it like the plague?
Because its original intention (see http://www.joelonsoftware.com/articles/Wrong.html and http://fplanque.net/Blog/devblog/2005/05/11/hungarian_notation_on_steroids) has been misunderstood and it has been (ab)used to help people remember what type a variable is when the language they use is not statically typed. In any statically typed language you do not need the added ballast of prefixes to tell you what type a variable is. In many untyped script languages it can help, but it has often been abused to the point of becoming totally unwieldy. Unfortunately, instead of going back to the original intent of Hungarian notation, people have just made it into one of those “evil” things you should avoid.
Hungarian notation in short was intended to prefix variables with some semantics. For example if you have screen coordinates (left, top, right, bottom), you would prefix variables with absolute screen positions with “
abs” and variables with positions relative to a window with “
rel“. That way it would be obvious to any reader when you passed a relative coordinate to a method requiring absolute positions.
update (in response to comment by delnan)
IMHO the abused version should be avoided like the plague because:
- it complicates naming. When (ab)using Hungarian notation there will always be discussions on how specific the prefixes need to be. For example:
- it makes variable names longer without aiding the understanding of what the variable is for.
- it sort of defeats the object of the exercise when in order to avoid long prefixes these get shortened to abbreviations and you need a dictionary to know what each abbreviation means. Is a
uian unsigned integer? an unreferenced counted interface? something to do with user interfaces? And those things can get long. I have seen prefixes of more than 15 seemingly random characters that are supposed to convey the exact type of the var but really only mystify.
- it gets out of date fast. When you change the type of a variable people invariably ( lol ) forget to update the prefix to reflect the change, or deliberately don’t update it because that would trigger code changes everywhere the var is used…
- it complicates talking about code because as “@g .” said: Variable names with Hungarian notation are typically difficult-to-pronounce alphabet soup. This inhibits readability and discussing code, because you can’t ‘say’ any of the names.
- … plenty more that I can’t recall at the moment. Maybe because I have had the pleasure of not having to deal with the abused Hungarian notation for a long while…
Hungarian notation is a naming anti-pattern in modern day programming environments and form of Tautology.
It uselessly repeats information with no benefit and additional maintenance overhead. What happens when you change your
int to a different type like
long, now you have to search and replace your entire code base to rename all the variables or they are now semantically wrong which is worse than if you hadn’t duplicated the type in the name.
It violates the DRY principle. If you have to prefix your database tables with an abbreviation to remind you that it is a table, then you are definitely not naming your tables semantically descriptively enough. Same goes for every other thing you are doing this with. It is just extra typing and work for no gain or benefit with a modern development environment.
Wikipedia has a list of advantages and disadvantages of the hungarian notation and can thus probably provide the most comprehensive answer to this question. The notable opinons are also a quite interesting read.
The benefit of not using hungarian notation is basically only avoidance of its disadvantages:
The Hungarian notation is redundant when type-checking is done by the compiler. Compilers for languages providing type-checking ensure the usage of a variable is consistent with its type automatically; checks by eye are redundant and subject to human error.
All modern Integrated development environments display variable types on demand, and automatically flag operations which use incompatible types, making the notation largely obsolete.
Hungarian Notation becomes confusing when it is used to represent several properties, as in
a_crszkvc30LastNameCol: a constant reference argument, holding the contents of a database column
varchar(30)which is part of the table’s primary key.
It may lead to inconsistency when code is modified or ported. If a variable’s type is changed, either the decoration on the name of the variable will be inconsistent with the new type, or the variable’s name must be changed. A particularly well known example is the standard
WPARAMtype, and the accompanying
wParamformal parameter in many Windows system function declarations. The ‘w’ stands for ‘word’, where ‘word’ is the native word size of the platform’s hardware architecture. It was originally a 16 bit type on 16-bit word architectures, but was changed to a 32-bit on 32-bit word architectures, or 64-bit type on 64-bit word architectures in later versions of the operating system while retaining its original name (its true underlying type is
UINT_PTR, that is, an unsigned integer large enough to hold a pointer). The semantic impedance, and hence programmer confusion and inconsistency from platform-to-platform, is on the assumption that ‘w’ stands for 16-bit in those different environments.
Most of the time, knowing the use of a variable implies knowing its type. Furthermore, if the usage of a variable is not known, it can’t be deduced from its type.
Hungarian notation strongly reduces the benefits of using feature-rich code editors that support completion on variable names, for the programmer has to input the whole type specifier first.
It makes code less readable, by obfuscating the purpose of the variable with needless type and scoping prefixes.
The additional type information can insufficiently replace more descriptive names. E.g.
sDatabasedoesn’t tell the reader what it is.
databaseNamemight be a more descriptive name.
When names are sufficiently descriptive, the additional type information can be redundant. E.g.
firstNameis most likely a string. So naming it
sFirstNameonly adds clutter to the code.
I myself don’t use this notation, because I dislike the unnecessary technical noise. I nearly always know which type I am dealing with and I want a clean language in my domain model, but I write mostly in statically and strongly typed languages.
As a MS SQL Server specific problem:
Any stored procedures prefixed with ‘sp_’ are first searched for in
the Master database rather than the one it is created in. This will
cause a delay in the stored procedure being executed.
IMO the biggest benefit of not using Hungarian is the fact it forces you to use meaningful names. If you are naming variables properly, you should immediately know what type it is or be able to deduce it fairly quickly in any well-designed system. If you need to rely on
bln or worse of all
obj prefixes to know what type a variable is, I would argue it indicates a naming issue – either poor variable names in general or way too generic to convey meaning.
Ironically, from personal experience the main scenario I have seen Hungarian used is either “cargo-cult” programming (i.e. other code uses it, so let’s continue to use it just because) or in VB.NET to work around the fact the language is case-insensitive (e.g.
Person oPerson = new Person because you can’t use
Person person = new Person and
Person p = new Person is too vague); I’ve also seen prefixing “the” or “my” instead (as in
Person thePerson = new Person or the uglier
Person myPerson = new Person), in that particular case.
I will add the only time I use Hungarian tends to be for ASP.NET controls and that’s really a matter of choice. I find it very ugly to type
CustomerNameTextBox versus the simpler
txtCustomerName, but even that feels “dirty”. I feel some kind of naming convention should be used for controls as there can be multiple controls that display the same data.
I’ll just focus on SQL Server since you mentioned it. I see no reason to put ‘tbl’ in front of a table. You can just look at any tSQL code and distinguish a table by how it is used. You would never
Select from stored_procedure. or
Select from table(with_param) like you would a UDF or
Execute tblTableOrViewName like a stored procedure.
Tables could be confused with a View, but when it comes to how they are used; there is no difference, so what is the point? The Hungarian notation may save you the time of looking it up in SSMS (under table or views?), but that’s about it.
Variables can present a problem, but they need to be declared and really, how far from your declare statement do you plan on using a variable? Scrolling a few lines shouldn’t be that big of a deal unless you’re writing very long procedures. It may be a good idea to break up the lengthy code a little.
What you describe is a pain, but the Hungarian Notation solution doesn’t really solve the problem. You can look at someone else’s code and find that the variable type may get changed which now requires a change to the variable name. Just one more thing to forget. And if I use a VarChar your’re going to need to look at the declare statement anyway to know the size. Descriptive names will probably get you further. @PayPeriodStartDate pretty much explains itself.
To my way of looking at things, Hungarian Notation is a kludge to get around an insufficiently powerful type system. In languages that allow you to define your own types it’s relatively trivial to create a new type that encodes the behavior you’re expecting. In Joel Spolsky’s rant on Hungarian Notation he gives an example of using it to detect possible XSS attacks by indicating that a variable or function is either unsafe (us) or safe (s), but that still relies on the programmer to visually check. If you instead have an extensible type system you can just create two new types, UnsafeString and SafeString, and then use them as appropriate. As a bonus, the type of encode becomes:
and short of accessing the internals of UnsafeString or using some other conversion functions becomes the only way to get from a UnsafeString to a SafeString. If all your output functions then only take instances of SafeString it becomes impossible to output an un-escaped string [ baring shenanigans with conversions such as StringToSafeString(someUnsafeString.ToString()) ].
It should be obvious why allowing the type system to sanity check your code is superior to trying to do it by hand, or maybe eye in this case.
In a language such as C of course, you’re screwed in that an int is an int is an int, and there’s not much you can do about that. You could always play games with structs but it’s debatable whether that’s an improvement or not.
As for the other interpretation of Hungarian Notation, I.E. prefixing with the type of the variable, that’s just plain stupid and encourages lazy practices like naming variables uivxwFoo instead of something meaningful like countOfPeople.
Another thing to add is that, what abbreviations would you use for an entire framework like .NET? Yeah, it’s so simple to remember that
btn represents a button and
txt represents a text box. However, what do you have in mind for something like
strbld? What about
CompositeWebControl? Do you use something like this:
CompositeWebControl comWCMyControl = new CompositeWebControl();
One of the inefficiencies of the Hungarian notation was that, by having larger and larger frameworks, it proved not only not to add extra benefit, but also to add more complexity for developers, because they now had to learn the nonstandard prefixes more and more.
Hungarian notation is almost completely useless in a statically typed language. It’s a basic IDE feature to show the type of a variable by putting the mouse over it, or by other means; moreover you can see what the type is by looking a few lines up where it was declared, if there’s no type inference. The whole point of type inference is to not have the noise of the type repeated everywhere, so hungarian notation is usually seen as a bad thing in languages with type inference.
In dynamically typed languages, it can help sometimes, but to me it feels unidiomatic. You already gave up your functions being restricted to exact domains/codomains; if all your variables are named with hungarian notation, then you are just reproducing what a type system would have given you. How do you express a polymorphic variable that can be an integer or a string in hungarian notation? “IntStringX”? “IntOrStringX”? The only place I’ve ever used hungarian notation was in assembly code, because I was trying to get back what I’d get if I had a type system, and it was the first thing I’ve ever coded.
Anyways, I could care less what people name their variables, the code will probably still be just as incomprehensible. Developers waste way too much time on things like style and variable names, and at the end of the day you still get a ton of libraries with completely different conventions in your language. I’m developing a symbolic (i.e: non text-based) language where there are no variable names, only unique identifiers, and suggested names for variables (but most variables still have no suggested name because there simply does not exist a reasonable name for them); when auditing untrusted code, you can’t depend on variable names.
As usual in such a case, I will post an answer before I read answers from other participants.
I see three “bugs” in your vision:
1) If you want to know the type of a variable/parameter/attribute/column you can hover your mouse or click it and it will be displayed, in most modern IDE. I don’t know what tools you’re using, but last time I was forced to work in an environment that didn’t provide this feature was in the 20th century, the language was COBOL, oops no it was Fortran, and my boss didn’t understand why I left.
2/ Types may change during the cycle of development. A 32-bit integer may become a 64-bit integer at some point, for good reasons that had not be detected at the start of project. So, renaming intX into longX or leaving it with a name that points to the wrong type is bad karma.
3) What you’re asking for is in fact a redundancy. Redundancy is not very good design pattern or habit. Even humans are reluctant to too much redundancy. Even humans are reluctant to too much redundancy.
I believe being in dire need of hungarian is a symptom.
A symptom of too many global variables …or of having functions too long to be mantainable.
If your variable definition isn’t in sight, usually, you’ve got trouble.
And if your functions don’t follow some memorable convention, there again, big trouble.
That’s… pretty much the reason why many workplaces dash it out, I suppose.
It originated on languages that needed it.
On times of global variables bonanza. (for lack of alternatives)
It served us well.
The only real use we have for it today is the Joel Spolsky one.
To track some particular attributes of the variable, like its safety.
(e.g. “Does variable
safeFoobar has a green light to be injected into a SQL query?
— As it is called
Some other answers talked about editor functionalities that helped seeing the type of a variable as you hover on it. In my view, those too are kind of problematic for code sanity. I believe they where only meant for refactoring, as many other features too, (like function folding) and should not be used on new code.
the reason it is avoided is because of systems hungarian which violates DRY (the prefix is exactly the type which the compiler and (a good) IDE can derive)
apps hungarian O.T.O.H. prefixes with the use of the variable (i.e. scrxMouse is a x coordinate on the screen this can be an int, short, long or even a custom type (typedefs will even allow you to change it easily))
the misunderstanding of systems is what destroyed hungarian as a best practice
I think the reasons not to use Hungarian notation have been well covered by other posters. I agree with their comments.
With databases I use Hungarian notation for DDL objects that are rarely used in code, but would otherwise collide in namespaces. Mainly this gets down to prefixing indexes and named constraints with their type( PK, UK, FK, and IN ). Use a consistent method to name these objects and you should be able to run some validations by querying the metadata.
I found a lot of good arguments against, but one I did not see: ergonomics.
In former times, when all you had was string, int, bool and float, the characters sibf would have been sufficient. But with string + short, the problems begin. Use the whole name for the prefix, or str_name for string? (While names are almost always strings – aren’t they?) What’s with a Street class? Names get longer and longer, and even if you use CamelCase, it is hard to tell where the type-prefix ends and where the variable-name begins.
BorderLayout boderLayoutInnerPanel = new BorderLayout ();
Okay – you could use underlines, if you don’t use them for something else already, or use CamelCase if you used underlines so long:
BorderLayout boderLayout_innerPanel = new BorderLayout ();
Border_layout boder_layoutInner_panel = new Border_layout ();
It’s monstrous and if you do it consequently, you will have
for (int iI = 0; iI < iMax-1; ++iI)
for (int iJ = iI; iJ < iMax; ++iMax)
int iCount += foo (iI, iJ);
Either you will end in using useless prefixes for trivial cases, like loop-variables or
count. When did you recently use a short or a long for a counter? If you make exceptions, you will often loose time, thinking about needing a prefix or not.
If you have a lot of variables, they get normally grouped in an object browser which is part of your IDE. Now if 40% start with i_ for int, and 40% with s_ for string, and they are alphabetically sorted, it is hard to find the significant part of the name.
Let’s say we have a method like this (in C#):
// some code
Now in code we call it like this:
var intStuff = GetCustomerCount();
// lots of code that culminates in adding a customer
The int doesn’t tell us very much. The mere fact that something is an int doesn’t tell us what’s in it. Now let’s suppose, instead, we call it like this:
var customerCount = GetCustomerCount();
// lots of code that culminates in adding a customer
Now we can see what the purpose of the variable is. Would it matter if we know it’s an int?
The original purpose of Hungarian, though, was to have you do something like this:
var cCustomers = GetCustomerCount();
// lots of code that culminates in adding a customer
This is fine as long as you know what c stands for. But you’d have to have a standard table of prefixes, and everyone would have to know them, and any new people would have to learn them in order to understand your code. Whereas
countOfCustomers is pretty obvious at first glance.
Hungarian had some purpose in VB before
Option Strict On existed, because in VB6 and prior (and in VB .NET with
Option Strict Off) VB would coerce types, so you could do this:
Dim someText As String = "5"
customerCount = customerCount + someText
This is bad, but the compiler wouldn’t tell you so. So if you used Hungarian, at least you’d have some indicator of what was happening:
Dim strSomeText As String = "5"
intCustomerCount = intCustomerCount + strSomeText // that doesn't look right!
In .NET, with static typing, this isn’t necessary. And Hungarian was too often used as a substitute for good naming. Forget Hungarian and choose good names instead.
The one place where I still regularly use either Hungarian or analogous suffixes is in contexts where the same semantic data is present in two different forms, like data conversion. This may be in cases where there are multiple units of measurement, or where there are multiple forms (e.g., String “123” and integer 123 ).
I find the reasons given here for not using it compelling for not imposing Hungarian on others, but only mildly suggestive, for deciding on your own practice.
The source code of a program is a user interface in its own right – displaying algorithms and metadata to the maintainer – and redundancy in user interfaces is a virtue, not a sin. See, e.g, the pictures in “The Design of Everyday Things”, and look at the doors labelled “Push” that look like you pull them, and at the beer taps the operators hacked onto the important nuclear reactor controls because somehow “hovering over them in the IDE” wasn’t quite good enough.
The “just hover in an IDE” is not a reason not to use Hungarian – only a reason some people might not find it useful. Your mileage may differ.
The idea that Hungarian imposes a significant maintenance burden when variable types change is silly – how often do you change variable types? Besides, renaming variables is easy:
Just use the IDE to rename all the occurences. -Gander, replying to Goose
If Hungarian really helps you quickly grok a piece of code and reliably maintain it, use it. If not, don’t. If other people tell you that you’re wrong about your own experience, I’d suggest that they’re probably the ones in error.