Over the years I’ve changed my mind multiple times about code comments.
In this article, I’ll explain how my commenting practice (yes, that’s a thing :p) has evolved as well as what I currently recommend, whether you’re a junior fresh out of school, a seasoned developer or a team lead.
历程
At the beginning of my career as a software developer, I used to write tons and tons of comments to remind me of why and how things worked.
As I grew more accustomed to the recurring patterns and got used to the weird (read: horrendous) APIs of some libraries, I progressively wrote less and less comments.
The reason behind this evolution is that as your experience grows, you need less and less explanations about the how.
What remains pretty much constant, independently of your experience level, is the need to have an understanding of the rationale/reasoning behind certain implementation details.
Having a strong understanding of the language/technology being used is key, but doesn’t tell you the whole story. Without hints about the intent of the code, things can get blurry real quick.
I once joined a really large project where nobody on the team even knew why some areas of the system were there. And that means trouble…
Nowadays, I tend to write comments mostly to explain why some sub-systems exist, why they’re structured the way they are or why a certain data structure has been chosen over another (e.g., for performance reasons).
UML schemas and wiki documentation can also be useful for higher level explanations, but I tend to avoid creating too many of those as they’re far away from the code and really hard to maintain.
I still like to use comments to highlight the “danger zones”. That is: critical pieces of the code that should only ever be touched with care. Those are useful, as the most sensitive code paths in a system have usually been battle tested and just work. This is not to say that we can’t refactor such code, but it has to be done with care (even if automated tests are in place). Sometimes there’s a subtle bug fix; sometimes it’s a matter of performance.
What I also often do is add references as comments; for instance towards the documentation of specific APIs or features that are being used, or pointers to relevant StackOverflow questions.
为什么注释写太多反而不好?
As a junior developer, when a codebase is littered with comments, you might feel safe, as you see tons of helpful messages to guide/reassure you and clear out any doubts.
Although, as time goes by, you’ll realize that, often times, those comments are out of sync with the code. When you’ll have noticed this multiple times, you’ll start paying less and less attention to those comments; until you just ignore them. At least that’s one possible reaction; it’s related to the theory of broken windows; the same is true of bad/ugly code and a lack of attention to technical debt. The alternative is to systematically try to fix the comments, which is better but also has an associated cost.
The main issue with having too many comments is that those not “safe” / “type safe” / “compiled”. Nothing apart us, humans reading/writing them, can make sure that they’re still correct/relevant. In a sense we’re the comment parsers; it’s up to us to keep them relevant.
Comments are metadata; they live in another “dimension”, independent of the code itself. More importantly, comments have an associated maintenance cost. Each and every comment line is actually something more to maintain in the project. To me, code comments decay is also technical debt of sorts.
The more comments you have in your codebase, the more costly it becomes to maintain. This fact alone is a good reason to write less comments.
Although, writing too few comments is not good either; it’s always a question of balance. You should at least document the rationale behind important design choices, the reason for which elements exist in your system.
Things such as who the author of something is, what the filename is, when it was modified, etc doesn’t make any sense. Source control takes care of that.
Copyright headers also don’t make any sense; if you need those, then take that out of your code and move that into your build. Create a template and let your build system insert whatever notice you need in the generated artifacts.
说一说注释掉的代码
As surprising as it is (to me at least), many experienced software engineers tend to comment out sections of code, thinking that they might need to recover those or “re-enable” that code later on.
I’m certainly not the only one to say this, but don’t. Just don’t. Commented-out code is pure noise. Not only that, but it is even dangerous.
In addition, commented-out code is still code that has to be maintained. But in most cases, it isn’t. The more commented-out code you see, the less you pay attention to it. Unfortunately, if you really decide to uncomment lines of code after some time has passed, then if that code hasn’t been maintained along with the rest, then it might end up introducing bugs (or worse).
Whenever you think about commenting out some code, just forget it. Don’t. Delete it. Right there.
If that code was never committed, then it doesn’t matter; it was just an idea; forget about it.
If that code was previously in the codebase, then removing it altogether now doesn’t mean that it is gone forever. Your source control management system is there exactly for that purpose. If you end up needing that code ever again, then you’ll dive into the history of your project and you’ll find it back, safe and sound.
Whenever I notice commented-out code, I don’t hesitate one bit: I delete it right away. And you should do that too. Less is more.
使用日志的方式去说明发生了什么
Code comments are relevant for maintenance; they help your teammates, your successors and even your future self to know why things are there and what is the rationale behind the architecture/design choices.
On the other hand, as I’ve explained above, comments detailing what the code is doing are mostly useless, misleading and costly to maintain. On the contrary, log statements that explain what the code is doing are incredibly valuable for any production system. When things go awry in production, you’ll be happy to find log files filled with useful troubleshooting information.
If you think about writing a comment to explain what the code is doing, then you should instead add a log statement, with the correct granularity (I’ll soon write an article about that!).
Nowadays, what I tend to do when I notice that there are too many comments is to immediately remove those that are useless/outdated. If I notice that there’s no or not enough logging, then I add some log statements.
编写自动化测试来解释运行原理
行为驱动开发(Behavior-Driven Development, BDD) is all about creating a shared understanding of how the system works. By applying BDD, you’ll create tests that double as specifications for the code it covers.
This is awesome because since tests are strongly tied with the code that is being tested, it is much harder to let it fall behind. Tests can be statically checked along with the rest of your code. Moreover, if a test fails, then you know that you either need to adapt the specs/tests or fix the broken code. Isn’t that great? To me it is, and certainly much more helpful than bogus comments!
Do yourself a favor; whenever you feel like writing a comment explaining the “how”, write tests instead.
By the way, BDD is awesome for many other reasons, so if you’re not familiar with that, make sure to read a few articles about it and give it a try in your next projects.
总结
这这篇文章中,我分享了编写代码注释的一些想法。
听起来可能会觉得比较无聊,但这些东西确实是你项目中重要的一部分。如果它们是相关的且是最新的,则会很有用。但如果是过时的话,很容易产生误导。
有用的注释大多都是关于“为什么”,而不是“如何”。
前文太长,总结如下:
- 不要写太多的注释。如有必要,请专注于解释理由/意图,或者写一些重要的警告,外部引用之类的东西。
- 维护代码时不要忘记注释,它们也是“技术债”的一部分。不要忽略注释。删除那些无用的注释,更改代码时要对应维护注释。
- 如果是为了说明发生了什么,建议使用日志。这也将有助于生产环境的故障排查。
- 如果是为了解释某个东西为什么存在,以及是如何工作的。建议编写自动化的 BDD 测试
- 代码不用时不要注释掉,请直接删掉。
Translate from https://itnext.io/how-to-write-code-comments-like-a-pro-c830e68cec92 by ismdeep