Did you know that in some cases, the intellectual property driving your application can be recovered back in source code form within minutes? You might be surprised how easily your software could be cracked and downloaded for free – or even worse, sold by some other party.
Obfuscation is one important tool used to guard against these risks. In this article, we provide a high-level summary on this rather technical topic.
No software or system is completely impervious to attack, and this is the unfortunate truth, which we laid out bare in a previous blog post highlighting 5 Blatant Truths About Software Licensing Systems and Piracy. However, there are some reasonable measures that should be taken to mitigate the risks of your intellectual property being reversed and/or stolen, and obfuscation is one tool that plays a critical role in doing so.
What is obfuscation?
In the context of software, obfuscation is the act of concealing (or “obscuring”) source code or machine code, so that it is difficult for humans to understand. Obfuscation always has relevance in protecting intellectual property, but its importance and how it is applied varies depending on what language or framework was used when creating your application. We can summarize this in 3 high-level categories: interpreted languages, intermediate languages, and native languages.
Interpreted Languages
These are languages that are interpreted directly, such as Perl, PHP, PowerShell, VBScript, and others. With these languages, the code (as written by a human) is read by an interpreter, which carries out the instructions written in code. In other words, sending someone a program written in these languages is the same as sending the source code. Consequently, using a tool that obfuscates the source code is extremely important if the source code contains sensitive information and logic.
Intermediate Languages
Intermediate languages are a specific type of interpreted language, where the code written by a human is “compiled” into intermediate language code. When the program is run, an interpreter reads the intermediate language code and carries out the instructions. Some examples of languages that are compiled to intermediate languages include .NET languages (C#, VB.NET, etc.) and Java. Since tools are available to revert (or “de-compile”) intermediate language code back to its former state (as written by a human), sending someone a program written in these languages involves the same risks as sending the source code. This makes obfuscation a very important tool to use when the source code contains sensitive information and logic.
Native Languages
Although the term “compiled language” is more commonly used in this context, we are deliberately using a different term so as to exclude intermediate languages. Native languages (such as C, C++) are those in which source code written by a human is compiled into machine code. Since machine code is much more cumbersome and difficult for humans to read, it makes it much more difficult to reverse engineer and copy. Additionally, machine code is very difficult to revert back to its original state (as written by a human).
Despite how much more difficult it is to work with, it is still possible to follow the logic of native programs by following the instructions being run. Furthermore, the number and types of security features available to these types of programs vary based on the compiler (and compiler version) used, processor architecture, and the operating system version. Consequently, it is important to obscure the application’s most sensitive information and logic.
Where do I start?
The simplest place to start is with an obfuscator, which is a tool that automatically obfuscates source code for you. With interpreted and intermediate languages, the tools available will vary based on the language used. However, when using native languages, automatic obfuscation is usually a part/feature of a different type of tool known as a “packer.”
Developers: It is very important to note that it can be critical to make extra considerations with programming practices to get the most out of an obfuscator. For example, it’s common to make code publicly accessible (meaning other programs can see it and call it) for unit testing purposes. The drawback to this that an obfuscator then cannot mangle or randomize the name of something that is not really meant to be called from outside your own source code.
Next, manual obfuscation is simply when you (or your developer) manually obfuscate code or data to help make it difficult to find and reverse sensitive data and logic. Even when using an obfuscator, this can be a valuable tool because even obfuscated code can potentially be reversed (or de-obfuscated).
Manually obfuscated code is often best paired with an obfuscator, since it can otherwise stand out as an obvious target when next to un-obfuscated code. An example of what you might want to obfuscate manually would be private/secret encryption key data and logic that uses it.
Since doing this inherently makes the source code more complicated, you (or your developer) will have to use your best judgment to ensure the obfuscated code is maintainable and does not significantly affect your application’s performance.
Summary
Obfuscation plays a pivotal role in protecting your software from reverse engineering and intellectual property theft. Keep in mind it should be one of many tools you employ for this purpose, and it does not serve as a substitute for other tools such as encryption and verification. Furthermore, if you use any licensing system, it is best to pair it with an obfuscator to help prevent the licensing from being bypassed with a crack/patch.
Although SoftwareKey is not in the obfuscation business, we support and strongly encourage the use of obfuscation to compliment the licensing features of the SoftwareKey System. If you’re currently using or considering using the SoftwareKey System, and you have questions about using obfuscation with it, our team is just a click or a call away. Contact us here.
Finally, here are some additional resources in case you are interested in doing some additional reading on this subject: