Traditional Culture Encyclopedia - Traditional festivals - Preface of software encryption and decryption

Preface of software encryption and decryption

Hidden software is a new branch of computer security research in recent ten years. In the research process of covert software, we should not only learn from computer security technology, but also use a lot of technologies in other fields of computational science, such as cryptography, steganography, digital watermarking, software measurement, reverse engineering and compiler optimization. We use these technologies to meet the needs of storing secret information safely in computer programs, although these needs are expressed in different forms. The word "secret" in this book has a wide meaning, and the technologies introduced in the book (code confusion, software watermarking and fingerprinting, tamper-proof technology and software "birthmark", etc. ) is used to prevent others from plagiarizing the intellectual achievements in the software. For example, the fingerprint technology used in software can be used to track whether the software is pirated, the code confusion technology can make it more difficult for attackers to analyze the software in reverse, and the tamper-proof technology can make it difficult for others to make cracked versions of the software, and so on.

Ok, now let's talk about why we should read this book, who is using hidden software and what this book will talk about.

Why are you reading this book?

Different from traditional security research, covert software does not care about how to prevent computers from being invaded by computer viruses. What it cares about is how the author of computer virus can prevent others from analyzing the virus! Similarly, we don't care whether there are security holes in the software. What we care about is how to secretly add some code to the program that will only be executed when the program is tampered with. In the field of cryptography, the security of encrypted data depends on the confidentiality of encryption key, and what we are studying now is how to hide the key. There are many software measurement techniques in software engineering to ensure a good program structure. This book will use the same technology to make the program complicated and difficult to read. Many technologies described in this book are algorithms developed based on compiler optimization technology, but the purpose of compiler optimization is to make the programs generated by the compiler as small as possible and as fast as possible. However, using some techniques introduced in this book will make the generated program very large and slow to execute. Finally, traditional digital watermarking and steganography try to hide the information to be hidden in images, audio, video or even plain text files, while steganography software hides the information to be hidden in computer code.

So, why read this book? Why do you want to know a security technology that can't prevent computers from being attacked by viruses or worms? Why do you want to learn a compilation optimization technology that will only make the code bigger and slow down the execution? Why spend your energy on the branch of cryptography that violates the basic premise of cryptography (that is, the key cannot be obtained by the attacker)?

The answer is that the traditional research results of computer security and cryptography sometimes can't solve the security problems that need to be solved urgently in practical work. For example, this book will show how to use software watermarking technology to prevent software piracy. Software watermark is a unique identification embedded in the program (similar to the card number or copyright statement of a credit card). Through this identification, the copy of the program is associated with you (the author of the program) or the customer. If you find a pirated CD of your own software in the market, you can trace the guy who bought the pirated software from you through the watermark extracted by the pirated software. When providing it to partners, you can also add a digital watermark to the beta version of the newly developed game. If you think someone leaked your code, you can find the perpetrator (from many partners) and take him to court.

Another example is that a new algorithm has been added to the new version of the program. Of course, you don't want competitors to get this algorithm and add it to their software. At this time, you can confuse the program, make it as complicated as possible, and make the competitor's reverse analysis software inefficient. And if you really suspect that someone has copied your code, this book will also teach you how to use the software "birthmark" to prove your suspicion.

For example, your program contains some unknown codes, and you want to make sure that the program can't run normally without these codes. For example, you certainly don't want hackers to modify the software license verification code in the program, or the key that can be used to decrypt mp3 files in the digital rights management system. Chapter 7 will discuss various tamper-proof technologies to ensure that the tampered program stops running normally.

I heard that you put the key in the executable file? What a terrible idea! Past experience tells us that any similar "secret, that is, security" approach will eventually end in failure. No matter how to hide the key in the program, it can't escape the palm of the tenacious reverse analyst. Of course, it must be admitted that what you did was right. All the skills introduced in this book cannot guarantee that the software will never be attacked by hackers. There is no need to guarantee that anything will be kept secret forever, that the program will never be tampered with, and that the code will never be copied. Unless there is a major breakthrough in this research field, all we can expect is to delay the opponent's attack. Our goal is to make the attacker's attack slow enough to make him feel that attacking your software is very painful or costly, so as to give up the attack. It may be that the attacker patiently took a long time to break through your defense, but by this time you have made enough money from this software, or you have used the updated version of the code (at this time, what he got is worthless).

For example, if you are the operator of a pay channel, users can watch TV programs provided by you through the set-top box. Each set-top box has a label-the unique ID assigned to each user is stored in a certain position of the code, so that you can decide whether to allow or reject a specific user to watch the program in the channel according to the user's payment. But now a hacker gang found and disassembled this code, found an algorithm to calculate the user ID, and sold the method to modify the user ID to netizens at a low price online. What should you do at this time? You may have thought of using a tamper-proof smart card, but this thing is not as difficult to crack as it looks. This will be explained in chapter 1 1. Or you may think of confusing the code to make it more difficult to analyze. Or you can use tamper-proof technology to make the program stop running automatically as soon as it is modified. More likely, you will use a mixture of the above technologies to protect your code. But despite the use of all technologies, you still have to know and accept the fact that your code may still be cracked and the secret will still be leaked (in this case, the user ID in the set-top box will still be tampered with). How did this happen? This is only because the idea of "security without publicity" is fundamentally flawed. But since all the technologies introduced in this book can't give you a "perfect and long-term security guarantee", why use these technologies and buy such a book? The answer is simple. The longer the code resists hacker attacks, the more customers subscribe to the channel, and the longer the set-top box upgrade cycle, so the more money you earn and the more money you save.

It's that simple.

Who can use secret software?

Many famous companies are interested in secret software. In fact, it's hard to really know how this technology is used in practice (because most companies are absolutely tight-lipped about how to protect their code), but we can still guess their interest in hidden software based on their patent application and ownership. Microsoft has many software watermarks, code confusion and software "birthmarks", and successfully founded a company. Apple has a patent on code obfuscation, which is probably used to protect its iTune software. Convera is an independent enterprise of Intel Corporation, focusing on code tamper-proof technology applied to digital rights management [27, 268-270]. Cloak Company, which was separated from Canada Northern Telecom, is also one of the most successful enterprises in this field. The company has a patent called "white box encryption" [67, 68, 182], which hides the encryption algorithm and key in the program code. From June, 5438 to February, 2007, Cloak Company was acquired by Aidid, a Dutch company mainly engaged in pay TV business, for $72.5 million. Even Sun Microsystems, a relative latecomer, submitted some patent applications in the field of code confusion.

Skype's VoIP client also uses code obfuscation and tamper-proof technologies similar to Arxan[24], Intel [27] and [89], which will be mentioned in this book for reverse engineering reinforcement. For Skype, it is undoubtedly extremely important to protect the integrity of its client, because once someone successfully parses its client software and analyzes the network protocol used by Skype, hackers can write cheap programs and communicate with Skype software normally (in this case, people don't have to use Skype). Therefore, keeping the network protocol private will help Skype to have a huge user base, which is probably the reason why Yi Bei bought Skype for $2.6 billion in 2005. In fact, the use of covert software technology has also won Skype enough time to become the leader of VoIP technology. Even if the Skype protocol is analyzed at this time (as hackers did, see section 7.2.4 for details), it is impossible for hackers to come up with a similar software that can shake Skype's market position.

Academic researchers have studied covert software technology from different angles. Some researchers with the research background of compilers and programming languages, such as us, will naturally join this research, because most algorithms involving code conversion will involve static analysis, which is familiar to researchers of compilation optimization technology. Although in the past, most cryptographers disdained to study the problem of "privacy is security", recently some cryptographers have begun to apply cryptography related technologies to software watermarking and found the limitations of code obfuscation technology. Researchers from multimedia watermarking, computer security and software engineering have also published many articles about covert software. Unfortunately, due to the lack of special publications and academic conferences (for researchers to communicate with each other), the research progress in this field has been greatly delayed. In fact, in order to make these research results accepted by traditional academic conferences and journals, researchers have been working hard and are still working hard. At present, the academic conferences that have published the research results of covert software include ACM seminar on POPL (Principles of Programming Language), information hiding seminar, IEEE software engineering seminar, advanced cryptography conference (CRYPTO), ISC (Information Security Conference) and other academic conferences on digital rights management. With the research in the field of covert software becoming the mainstream of academic research, we are expected to have periodicals, symposiums and even seminars devoted to covert software, but it is a pity that all this has not been realized so far.

The military also spends a lot of energy (and taxpayers' money) on secret software. For example, the patent of software watermarking algorithm owned by Cousot Company [95] belongs to French Terez Group, the ninth largest national defense engineering contractor in the world. The following is a quote from the latest (2006) US military tender document [303] about AT (tamper-proof) technology research.

AT present, all US military project execution departments (PEO) and project managers (PM) must use AT strategies formulated by the military and the Ministry of National Defense when designing and implementing related systems. Embedded software is the core of modern weapon system and one of the technologies that need to be protected most. AT technology can effectively ensure that these technologies will not be analyzed and utilized by other countries (people) in reverse engineering. Without the protection of at technology, it is easy to reverse analyze the code compiled by standard compiler. When analyzing software, reverse engineering analysts will use many tools, such as debuggers, decompilers and disassemblers, as well as various static and dynamic analysis techniques. The purpose of using AT technology is to make reverse engineering more difficult, so as to prevent the technological advantage of the United States from being stolen by other countries. In the future, it is necessary to provide a more useful, effective and diverse set of automatic test tools for the army's PEO and preventive maintenance ... The purpose of developing AT technology is to provide a high-strength shell that can resist reverse engineering analysis and delay the enemy's attack on protected software to the maximum extent. In this way, the United States has the opportunity to maintain its superiority in the high-tech field, or to slow down the leakage of its weapons technology. In the end, the US military can continue to maintain its technological superiority, thus ensuring its absolute superiority in armaments.

This tender document comes from the Missile and Space Program (Design Department) of the US Army, focusing on the protection of real-time embedded systems. We have reason to believe that the reason for this tender document is that the US military is worried that the missiles launched at the enemy will not explode after landing for various reasons, thus giving the enemy an opportunity to access the control software embedded in the missiles and guiding the missiles to fly over the target.

The following is another passage from the US Department of Defense [1 15].

Active software protection (SPI) is one of the responsibilities of the Ministry of National Defense, which must develop and deploy related protection technologies to ensure the safety of computer programs containing key information of national defense weapon systems. SPI provides a brand-new security protection method, which does not protect the security of computer or network (like traditional security technology), but only strengthens the security of computer program itself. This new method can significantly improve the information security of the Ministry of National Defense. SPI is widely used, and all programs from desktop to supercomputer can be protected by SPI technology. It is a complete layer and an example of "defense in depth". SPI technology is a supplement to traditional security technologies such as network firewall and physical security, but its implementation does not depend on these traditional security devices. Now, SPI technology has been deployed in selected HPC centers, more than 150 national defense departments and other military bases built and maintained by commercial companies. The extensive deployment of SPI technology will effectively enhance the protection of key application technologies in the United States and the US Department of Defense.

What does the above passage mean? It shows that the U.S. Department of Defense is not only concerned about whether missiles will fall into enemy territory, but also about the safety of software running in its own computer center with high safety factor and performance. In fact, stealing secrets and anti-stealing secrets are eternal themes between anti-espionage agencies and intelligence departments. For example, a program on a fighter needs to be updated. At this time, it is very likely that we will use our laptop to connect this fighter for updating operation. But what happens if this laptop is accidentally lost, or just controlled by other governments in some way, as often happens in movies? The other side will immediately take the relevant code for reverse engineering analysis, and use the analysis results to improve the software used on its fighter. What's more, the other party will quietly add a Trojan horse to your software to make the plane fall from the sky at a certain time. If it is not absolutely guaranteed that the above scenario is 100% impossible, then the hidden software can at least be used as the last line of defense (at least the responsibility can be investigated afterwards). For example, software on an airplane can make a fingerprint signature with the ID of the person who has access to the relevant software. If one day, these codes are found on fighter planes in other countries, we can immediately reverse analyze these codes and further calculate who is the culprit of the leak.

What? I heard you say, why should I be interested in how to protect their secrets between the government and business giants? If hackers crack these softwares, they only get some meager benefits through their own labor. Having said that, the benefits of these protection technologies are ultimately greater than the benefits they bring to business giants. The reason is that for you, legal protection measures (such as patents, trademarks and copyrights) will only work if you have enough financial resources to sue the other party in court. In other words, even if you think that a big company stole an idea with huge "money" by cracking your code, you can't sue Microsoft in court through that marathon lawsuit unless you have enough financial strength to survive in this financial competition. The protection technologies discussed in this book (such as code obfuscation and tamper prevention) are cheap and easy to use, and can be used by small and medium-sized enterprises and business giants. And if you sue this big company at this time, you can also use watermark or software "birthmark" and other technologies to show the real evidence of code plagiarism in court.

Finally, I have to briefly mention another kind of people who are extremely good at using hidden software-bad people. The author of the virus has been able to disguise the virus code by using code obfuscation technology very successfully, thus avoiding the detection of anti-virus software. It is worth mentioning that when people use these technologies (such as protecting DVDs, games and cable TV), they are often cracked by hackers, but when hackers use these technologies (such as building malicious software), it is difficult for people to fight.

The contents of this book

The purpose of covert software research is to invent an algorithm, which can delay the progress of opponents (reverse engineering analysis) as much as possible, and at the same time reduce the computational overhead increased in program execution because of using this technology as much as possible. At the same time, it is necessary to invent an evaluation technology, so that it can be said that "after using algorithm A in the program, it takes T units more for hackers to break the new program than the original program, and the performance overhead of the new program is zero", or at least it should be said that "compared with algorithm B, the code protected by algorithm A is more difficult to break". In particular, the research on covert software is still in its infancy. Although we will introduce all relevant protection algorithms and evaluation algorithms to you in the book, the present situation of this art is still not ideal (you can't be too disappointed then).

In this book, we try to sort out all the current research results about hidden software and introduce them to readers systematically. We strive to cover a technology in each chapter, and describe the application fields and available algorithms of this technology. Chapter 1 will give some basic concepts in the field of covert software; Chapter 2 introduces the tools and skills commonly used by hackers in reverse analysis software by antagonistic demonstration mode, and then introduces how to prevent hackers' attacks according to these tools and skills; Chapter 3 describes in detail the techniques used by hackers and software protectors to analyze computer programs; Chapter four, chapter five and chapter six introduce the related algorithms of code obfuscation respectively. The seventh chapter introduces the relevant algorithms of tamper-proof technology; Chapters 8 and 9 introduce the algorithms related to watermarking respectively. Chapter 10 introduces the software "birthmark" related algorithm; Chapter 1 1 describes the software protection technology based on hardware devices.

If you are an enterprise manager and are only interested in the research status of hidden software and how these technologies are applied to your project, then just read chapter 1 and chapter 2. If you are a researcher with a compiler design background, it is recommended to skip to the third chapter and start reading. But the following chapters are best read in order. This is because ... er, for example, the knowledge introduced in the code confusion chapter will be used in the chapter introducing watermarking technology. Of course, in the process of writing this book, we still try to make each chapter independent, so (if you have some background knowledge) it is not bad to skip a chapter or two occasionally. If you are an engineer and want to use related technologies to strengthen your software, it is strongly recommended that you read all the contents in Chapter 3 carefully, and if possible, you should also make some textbooks on compilation principles to make up for the knowledge of "static analysis of programs". Then you can skip to the chapter you are interested in and read at will. If you are a college student and read this book as a course material, read it page by page, and don't forget to review it at the end of the term.

I hope this book can do two things. First of all, I want to prove to readers that there are a lot of wonderful ideas in code confusion, software watermarking, software "birthmark" and tamper prevention, which are worth your time to learn, and these technologies can also be used to protect software. Secondly, I hope this book can collect all the useful information in this field at present, thus providing a good starting point for the in-depth study of covert software.

Christian Kohlberg and Jaswell nagra.

2 February 2009 (groundhog day)

P.S. Actually, there is a third purpose in writing this book. If, in the process of reading this book, you suddenly have an epiphany and come up with a brilliant idea, which in turn inspires your ambition to devote yourself to the research of hidden software, then, dear readers, my third goal has been achieved. Please tell us your new algorithm and we will add it to the next edition of this book!