AI is blinded by a T-shirt? Lack of knowledge is the root cause!

  Reporter Xie Kaifei

  Correspondent Xu Xiaofeng Wang Yixi

  As long as you put on a T-shirt with a special pattern, you can fool the AI human body detection system, thus achieving the "stealth" effect?

  Recently, this scene was really staged. Research institutions such as Northeastern University and Massachusetts Institute of Technology have jointly designed T-shirts based on antagonistic sample technology. According to the researchers, this is the world’s first physical antagonism experiment on non-rigid objects (such as T-shirts). The AI human detection camera can’t accurately detect the pedestrians wearing the T-shirt, and it can achieve the "stealth" effect no matter how the clothes are wrinkled or deformed.

  What is the principle behind this T-shirt that can make people "invisible" under the AI ? ? human detection system? Will this defect lead to security problems and how to solve them? The reporter of Science and Technology Daily interviewed relevant experts.

  Special patterns can fool AI’s "eyes"

  In this experiment, a man wearing a white T-shirt and a woman wearing a black T-shirt came from a distance. Under the AI ? ? body recognition camera, only women wearing black T-shirts could be seen.

  How is this done? It turns out that researchers used a method called anti-attack to deceive AI. Careful observation shows that different color blocks are printed on the white T-shirt. These color blocks are no different from ordinary patterns to human eyes, but they will cause some interference to the machine.

  Wang Jinqiao, a researcher at the Institute of Automation, China Academy of Sciences, explained that the researchers modified the pattern on the original T-shirt, and replaced the original pattern with a pattern with strong interference through technical means, which changed the original visual appearance of the T-shirt and made the AI model confuse and make mistakes in the prediction of data labels, thus achieving the purpose of attack.

  "Attackers interfere with data by constructing insignificant disturbances, which can make artificial intelligence algorithms based on deep neural networks output any wrong results that attackers want. And such disturbed input samples are called confrontation samples. " Wang Jinqiao said.

  In practice, countermeasure samples are mainly used to test some systems with high safety factors, so as to improve the security of AI model and resist possible security risks through countermeasures. For example, face-brushing payment must have certain anti-attack ability in order to avoid disastrous consequences, such as not allowing attackers to simply use photos or directionally modify the original input to crack the user payment system.

  Experiments show that for a correctly classified panda image, the human eye still sees the panda after adding the interference of a specific countermeasure sample, but the AI image recognition model classifies it as a gibbon, and the confidence is as high as 99%.

  However, there is a flaw in the way of deceiving AI by printing antagonistic patterns on clothes. As long as the angle and shape of the pattern change, it will be easily seen through. In the past, some simple transformations, such as scaling, translation, rotation, brightness and contrast adjustment, and adding adaptive noise, were usually used when designing anti-samples.

  Wang Jinqiao explained that these simple transformations are often effective in generating confrontation samples of static targets, but they are easy to fail for non-rigid dynamic targets such as pedestrians. Because of the movement and attitude change of the dynamic target, these simple transformations will change greatly, thus making the countermeasure samples lose their original properties.

  "Compared with the confrontation samples designed in the past, the success rate of this attack is higher." Dr. Ke Xiao, deputy director of the School of Mathematics and Computer Science of Fuzhou University and Fujian New Media Industry Technology Development Base, pointed out that in order to cope with the deformation of T-shirts caused by human movement, researchers used the method of "thin plate spline interpolation" to model the possible deformation of pedestrians. At the same time, in the training stage, the lattice of chessboard pattern on T-shirt is used to learn the position change relationship of deformation control points, which makes the generated confrontation samples more realistic and more fit to human deformation.

  AI vision system is interfered by many factors.

  In addition to countering attacks, many environmental factors and human factors in practical applications may lead to mistakes in AI human detection.

  For example, in the automatic driving scene, due to bad weather conditions (such as heavy snow, fog, etc.) or complex light and road conditions, the imaging of the people in front is blurred, which will greatly affect the detection performance of the front target. In the monitoring scene, suspicious people may interfere with the artificial intelligence algorithm through the occlusion of clothes, umbrellas, etc.

  "Excluding its own emergency braking function, cars with pedestrian detection function also have problems such as not being able to detect small target human bodies in time and accurately." For example, Ke Xiao said that the American Automobile Association once conducted a test on several brands of vehicles with pedestrian detection function, and the targets used in the test included adult dummies and children dummies. When children appear in front of the car or the speed of the car reaches 48 kilometers per hour, only one brand has a certain probability to detect pedestrians, and the other three brands have not detected pedestrians in the two scenes.

  Why is the target detection model under AI visual recognition technology so fragile? "In human eyes, slight image interference will not affect the final judgment, but it is not the case for the AI ? ? model." For example, Ke Xiao said that related experiments show that a well-tested image detection and recognition classifier does not learn and understand the real bottom information of the target image like humans, but only constructs a well-behaved machine learning model on the training samples.

  It is understood that the existing AI visual recognition technology usually adopts deep neural network, which is essentially a deep mapping of features. It only learns the statistical characteristics of data or the correlation between data, and it depends on the amount of data and the richness of the data itself. The more data there is, the more discriminating the features learned by the machine for identifying the target, and the more it can reflect the correlation.

  Wang Jinqiao said, but the real situation is that the data is often very limited, which makes it difficult for the neural network model to "see more and see more", which leads to its unsatisfactory performance in the face of data that it has never seen before. On the other hand, once this statistical feature distribution and correlation are known or cracked by attackers, it is possible to modify the input samples in a targeted manner, thus changing the output of the model and achieving the purpose of attack.

  AI vision failure is easy to cause security problems.

  Wearing a special T-shirt to achieve the so-called "stealth" effect is actually confusing the visual system of AI. Will this defect of AI target detection technology lead to security problems?

  Ke Xiao said that in the case of automobile-assisted driving of American Automobile Association, pedestrians were missed or failed to be detected in time, which may lead to traffic accidents. In addition, the omission of dangerous people and articles in security monitoring may also lead to security risks, and criminals can use confrontation attacks to find loopholes in the target detection system and carry out attacks.

  "Safety problems may be caused by the defects of the model itself, such as insufficient generalization performance, single training data and over-fitting. At this time, we should enrich the training data as much as possible, and add technical means to prevent over-fitting in the model training process to improve the actual combat capability of the model. " Wang Jinqiao believes that, on the other hand, in practical systems, it is often necessary to consider the model security to enhance the credibility of the results and the robustness of the model, and to add the prediction of the attack model to improve the discrimination ability of the confrontation samples, thus reducing the security risks.

  At present, researchers are constantly proposing AI target detection models with higher accuracy and faster speed to solve the problems of missed detection, false detection, poor real-time and robustness in target detection technology. What efforts need to be made for the construction of future technical security?

  Wang Jinqiao believes that artificial intelligence is still in its infancy, and the existing artificial intelligence algorithms are essentially learning simple mapping relationships, without really understanding the content behind the data and the potential causal relationship. Therefore, its theoretical innovation and industrial application are still facing many technical difficulties, which require scientific researchers to continue to tackle key problems and realize real intelligence to reduce the risk of application.

  "Secondly, in the process of technical research and application of new technologies, researchers should consider various security issues as much as possible, join the anti-sample anti-attack model, and take corresponding measures." Wang Jinqiao suggested that laws and regulations related to artificial intelligence should also be established and improved from the social level, so as to guide the application scope of technology, give corresponding guidance and norms to possible safety problems, and create a more comprehensive and mature environment for scientific and technological innovation.