Traditional Culture Encyclopedia - Traditional stories - Beijing formulates innovative measures for general artificial intelligence

Beijing formulates innovative measures for general artificial intelligence

In terms of strengthening the overall supply capacity of computing power resources, the exposure draft proposes to organize commercial computing power to meet the urgent needs of cities. Strengthen cooperation with market players such as head-end public cloud vendors, implement the computing partner program, determine the first batch of partner program members, and clarify the supply technical standards, software and hardware service requirements, computing power supply scale, preferential strategies, etc. , and announced a number of high-quality computing power suppliers for universities and small and medium-sized enterprises in Beijing.

The document also proposes to efficiently promote the construction of new computing infrastructure. Incorporate new computing power construction projects into the computing power partner program, speed up the construction of projects such as "Beijing Artificial Intelligence Computing Platform" in Haidian District and "Beijing Digital Economy Computing Center" in Chaoyang District, quickly form a large-scale advanced computing power supply capacity, and support the research and development of large-scale language models, multi-modal models, large-scale fine neural network simulation models and brain-inspired neural networks with hundreds of billions of parameters.

In terms of improving the supply capacity of high-quality data elements, the exposure draft proposes to integrate the existing open source Chinese pre-training data set with high-quality Internet Chinese data and clean it according to the current situation that the proportion of high-quality Chinese corpus trained by large model is too small, which is not conducive to Chinese context expression and industrial application. At the same time, we will continue to expand high-quality multi-modal data sources, build a standardized and safe pre-training corpus for large models such as Chinese, graphic pairs, audio and video, and conditionally open it through the social data area of Beijing International Big Data Exchange.

The document also proposes to speed up the construction of a "national data basic system pilot demonstration zone" with high-level open data elements, strive for a national data training base, and improve the scale and quality of Beijing artificial intelligence data labeling library. It is suggested that enterprises affiliated with high-quality data websites provide some desensitized high-quality data, which can be opened in a targeted and conditional way. Enterprises or scientific research institutions can apply for free use online and explore business scenario cooperation based on data contribution and model application.

In terms of the layout of large-scale model technology system, the document proposes to carry out research on innovative algorithms and key technologies of large-scale models, strengthen the research and development of large-scale model training data collection and governance tools, open the benchmarks and tools for large-scale model evaluation, and explore new paths of general artificial intelligence such as embodied intelligence, general agent and brain-like intelligence.

For scenario applications, the exposure draft mentioned six areas: government services, medical care, scientific research, finance, autonomous driving, and urban governance.

The document proposes to explore and create an inclusive and prudent regulatory environment, encourage generative artificial intelligence products to achieve upward and good applications in non-public service areas such as scientific research, and issue the "Recommended Guide for Internet Information Service Algorithms in Beijing".

Attachment: Some Measures for Promoting the Innovation and Development of General Artificial Intelligence in Beijing (2023-2025) (Draft for Comment)

In order to seize the development opportunity of the big mode, attach importance to the development of general artificial intelligence, give full play to the guiding role of the government and the catalytic role of the innovation platform, integrate innovation resources, strengthen the allocation of elements, create an innovation ecology, pay attention to risk prevention, and promote the innovation leading in the field of general artificial intelligence in our city, the following implementation measures are put forward:

First, strengthen the overall supply capacity of computing resources.

(A) the organization of commercial computing capacity to meet the urgent needs of the city.

Strengthen cooperation with market players such as head-end public cloud vendors, implement the computing partner program, determine the first batch of partner program members, and clarify the supply technical standards, software and hardware service requirements, computing power supply scale, preferential strategies, etc. , and announced a number of high-quality computing power suppliers for universities and small and medium-sized enterprises in Beijing.

(2) Effectively promote the construction of new computing infrastructure.

Incorporate new computing power construction projects into the computing power partner program, speed up the construction of projects such as "Beijing Artificial Intelligence Computing Platform" in Haidian District and "Beijing Digital Economy Computing Center" in Chaoyang District, quickly form a large-scale advanced computing power supply capacity, and support the research and development of large-scale language models, multi-modal models, large-scale fine neural network simulation models and brain-inspired neural networks with hundreds of billions of parameters.

(3) Build a unified cloud computing power dispatching platform.

Use the unified entrance of the government to reduce the procurement cost of public cloud, benefit small and medium-sized enterprises, and reduce the communication cost of enterprises facing different cloud vendors. Aiming at the demand of flexible computing power, a unified cloud computing power scheduling platform is built to realize unified management and unified operation of heterogeneous computing environments, which is convenient for enterprises to run various artificial intelligence computing tasks seamlessly, economically and efficiently in different cloud environments. Build a direct basic optical transmission network between Beijing and Hebei, Tianjin, Shanxi, Inner Mongolia and other provinces (cities), further enhance the platform's perception of computing resources in the four places, and explore computing transactions.

Second, improve the supply capacity of high-quality data elements.

(d) Collection of high-quality basic training data sets

Aiming at the problem that the proportion of high-quality Chinese corpus in large-scale model training is too small, which is not conducive to Chinese context expression and industrial application, the existing open source Chinese pre-training data set and high-quality Internet Chinese data are integrated and cleaned in compliance. At the same time, we will continue to expand high-quality multi-modal data sources, build a standardized and safe pre-training corpus for large models such as Chinese, graphic pairs, audio and video, and conditionally open it through the social data area of Beijing International Big Data Exchange.

(five) to create a "national data basic system pilot demonstration zone" and plan the national data training base.

Accelerate the construction of a "national data basic system pilot demonstration zone" with high-level open data elements, strive for a national data training base, and improve the scale and quality of Beijing artificial intelligence data labeling library. It is suggested that enterprises affiliated with high-quality data websites provide some desensitized high-quality data, which can be opened in a targeted and conditional way. Enterprises or scientific research institutions can apply for free use online and explore business scenario cooperation based on data contribution and model application.

(6) Build a crowdsourcing service platform for fine labeling of data sets.

We will build a crowdsourcing service platform for command data sets and multimodal data sets, develop an intelligent cloud service system that integrates relevant tools and applications, encourage and organize professionals from different disciplines to mark the training data and command data of the general artificial intelligence model, improve the diversity of training data, give appropriate rewards to contributors, and promote the sustained and benign development of the platform.

Third, the system layout large model technology system, continue to explore the road of general artificial intelligence.

(seven) to carry out large-scale model innovation algorithm and key technology research.

Around the whole process of large-scale language model construction, training, optimization and comparison, reasoning and deployment, support innovative algorithms and core technology research, form a complete and efficient training system and open source to the outside world. Explore the multi-modal general model architecture, study the large-scale model efficient parallel training technology, as well as optimization methods such as logic and knowledge reasoning, instruction learning, and human intention alignment, and develop efficient compression technology supporting tens of billions of parametric model reasoning.

(eight) to strengthen the research and development of large-scale model training data collection and management tools.

From five aspects of "collection, storage, management, research and use", the data processing tools including data collection, cleaning, marking, desensitization and storage are developed. This paper focuses on the real-time updating technology of Internet data, integration and classification methods of multi-source heterogeneous data, related systems of data management platform, software tools and algorithms such as data cleaning, labeling, classification and labeling, and algorithms and tools for data content security review.

(9) Open benchmarks and tools for large-scale model evaluation.

Construct the evaluation benchmark and evaluation method of multi-modal and multi-dimensional basic model. Establish a basic model evaluation tool set and provide adaptive evaluation tools. Establish a fair and efficient adaptive evaluation mechanism, and automatically adapt different tools and indicators according to different evaluation objectives. The evaluation algorithm of intelligent model aided by artificial intelligence is studied, and an automatic evaluation tool for subjective or generative tasks is constructed. Integrate multi-dimensional evaluation tools including universality, efficiency, intelligence and robustness, and build an online evaluation service platform for basic models.

(ten) to promote the research and development of large-scale basic software and hardware systems.

Support the development of distributed and efficient training system to realize efficient and automated parallel model training tasks. Develop a new generation of artificial intelligence compiler suitable for model training scenarios, realize automatic operator generation and automatic optimization, and promote the wide adaptation of artificial intelligence chips and frameworks. Develop an artificial intelligence chip evaluation system to realize multi-chip and multi-frame automatic evaluation. It provides the basic software and hardware ecological foundation for independent innovation for large-scale model training and application.

(eleven) to explore new paths of general artificial intelligence such as body intelligence, general agent and brain-like intelligence.

Develop the basic theoretical framework system of general artificial intelligence, and strengthen the basic theoretical research on mathematical mechanism, independent cooperation and decision-making of artificial intelligence. Promote the research and application of intelligent system with body, and break through the perception, cognition and decision-making technology of robots under complex conditions such as open environment, generalized scenes and continuous tasks. Explore the research on the new path of general artificial intelligence driven by value and causality, build a unified theoretical framework system, rating standards and testing platform for general artificial intelligence, develop a general artificial intelligence operating system and programming language, and promote the application of the underlying technical architecture of general agents. Explore interdisciplinary research such as brain-like intelligence, and inspire new artificial neural network modeling and training methods by studying the connection mode, coding mechanism and information processing principle of brain neurons.

Fourth, promote the application of general artificial intelligence technology innovation scenarios.

(twelve) to promote the pilot application in the field of government services.

Focus on government consultation, policy service, immediate handling of complaints, and openness of government affairs, and take the lead in realizing large-scale technology empowerment. With the help of large-scale model semantic understanding, autonomous learning and intelligent reasoning, the intelligent answering level of government consultation system is improved and the multilingual interaction ability is enhanced. Support the construction of the "Beijing Policy" platform and optimize the standardized management and accurate service of policies. Assist the public service hotline to respond to public demands more efficiently and deepen the efficient use of people's livelihood big data. Improve the convenience of handling services, assist in guiding clerks to fill out forms, assist comprehensive window personnel to provide more accurate handling instructions, assist examination and approval personnel to improve examination and approval efficiency, and promote more full enjoyment of business data and more efficient collaboration of business processes.

(thirteen) to explore the demonstration application in the medical field.

Support qualified research medical institutions in our city to refine the scene requirements of intelligent guidance, auxiliary diagnosis and intelligent treatment, fully tap multimodal medical data such as medical documents, medical knowledge maps and medical images, build intelligent applications based on general data and professional data in the medical field, realize accurate identification and prediction of various diseases and symptoms, and assist medical institutions to improve the decision-making level of disease diagnosis, treatment and prevention.

(fourteen) explore the demonstration application in the field of scientific research.

Develop scientific intelligence, accelerate artificial intelligence technology, and empower scientific research in the fields of new materials and innovative drugs. Support relevant laboratories in the fields of energy, materials and biology in our city to set up scientific research cooperation projects, carry out joint research and development with relevant scientific research institutions and innovative enterprises in our city, fully tap experimental data in the fields of materials, protein and molecular drugs, develop scientific calculation models, carry out chemical structure sequence prediction of new alloy materials, protein sequences and innovative drugs, and shorten the period of scientific research experiments.

(fifteen) to promote the demonstration application in the financial field.

Further explore the application scenarios of the financial industry in our city, and systematically arrange a number of financial institutions to "open the list" projects. In view of the high information load and fast information update in the financial scene, it is difficult for financial practitioners to obtain accurate information quickly and comprehensively, and support financial technology enterprises to explore and apply artificial intelligence technology to deeply understand and analyze financial texts. Focus on intelligent risk control, intelligent investment, intelligent customer service and other links, promote the accurate analysis of long financial texts and update of model knowledge, break through the fusion technology of complex decision logic and model information processing ability, realize the transformation from complex financial information processing to investment decision-making suggestions, and support investment-assisted decision-making in the financial field.

(sixteen) explore the demonstration application in the field of automatic driving.

Support autonomous driving enterprises to develop multi-modal autonomous driving technology, give full play to the advantages of high-dimensional semantic understanding and generalization of large-scale language models, improve the multi-dimensional perception and prediction performance of autonomous driving models based on vehicle-road collaborative data and vehicle-driving multi-sensor fusion data, effectively solve the long tail problem of complex scenes, and help to improve the generalization ability of vehicle-borne autonomous driving models. Support the construction of vehicle-road collaborative database in the construction of Beijing high-level autonomous driving demonstration zone 3.0, and guide enterprises to carry out automatic driving model training iteration based on real scenes. Explore the model test of cloud-controlled autopilot based on low-delay communication, and open up a new technical path for autopilot.

(seventeen) to promote the demonstration application in the field of urban governance.

Support artificial intelligence R&D enterprises to take the lead in introducing large model technology into urban brain construction, carry out research and development of multi-sensory system fusion processing technology, break the data islands of various systems in urban governance, realize unified perception, correlation analysis and situation prediction of the underlying business of smart cities, scientifically allocate government resources and administrative power, and provide more comprehensive and comprehensive decision-making assistance for urban governance.

Verb (abbreviation of verb) explores and creates an inclusive and prudent regulatory environment.

(eighteen) continue to promote the innovation of regulatory policies and regulatory processes.

Explore the creation of a stable and inclusive regulatory environment, actively promote the inclusive and prudent supervision of new technologies in the field of artificial intelligence to empower traditional industries, and support the independent innovation, popularization and application and international cooperation of basic technologies such as artificial intelligence algorithms and frameworks. Give priority to the use of safe and reliable software, tools, calculation and data resources, and ensure the standardization of training data sets by improving algorithms and other technical means. Encourage generative artificial intelligence products to achieve upward application in non-public service fields such as scientific research. Actively strive for the Ministry of Network Information, establish a pilot project in the core area of Zhongguancun, and promote the implementation of inclusive and prudent supervision.

(nineteen) to establish a normalized service and guidance mechanism.

Do a good job in the safety assessment of generative artificial intelligence products that intend to provide services to the public, establish a normalized contact service and guidance mechanism, and urge enterprises to abide by laws and regulations and respect social morality, public order and good customs. Optimize the safety assessment process mechanism, refine the evaluation standards such as large model algorithm design, training data source screening, content security, manual labeling rules, carry out accurate service guidance, and accelerate the safety assessment of related technical products of artificial intelligence enterprises in our city. Guide enterprises to establish and improve the algorithm security prevention mechanism, introduce security detection technology tools in the product development stage, and urge enterprises to actively perform procedures such as algorithm filing, modification and cancellation. The Guide to Recommendation and Compliance of Internet Information Service Algorithms in Beijing was issued to guide innovative subjects to establish a sense of safety responsibility, improve the management system, strengthen technical means, and promote the development of enterprise algorithm compliance.

(twenty) to strengthen the security protection of network services and personal data protection.

Guide computing operators to implement network security law, data security law, personal information protection law and other laws and regulations, strengthen network and data security management, clarify the main responsibilities of network security, data security and personal information protection, strengthen the construction and implementation of security management system, encourage enterprises to carry out data security management certification and personal information protection certification, implement data cross-border transmission security management system, and comprehensively improve network security and data security protection capabilities.

(twenty-one) continue to improve the self-discipline and autonomy of ethical governance in the artificial intelligence industry.

Implement the task of building a new generation of national artificial intelligence innovation and development experimental zone, strengthen the research on artificial intelligence ethical safety norms and social governance practice, develop and deploy artificial intelligence ethical governance service platform, serve government supervision and industry self-discipline, strengthen the awareness of scientific and technological ethics norms of relevant responsible subjects, and enhance the ability of scientific and technological ethical governance.

I. Drafting background

In order to seize the opportunity of large-scale model development, attach importance to the development of general artificial intelligence, give full play to the guiding role of the government and the catalytic role of the innovation platform, integrate innovative resources, strengthen the allocation of elements, create an innovative ecology, pay attention to risk prevention, and promote innovation and leadership in the field of general artificial intelligence in our city, these measures are formulated.

Second, the main content

The "Several Measures" clarified the organizational mechanism, and put forward 2 1 specific measures in five directions, such as strengthening the overall supply capacity of computing resources, improving the supply capacity of high-quality data elements, constantly exploring the path of general artificial intelligence in system layout and large-scale model technology system, promoting the application of innovative scenarios of general artificial intelligence technology, and exploring and creating an inclusive and prudent supervision environment.

First, strengthen the guidance of the overall supply capacity of computing power resources, rely on the working mechanism of the overall joint meeting of the municipal data center, strengthen the communication and cooperation between relevant units at the urban level and key new R&D institutions, cloud service enterprises, computing power construction enterprises, basic telecommunications enterprises and other units, and promote the collection of existing computing power, the demonstration of new projects and the transformation of existing projects. Three concrete measures are put forward in this direction: organizing commercial computing power, building a new computing power infrastructure, and building a cloud computing power scheduling platform.

The second is to improve the supply capacity of high-quality data elements, and jointly build large-scale pre-training basic data sets and high-quality fine-tuning data sets with relevant units. Establish a coordination mechanism for the supply and use of training data, and strengthen communication and cooperation between relevant industry authorities, relevant district governments, key R&D units, platform enterprises, data trading institutions and other market entities. In this direction, three specific measures are put forward: collecting high-quality basic training data sets, building a "national data basic system pilot demonstration area", planning a national data training base, and building a crowdsourcing service platform for fine labeling of data sets.

Third, the system lays out a large-scale model technology system, constantly explores the path of general artificial intelligence, supports the research on innovative algorithms and key technologies of large models, supports the research and development of basic software and hardware systems of large models, training data collection management tools and evaluation tools, and supports the exploration of new paths of general artificial intelligence. In this direction, five specific measures are put forward, such as carrying out research on innovative algorithms and key technologies of large models, strengthening research and development of training data collection and governance tools of large models, opening up evaluation benchmarks and tools of large models, promoting research and development of basic software and hardware systems of large models, and exploring new paths of general artificial intelligence.

The fourth is to promote the application direction of large-scale model technology innovation scenarios, give full play to the strong generalization ability of large-scale models, guide enterprises to fully tap domain data resources, carry out research on domain large-scale model application technology, expand the application boundary of large-scale models, and explore the business model and innovation ecology of large-scale models oriented to subdivided vertical fields. This direction puts forward six specific measures to expand application scenarios in government services, medical care, scientific research, finance, autonomous driving, urban governance and other fields.

The fifth is to explore the direction of creating an inclusive and prudent supervision environment, establish a normalized contact and service mechanism with large demonstration enterprises, continuously investigate and track the difficulties encountered by enterprises in safety assessment, strengthen communication and coordination with the National Network Information Office, and actively strive to establish a pilot area in the core area of Zhongguancun to promote the implementation of inclusive and prudent supervision. Four specific measures are proposed in this direction: continuously promoting the innovation of regulatory policies and regulatory processes, establishing a normalized service and guidance mechanism, strengthening large-scale network security protection and personal data protection, and continuously improving the self-discipline and autonomy of ethical governance in the artificial intelligence industry.