Deep Learning has gained significant progress in recent years. At the 25th National Technology Law Symposium in Taipei in December 2021, I updated my research on applying Deep Learning techniques to the patent domain, particularly patent text generation. This blog post summarizes my presentation at the symposium in three aspects: (1) why Deep Learning and generative language models may apply to the patent domain and potentially the other legal domains, (2) my research progress so far, and (3) looking into the future.
First, the generative language model approach in the Deep Learning field is a generic approach. In the Computer Science field, a generative language model is generally defined as a model that can predict the next word. The idea is simple but powerful. Such a model is agnostic to languages. This means that the approach can apply to different natural languages. In addition, such a model is agnostic to domains. This means that the model can consume text in different domains. For example, by training the generative model with text in a different domain, the model can generate text in that specific domain. Behind the scene, the input text is converted to numbers (or tensors). The model learns the correlations among numbers, predicts the numbers to generate next, and converts the next and generated numbers back to text. In this way, the mechanism can apply not only to the patent domain but also the other legal domains. In my research, I started from the patent domain because of my technical background. At this stage, it is too early to tell whether the generative model may generate a patent application that can pass a patent examiner's review and get granted a new patent. Nevertheless, it is encouraging to see that sometimes even patent professionals are surprised by the quality of the generated patent text because it is pretty coherent.
Second, my research on such patent text generation was explained and demonstrated with several examples at the symposium. For example, by having some text as an input, called priming for the model, the patent generative model can produce patent text based on a preferred degree of randomness. The generative model for patent text is called PatentGPT in my research because it is based on the popular GPT-2 model which was developed and released by OpenAI. Another example in my presentation is rewriting a patent claim to a patent abstract. The way for the PatentGPT to accomplish such a task is by training the language model with tons of examples so that the model learns how to do the task. In general, depending on the downstream patent tasks, a generative model can perform those tasks by having the right data structure for the training data and sufficient epochs of training. At this moment, the patent text generation task in my research covers the patent title, patent abstract, and patent claims. The data scope is planned to include the other parts of the whole patent document, such as patent description, in the future.
Last, I highlighted the progress and race for bigger language models in 2021 by bringing the top 5 successors of GPT models into attention. The top 5 includes the Switch Transformer (by Google), DALL-E (for text-to-image generation, by OpenAI), LaMDA (for chatbot, by Google), MUM (for multitasking and multimodal languages, by Google), and Wu Dao 2.0 (covering both English and Chinese, by Beijing Academy of Artificial Intelligence). These gigantic models should have great potentials for the patent domain and other legal domains if one can formulate the desired tasks and have resources to train the models. Applying such gigantic language models to solve legal tasks is uncharted territory for interdisciplinary researchers. Looking forward, it is also an exciting time to imagine how powerful the next top 5 language models may become next year. The RETRO (Retrieval-Enhanced Transformer) model proposed by DeepMind recently is especially promising because it is capable of improving language models by retrieving from training data. Such a retrieval mechanism can be invaluable and essential for some legal tasks, such as prior art search in the patent domain.
I hope that my blog post and my presentation at the symposium can attract more scholars and researchers to explore the possibilities of applying Deep Learning techniques for legal innovation. Particularly, in the patent domain, my research has provided the initial proof of concept for moving toward the upcoming era of human-machine co-creation. Patent text generation is just an example of how generative language models can create new values and applications in one legal domain. I believe that it is a matter of time to see more machine-generated text in the other legal domains in the near future.
Jason Lee, Deep Learning for Legal Innovation in the Patent Domain, Digital Law Asia (Feb. 3, 2022), https://digital.law.nycu.edu.tw/blog-post/f4a9qw/