Project Overview
The project is based on ‘Wetlands Tracker,’ an application that streamlines the analysis of permit applications published by the US Army Corps of Engineers (USACE) in Alabama, Texas, and Louisiana. These permits pertain to construction projects impacting the Gulf of Mexico’s wetlands, an ecologically sensitive area. The application utilizes advanced web scraping and text extraction techniques, coupled with OpenAI’s Large Language Model (LLM) GPT-3.5, to read and interpret thousands of permit applications published online. This automation significantly reduces the time and labor traditionally required for such tasks, potentially enhancing the ability of environmental advocates to target their efforts more effectively. The project will commence with a comparative analysis of the performance of commercial LLMs – like OpenAI – against the latest open-source LLMs. Following this, we plan to develop a prototype for a question-answer interface. This interface will allow users to interact with and query the extensive database of PDFs using natural language, making the information more accessible and user-friendly.
The first phase will assess relative performance of commercial LLMs like OpenAI vs newly-released open-source LLMs. Next, the project intends to create a prototype of a question-answer interface – so users can query the database of PDFs in natural language. Using the work the Massive Data Institute (MDI) has done with community partners, we want to develop more general tools that will allow us to implement and assess LLM application in other policy domains. We also wish to expand and engage the community of researchers engaged in policy-relevant LLM work.
Team
Kumar H
MS Data Science and Public Policy/ Research Assistant, Massive Data Institute @ McCourt School of Public Policy, Georgetown University
Dr. Michael A Bailey
Colonel William J. Walsh Professor of American Government, Department of Government and McCourt School of Public Policy