Announcements

 View Only

Discussion: Web Archiving and AI

  • 1.  Discussion: Web Archiving and AI

    Posted May 23, 2024 04:34 PM

    Join the SAA Web Archiving Section on Friday, June 14, 12-1pm EST for a discussion with Matteo Cargnelutti and Kristi Mukk from the Harvard Library Innovation Lab about web archiving and AI!

     

    Description:

     

    Can the techniques used to ground and augment the responses provided by Large Language Models be used to help explore web archive collections? That question led us to develop and release WARC-GPT: an experimental open-source Retrieval Augmented Generation tool for exploring collections of WARC files using AI. WARC-GPT functions as a highly-customizable boilerplate the web archiving community can use to explore the intersection between web archiving and AI. Specifically, WARC-GPT is a RAG pipeline, which allows for the creation of a knowledge base out of a set of WARC files, which is later used to help answer questions asked to a Large Language Model (LLM) of the user's choosing. In this session, we will demo the tool and explain how it works, discuss our experience testing it out so far, and share our perspective on how web archivists can respond to this AI moment.

     

    Blog post: https://lil.law.harvard.edu/blog/2024/02/12/warc-gpt-an-open-source-tool-for-exploring-web-archives-with-ai/

    WARC-GPT on Github: https://github.com/harvard-lil/warc-gpt

     

    Registration:

     

    Please register in advance for this meeting: https://harvard.zoom.us/meeting/register/tJElceGrrDwpGdBcPae2eMWwrv4j_TqDmLcT

     

    After registering, you will receive a confirmation email containing information about joining the meeting.

    This presentation will be recorded, and the recording link will be made available afterward.

     

     

    Allison Fischbach, MLIS (she/her)
    Digital Archivist
    Alan Mason Chesney Medical Archives
    Johns Hopkins University and Medicine
    afischbach@jhmi.edu
    410-735-6782

     

    medicalarchives.jhmi.edu