Show HN: Web service to clean PDFs from potentially privacy-invasive metadata

4 points

3 years ago

docleaner is a web service created at TUD-CERT for internal use to remove potentially unwanted metadata from documents. The service currently supports PDFs and removes all metadata except fields relevant for various PDF standards (such as markers for PDF/UA-1, PDF/VT, PDF/1 etc.) and namespaces (e.g. XMPRights for legal stuff). Usage via htmx-based web frontend or REST-like API, released under BSD license. Maybe someone else is interested, feel free to use or contribute.