I used Computer Vision to scan an image for text, then classified if it was one of the nearly 20 forms of protected health data using a combination of ML and heuristics, then scrubbed it out on device by drawing a box over the text. It's not 100% accurate but showed good results (seems like about 90%). I also added the ability to tap on the image and manually scrub an area, as well as remove any incorrect classifications.
Next up I'd like to improve the accuracy of the model and then build a desktop client that does the same kind of detection on different documents. Some of the feedback that came up during my pitch was wether this could be used to identify other kinds of information (financial), and I'd like to experiment with generalization of the model for different domains.
Figured I'd post for more feedback - what a fun event, and it was so great to talk to aspiring and past YC alumni and partners!