Customizing redaction with pdfSweep

Tags: pdfSweepredactredactionremove from PDFremove text

With pdfSweep it is also easy to customize your redaction process. This example shows you how to change the color of the redaction annotation to match the color of the text that was redacted, to provide a more aesthetically pleasing result.

We start by defining a class that extends CharacterRenderInfo, and keeps tracks of the color of each character.


  1. class CCharacterRenderInfo extends CharacterRenderInfo {
  1. private Color strokeColor;
  2. private Color fillColor;
  3.  
  4. public CCharacterRenderInfo(TextRenderInfo tri) {
  5. super(tri);
  6. this.strokeColor = tri.getStrokeColor();
  7. this.fillColor = tri.getFillColor();
  8. }
  9.  
  10. public Color getStrokeColor() {
  11. return strokeColor;
  12. }
  13.  
  14. public Color getFillColor() {
  15. return fillColor;
  16. }
  17. }

This class can then be used to build a custom strategy, that knows the color of the redacted elements.


class CustomLocationExtractionStrategy extends RegexBasedLocationExtractionStrategy implements ICleanupStrategy {

  1. private String regex;
  2. private Map<Rectangle, Color> colorByRectangle = new HashMap<>();
  3.  
  4. public CustomLocationExtractionStrategy(String regex) {
  5. super(regex);
  6. this.regex = regex;
  7. }
  8.  
  9. @Override
  10. public List<CharacterRenderInfo> toCRI(TextRenderInfo tri) {
  11. List<CharacterRenderInfo> cris = new ArrayList<>();
  12. for (TextRenderInfo subTri : tri.getCharacterRenderInfos()) {
  13. cris.add(new CCharacterRenderInfo(subTri));
  14. }
  15. return cris;
  16. }
  17.  
  18. @Override
  19. public List<Rectangle> toRectangles(List<CharacterRenderInfo> cris) {
  20. Color col = ((CCharacterRenderInfo) cris.get(0)).getFillColor();
  21. List<Rectangle> rects = new ArrayList<>(super.toRectangles(cris));
  22. for (Rectangle rect : rects) {
  23. colorByRectangle.put(rect, col);
  24. }
  25. return rects;
  26. }
  27.  
  28. @Override
  29. public Color getRedactionColor(IPdfTextLocation rect) {
  30. return colorByRectangle.containsKey(rect.getRectangle()) ? colorByRectangle.get(rect.getRectangle()) : Color.BLACK;
  31. }
  32.  
  33. public Object clone()
  34. {
  35. return new CustomLocationExtractionStrategy(regex);
  36. }
}

The custom strategy can easily be called with following code:


// load license key
LicenseKey.loadLicenseFile(licenceFile);

String input = "iphone_user_guide_untagged.pdf";
String output = "redactIPhoneUserManualMatchColor.pdf";

CompositeCleanupStrategy strategy = new CompositeCleanupStrategy();
strategy.add(new CustomLocationExtractionStrategy("(iphone)|(iPhone)"));

PdfDocument pdf = new PdfDocument(new PdfReader(input), new PdfWriter(output));

// sweep
PdfAutoSweep autoSweep = new PdfAutoSweep(strategy);
autoSweep.cleanUp(pdf);

pdf.close();

This is what the example output document looks like, post redaction:

Custom redaction with pdfSweep
Custom redaction with pdfSweep