pdfSweep: AutoSweep example

Tags: iText 7

The basic pdfSweep workflow has just two easy steps:

  • Select those parts of the document that must be redacted: either by specifying the coordinates, or by inputting a regular expression that fits your needs. We have already provided a substantial list of common regular expressions to do some of the heavy lifting for you, such as social security numbers, phone numbers, dates, etc.

  • Pass the locations to pdfSweep, or invoke with the pattern(s) of your choice.

This is an autosweep example that redacts the words 'Alice' and 'White Rabbit' and 'Rabbit' (regardless of casing). It marks all occurrences of Alice with a pink rectangle, and all occurences of 'Rabbit' with a gray rectangle.


// Load the license to be able to use pdfSweep 
LicenseKey.loadLicenseFile(licenceFile); 
 
String input = "AliceInWonderLand.pdf"; 
String output = "AliceInWonderLandRedacted.pdf"; 

// define a composite strategy 
CompositeCleanupStrategy strategy = new CompositeCleanupStrategy(); 
strategy.add(new RegexBasedCleanupStrategy("Alice")
            .setRedactionColor(Color.PINK)); 
strategy.add(new RegexBasedCleanupStrategy("(White )*Rabbit")
            .setRedactionColor(Color. GRAY)); 
 
PdfDocument pdf = new PdfDocument(new PdfReader(input), new PdfWriter(output)); 
 
// sweep 
PdfAutoSweep autoSweep = new PdfAutoSweep(strategy); 
autoSweep.cleanUp(pdf);

// close document 
pdf.close();

This is the original document:

pdfSweep input document example
pdfSweep input document example

And this is ‘post redaction’:

pdfSweep output example
pdfSweep output example

Figure 1: pdfSweep original input document Figure 2: pdfSweep redacted output document

As is made clear by this example document (and the code to go with it), it is perfectly possible to define a custom color for each snippet of text to be redacted.

Similarly, you could also have specified the exact locations yourself. Following code-snippet demonstrates that usecase:


String input = "iphone_user_guide_untagged.pdf"; 
String output = "iphone_user_guide_redacted.pdf";        
 
// load licensekey 
LicenseKey.loadLicenseFile("itext_dev_master_license.xml"); 
 
// define rectangles 
List rects = Arrays.asList(new Rectangle(60f, 80f, 460f, 65f), new Rectangle(300f, 370f, 215f, 260f)); 
 
// turn rectangles into cleanup locations 
int pagesNum = 130; 
List cleanUpLocations = new ArrayList<>(); 
for (int i = 0; i < pagesNum; ++i) {    
    for (int j = 0; j < rects.size(); ++j) {        
        cleanUpLocations.add(new PdfCleanUpLocation(i + 1, rects.get(j)));    
    } 
} 
 
// open document 
PdfDocument pdfDocument = new PdfDocument(new PdfReader(input), new PdfWriter(output)); 
 
// perform cleanup 
PdfCleanUpTool cleaner = new PdfCleanUpTool(pdfDocument, cleanUpLocations); 
cleaner.cleanUp(); 
 
// close document 
pdfDocument.close();