Find and Replace
You can easily navigate within your document using a keyboard and mouse, but if you have many pages to scroll through, it will take quite a while to find specific text in a long document. It will be more time consuming when you want to replace certain characters or words that you have used in your document. The “Find and replace” functionality enables you to find a sequence of characters in a document and replace it with another sequence of characters.
Aspose.Words allows you to find a specific string or regular expression pattern in your document and replace it with an alternative without installing and using additional applications such as Microsoft Word. This will speed up many typing and formatting tasks, potentially saving you hours of work.
This article explains how to apply string replacement and regular expressions with the support of metacharacters.
Ways to Find and Replace
Aspose.Words provides two ways to apply the find and replace operation by using the following:
- Simple string replacement – to find and replace a specific string with another, you need to specify a search string (alphanumeric characters) that is going to be replaced according to all occurrences with another specified replacement string. Both strings must not contain symbols. Take into account that string comparison can be case-sensitive, or you may be unsure of spelling or have several similar spellings.
- Regular expressions – to specify a regular expression to find the exact string matches and replace them according to your regular expression. Note that a word is defined as being made up of only alphanumeric characters. If a replacement is executed with only whole words being matched and the input string happens to contain symbols, then no phrases will be found.
Also, you can use special metacharacters with simple string replacement and regular expressions to specify breaks within the find and replace operation.
Aspose.Words presents the find and replace functionality with the IReplacingCallBack. You can work with many options during the find and replace process using FindReplaceOptions class.
Find and Replace Text Using Simple String Replacement
You can use one of the Replace methods to find or replace a particular string and return the number of replacements that were made. In this case, you can specify a string to be replaced, a string that will replace all its occurrences, whether the replacement is case-sensitive, and whether only stand-alone words will be affected.
The following code example shows how to find the string “CustomerName” and replace it with the string “James Bond”:
// For complete examples and data files, please go to https://github.com/aspose-words/Aspose.Words-for-Java | |
// Load a Word DOCX document by creating an instance of the Document class. | |
Document doc = new Document(); | |
DocumentBuilder builder = new DocumentBuilder(doc); | |
builder.writeln("Hello _CustomerName_,"); | |
// Specify the search string and replace string using the Replace method. | |
doc.getRange().replace("_CustomerName_", "James Bond", new FindReplaceOptions()); | |
// Save the result. | |
doc.save(dataDir + "Range.ReplaceSimple.docx"); |
You can notice the difference between the document before applying simple string replacement:
And after applying simple string replacement:
Find and Replace Text Using Regular Expressions
A regular expression (regex) is a pattern that describes a certain sequence of text. Suppose you want to replace all double occurrences of a word with a single word occurrence. Then you can apply the following regular expression to specify the double-word pattern: ([a-zA-Z]+) \1
.
Use the other Replace method to search and replace particular character combinations by setting the Regex
parameter as the regular expression pattern to find matches.
The following code example shows how to replace strings that match a regular expression pattern with a specified replacement string:
// For complete examples and data files, please go to https://github.com/aspose-words/Aspose.Words-for-Java | |
Document doc = new Document(); | |
DocumentBuilder builder = new DocumentBuilder(doc); | |
builder.writeln("sad mad bad"); | |
if(doc.getText().trim() == "sad mad bad") | |
{ | |
System.out.println("Strings are equal!"); | |
} | |
// Replaces all occurrences of the words "sad" or "mad" to "bad". | |
FindReplaceOptions options = new FindReplaceOptions(); | |
doc.getRange().replace(Pattern.compile("[s|m]ad"), "bad", options); | |
// Save the Word document. | |
doc.save(dataDir + "Range.ReplaceWithRegex.docx"); |
You can notice the difference between the document before applying string replacement with regular expressions:
And after applying string replacement with regular expressions:
Find and Replace String Using Metacharacters
You can use metacharacters in the search string or the replacement string if a particular text or phrase is composed of multiple paragraphs, sections, or pages. Some of the metacharacters include &p for a paragraph break, &b for a section break, &m for a page break, and &l for a line break.
The following code example shows how to replace text with paragraph and page break:
// For complete examples and data files, please go to https://github.com/aspose-words/Aspose.Words-for-Java | |
Document doc = new Document(); | |
DocumentBuilder builder = new DocumentBuilder(doc); | |
builder.getFont().setName("Arial"); | |
builder.writeln("First section"); | |
builder.writeln(" 1st paragraph"); | |
builder.writeln(" 2nd paragraph"); | |
builder.writeln("{insert-section}"); | |
builder.writeln("Second section"); | |
builder.writeln(" 1st paragraph"); | |
FindReplaceOptions options = new FindReplaceOptions(); | |
options.getApplyParagraphFormat().setAlignment(ParagraphAlignment.CENTER); | |
// Double each paragraph break after word "section", add kind of underline and make it centered. | |
int count = doc.getRange().replace("section&p", "section&p----------------------&p", options); | |
// Insert section break instead of custom text tag. | |
count = doc.getRange().replace("{insert-section}", "&b", options); | |
doc.save(dataDir + "ReplaceTextContaingMetaCharacters_out.docx"); |
Find and Replace String in Header/Footer of a Document
You can find and replace text in the header/footer section of a Word document using the HeaderFooter class.
The following code example shows how to replace the text of the header section in your document:
// For complete examples and data files, please go to https://github.com/aspose-words/Aspose.Words-for-Java | |
// Open the template document, containing obsolete copyright information in the footer. | |
Document doc = new Document(dataDir + "HeaderFooter.ReplaceText.doc"); | |
// Access header of the Word document. | |
HeaderFooterCollection headersFooters = doc.getFirstSection().getHeadersFooters(); | |
HeaderFooter header = headersFooters.get(HeaderFooterType.HEADER_PRIMARY); | |
// Set options. | |
FindReplaceOptions options = new FindReplaceOptions(); | |
options.setMatchCase(false); | |
options.setFindWholeWordsOnly(false); | |
// Replace text in the header of the Word document. | |
header.getRange().replace("Aspose.Words", "Remove", options); | |
// Save the Word document. | |
doc.save(dataDir + "HeaderReplace.docx"); |
You can notice the difference between the document before applying header string replacement:
And after applying header string replacement:
The code example to replace the text of the footer section in your document is very similar to the previous header code example. All you need to do is replace the following two lines:
HeaderFooter header = headersFooters.get(HeaderFooterType.HEADER_PRIMARY);
header.getRange().replace("Aspose.Words", "Remove", options);
With the following:
You can notice the difference between the document before applying footer string replacement:
And after applying footer string replacement:
Ignore Text During Find and Replace
While applying the find and replace operation, you can ignore certain segments of the text. So, certain parts of the text can be excluded from the search, and the find and replace can be applied only to the remaining parts.
Aspose.Words provides many find and replace properties for ignoring text such as IgnoreDeleted, IgnoreFieldCodes, IgnoreFields, IgnoreFootnotes, and IgnoreInserted.
The following code example shows how to ignore text inside delete revisions:
// Create new document. | |
Document doc = new Document(); | |
DocumentBuilder builder = new DocumentBuilder(doc); | |
// Insert non-revised text. | |
builder.writeln("Deleted"); | |
builder.write("Text"); | |
// Remove first paragraph with tracking revisions. | |
doc.startTrackRevisions("author", new Date()); | |
doc.getFirstSection().getBody().getFirstParagraph().remove(); | |
doc.stopTrackRevisions(); | |
Pattern regex = Pattern.compile("e", Pattern.CASE_INSENSITIVE); | |
FindReplaceOptions options = new FindReplaceOptions(); | |
// Replace 'e' in document ignoring deleted text. | |
options.setIgnoreDeleted(true); | |
doc.getRange().replace(regex, "*", options); | |
System.out.println(doc.getText()); // The output is: Deleted\rT*xt\f | |
// Replace 'e' in document NOT ignoring deleted text. | |
options.setIgnoreDeleted(false); | |
doc.getRange().replace(regex, "*", options); | |
System.out.println(doc.getText()); // The output is: D*l*t*d\rT*xt\f |
Customize Find and Replace Operation
Aspose.Words provides many different properties to find and replace text such as applying specific format with ApplyFont and ApplyParagraphFormats properties, using substitutions in replacement patterns with UseSubstitutions property, and others.
The following code example shows how to highlight a specific word in your document:
// For complete examples and data files, please go to https://github.com/aspose-words/Aspose.Words-for-Java | |
// Highlight word "the" with yellow color. | |
FindReplaceOptions options = new FindReplaceOptions(); | |
options.getApplyFont().setHighlightColor(Color.YELLOW); | |
// Replace highlighted text. | |
doc.getRange().replace("the", "the", options); |
Aspose.Words allows you to use the IReplacingCallback interface to create and call a custom method during a replace operation. You may have some use cases where you need to customize the find and replace operation such as replacing text specified with a regular expression with HTML tags, so basically you will apply replace with inserting HTML.
If you need to replace a string with an HTML tag, apply the IReplacingCallback interface to customize the find and replace operation so the match starts at the beginning of a run with the match node of your document. Let us provide several examples of using IReplacingCallback.
The following code example shows how to replace text specified with HTML:
// For complete examples and data files, please go to https://github.com/aspose-words/Aspose.Words-for-Java | |
public static void ReplaceWithHtml() throws Exception { | |
Document doc = new Document(); | |
DocumentBuilder builder = new DocumentBuilder(doc); | |
builder.writeln("Hello <CustomerName>,"); | |
FindReplaceOptions options = new FindReplaceOptions(); | |
options.setReplacingCallback(new ReplaceWithHtmlEvaluator()); | |
doc.getRange().replace(Pattern.compile(" <CustomerName>,"), "", options); | |
//doc.getRange().replace(" <CustomerName>,", html, options); | |
// Save the modified document. | |
doc.save(dataDir + "Range.ReplaceWithInsertHtml.doc"); | |
System.out.println("\nText replaced with meta characters successfully.\nFile saved at " + dataDir); | |
} | |
static class ReplaceWithHtmlEvaluator implements IReplacingCallback { | |
public int replacing(ReplacingArgs e) throws Exception { | |
// This is a Run node that contains either the beginning or the complete match. | |
Node currentNode = e.getMatchNode(); | |
// create Document Buidler and insert MergeField | |
DocumentBuilder builder = new DocumentBuilder((Document) e.getMatchNode().getDocument()); | |
builder.moveTo(currentNode); | |
// Replace '<CustomerName>' text with a red bold name. | |
builder.insertHtml("<b><font color='red'>James Bond, </font></b>");e.getReplacement(); | |
currentNode.remove(); | |
//Signal to the replace engine to do nothing because we have already done all what we wanted. | |
return ReplaceAction.SKIP; | |
} | |
} |
The following code example shows how to highlight positive numbers with green color and negative numbers with red color:
// For complete examples and data files, please go to https://github.com/aspose-words/Aspose.Words-for-Java | |
// Replace and Highlight Numbers. | |
static class NumberHighlightCallback implements IReplacingCallback { | |
public int replacing (ReplacingArgs args) throws Exception { | |
Node currentNode = args.getMatchNode(); | |
// Let replacement to be the same text. | |
args.setReplacement(currentNode.getText()); | |
int val = currentNode.hashCode(); | |
// Apply either red or green color depending on the number value sign. | |
FindReplaceOptions options = new FindReplaceOptions(); | |
if(val > 0) | |
{ | |
options.getApplyFont().setColor(Color.GREEN); | |
} | |
else | |
{ | |
options.getApplyFont().setColor(Color.RED); | |
} | |
return ReplaceAction.REPLACE; | |
} | |
} |
The following code example shows how to prepend a line number to each line:
// For complete examples and data files, please go to https://github.com/aspose-words/Aspose.Words-for-Java | |
public static void TestLineCounter() throws Exception { | |
// Create a document. | |
Document doc = new Document(); | |
DocumentBuilder builder = new DocumentBuilder(doc); | |
// Add lines of text. | |
builder.writeln("This is first line"); | |
builder.writeln("Second line"); | |
builder.writeln("And last line"); | |
// Prepend each line with line number. | |
FindReplaceOptions opt = new FindReplaceOptions(); | |
opt.setReplacingCallback(new LineCounterCallback()); | |
doc.getRange().replace(Pattern.compile("[^&p]*&p"), "", opt); | |
doc.save(dataDir + "TestLineCounter.docx"); | |
} | |
static class LineCounterCallback implements IReplacingCallback | |
{ | |
private int mCounter = 1; | |
public int replacing(ReplacingArgs args) throws Exception { | |
Node currentNode = args.getMatchNode(); | |
System.out.println(currentNode.getText()); | |
args.setReplacement(mCounter++ +"."+ currentNode.getText()); | |
return ReplaceAction.REPLACE; | |
} | |
} |