top of page

Text Processor, Sample Work

The task is to complete the method bodies of the TextProcessor class


the TextProcessor class will be used as part of an application to gather information about word use in documents. The purpose of an instance of the class is to keep a count of how many times each word in a document is used. It must also be able to respond to questions about the words. Some examples are given below, after the descriptions of the methods.


Your implementation of TextProcessor will also be tested for correctness by a test class developed by the marker so it is worth considering creating your own test class as part of the development of the TextProcessor class.


While that is partly evaluated by correctness demonstrated through testing, the implementation should also be of good quality. So, you will be assessed on the appropriateness of your solution as well as its correctness. Nevertheless, correctness is, in general, to be valued over efficiency. So, don’t be satisfied with a highly efficient but slightly incorrect solution.


The TextProcessor class The class provided in the starting project consists of a constructor and four methods. There is no useful implementation in any of those elements and you must add further code to complete them. You must not modify any of the names, parameters, or return types of those elements, although you may add further non-public methods if you wish and you have complete freedom over the fields you define, although all instance fields must be kept private. The class must be implemented to provide the following functionality:


• addWord: The parameter represents a case-sensitive word that has been found in a document. Because an instance must keep track of how many times each word has been added, the count of occurrences for that word must be increased by 1. You may assume that all strings passed to this method will consist only of one or more alphabetic characters and there is no need to check that this is the case. In other words, all parameters will be valid words.


howMany: The parameter represents a case-sensitive word that has been found in a document. The number of times that the exact word has been passed to the add word method must be returned.


suffixCount: This method takes a single parameter and it must return the number of distinct words added so far to the processor that the parameter is a suffix of. It counts as a suffix if a word in the processor ends with the parameter but does not also exactly match the parameter. For instance: parameter "thing" matches all of: "Something", "something", "nothing", and "thing" but only the first three qualify as distinct suffix matches, so the result would be 3. The count must represent distinct words, so each added word that contains the suffix must only be counted once. For instance, if "something" has been added 3 times, that only counts as 1 match for “something”.


anagramCount: [challenging] This method takes a single parameter and it must return the number of distinct case-insensitive anagrams of the given word that have been added to the processor. An anagram consists of exactly the same number of letters as the given word with each character occurring the same number of times. However, identical words are not considered to be anagrams. For instance: "and" is an anagram of "dan" but "nan" is neither an anagram of "naan" nor of “nan”. Note that because the match must be case-insensitive, “Dan” is also an anagram of “and”. However, because the match is case-insensitive, all of “Dan”, “dan”, and “DAN” only count as a single match for any of “and”, “AND” and “And”. Also, each anagram is only counted once, regardless of how many times it has been added to the processor via addWord.


The TextProcessMain class This has been provided to help you check your implementation. You may change it as much as you wish and you will not be assessed on its contents. It currently provides a small amount of literal text as a sample document and makes a few calls on the TextProcessor’s method to check their functionality. It is not a complete test class. You can pass any text document to the program as input bypassing its file name to the main method as a parameter when you run the program. If a file is provided, then the text in that will be analyzed rather than the literal text. Example data Consider the following text found in a document that is being analyzed using an instance of TextProcessor


Comments


bottom of page