FilePhraser ----------- This distribution contains the FilePhraser, a component that recognises phrases in a file. It combines distribution 008 (Phraser) with all necessary components to operate directly on text files. Usage: construct a FilePhraser with the name of the file you want to have processed, register any WordEvent listeners using the addWordListener() method, and kick off the whole process by calling the FilePhraser's start() method. The registered listener(s) will receive each word as it is read, including any mark-up that signifies phrase boundaries. You can extract the data from the event by calling its getValue() method with the parameters "token" and "pos"; this will return null for "token" when the end of the input file has been reached. Markup (such as sentence boundary markers) will also be returned; these are normally enclosed in pointed brackets (eg for sentence boundaries, and for phrases). This distribution contains all the required classes, including those which tie together the separate components. You can also run the FilePhraser as a stand-alone component. Here you will need to supply the filenames of the files to be processed as command-line parameters, and each token will be printed on a separate line to System.out with any associated tag. (C) 2000 Phrasys (www.phrasys.com / www.phrasys.co.uk) The Phraser listens to incoming WordEvents, and adds mark-up tags around any phrases it can find. These tags looke like or similar, depending on what phrases are to be recognised. The mark-up creates extra WordEvents which are added to the stream of incoming events. Usage: construct a Phraser, and register it to a component that produces WordEvents (such as the Tokeniser or the tagger). If you are using part of speech tags in the resource file, you will have to use a tagger as source in order to have access to the "pos" value of the incoming tokens. If you want to use a different resource file from the one provided you can also specify which one you want to use. Then register with the Phraser a component that will listen to WordEvents containing phrase mark-up. The end of the input will be marked through an event which has null as its "token" value. For more information please consult the javadoc generated documentation on the website. (C) 2000 Phrasys (www.phrasys.com / www.phrasys.co.uk)