| |
| |
PURPOSE
: JT-Filter performes a variety of "filter"
operations on a selected field.
WHY
: Often fields may require special processing
to make it suitable for use
Such things as proper capitalization, particularly
of names, genderization,
standardization of phone numbers, splitting of
city/state/zip and many other
filters will be provided.
USAGE
: JT-Filter Infile Ffile OutFile
Infile
- This is an input file, tab delimited, and containing
source data.
Ffile
- This file contains field # and the filter
to use on it.
This is a tab delimited file with the field
# (Starting with 1)
and the name of the filter to apply to that
field.(e.g. UPPERCASE 1).
Some filters will themselves have further arguments.
(e.g. SUBSTITUTE # oldstring newstring CASE|NOCASE)
Some filters, like NAME, will divide a field
into several fields (e.g. salutation, first,
middle, last, suffix, gender, ethnicity)or like
CSZ, (e.g. city, state, zip5, zip4, zip2)
A wide variety of filters will be developed
as time and need permits.
CURRENT
FILTERS INCLUDE:
*
LOWERCASE field
* UPPERCASE field
* CAPITALIZE field
* SWAP field1 field2
Exchanges the contents of field1 and field2.
*
CHARACTERFIX field fname
The file contains lines, each containing 2 characters.
The first character is the search char,
the second is the replacement char.
e.g.
Aa, Bb
This
can be used to convert extended roman characters
to
the nearest regular letters.
*
REPLACE field findstring replacestring
Finds all occurances of the findstring in
the specified field and replaces it with the
replacestring. Searches are NOT case sensitive.
*
SPLIT field column
Splits a field after column selected.
e.g. 3 leaves 3 characters in first half.
This
standardizes telephone numbers.
This does not work for international numbers.
Spelled phone numbers are converted to numeric
numbers
The
CSZ filter corrects spelling, format and
case. This should improve the address correction
percentage. The data file for the city lookup
is a text file that can be easily updated
as needed. Misspellings and case can be
corrected by entries in that file. There
are thirteen thousand entries in that file.
*
VFCFIX field len "vfc1" "vfc2"
"vfc3" "vfc4" "vfc5"
This
converts aberrant VFC format codes that
are sometime encountered. Most VFC formats
can be fixed with this filter. Usually
a VFC format file can be treated as a
1 field tab delimited file. Each line
can be read as a single field and the
VFC codes fixed to be handled as a normal
VFC file and converted using JT-convert.
field
is the field number to fix, normally 1.
len
is the length of the VFC code, usually 1,
but I have seen 4.
vfc1
is the string that indicates a new page.
vfc2 is the string that indicates an overprint.
(Usually ignored)
vfc3 is the string that indicates Skip 1 line
and Print (i.e. next line)
vfc4 is the string that indicates Skip 2 lines
and Print
vfc5 is the string that indicates Skip 3 lines
and Print
Any other string will be ignored and changed
to a "skip 1 line and print"
Thank
you, Cody, for your assistance in this peculiar
problem.
Outfile
- This is an output file, tab delimited, containing
the filtered data.
RESULTS
: Result status is logged into a y-m-d.log
file. (e.g. \"05-06-06.log\"
An exit code of 0 is returned on success, nonzero
on failure.
|
|
|