return to home page
Technology
  The DataLever Portfolio    
 

DataLever™ for parsing and pattern matching

DataLever's parsing and pattern matching tools help you with two problems. First, they tackle the problem of identifying entities and extracting meaning from textual data. For example, in legacy financial records you'll often find an embedded “relationship” between two entities (beneficiary of, trustee, custodian, etc.). For example:

Before

JOE SMITH GUARDIAN OF JANICE SMITH JOHN DOE BENEFICIARY

The parsing tools help you analyze and extract the entities (people, places, organizations), the relationships between those entities, and any other attributes buried within the textual fields so that you put that hidden information to work for your business.

After

PERSON1: JOE SMITH
RELATIONSHIP: GUARDIAN
PERSON2: JANICE SMITH

PERSON1: JOHN DOE
RELATIONSHIP: BENEFICIARY
PERSON2:

Second, the tools help identify the parts from “dirty” or misplaced information. For example, you might have a “Name” column in your table that contains records like:

Before

JOHN SMITH (ext 43)
JOHN SMITH (deceased 12/01/03)
JOHN SMITH (see record 110014)


All of these records share two problems: They don't correctly identify the person, and they have extra information that could be useful, if only it were properly restructured. DataLever's parsing and pattern matching tools analyze and process cases like this and many others that arise over time when the database schema doesn't provide a place for “extra” information that system users need.

After

FIRST: JOHN
LAST: SMITH
EXT: 43
NOTE:
DATE:

FIRST: JOHN
LAST: SMITH
EXT:
NOTE: deceased
DATE: 12/01/03

FIRST: JOHN
LAST: SMITH
EXT:
NOTE: see record 110014
DATE:

Many data quality problems center on the fielding and variable formatting of textual information, especially name and address data. DataLever has a full set of textual parsing tools designed to identify structured data within free-form text fields from sources like contact records with combined name/address/phone fields, catalogs, inventories, legacy distribution records-almost anything that appears in text fields can be parsed into usable, structured information.

For simplicity, DataLever offers several “canned” parsing macros to handle the most common business cases. As new data is processed or further analysis is performed, DataLever automatically reports any records that are not matched by the sets of patterns. And automated pattern analysis can reveal unexpected (and previously unknown) data elements, helping you capture the maximum amount of information from unstructured data.

 

   
       
         
    Copyright ©1998-2008 DataLever Corporation. All rights reserved.