InforSense
Embedding Intelligence Throughout the Enterprise
 

TEXT ANALYTICS

Fully featured text analytics toolset

 

InforSense Text Analytics text mining software modules provide a full complement of advanced analytics tools to support text mining activities including:

  • Scientific literature analytics to interpret scientific experiments
  • Corporate intelligence to monitor your competitors' new feeds
  • Early warning systems for manufacturers to identify common faults
  • 'Voice of the customer' analysis for Retail, Financial Services and Manufacturing organizations
  • Process improvement for call centers

Text Data Mining Analytics Software

Much of the information that influences our text mining analysis and decision-making is contained in documents and publications. Several sources (http://www.b-eye-network.com/view/2098) suggest that 80-85% of all data used in businesses is based on unstructured text data mining and that the amount of unstructured data is growing rapidly.  However, many analytical products focus exclusively on analysis of structured, database-driven content.  In recent years, some products have been introduced which focus exclusively on text analytics. Neither approach is sufficient on its own. They must be used in conjunction with each other to get the complete picture and to make the most informed decisions.

InforSense Text Analytics have a wide variety of uses ranging from security and intelligence applications, through academic or commercial research activities, to enhanced customer relationship management.  All of these uses have some common requirements.  The capabilities of InforSense Text Analytics are listed here in three levels, with each level building on the processing undertaken in the previous level.

Level One

  • Information Retrieval. Retrieval and ranking of documents from a collection based on a user-defined query
  • Text Processing. Any text-in, text-out operation, usually used to get the raw text into a format for further processing. It may include parsing xml documents, document filtering, word stemming, character replacement and document sectioning.

Level Two

  • Named Entity Recognition. The identification and extraction from within the text of certain predefined entities such as dates, people's names, addresses, etc., and specifically in Life Sciences includes gene names, chemical compounds, tissues types, etc.
  • Natural Language Processing. The semantic processing of the text to identify each word's or phrase's part of speech (noun, verb, preposition, etc.) and to derive underlying concepts embedded in the documents.

Level Three

  • Document Classification. The categorization/assignment of documents into pre-defined classes.
  • Document Clustering. Automatic categorization of documents into groups based on some measure of text similarity. The groups are not predefined, in contrast to Document Classification.
  • Information Extraction. The extraction of specific information items from a document collection.

Ultimately these processes must result in a meaningful way to use the data, including interactive browsing of the results and integration of the results with traditional structured data analytics.

When the results of text analytics are combined with other analytical modules, applications can be built that combine both analysis of structured and semi-structured data.

 

Copyright© 2000-2009 InforSense Ltd. All rights reserved.

Contact | Support | Careers | Privacy Policy | Sitemap