Braincat

BrainCat is the latest generation of Automatic Categorizer that is easily programmable and allows for the management of enormous volumes of information in a very simple and rapid manner. The system is capable of grouping the documents and information in homogeneous categories, basing itself either on classes predefined by the user or entirely automatically through an advanced clusterization system. The categorizer is capable of learning and refining its activity over time through a structured training system. Automatically, and therefore without distortion due to the different sensitivity of the BrainCat operators, it will store all of the corporate information classifying it in an "intelligent" manner and rendering the company's entire Know How easily accessible and usable.

BRAINCAT is helpful in:
  • automatically accessing the information originating from different databases
  • categorizing the information
  • searching information and documents through free text querying
  • keeping track of the user's behavior in order to propose the information that is deemed relevant based on the user profile in an automatic manner
  • making it so that the organization has the entire knowledge base at its disposal

The Semantic Annotation system implemented allows for management of different methods:

METHOD 1 – Annotation based on predefined rules. The system learns constantly. The documents that cannot be automatically annotated are addressed to a special user that completes them.

METHOD 2 – Annotation based on example sentences. The rules are written in a special language.

METHOD 3 – Annotation based on training. The system searches all of the sentences in the documents that are sufficiently similar to the example ones memorized.

METHOD 4 – Continuous learning from quality control. The system "trains" itself to automatically recognize the annotations on new documents on the basis of the example documents.
   
 Some application examples:

  • Automatic routing for texts with content relevant to the purposes of turning on the alert. The system is configured with a series of sentences associated to concepts that characterize documents that must be "captured" by the system
  • Recognition of the type of text on the basis of the content. The texts are divided up based on the language in which they are written or on the argument they treat (for example, "disability pensions" or "severance indemnity")
  • Recognition of the type of text based on its structure – the layout and the structure of the text are used
  • Automatic category definition - automatic process of creation of the categories directly through the reading and interpretation of the documents


BRAINCAT is based on the most advanced technologies among which:

  • System of self-learning or machine-learning (SVM algorithms and Fuzzy-ARTMAP)
  • Statistic inference (Naive-Bayes modified)
  • Algorithms of computational linguistics (ex., Part-of-speech taggin, N-Grams)
  • Clustering algorithms and self-organized maps