Skip to content

Text Synthesis & Recognition

  • Emdros ("Engine for MdF Database Retrieval, Organization, and Storage". Text query-language engine for expressing linguistically relevant queries on a RDBMS) (ml) (forum)
  • Ellogon (C based, multi-lingual, general-purpose text research framework. Ellogon includes a KDE based visualization tool and API bindings for C++, Perl, Python and other programming languages.) (forum)
  • libextractor (C library for extracting keywords from files) (source)
  • DDC ("Dialing DWDS Concordancer". C++ based linguistic search engine, used by German science institutions.)
  • Traduki (Python based text translation toolkit) (cvs) (ml)
  • dbacl (Tool to help classify text documents into categories, and then compare other text documents to the learned categories, using Bayesian filtering techniques.)
  • Alice (AIML XML based QA-engine designed to pass the Turing Test) (cvs) (ml)
  • Anna (AIML XML based QA-engine designed to pass the "Loebner Prize" Turing Test. Anna is a code-fork of Alice.) (cvs)
  • Catty (Google search engine based chat bot)
  • Cack (English sentences generator using 'random' words - using nouns, verbs, adjectives and adverbs in the correct context)
  • Dadadoo (Tool to create 'random' sentences based on a text analysis for Markov chains of word probabilities. Inactive project.)
  • Enca ("Extremely Naive Charset Analyser". Tool to recognize the character encoding of a text and convert it into another encoding. Inactive project.)
  • Snowball (String processing language parser for creating stemming algorithms for use in textual query systems. Inactive project.) (cvs)
Sedo - Buy and Sell Domain Names and Websites project info: debianlinux.net Statistics for project debianlinux.net etracker® web controlling instead of log file analysis