Data sets tagged with "textmining"

[Wikitech-l] page counters

This presents a kind of ‘what pages are visited’ statistics. It is applied to a squid access-log stream and redirected to profiling agent (webstatscollector) then the hourly snapshots are written in very trivial format. This can be used to both noticing strange activities, as well as spotting trends (specific events show up really nicely), let it be a movie premiere, ...
Offsite

Information Extraction: The RISE Repository of Information Sources

RISE is a distributed repository of online information sources that are used for the empirical analysis of learning algorithms that generate extraction patterns. The sources included in this repository are provided by people from the information extraction (IE) and wrapper generation (WG) communities. Both communities use machine learning algorithms to generate ...
Offsite

Big Huge Thesaurus API: Access 145,000 Words and Phrases

This site sports a very simple API for retrieving the synonyms for any word and also an actual Big Huge Thesaurus. License You may use the service for any legal and non-slimy purpose* so long as you link to this site in your website or application credits as follows: Thesaurus service provided by words.bighugelabs.com THE SERVICE IS PROVIDED “AS IS” WITHOUT WARRANTY ...
Offsite

All Tags