Data sets tagged with "nlp"

Delicious bookmarks, September 2009

A record of all bookmarking activity on delicious.com for a roughly 10-day period in September 2009. Format is JSON, one record per line. There are 1.25 million entries. Download size is 170 MB. Sample record: {"updated": “Tue, 08 Sep 2009 08:45:00 +0000”, “links”: [{"href": “http://www.mcfc.co.uk/”, “type”: “text/html”, “rel”: "alternate"}], ...
Offsite

Big Huge Thesaurus API: Access 145,000 Words and Phrases

This site sports a very simple API for retrieving the synonyms for any word and also an actual Big Huge Thesaurus. License You may use the service for any legal and non-slimy purpose* so long as you link to this site in your website or application credits as follows: Thesaurus service provided by words.bighugelabs.com THE SERVICE IS PROVIDED “AS IS” WITHOUT WARRANTY ...
Offsite

Linguistic Data Consortium (LDC) - Collection of Linguistic Corpora and Datasets

The Linguistic Data Consortium is an open consortium of universities, companies and government research laboratories. It creates, collects and distributes speech and text databases, lexicons, and other resources for research and development purposes. The University of Pennsylvania is the LDC’s host institution. The LDC was founded in 1992 with a grant from the Advanced ...
Offsite

OpenCalais API

The OpenCalais Web Service automatically creates rich semantic metadata for the content you submit – in well under a second. Using natural language processing (NLP), machine learning and other methods, Calais analyzes your document and finds the entities within it. But, Calais goes well beyond classic entity identification and returns the facts and events hidden within ...
Offsite

All Tags