Data sets tagged with "lexicon"
Word List - 100,000 + Official Crossword Words (Excel readable)
A word list with over 100,000 entries that are officially permitted in crossword games like Scrabble™. This word list is available in a simple, alphabetically-ordered Excel format, making it convenient for reference, spell-checking, or in more sophisticated application, for developers looking to build a custom spelling dictionary. The entries include variants of ...
Free
Word List - 100,000+ official crossword words (with Definitions, Excel format)
A list of 113,809 words officially permitted in crossword games like Scrabble™ with their definitions. The words are compatible with the first edition of the Official Scrabble Players Dictionary™. Since this list has variants of words: -ing, -ed, -s, and so on, it makes a good addition when building a custom spelling dictionary. It is an reference to have handy for ...
$4.00
Word List - 100,000+ official crossword words (Excel readable)
113,809 official crosswords A list of words permitted in crossword games such as Scrabble™. Compatible with the first edition of the Official Scrabble Players Dictionary™. Since this list has all forms: -ing, -ed, -s, and so on of words, it makes a good addition when building a custom spelling dictionary.
Free
Wordnet
WordNet® is a large lexical database of English, developed under the direction of George A. Miller. Nouns, verbs, adjectives and adverbs are grouped into sets of cognitive synonyms (synsets), each expressing a distinct concept. Synsets are interlinked by means of conceptual-semantic and lexical relations. The resulting network of meaningfully related words and concepts ...
Offsite
Word List - 1000 Most Frequent Words from an Internet Corpus
This file consists of the 1,000 most frequently used English words as used on the Internet computer network in 1992.
Free
Word List - 1,000 Most Frequently Used English Words by Frequency (with Definitions, Excel format)
This file consists of the 1,000 most frequently used English words from a wide variety of common texts listed in decreasing order of frequency
$4.00
Linguistic Data Consortium (LDC) - Collection of Linguistic Corpora and Datasets
The Linguistic Data Consortium is an open consortium of universities, companies and government research laboratories. It creates, collects and distributes speech and text databases, lexicons, and other resources for research and development purposes. The University of Pennsylvania is the LDC’s host institution. The LDC was founded in 1992 with a grant from the Advanced ...
Offsite
Word List - 1,000+ Most Frequent words in King James Bible
1,185 King James Version frequent substrings (KJVfreq.txt) The most frequently occurring 1,185 substrings in the King James Version Bible ranked and counted by order of frequency.
Free
Word List - Official Scrabble (TM) Player's Dictionary (OSPD) 2nd ed (with Definitions, Excel format
4,160 official crosswords delta (crswd-d.txt) When combined with the 113,809 crosswords file, it produces the official crossword list compatible with the second edition of the Official Scrabble Players Dictionary. (Scrabble is a registered ...
$4.00
Word List - Official Scrabble (TM) Player's Dictionary (OSPD) 2nd ed
4,160 official crosswords delta (crswd-d.txt) When combined with the 113,809 crosswords file, it produces the official crossword list compatible with the second edition of the Official Scrabble Players Dictionary. (Scrabble is a registered trademark of Milton-Bradley licensed to Merriam-Webster.)
Free
Google Labs - Books Ngram Viewer
Here are the datasets backing the Google Books Ngram Viewer. These datasets were generated in July 2009; we will update these datasets as our book scanning continues, and the updated versions will have distinct and persistent version identifiers (20090715 for the current set). Each of the links below will directly download a fragment of the given corpus. For instance, ...
Offsite
