The Comprehensive Knowledge Archive Network (CKAN) Collection

Description:

From their website:

CKAN is the Comprehensive Knowledge Archive Network, a registry of open knowledge packages and projects (and a few closed ones)…Those familiar with freshmeat, CPAN or PyPI can think of CKAN as providing an analogous service for open knowledge…CKAN is developed and maintained by the Open Knowledge Foundation. Both the CKAN code and data are open: free for anyone to use and reuse. To find out more check out the the CKAN project at knowledgeforge.net

CKAN is a peer in the global data commons and Infochimps is proud to be able to mirror their collection of over 300 datasets.

Created over 2 years ago by Infochimps

Updated over 2 years ago

US Census Bureau TIGER data

The US government’s ‘Topologically Integrated Geographic Encoding and Referencing’ system, usually referred to as TIGER, is based on an extensive database of US geographic information. It is county-level data that documents physical features like roads and rivers, as well as some administrative features such as Congressional districts. Data can be downloaded for ...
Offsite

National Public Transport Access Node database (NaPTAN)

From the [overview](http://naptan.org.uk/overview.htm): > NaPTAN provides a unique identifier for every point of access to public transport in the UK, together with meaningful text descriptions of the stop point and its location. This enables both computerised transport systems and the general public to find and reference the stop unambiguously. Stops can be related to ...
Offsite

National Land and Property Gazetteer

Description From main site: > The NLPG is the first, definitive, national address list that provides unique identification of properties across England and Wales and conforms to the British Standard, BS 7666. Local government, and potentially the public and private sectors, can link their information systems to this high-quality source of addresses and accurate ...
Offsite

National Street Gazetteer

Description From the [about page](http://www.thensg.org.uk/iansg/link.htm?id=100): > The National Street Gazetteer (NSG) is the definitive reference system used in the notification process and the coordination of street works. Under legislation, each local highway authority in England and Wales is required to create and maintain its own Local Street Gazetteer (LSG) ...
Offsite

Chemical Block

About ChemBlock makes available two databases: 1. Building Blocks fields: ID number, Structure, Chemical Name, Salt data 4925 compounds 2. Screening Library fields: ID number, Structure, Salt data 122051 compounds Openness Terms of re-distribution/re-use are not mentioned on the site.
Offsite

Open Shakespeare

The Open Shakespeare package provides a full open set of Shakespeare’s works along with ancillary material, a variety of tools and a python API. Specifically, in addition to the works themselves (often in multiple versions), there is an introduction, a chronology, explanatory notes, a concordance and search facilities. All material is open source/open knowledge so that ...
Offsite

archive.org - Internet Archive

“The Internet Archive, a 501©(3) non-profit, is building a digital library of Internet sites and other cultural artifacts in digital form. Like a paper library, we provide free access to researchers, historians, scholars, and the general public.”
Offsite

Binding DB - The Binding Database

About > BindingDB is a public, web-accessible database of measured binding affinities, focusing chiefly on the interactions of protein considered to be drug-targets with small, drug-like molecules. Openness Not open as restricts commercial re-use: > The database you are about to use is protected under copyright and/or patent law. While you are free to use the data ...
Offsite

Planning Alerts Planning Applications Database

UK Planning Application data from a variety of councils across the UK. More information plus as full up-to-date list of councils covered can be found at: <http://www.planningalerts.com/about.php>
Offsite

Distributed Structure-Searchable Toxicity (DSSTox) Public Database Network

About > Distributed Structure-Searchable Toxicity (DSSTox) Database Network is a project of EPA’s National Center for Computational Toxicology, helping to build a public data foundation for improved structure-activity and predictive toxicology capabilities. The DSSTox website provides a public forum for publishing downloadable, structure-searchable, standardized ...
Offsite

ICONCLASS - Multilingual Thematic Classification

About From the website: > This is an experimental service that makes the ICONCLASS Iconographic Classification system available as linked-data using the SKOS vocabulary. This service is inspired by the excellent Library of Congress Subject Headings linked data service. It is intentionally copied in spirit and conventions used. The idea is to enable others to make ...
Offsite

Discogs: Discographies

Discogs is a community-built database of music information. Imagine a site with discographies of all labels, all artists, all cross-referenced, this is what Discogs strives to be. Here you will find monthly data dumps of Discogs Release, Artist, and Label data. The data is in XML format and formatted according to the API spec. License All material is in the public ...
Offsite

Securities & Exchange Commission's Public Information Server

This server features SEC public documents, information of interest to the investing public, rule-making activities, and access to the Commission’s electronic filing database, EDGAR. The public will be able to query the EDGAR database for any company currently filing electronically with the SEC. These filings are updated 24 hours after they are filed with the ...
Offsite

Wordnet

WordNet® is a large lexical database of English, developed under the direction of George A. Miller. Nouns, verbs, adjectives and adverbs are grouped into sets of cognitive synonyms (synsets), each expressing a distinct concept. Synsets are interlinked by means of conceptual-semantic and lexical relations. The resulting network of meaningfully related words and concepts ...
Offsite

EEA - Data service

About Overview: > The data service provides almost all data sets and applications which have been used in EEA’s periodical environmental reports. Topics include: Air emissions Air quality Corine land cover 1990 Corine land cover 2000 EEA owned data sets Land cover accounts Eurosion Nationally designated areas Point data Raster data Geospatial data ...
Offsite

World Values Survey

Description Large global surveys of ‘values’ taking place every five years since 1990 described on its website as “The world’s most comprehensive investigation of Political and Socio-Cultural Change”. Openness: Semi-Open Access: download in bulk is possible as well as analysis on the website. However have to go through terms and conditions (not ...
Offsite

DBTune

“This effort has started in the context of the Linking open data community project of the Semantic Web Education and Outreach Interest Group. Its main purpose is to make available freely available data concerning music on the semantic-web, such as Magnatune, Jamendo, Dogmazic, Mutopia, and to create links between them and other available semantic web repositories, such ...
Offsite

Correlates of War

Description The Correlates of War project hosts a variety of datasets related to the study of inter-state conflict. Details As of 2007-09-22 the following datasets were listed: State System Membership (v2004.1): This data set records the fluctuating composition of the state system since 1816. It also identifies countries corresponding to the standard ...
Offsite

Bulk.resource.org

Bulk.resource.org is a service of public.resource.org. Public.resource.org is a non-profit committed to publishing and sharing public domain materials in the United States. This system contains unsupported, as-is copies of selected U.S. government archives, including: The SEC’s EDGAR Database Commerce Business Daily U.S. Copyright Database Patent Full Text Database ...
Offsite

Statistics Canada

About From [what we do](http://www.statcan.gc.ca/about-apercu/overview-apercu-eng.htm) page: > Statistics Canada, a member of the Industry Portfolio, produces statistics that help Canadians better understand their country—its population, resources, economy, society and culture. Access/re-use [Copyright ...
Offsite

EUROPA - Register of Commission documents

About Overview > The register contains references both of documents which have already been published and of internal (unpublished) Commission documents, from the 1st January 2001. Information in register includes: the identifier or reference number, the title of the document in the languages in which it is available, the date of the document, the languages in ...
Offsite

Places of interest in the London Borough of Sutton

A CSV file of places of interest in the London Borough of Sutton as compiled by Sutton Active. Currently with 142 items. The CSV file is dynamically generated from the live database with each request. Please cache locally if you require regular access. No geotags yet but I’m working on it.
Offsite

GeoCommons

Description Geocommons is a website for uploading and visualizing datasets with a geospatial component (so they can be plotted on a map). Focus is on visualization rather than the data with tagline: “Explore, Create and Share Intelligent Maps and Geographic Data” Openness: PASS- License: all datasets licensed under cc by-sa 3.0 Access: data provided in kml ...
Offsite

Open History

Collection of articles – mostly about Japanese history. Started in 2001 and last updated in 2006-09-18.
Offsite

History Commons

About From [about](http://www.historycommons.org/aboutsite.jsp) page: > What is the History Commons website? > The History Commons website is run by the Center for Grassroots Oversight (“CGO”), an organization that is fiscally sponsored by The Global Center, a 501©3 non-profit organization. CGO was incorporated as a public benefit corporation in late 2006, and ...
Offsite

Open Text Book

“Open Text Book is a registry of textbooks and text book material that is open in accordance with the Open Knowledge Definition (OKD).”
Offsite

Wikispecies

“Wikispecies is an open, free directory of species. It covers Animalia, Plantae, Fungi, Bacteria, Archaea, Protista and all other forms of life.”
Offsite

Wikibooks

“Welcome to Wikibooks, a Wikimedia project that was started on July 10, 2003 with the mission to create a free collection of open-content textbooks that anyone can edit.”
Offsite

Wikisource

“Wikisource is an online library of free content publications collected and maintained by the community (see our inclusion policy).”
Offsite

Wikimedia Commons

Over 2 million freely usable media files to which anyone can contribute
Offsite

Given Name Frequency Project

Quite a bit of data is available for download but only individually (not in a single file). According to web page have have: > * GINAP – code to standardize given names and correct common problems in name samples. Such standardization is an important step in analysis of given names. > * Popular given names, US 1801 to 1999 – a collection of sets of standardized ...
Offsite

Ekopedia

Ekopedia is “the practical encyclopedia about alternative life techniques”. It is dedicated to providing information related to environmental sustainability. License Creative Commons
Offsite

ISO 639-2 - Codes for the Representation of Names of Languages

About From home page: > ISO 639 provides two sets of language codes, one as a two-letter code set (639-1) and another as a three-letter code set (this part of ISO 639) for the representation of names of languages. ISO 639-1 was devised primarily for use in terminology, lexicography and linguistics. This part of ISO 639 represents all languages contained in ISO 639-1 ...
Offsite

Wikinews

“We are a group of volunteers whose mission is to present reliable, unbiased, relevant and entertaining News. All content is released under a free license. By making our content perpetually available for free redistribution and use, we hope to contribute to a global digital commons."
Offsite

Wiktionary

“Welcome to the English-language Wiktionary, a collaborative project to produce a free, multilingual dictionary with definitions, etymologies, pronunciations, sample quotations, synonyms, antonyms and translations. Wiktionary is the lexical companion to the open-content encyclopedia Wikipedia.”
Offsite

Wikiquote

“Welcome to Wikiquote, a free online compendium of quotations from notable people and creative works in every language, including sources (where known), translations of non-English quotes, and links to Wikipedia for further information! The English version of Wikiquote has 13,799 pages so far with many thousands of quotations and proverbs.”
Offsite

FreeBMD (Births, Marriages and Deaths)

Description From front page: “FreeBMD is an ongoing project, the aim of which is to transcribe the Civil Registration index of births, marriages and deaths for England and Wales, and to provide free Internet access to the transcribed records.” Openness: NOT OPEN 1. License: access for personal research purposes only. Full T&C below. 2. Access: single ...
Offsite

Open Media Database

About “omdb (open media database) is a free database for film media. There is no set editorial staff, but rather a large number of movie addicts and lovers who volunteer their time to provide material and develop the site. Anybody can add or change existing information on omdb once they have done the quick and simple task of signing up for their user login name. ...
Offsite

Fine Rolls of Henry III

Description From <http://www.finerollshenry3.org.uk/cocoon/frh3/content/about/about.html>: > The Henry III Fine Rolls Project is a three year enterprise commencing in April 2005, funded by the Arts and Humanities Research Council. It aims to publish the Fine Rolls of Henry III from 1216 down to 1248. It is hoped that a second three year project will complete ...
Offsite

Open-Of-Course

Open-Of-Course is a multilingual and interactive portal for open content courses and tutorials. It is based on the free software ELO “Moodle” and people are welcome to add their own open educational content to the system.
Offsite

FreeDict

About Summary from [SourceForge page](http://sourceforge.net/projects/freedict/): > Free translating dictionaries. The data is kept as XML complying to the TEI DTD. This enables to include features such as phonetics, part of speech and etymology information in a project independent format. Access/Re-use Fully open. From the [project ...
Offsite

ChemIDplus

About > This database allows users to search the NLM ChemIDplus database of over 370,000 chemicals. A user may enter compound identifiers such as Chemical Name, CAS Registry Number, Molecular Formula, Classification Code, Locator Code, and Structure or Substructure. New searchable features include search and display by Toxicity indicators such as Median Lethal Dose ...
Offsite

Ancient Geographic Information

Description Datasets produced by the [pleiades project](http://pleiades.stoa.org/about-pleiades): > Organized by the Ancient World Mapping Center at the University of North Carolina at Chapel Hill, U.S.A., Pleiades brings together a global community of scholars, students and enthusiasts to expand and enhance continually the information originally brought together by ...
Offsite

HapMap

Description The International HapMap Project is a partnership of scientists and funding agencies from Canada, China, Japan, Nigeria, the United Kingdom and the United States to develop a public resource that will help researchers find genes associated with human disease and response to pharmaceuticals. Datasets From ...
Offsite

Languages of the World (Multilingual RDF Descriptions)

Description Linkvoj means languages in Esperanto. From the frontpage of <http://www.lingvoj.org/>: http://www.lingvoj.org/lingvoj.rdf is the complete RDF file gathering currently the description of 507 languages, including all languages defined by ISO 639-1 and most of ISO 639-2 codes (a few exceptions remain, for which Wikipedia articles are not consistent with ...
Offsite

Open Font Library

Openness: OPEN License: SIL OFL (http://openfontlicense.org/) Access: yes from each page (by hand) bulk: no
Offsite

All Collections