Data sets tagged with "names"
Email Data Sets
Due to privacy issues, it is very hard to get a hold of large and realistic email corpora. Here you can find a few email data sets, as well as a dataset of news groups text – annotated with personal names spans. The email corpora given here were extracted from the Enron corpus, made public by the Federal agency Regulatory commission. As a second type of informal text, ...
Offsite
Given Name Frequency Project
Quite a bit of data is available for download but only individually (not in a single file). According to web page have have: > * GINAP – code to standardize given names and correct common problems in name samples. Such standardization is an important step in analysis of given names. > * Popular given names, US 1801 to 1999 – a collection of sets of standardized ...
Offsite
Offsite
Given Name Frequency Project: Analysis of Given Name Popularity
This Given Name Frequency Project provides analysis, tools, and data to spur further work on given names. Data provided includes popular given names in the US from 1801 to 1999, samples of names from England before 1800 from a diverse set of sources, the popularity of the name Mary over the past 800 years, and a sample of cotton workers in Manchester, England from ...
Offsite
Members of the European Parliament (5th parliamentary term - 1999 to 2004)
List of MEP (Members of the European Parliament) in the 4th term from 1999-2004. It includes the name, web profile ID and political group membership. The data has been obtained from the official website of the parliament: http://www.europarl.europa.eu/members/archive/term5.do?language=EN. The ID can be used to access each MEP’s profile via ...
Free
Members of the European Parliament (4th parliamentary term - 1994–1999)
List of MEP (Members of the European Parliament) in the 4th term from 1994-1999. It includes the name, web profile ID and political group membership. The data has been obtained from the official website of the parliament: http://www.europarl.europa.eu/members/archive/term4.do?language=EN. The ID can be used to access each MEP’s profile via ...
Free
Popular baby names by year, top 1000 (US Social Security Administration)
For a list of the most popular names for a particular year of birth (any year after 1879), enter the year and the length of the popularity list. Data downloaded from the Social Security Administration Popular Baby Names site (http://www.ssa.gov/OACT/babynames/index.html). You can script a download of the full list using curl: mkdir -p ...
Free
Case Closed Names
This dataset consists of a collection of Infoboxes from Wikipedia on the topic of Case Closed Names.
Free
1990 Census Name Files
Three separate datasets obtained from the 1990 cense. One set includes last names, one has first male names, and one has first female names. They contain the following data: the name, frequency in percent, cumulative frequency in percent, and rank.
Offsite
Ted Pedersen - Name Discrimination Data / Name Disambiguation Data / Name Ambiguity Data / Named Ent
Contains data where ambiguous entity names in text have been disambiguated. The data has either been manually disambiguated, or created by conflating multiple names into a single ambiguous pseudo-name.
Offsite
German Male Forenames
List of German Male Forenames
30.000 names order by commonness
See Also: List of 30,000 German Female Fornames ordered by commonness
$100.00
German Female Forenames
List of German Female Forenames
10.000 names ordered by commonness
See Also: List of 30,000 German Male Fornames ordered by commonness
$100.00
Upcoming Executions | Death Penalty Information Center
List of stays of execution and of upcoming executions, updated regularly.
Offsite
1000 USA Company CEO's contacts (primarily small companies)
1000 USA Company CEO’s (primarily small companies) (includes Company Name, CEO name, landline, number of employees, year founded, product, sometimes cell phone, NO EMAILS)
$80.00
Open Street Map
All of Open Street Map, converted to geoJSON entities, and available for querying.
API
Geonames Places As GeoJSON
All entities in the Geonames database as geoJSON.
API
Baby names: boys names from England and Wales, 1996 to 2010
Counts of boys names from England and Wales, by birth year, from 1996 to 2010. The data was originally published by the UK Office for National Statistics (ONS). 1996 to 2009 data available here: http://www.ons.gov.uk/ons/publications/re-reference-tables.html?edition=tcm%3A77-49222 2010 data available here: ...
Free
Baby names: girls names from England and Wales, 1996 to 2010
Counts of girls names from England and Wales, by birth year, from 1996 to 2010. The data was originally published by the UK Office for National Statistics (ONS). 1996 to 2009 data available here: http://www.ons.gov.uk/ons/publications/re-reference-tables.html?edition=tcm%3A77-49222 2010 data available here: ...
Free
Baby names: boys names from Scotland, 2002 to 2010
Counts of boys names fromScotland, by birth year, from 2002 to 2010. The data was originally published by the General Register Office for Scotland (GRO-Scotland). The original datasets for 2007 to 2010 are available here: http://www.gro-scotland.gov.uk/statistics/theme/vital-events/births/popular-names/index.html. Datasets for 2002 to 2006 obtained via email request to ...
Free


