Category: Technology

Not finding the data sets you're looking for? Not all of our data sets are categorized yet. Try checking out tags instead.

Showing 1 - 20 out of 300 datasets

Digital Element IP Intelligence Geolocation

A geolocation API with 20 fields of search results, all customized to your IP query. Search by IP address to return data about a geographical area, including country, region, city, internet connection speed, global coordinates, postal and country codes, time zone, and even daylight savings observation status. Looking for more dimensions of IP searchable data? Try the ...
API

Digital Element IP Intelligence Domains

A reverse IP lookup API with 5 fields of search results, all customized to your IP query. Search by IP address to return data about the domain, company, ISP, NAICS industry code and proxy type for an IP address. Looking for more dimensions of IP searchable data? Try the Geolocation API, returning up to 20 geo data points of custom query information per IP address. Or ...
API

Digital Element IP Intelligence Demographics

A geolocation API for all your demographics needs. Search by IP address to return data about a geographical area, including number of households, gender, age groups and language. Looking for more dimensions of IP searchable data? Try the Gelocation API, returning up to 20 geo data points of custom query information per IP address. Or the Domains API that retrieves ...
API

Enron Email Dataset

From the CALO Project at Carnegie-Mellon University a massive dataset of emails recovered from discovery documents in the Enron trials About From distribution page: > This dataset was collected and prepared by the CALO Project (A Cognitive Assistant that Learns and Organizes). It contains data from about 150 users, mostly senior management of Enron, organized into ...
Offsite

Word List - 100,000 + Official Crossword Words (Excel readable)

A word list with over 100,000 entries that are officially permitted in crossword games like Scrabble™. This word list is available in a simple, alphabetically-ordered Excel format, making it convenient for reference, spell-checking, or in more sophisticated application, for developers looking to build a custom spelling dictionary. The entries include variants of ...
Free

Document Metadata Based on a Sample of Web Documents from the Open Directory

DMOZ100k06 is a large research data set about document metadata based on a random sample of 100,000 web documents from the Open Directory combined with data retrieved from the social bookmarking service delicious.com, the content rating system ICRA, and the search engine Google. The data set is freely available for other research. Michael G. Noll
Offsite

AOL Search Data Mirrors

This collection consists of ~20M web queries collected from ~650k users over three months. The data is sorted by anonymous user ID and sequentially arranged. The goal of this collection is to provide real query log data that is based on real users. It could be used for personalization, query reformulation or other types of search research. From AOL’s original Read-Me ...
Offsite

A list of all 22,802 words in the Scribblenauts dictionary.

List of summonable objects from the Nintendo DS game Scribblenauts, from AARDVARK, ABOMINABLE SNOWMAN and ABSCONDER to ZOMBIE, ZUNICERATOPS and ZYGOTE. via the Scribblenauts Wikipedia entry: Scribblenauts is an emergent puzzle action video game with the tagline “Write Anything, Solve Everything”. Its objective is to complete puzzles by summonning any object (from a ...
Free

Twitter Census: Trst Rank

Twitter influence metrics with the click of a button! Trstrank measures Twitter user reputation, importance and influence in a way far more robust than counting the number of followers. It is a sophisticated measure of a user’s relative importance among the entire Twitter network. The API measures Twitter influence across two dimensions for each query: the ...
API

Marvel Universe Social Graph

A fun Marvel Comics character collaboration graph constructed by Cesc Rosselló, Ricardo Alberich, and Joe Miro from the University of the Balearic Islands. The Marvel Universe, that is, the artificial world that takes place in the universe of the Marvel comic books, is an example of a social collaboration network. They compare the characteristics of this universe to ...
Free

Twitter Census: Influence Metrics

Twitter influence – how to measure it? Let us count the ways: enthusiasm, feedness, sway, follow churn, trstrank, followers, outflux, interesting, chattiness, follow rate, influx… How Does The Twitter Influence API Work? The Twitter Influence API performs a “scrape” of the social network to gather and evaluate individual instances of metrics like mentions, ...
API

AOL Search Data

The AOL Search Data is a collection of real query log data that is based on real users. The data set consists of 20M web queries collected from 650k users over three months. These private searches are perfect for research and mining. The data is sorted by anonymous user ID and sequentially arranged. The collection can be used for personalization, query reformulation or ...
Free