Index – Collective Intelligence in Action

Index

[SYMBOL][A][B][C][D][E][F][G][H][I][J][K][L][M][N][O][P][Q][R][S][T][U][V][W][X][Y]

SYMBOL

23andme.com

A

abstraction, content types
accessing DME
accuracy of predictive model2nd
activation function
addIndexes
ad-generation engine
adjusted cosine
adjusted cosine-based similarity2nd
advertisements2nd3rd4th5th6th
advertising
agglomerative2nd
AJAX2nd3rd
Alexa2nd
Algorithm.
    See Waikato Environment for Knowledge Analysis (WEKA).
algorithms, key learning
AlgorithmSettings2nd3rd4th5th
Amazon2nd3rd4th5th6th7th8th9th
Analyzer.
    See also Lucene.
analyzers2nd
analyzeText2nd
analyzing content
AOL
APIs, JDM
application phase
application server
ApplyTask.
    See Java Data Mining (JDM).
apriori2nd3rd
architecture
  content integration
  tagging2nd3rd
arrow
articles2nd3rd4th5th6th7th8th
artificial intelligence2nd
Asian languages
association algorithms
association rules2nd
asynchronous
asynchronously
Atom Publishing Format2nd
Attribute2nd3rd
attribute selection
attributes2nd
  nominal
  numerical
  ordinal
AttributeType
attrition rates
authority2nd
auto-complete
automated indexers
average-link
averages

B

back-propagation
BallTree
banner advertisements
base pairs
batch process
Bayesian belief networks (BBN)2nd3rd4th5th6th
Bayesian clustering
BBN.
    See Bayesian belief networks (BBN).
Bell, Robert.
    See BellKor.
BellKor
Bennett, Jim
BFS
Bialecki, Andrzej
BigTable
binning
bioinformatics
biological relationships
black box2nd3rd
blob
blog comments
blog entries2nd3rd4th6th8th9th10th11th
  clustering
  retrieving from Technorati
BlogAnalysisDataItem.
    See also clustering.
BlogDataSetCreatorImpl2nd3rd
Blogdigger2nd3rd4th5th
BlogEntry2nd
Blogflux
bloggers2nd
Bloglines2nd3rd4th5th6th
blogosphere2nd3rd4th5th6th7th
  searching
BlogQueryParameter2nd3rd4th5th
BlogQueryResult2nd3rd4th5th6th
blogs2nd3rd4th5th
  blog-tracking companies
  searching
BlogSearcher.
    See also blogs.
BlogSearchExample
BlogSearchResponseHandler2nd3rd4th5th6th7th
blog-tracking companies
Bloomberg.com
bookmark2nd3rd4th5th6th7th8th
bookmarking2nd3rd4th5th
BooleanQuery.
    See also Lucene.
boost factor2nd
boosting2nd
bots
breadth-first search
buckets
buildDataPhysicalDataSet.
    See Java Data Mining (JDM).
building, intelligent crawler
BuildSettings2nd
business intelligence

C

C4.5
C5.0
caching
CachingWrapperFilter
Carrot2, intelligent search
CART
CARuleMiner.
    See Waikato Environment for Knowledge Analysis (WEKA).
catalog
CategorySet
cause-effect.
    See Bayesian belief networks (BBN).
.CFS extension
Charkrabarti
CharTokenizer
chat logs
chat sessions
child intelligence
child nodes
Chinese language
chromosomes
Church, George
churn in items
CI.
    See collective intelligence (CI).
Cinematch2nd
circle of influence2nd
classification2nd.
    See also regression.
classification terms2nd
ClassificationModel.
    See also Java Data Mining (JDM).
ClassificationSettings2nd
ClassificationTestTask
classifieds
clickstream2nd
click-through2nd4th
  rates.
    See decision trees.
cloaking
cluster.
    See also Java Data Mining (JDM).
Clusterer2nd3rd4th
clusterer, creating
ClusterEvaluation.
    See also Waikato Environment for Knowledge Analysis (WEKA).
clustering2nd3rd4th5th11th12th
  evaluating results
  high-dimension data
  sparse data
  with JDM
  with WEKA
clustering model2nd3rd
ClusteringModel
ClusteringSettings
Clusty, intelligent search
CNET Networks
Cofe
Cofi
collaborative analysis
collaborative approach
collaborative filtering2nd3rd
  model-based
  probabilistic
collaborative-based2nd
Collective Intelligence
collective power
commercial crawlers
commit lock
community
Compass2nd
Compete
complete-link
complex-event-processing
composite
CompositeContentType
CompositeContentTypes2nd
compound files.
    See also Lucene.
computational biology
computeInitialDistances
computing similarities
conditional independence.
    See Naïve Bayes and Bayesian belief networks.
conditioning methods
connecting with other users
Connection2nd3rd
ConnectionFactory
ConnectionMetaData
ConnectionSpec
consists
content aggregation
content visited
content-based2nd
content-based analysis
  recommendation engine
ContentBasedBlogRecoEngine
content-centric applications2nd
conversion rates
core competency
corporate website
correlation coefficient
correlation matrix
cosine2nd3rd4th
cosine-based similarity
CoverTree
co-visitation
crawling
  deep
  process
crawling the web
createInitialSingleItemClusters
creating search index
cross-validation2nd3rd
crowd sourcing
Cuil
Cutting, Doug2nd
cycles

D

DAG.
    See directed acyclic graph (DAG).
data aggregator service
data analysis
data collection
data mining2nd3rd
  core concepts
  example
  JDM
  process
  vendors
  WEKA
Data Mining Group (DMG)
data mining tools vendors
data, learning dataset
database2nd3rd4th
data-based search
Datar, Mayur
DataSetApplyTask
DataSetCreator
datasets2nd3rd4th5th6th7th
DayPop
DBScan
deadlocks2nd
decision trees.
    See also classification.
decodeme.com
deep web
degree of belief
del.icio.us2nd
densely populated
derived intelligence2nd
detecting phrases2nd
diagonal matrix
dictionary of tags
Digg2nd3rd
dimensionality reduction
dimensions
directed acyclic graph (DAG)2nd
directed graph
Directory
diversity
DME, accessing.
    See also Java Data Mining (JDM).
DMG.
    See Data Mining Group (DMG).
DNA
DNA chips
Document.
    See also Lucene.
document frequency
doorway pages
dot product2nd3rd4th5th6th7th8th9th
dot-com era
Dryad
dynamic navigation2nd3rd4th5th6th7th8th

E

eBay
Eclipse
eigen value2nd
Einstein
EM.
    See expectation maximization (EM).
email5th
  chain
  filtering
  spam.
    See classification.
embedding intelligence
English language
Epinions.com
EqualInverseDocFreqEstimator.
    See also text analysis.
ethics
Eurekster
Evaluation2nd
event-driven
exceptions handling
ExecutionHandle2nd3rd
ExecutionStatus2nd3rd
expectation maximization (EM)2nd
experimental data search
Experimenter
Explanation2nd
explicit information2nd
exploitation
exploration
Explorer
ExportTask
external content
extract phrases
extracting URLs

F

Fair Isaac
fame factor
FAQ
FastVector2nd3rd
FeedForwardNeuralNetSettings
Field
FieldQueryParser
Fields
file-based locking
Filter
filter
filtering.
    See also Lucene.
firewall2nd
Flickr2nd
flyweight pattern
focused crawling2nd3rd
folders
folksonomies2nd3rd
FontSizeComputationStrategy
FontSizeComputationStrategyImpl
forward
framework, extending
freemium
frequency count
FSDirectory.
    See also Lucene.
fundamental concepts2nd
FuzzyQuery.
    See Lucene.

G

Gaussian cluster
Gaussian distribution
Gaussian kernel function
gender.
    See also attribute.
General Public License (GNU)
genes
genetic algorithms
genomic sequencing
geographic location2nd
German
GermanAnalyzer
GermanStemFilter
getBlogDetails
getSynonym2nd
global warming
global-lock system
Gmail
GNU.
    See General Public License (GNU).
Google File System
Google News2nd
Gospodnetic
GPL
gradient descent algorithm
gradient search
greedy recommenders
groups2nd3rd

H

HAC.
    See Hierarchical Agglomerative Clustering (HAC).
hackability
Hadoop2nd3rd
Hakia
handling exceptions
handling response XML
hard-to-replicate data
Harvard Medical School
harvest from external sites
harvest rate
hashCode
Hatcher
HDFS
Heritrix
Herren, John
Hibernate2nd
Hibernate search2nd
hidden layer2nd
hidden nodes2nd
hidden unit
Hierarchical Agglomerative Clustering (HAC)2nd
hierarchical clustering2nd
HierarchicalClusteringImpl2nd
HierCluster
HierDistance
high performance
high-dimension
Hinchcliffe, Dion
HitCollector.
    See Lucene.
Hits.
    See also Lucene.
Hoffman, Kevin
Hoffmann
home address
homonym
Hornick, Mark
HTMLTagCloudDecorator
HTTPS requests
Human Genome Project
hyper linking

I

IB1
IBk
IBM
IceRocket
ID3
identity matrix
IETF
if-then rules2nd
images
immutable
implicit information2nd
ImportTask
incremental indexing
index
  creating
  files
  modifying
  optimizing performance
  searching
indexing
  incremental
  optimizing performance
indexing service
IndexReader
IndexSearcher
IndexUpdaterService
IndexWriter2nd
info gain.
    See decision trees.
information entropy
information retrieval2nd3rd4th5th
infrastructure
InitialContext
injecting synonyms
input layer2nd
installation guide
Instance2nd
instant gratification
instant messengers
integration
intelligence, extracting
intelligent crawling2nd
intelligent search
interaction history2nd
inverse document frequency (idf)2nd3rd4th5th6th7th
inverse of matrix
inverse user frequency
InverseDocFreqEstimator
InverseDocFreqEstimatorImpl
inverted text index
invisible web
is
isValidPhrase2nd3rd
Item
item churn
item-based analysis
Items
items
item-to-item2nd
  Amazon

J

Jaccard coefficient
JAMA
Java
Java Community Process2nd3rd
Java Data Mining (JDM)2nd9th10th14th15th16th
  accessing DME
  architecture
  clustering
  clustering settings
  connections
  datasets
  key clustering classes
  key objects
  tasks
Java Web Start
javax.datamining2nd3rd
javax.datamining.clustering2nd
javax.datamining.supervised2nd
JDBC
JDM.
    See Java Data Mining (JDM).
JDMConnectionExample
JDMException
JNDI lookup
Johnson, Dave
journal entries
JSON APIs
JSR 2472nd3rd
JSR 732nd3rd
JVM

K

k neighbors
KDTree
kernel function2nd
key learning algorithms
Keyes, Ken
Keyword spamming
keywords2nd3rd
k-fold.
    See cross-validation.
King Ping
k-means4th
  implementation2nd
KMeansSettings
k-nearest neighbor (k-NN)2nd3rd
knowledge repository
KnowledgeFlow.
    See Waikato Environment for Knowledge Analysis (WEKA).
Koren, Yehuda
Kosmix intelligent search
KStar
KXEN

L

language-independent
languages
large-scale systems
latent classes
latent Dirichlet allocation
latent semantic indexing (LSI)2nd
layer2nd
leaf cluster
learning dataset2nd
learning models
learning phase
Lemire, Daniel
LetterTokenizer.
    See Lucene.
Lexee
Libby, Dan
life sciences
Linden, Greg
linear algebra
linear model
linear regression2nd
LinearNNSearch
Linguistic-based search
link spamming
Linkdb
LinkedIn
links, decision tree
list2nd
list of related items
load balancer2nd
Locality Sensitive Hashing (LSH)
Lock
lock
log likelihood
logarithmic
LogicalAttribute
LogicalDataSet
look-to-book ratio
low-dimension
LowerCaseTokenizer2nd
LSH.
    See Locality Sensitive Hashing (LSH).
LSI.
    See latent semantic indexing (LSI).
Lucene2nd3rd4th5th6th7th13th18th
  architecture
  classes
  core classes
  download
  finding similar items
  indexing2nd
  querying
  scoring
  term vector
Lucene in Action2nd
Lucene JDBC
Lucene scoring
LuceneTextAnalyzer
Luke

M

machine learning2nd3rd
machine-generated tags
mailing list
manufacturer
MapReduce2nd3rd4th
margin
marketplace2nd
Markov Decision process
mass behavior
mathematical model
mathematics
MathWorks
Matrix2nd
matrix
matrix inversion
Meebo2nd
memory-based
menus
message boards2nd3rd4th5th6th
messaging infrastructure
messaging server2nd
MetaDataExtractor
MetaDataVector2nd
meta-search
microarrays
Microsoft2nd3rd
mining process
MiningObject2nd
Minwise Independent Permutation Hashing (MinHash)
mirror sites2nd
MLP.
    See multi-layer perceptron (MLP).
model-based2nd3rd
ModelDetail
MOR
MoreLikeThis
movie titles
MSN2nd3rd4th5th
MultiFieldQueryParser2nd
multi-layer perceptron (MLP)2nd
multiple fields query
multiple indexes search2nd3rd
multiple multiplication factor model
multiple-term tokens
MultiSearcher2nd
multi-term phrases, detect
MultiTermQuery
music2nd
MyRank
MyWeb

N

NaïveBayes
natural language
navigenics.com
N-dimensional vector
nearest neighbor.
    See k-nearest neighbor (k-NN).
NearestNeighbourSearch
net gain
net worth2nd
Netflix2nd3rd
network effect
network topologies
neural network2nd3rd4th5th6th
NeuralNetworkModelDetail
New York Times
news feeds
news items
news site
newsfeed format
NextBio
Nielsen Net Ratings, search numbers
node2nd3rd4th5th
nodes, decision tree
noisy ratings
nominal attributes
nonlinear
Normalize
normalizeToken
numerical attributes
Nutch2nd3rd4th7th9th10th11th
  running
  searching with
  setting up

O

online analytic processing (OLAP)
ontology2nd
open source crawler2nd
OPTICS
optimizing memory settings
Oracle
OrbiMed Advisors LLC
ordinal attributes
orthogonal matrix
overfitting2nd

P

PageRank
pandemic
ParallelMultiSearcher
parent nodes
path followed
pattern matcher2nd
pdf
PDFBox
Pearson-r correlation2nd3rd
Pentaho
PerFieldAnalyzerWrapper
perpetual deta
persistence model
personal health history
personal journals2nd
personalization2nd
  Google News
personalized recommendations2nd
Photo
photo
photos2nd3rd4th5th6th
phrase detection
phrase dictionary
PhraseQuery
PhrasesCache
PhysicalAttribute
PhysicalAttributeRole
PhysicalDataSet2nd
pictures
pinging
Pingoat
Pingomatic
PLSI.
    See probabilistic latent semantic indexing (PLSI).
PMML.
    See Predictive Model Markup Language (PMML).
podcasters
podcasts2nd
polling
polls2nd3rd
polysemy2nd3rd
Porter2nd
PorterStemFilter
PorterStemmer
PorterStemStopWordAnalyzer2nd
Postami
Powerset
precision
predictive model
  intelligent search
Predictive Model Markup Language (PMML)
predictive models2nd3rd4th5th6th
PredictiveApriori
PrefixFilter
PrefixQuery
price
printClusterEntries
printer
prior history
probabilistic
probabilistic latent semantic indexing (PLSI)2nd
probabilistic methods
probabilistic networks.
    See Bayesian belief networks (BBN).
probability distribution
probability theory2nd
products2nd
professionally developed keywords
professionally generated
profile
profile page
Profile selections
pruneDistances
purchasing history

Q

quadratic regression
quality of the item
quality of the predictive model
Quantcast
Query2nd.
    See Lucene.
query results
query terms
QueryFilter.
    See Lucene.
QueryParser
questions and answers2nd3rd4th5th

R

Racofi.
    See Cofi.
radial basis function (RBF)2nd3rd
RAM
RAMDirectory
random
RangeFilter
RangeQuery
RapidMiner.
    See Waikato Environment for Knowledge Analysis (WEKA).
rate
ratings2nd3rd4th5th6th8th10th11th
  example
  persistence model
RBF.
    See radial basis function (RBF).
RBMs
RDF
Read A Blog
Reader
recall
RecodApplyTask
recommendation
recommendation engine3rd4th5th6th7th8th9th10th
  Amazon
  collaborative-based
  content-based
  high performance
  Netflix
recommendation engines
recommendation system2nd
  hybrid
  large-scale
recommendations2nd
reference weblogs
ReferenceWeblog.
    See blogs.
registration
regression2nd3rd.
    See also classification.
    See also classification.
RegressionModel2nd
RegressionSettings
RelevanceTextDataItem
remixability
response XML, handling
result objects, implementing
RetrievedBlogEntry
RetrievedBlogHitCollector
retrieving content
review2nd3rd
Reviewer
reviews2nd3rd
Revver
Rich Site Summary (RSS)2nd3rd4th5th6th10th
  integrating providers with
  parsing
  RSS 2.0
rich user experience
robots.txt2nd
Rolex watch.
    See classification.
Rollyo
RSS.
    See Rich Site Summary (RSS).
rule induction
Russian charset
RussianAnalyzer
RussianLetterTokenizer
RussianLowerCaseFilter
RussianStemFilter

S

SaaS.
    See software-as-a-service (SaaS).
SAP
SAS
saving2nd
SAX2nd3rd4th5th
scaling
Scuttle2nd
search2nd3rd
  architecture
  base classes
  definition
search engine3rd4th5th7th8th10th11th
  ranking2nd3rd
search engines2nd4th
  community-based
search history
search index2nd
  creating
search parameters
  implementing
search performance optimization
search service2nd
search terms2nd
Searcher2nd3rd
searching
  blogosphere
  blogs
  with Nutch
Searchme
segment2nd3rd
select for update
sending messages
Service-Oriented Architecture (SOA)
services, definition
setBoost
setMaxBufferedDocs
setMaxFieldLength
setMergeFactor
shopping basket
sigmoidal basis functions
Silicon Valley
similarity2nd
similarity computation, cosine-based
similarity matrix
similarity metric
SimpleAnalyzer
SimpleBiTermStopWordStemmerMetaDataExtractor
SimpleContentType
SimpleKMeans
SimpleMetaDataExtractor
SimpleStopWordMetaDataExtractor
SimpleStopWordStemmerMetaDataExtractor
simulated annealing
single neucleotide polymorphisms (SNP)
single-link
single-signons
Singleton
singular value decomposition (SVD)2nd
sitemaps2nd
slop
SNP.
    See singular neucleotide polymorphisms (SNP).
SOA.
    See Service-Oriented Architecture (SOA).
social networking2nd
sociology
software-as-a-service (SaaS)2nd
Solr2nd
sorting
SpanQuery
sparse data
sparse matrix2nd
sparsely populated2nd3rd4th
Sphere
spider trap
Spring2nd3rd
Spring bean
SPSS
spurl.net
square matrix2nd
Stack.
    See text analysis.
standard data mining API
StandardAnalyzer
StandardTokenizer
stateless
statistical
statistics
stem
stemmer analyzers
stickier2nd
stochastic simulation
stop terms
stop words2nd3rd5th6th
  removing
StopAnalyzer
Strategy
subcategory
supervised learning2nd
SupervisedAlgorithmSettings
SupervisedModel2nd
SupervisedSettings
support vector machine (SVM)2nd
Surowiecki, James
SVD.
    See singular value decomposition (SVD).
SVDExample
SVM.
    See support vector machines (SVM).
SVMClassificationSettings
SVMRegressionSettings
sweet spot
Synchronous services
SynonymPhraseStopWordAnalyzer
SynonymPhraseStopWordFilter
synonyms2nd3rd4th
  injecting
SynonymsCache
synonymy

T

Tag
tag cloud2nd7th
  building2nd3rd
  definition
TagCloud.com
TagCloudElement2nd
tagging2nd3rd5th6th7th8th9th10th
  introduction
TagMagnitude2nd
TagMagnitudeVector2nd3rd4th
tags2nd3rd4th
tan hyperbolic functions
Task
Taste.
    See collaborative filtering.
taxonomyParentId
Technorati2nd3rd4th5th6th7th8th9th10th11th
term frequency (TF)2nd3rd
term frequency vector
term vector2nd3rd4th5th7th9th10th11th12th
  infrastructure
  representation
term vectors2nd
term-frequency
TermFreqVector
TermQuery
terms, definition
text analysis3rd4th5th
  infrastructure
text analytics
text analyzers
  stemmer analyzer
text clustering
text parsing
text processing2nd
TextAnalyzer
TextDataItem2nd
The Hundredth Monkey
The Long Tail2nd
The New Yorker
The Wall Street Journal
threshold
Time
Token
TokenFilter
tokenization2nd
  definition
tokenize
Tokenizer
Tokenizer, European
TokenStream
Tomcat
toolkit, text analysis
tools
top 10
Top Item List
top n recommendation2nd
Top Reviewers list
TopDocCollector
TopDocs
TopFieldDocs
topical crawlers
top-seller lists
Toxi2nd
TP53
training process
transaction history2nd
TreeSettings

U

UGC.
    See user-generated content (UGC).
undirected path
University of Waikato
unstructured text2nd3rd4th
unsupervised learning2nd3rd
URLs, extracting
user interactions2nd
user profile2nd
user rating2nd3rd
user-based analysis
user-centric applications2nd
user-generated content (UGC)4th5th6th
  definition
  tags
user-item dataset
user-item matrix2nd
Userland Software
user-user similarity matrix

V

validation window
variables
vector2nd4th5th
  space model2nd
viral2nd
viral marketing
Vista
VisualizeTagCloudDecorator2nd
vocabulary2nd3rd
Volinsky, Chris
voting2nd3rd4th5th6th

W

Waikato Environment for Knowledge Analysis (WEKA)2nd6th7th8th9th10th11th12th13th14th15th16th
  APIs
  installation
  tutorial
watches
Web 2.02nd3rd4th5th6th
web 2.0
Web 3.02nd3rd
web application2nd
web applications
web crawler2nd4th6th
  building
  running
web crawling2nd3rd7th
  deep
  process
  why
web server
web spiders
Web2.02nd
WEKA APIs
WEKA.
    See Waikato Environment for Knowledge Analysis (WEKA).
weka.associations
weka.attributeselection
weka.classifier
weka.clusterer
weka.core
weka.filters
WEKABlogClassifier2nd3rd
WEKABlogDataSetClusterer
WEKABlogPredictor2nd
WEKAPredictiveBlogDataSetCreatorImpl
White, Tim
WhitespaceAnalyzer
WhitespaceTokenizer
Wiki2nd
Wikipedia2nd3rd4th5th6th
wikis2nd3rd4th5th6th
WildCardQuery
window of terms
Winer, Dave
Wisdom of the Crowds
word frequency
works
worksheets
world wide web
write lock

X

XML response, parsing

Y


YALE.
    See Yet Another Learning Environment (YALE).
Yet Another Learning Environment (YALE)
YouTube2nd