IS2140
Friday, April 18, 2014
Thursday, April 10, 2014
Sunday, April 6, 2014
Friday, March 28, 2014
Unit 11 Reading Note (3/31)
IES chapter 14 parallel information retrieval
Document Partitioning: each document is divided into one or more nonoverlapping partitions. Many of the text-framework features can be configured to operate differently for each partition. But it doesn't have good performance if the index is stored on disk.
Term partitioning addresses the disk seek problem by splitting the collection into sets of terms instead of sets of documents.
MapReduces are highly parallelizable, because both map and reduce can be executed in parallel on many different machines.
Friday, March 21, 2014
Unit 10 Reading Note (3/24)
IIR chapters 19 and 21
In response to queries a search engine can return web pages whose contents it has not indexed.
Search engines generally organize their indexes in various tiers and parti- tions, not all of which are examined on every search.
In a Markov chain, the probability distribution of next states for a Markov chain depends only on the current state, and not on how the Markov chain arrived at the current state.
Search engines generally organize their indexes in various tiers and parti- tions, not all of which are examined on every search.
In a Markov chain, the probability distribution of next states for a Markov chain depends only on the current state, and not on how the Markov chain arrived at the current state.
Thursday, March 20, 2014
Subscribe to:
Comments (Atom)