Abstract:
Term-partitioned indexes are generally inefficient for the evaluation of conjunctive queries, as they require the communication of long posting lists. On the other side, document-partitioned indexes incur in excessive overheads as the evaluation of every query involves the participation of all the processors, therefore their scalability is not adequate for real systems. We propose to arrange a set of processors in a two-dimensional array, applying term-partitioning at row level and document-partitioning at column level. Choosing the adequate number of rows and columns given the available number of processors, together with the selection of the proper ways of partitioning the index over that topology is the subject of this paper. © 2009 Springer.
Registro:
Documento: |
Artículo
|
Título: | Two-dimensional distributed inverted files |
Autor: | Feuerstein, E.; Marin, M.; Mizrahi, M.; Gil-Costa, V.; Baeza-Yates, R. |
Ciudad: | Saariselka |
Filiación: | Departamento de Computación, FCEyN, Universidad de Buenos Aires, Argentina Yahoo Research Latin America, Santiago, Chile
|
Palabras clave: | Conjunctive queries; Inverted files; Real systems; Two-dimensional arrays; Information retrieval; Information services; Two dimensional; Towers |
Año: | 2009
|
Volumen: | 5721 LNCS
|
Página de inicio: | 206
|
Página de fin: | 213
|
DOI: |
http://dx.doi.org/10.1007/978-3-642-03784-9_20 |
Título revista: | 16th International Symposium on String Processing and Information Retrieval, SPIRE 2009
|
Título revista abreviado: | Lect. Notes Comput. Sci.
|
ISSN: | 03029743
|
Registro: | https://bibliotecadigital.exactas.uba.ar/collection/paper/document/paper_03029743_v5721LNCS_n_p206_Feuerstein |
Referencias:
- Badue, C., Baeza-Yates, R., Ribeiro, B., Ziviani, N., Distributed query processing using partitioned inverted files, , SPIRE 2001
- Baeza-Yates, R, Ribeiro-Neto, B, Modern Information Retrieval; Costa, G.V., Marin, M., Reyes, N., Parallel query processing on distributed clustering indexes (2009) Journal of Discrete Algorithms, 7, pp. 03-17
- Jeong, B.S., Omiecinski, E., Inverted file partitioning schemes in multiple disk systems (1995) IEEE Trans. Parallel and Distributed Systems, 16 (2), pp. 142-153
- Lucchese, C., Orlando, S., Perego, R., Silvestri, F.: Mining query logs to optimize index partitioning in parallel web search engines. In: INFOSCALE (2007); MacFarlane, A.A., McCann, J.A., Robertson, S.E., Parallel search using partitioned inverted files, , SPIRE 2000
- Marin, M., Costa, G.V., High-performance distributed inverted files (2007) CIKM 2007
- Marin, M., Gomez-Pantoja, C., Gonzalez, S., Gil-Costa, V.: Scheduling Intersection Queries in Term Partitioned Inverted Files. In: Luque, E., Margalef, T., Benítez, D. (eds.) Euro-Par 2008. LNCS, 5168, pp. 434-443. Springer, Heidelberg (2008); Moffat, A., Webber, W., Zobel, J., Baeza-Yates, R., A pipelined architecture for distributed text query evaluation (2007) Information Retrieval, 10 (3), pp. 205-231
- Ribeiro-Neto, B.A., Barbosa, R.A., Query performance for tightly coupled distributed digital libraries (1998) ACM Conf. Digital Libraries, pp. 182-190
- Stanfill, C.: Partitioned posting files: a parallel inverted file structure for information retrieval. In: SIGIR (1990); Suel, T., Mathur, C., Wu, J.W., Zhang, J., Delis, A., Kharrazi, M., Long, X., Shanmugasundaram, K., ODISSEA: A peer-to-peer architecture for scalable web search and information retrieval (2003) WWW
- Tang, C., Dwarkadas, S.: Hybrid global-local indexing for efficient peer-to-peer information retrieval. In: NSDI (2004); Tomasic, A., García-Molina, H., Performance issues in distributed shared-nothing information-retrieval systems (1996) Information Processing & Management, 32 (6), pp. 647-665
- Xi, W., Sornil, O., Luo, M., Fox, E.A.: Hybrid partition inverted files: Experimental validation. In: Agosti, M., Thanos, C. (eds.) ECDL 2002, 2458, p. 422. Springer, Heidelberg (2002); Zhang, J., Suel, T.: Optimized inverted list assignment in distributed search engine architectures. In: IEEE IPDPS 2007(2007); Zhong, M., Shen, K., Seiferas, J.I., Correlation-aware object placement for multiobject operations (2008) ICDCS 2008, pp. 512-521
- Zobel, J., Moffat, A., Inverted files for text search engines (2006) ACM Computing Surveys, 38 (2)
Citas:
---------- APA ----------
Feuerstein, E., Marin, M., Mizrahi, M., Gil-Costa, V. & Baeza-Yates, R.
(2009)
. Two-dimensional distributed inverted files. 16th International Symposium on String Processing and Information Retrieval, SPIRE 2009, 5721 LNCS, 206-213.
http://dx.doi.org/10.1007/978-3-642-03784-9_20---------- CHICAGO ----------
Feuerstein, E., Marin, M., Mizrahi, M., Gil-Costa, V., Baeza-Yates, R.
"Two-dimensional distributed inverted files"
. 16th International Symposium on String Processing and Information Retrieval, SPIRE 2009 5721 LNCS
(2009) : 206-213.
http://dx.doi.org/10.1007/978-3-642-03784-9_20---------- MLA ----------
Feuerstein, E., Marin, M., Mizrahi, M., Gil-Costa, V., Baeza-Yates, R.
"Two-dimensional distributed inverted files"
. 16th International Symposium on String Processing and Information Retrieval, SPIRE 2009, vol. 5721 LNCS, 2009, pp. 206-213.
http://dx.doi.org/10.1007/978-3-642-03784-9_20---------- VANCOUVER ----------
Feuerstein, E., Marin, M., Mizrahi, M., Gil-Costa, V., Baeza-Yates, R. Two-dimensional distributed inverted files. Lect. Notes Comput. Sci. 2009;5721 LNCS:206-213.
http://dx.doi.org/10.1007/978-3-642-03784-9_20