Obsolete Pages{{Obsolete}}
The official documentation is at: http://docs.alfresco.com
Note: This design has been superceeded with Index Version 2 as of Release 1.4.
A mechanism is required to search against the property, full text, content
and semi-structured data in the repository. The structural data is in two forms:
the parent - child relationship between nodes; and the location within
hierarchies used for categarisation.
The persistence of the data may be separate from the index used to search and locate data.
For example, indexing external content, separating the storage of content from other information.
The intention is to use lucene as the index and search engine.
It allows the production of an unstructured index with potentially repeating fields.
Each field in the index can be optionally:
Not all documents need to index the same fields.
This is a good match to the extensible content model.
It is not clear if we should use lucene to store document content as well as to be able to use it for indexing.
If delayed indexing or non storage of one attribute requires propertie to be obtained via the node service then all properties will be returned.
Lucene seems an obvious choice as it resolves the following issues:
Lucene has disadvantages as:
We should tokenise each field/attribute according to its type definition.
For example path should be treated in a special way.
We should map to the same analyser on the query side.
Integers etc. need to be stored and tokenised in a form that will allow lexographical ordering. Similarly for date. Timestamps need to be indexed as dates and treated specially in queries.
The data dictionary should control the indexing behaviour.
There are two scenarios
When we are not in the two-phase commit world we have to do more detailed error recovery.
With JTA we will know if we need to recover and just need to know what to do.
For each store we need to keep the following when we prepare a transaction
If we find a nonJTA TX that still has info we need to determine
In the JTA world we are told what to recover.
To test the index state:
If an index is absent we have to rebuild.
If an index is partially corrupted by deleting an index segment then the index will effectively be broken and should be rebuilt from scratch.
In the non-JTA world we would not commit the index befroe the database. There is no need to back out a change from the index.
Support for JTA.
Should switch to the spring pattern for keeping transactional resources.
Produced by all internal factories.
Conditional on being a JTA or Hibernate transaction manager
JTA
NodeService.save()
nonJTA
This implies we have one synchronisation that optionally does the indexer stuff.
We should be done before the Spring synchronisation</pre>
We have modified lucene 1.4.3 to address a number of minor issues and enhancememnts.
These are described here. Lucene Extensions and Issues
Ask for and offer help to other Alfresco Content Services Users and members of the Alfresco team.
Related links:
By using this site, you are agreeing to allow us to collect and use cookies as outlined in Alfresco’s Cookie Statement and Terms of Use (and you have a legitimate interest in Alfresco and our products, authorizing us to contact you in such methods). If you are not ok with these terms, please do not use this website.