Goal
Install Apache Tika with Solr on Platform.sh
Assumptions
- A Drupal 8 project on Platform.sh.
- Solr configured on that project.
Problems
Apache Tika allows you to extract information from binary files (e.g. PDF files) and make them searchable in Solr.
Steps
1. Install search modules using composer
Install and configure search_api
and search_api_solr
:
composer require drupal/search_api
composer require drupal/search_api_solr
More information is available for setting up Solr with Drupal 8 in the public documentation.
2. Install search attachments module
Install search_api_attachments
using composer
composer require drupal/search_api_attachments
Search API Attachments enable pointing at the tika
jar file to index PDF documents.
3. Install the Tika jar
Modify or include a build hook to download the Tika jar file to the project by editing .platform.app.yaml
.
# .platform.app.yaml
hooks:
build: |
mkdir -p /app/srv/bin
cd /app/srv/bin && curl -OL http://download.nextag.com/apache/tika/tika-app-1.16.jar
The build hook creates the directory /srv/bin
and downloads the Tika jar executable into it. An example project is available where the full .platform.app.yaml
can be found.
Consult the documentation for more information about build and deploy hooks on Platform.sh.
4. Configure the search API attachments
Visit /admin/config/search/search_api_attachments
in a browser and add the method, java executable, and Tika paths configuration.
These paths correspond to the paths entered in the .platform.app.yaml
file for the build
step.
Conclusion
Apache Tika is now setup with Solr on Platform.sh.