Skip to content

Latest commit

 

History

History
144 lines (102 loc) · 4.58 KB

README.md

File metadata and controls

144 lines (102 loc) · 4.58 KB

Build Status Coverage Status GitHub version

was-thumbnail-service

Rails app for providing thumbnail images for web archiving seed object URIs. These thumbnails are intended to be used by discovery environments that include the seed objects, such as SearchWorks, and by access tools such as sul-embed.

The thumbnail images are derived from WARC artifacts in an openwayback machine corresponding to the seed URI. There may be one or more thumbnail images per seed URI -- particular "mementos" are chosen for images based on the simhash gem's "hamming distance" between mementos.

Background job workers are used to determine if new mementos have been added to the Wayback machine and if so, if new thumbnails should be generated.

Note that when was_robot_suite step wasSeedDissemnationWF triggers thumbnail image creation, thumbnail is added to DOR object contentMetadata. However new thumbnail images created by the background workers are put only in the digital stacks, not in DOR object contentMetadata. This allows DOR seed objects to redo their thumbnails in the digital stacks without altering the DOR object itself.

Download the code locally

git clone https://github.com/sul-dlss/was-thumbnail-service.git
cd was-thumbnail-service/
bundle install

Configuration

Database

config/database.yml

MySql
  1. Add database user/password

  2. Create the mysql databases

- mysql -e 'create database test_was_thumbnail_service;'
- mysql -e 'create database dev_was_thumbnail_service;'

or from mysql client:

create database dev_was_thumbnail_service;
create database test_was_thumbnail_service;
MySql (in Docker)
  1. Start MySql container
docker run --name test-mysql -e MYSQL_ROOT_PASSWORD=<your password> -p 3306:3306 -d mysql:5
  1. Configure config/database.yml:
default: &default
  adapter: mysql2
  encoding: utf8
  pool: 5
  username: root
  password: <your password>
  host: 127.0.0.1

And to shut it down:

docker stop test-mysql
docker rm -vf test-mysql
SQLite (for local testing)

Revise config/database.yml with these changes:

development:
  # <<: *default
  # database: dev_was_thumbnail_service
  adapter: sqlite3
  database: db/test.sqlite3
  pool: 5
  timeout: 5000

# Warning: The database defined as "test" will be erased and
# re-generated from your development database when you run "rake".
# Do not set this db to the same as development or production.
#test:
#  <<: *default
#  database: test_was_thumbnail_service
test:
  adapter: sqlite3
  database: db/test.sqlite3
  pool: 5
  timeout: 5000

Sqlite does not need a separate step to create the database.

Note about testing with SQLite

Note that some specs fail with sqlite. There appears to be a simhash value in some test fixtures that only works with mysql, not sqlite. To get around this, there is a rake task to run all tests except those that require mysql:

rake spec_sqlite

Perform database migrations (set up the schema)

rake db:migrate

Operation

Run the Rails server.

rails s

You can get an overview of the system by going to http://localhost:3000/ . This should show a table of seed objects in the database, their druids, URIs, number of mementos and number of images.

Adding a new seed

locally

In the shell, you can add a new seed to the database through the API. Choose a uri in the openwayback app you entered in your configuration.

curl -i "http://localhost:3000/api/seed/create?druid=druid:ab123cd4567&uri=http://slac.stanford.edu"

It should return 200, meaning the seed URI is added successfully. At first, the rails app table will show the seed's druid and uri, but there will be no mementos or thumbnails. Those will be added via a background process.

in our infrastructure

When a new seed is accessioned (generally via was-registrar), a workflow step from wasSeedDissemnationWF (from was_robot_suite) will call the api (as above) to add the seed to the was-thumbnail-service.

Thumbnail generation

This is done as a background process with delayed_job:

bin/delayed_job start
rake RAILS_ENV=development was_thumbnail_service:run_thumbnail_monitor