Pure is a Java EE application that uses the subset of Java EE available in servlet containers such as Tomcat and Jetty. The application is built around the Spring IoC container and the associated Spring frameworks. Pure is written 100% in Java.
The Pure core module provides functionality for working with research metadata and associated binary representations (actual publications, images etc.). The core provides standard CRUD functionality, as well as a host of other features such as reporting, transformation from/to formats (CERIF, DC, etc.), access control, workflows, multi-tenancy, dependency management. Most other modules are built on top of the functionality provided by the core.
Pure's performance is directly linked to the speed of the CPU's, and we recommend fast processors and performance-oriented processing architecture. Sun's SPARC architecture is not recommended. It will run, but Pure does not utilise the special features of this architecture, and the result is disappointing.
AREA | RECOMMENDATION |
Processors | Dual Intel Westmere X5670 CPU's |
Memory, single-box setup | 8Gb RAM, >16Gb will not enhance performance |
Memory, separate DB-server and application server | 6Gb RAM, >12Gb will not enhance performance |
Storage, Pure system data | Min. 70Gb storage |
Storage, fulltext files and other binary attachments | Depends on full text deposit policies. Storing to an expandable filesystem (SAN, etc.) is recommended. |
Server OS | Windows, Linux, UNIX (e.g. Solaris). We have equally good experiences with all three major platforms. |
Database | Oracle, Microsoft SQL, PostgreSQL. Recent versions. |
Virtualisation | All major platforms are supported. We use VMWare ourselves. |
Pure is normally integrated with local systems that hold information about research; please see the table below.
Pure communicates with these internal systems via a built-in integration framework called PIP; Pure Integration Platform. It is a two-way data synchronisation framework that uses one synchronisation job per data source. It is a plugin-based framework by which each synchronisation job can be added and maintained in a standardised way with little effort.
Integration with internal systems is a project in itself, and it is carried out in collaboration between local IT staff and our implementation team subject to planning at the start-up workshop, please also see Implementation. The division of tasks and responsibilities is very straight-forward:
PARTY | TASK |
Institution’s integration team | Setting up views or warehouse Data selection Data integrity Data quality and completeness |
Atira’s implementation team | Communication with views or warehouse Data mapping Data relations Testing and feedback |
Once the different synchronisation jobs are set up on the PIP, system administrators can manage them freely; set their synchronisation schedules, perform ad-hoc runs out of schedule, monitor run results in logs, and more. All-new jobs can also be set up by admins within the existing synchronisation scope. Please also see Systems Integration Management and the section about PIP under Interfaces.
The result of a data-integration project with internal systems is reliable, well-tested synchronisation of selected, quality-assured data from those systems into Pure, where data is mapped and related correctly according to Pure's CERIF-based datamodel.
The table below shows a small selection of internal systems, which we have previously integrated Pure with:
SYSTEM | STANDARDISED |
Agresso, Finance (UNIT4) | Yes |
Agresso Award Management (UNIT4) | Yes |
UNIT4 CRM1 | Yes |
CEDAR, Finance | No |
CODA Finance (UNIT4) | No |
SITS, Student Administration | No |
SAP, Finance | No |
Navision, Finance | No |
ScanPas, HR | No |
Meltwater, Press coverage | Yes |
InfoPaq, Press coverage | Yes |
pFACT, project costing | Yes |
ResourceLink, HR | No |
Integration with user authentication and single sign-on systems is also based on the PIP. Local IT staff needs to do very little but grant our implementation team access to the relevant authentication or single sign-on system for a limited period of time within the project. The table below shows the user authentication and single sign-on systems, that Pure has been integrated with previously:
SYSTEM | STANDARDISED |
LDAP, authentication | Yes |
Active Directory authentication | Yes |
Shibboleth authentication/SSO | Yes |
Kerberos authentication | Yes |
CAS, single sign-on | Yes |
Co-Sign, single sign-on | Yes |
Integration with repositories is not based on the PIP but in so-called repository connectors, please see under Interfaces for technical information. These repository connectors are a part of Pure but they are maintained according to external needs; e.g. when the corresponding repository is released in a new version.
The result of integration with a repository is that metadata and full-text objects automatically are deposited in the repository at the same time as they are submitted to Pure.
For more information about repository integration, please see Fulltext and Repository functionality under Library functionality. Local IT and library staff are not involved heavily in repository integrations except when defining the rules and workflows by which content must be deposited in the repository.
The table below shows the repository systems that Pure has been integrated with:
SYSTEM | STANDARDISED |
DSpace | Yes |
ePrints | Yes |
Fedora | Yes |
Equella | Yes |
Berkley Press Digital Commons | Yes |
Integration withexternal bibliographic and bibliometric data sources is based on Pure's Self-import module, which comes with standard interfaces for a number of sources, please see under Interfaces for more technical information. Integration with these sources comes with Pure out of the box and makes it possible to import bibliographic metadata, bibliometric data, and fulltext files from the sources shown in the table below:
SOURCE | DESCRIPTION | STANDARDISED | LICENSE |
PubMed | Online source | Yes | Not required |
Web of Science | Online source | Yes | Required |
Scopus | Online source | Yes | Required |
ArXiv | Online source | Yes | Not required |
CrossRef | Online source | Yes | N/A |
JournalTOCs | Online source | Yes | Not required |
WorldCat | Online source | Yes | Required |
Bibliotek.dk | Online source | Yes | Not required |
BibTex | File source | Yes | Not required |
RefMan | File source | Yes | Not required |
RIS | File source | Yes | Not required |
Arto | File source | Yes | N/A |
CABabstract | Online source | Yes | N/A |
NOTE: Adding new sources for self-import is a relatively simple affair; both file-based and API-based. Please enquire.
Most institutions require legacy data to be imported; often publications from former publication bases, but it can also be other types of content. Pure's PXA Framework supports such imports.
A specialised file format called PXA is part of the Legacy Import Framework. PXA is short for Pure XML Archive. It is a container-format for metadata, text- and binary files, and relations. The PXA format will always match the data-model in Pure precisely, which allows direct import of legacy data and related fulltext into Pure.
It is possible to import something to a specific stage in a workflow: If importing publications, for example, data can be imported directly to the final stage of the publication-workflow if you trust the source, or it can be imported to the initial stage of the publication-workflow if you don't trust the source.
Legacy data import via PXA is usually carried out as a service by Atira during the implementation project.
In addition to the PXA framework, a generic XLS import feature is available, which allows the import of data into Pure via Excel spreadsheet files. A specific format must be observed for the file to be valid. Compliant files can just be uploaded and imported in one simple operation. This feature is built to empower research institutions to carry out legacy import themselves. Similarly, legacy import of research outputs is available by RIS and BibTex file formats.
Depending on the data quality in the legacy source, legacy import can be difficult. A common problem is, that Authors in the legacy data do not have the same unique identifiers as Persons have in the HR system, which makes alternative steps necessary in order to properly relate legacy publication Authors to Persons. Many other challenges can present themselves as well, and to meet them prepared we often ask that a small data-sample is provided from the legacy system before proceeding.
Standard documentation is maintained in parallel to the continuous development of Pure and made available to all Pure owners.
TYPE | DESCRIPTION | FOR WHOM? |
Printable user documentation | General task-guides for end-users. All primary tasks covered for all user roles | All users |
Printable user documentation | Quick-start guide for researchers | Simple roles |
Built-in help | A) Built-in help pages and B) context-sensitive help-messages. Administrators can modify existing help resources and create new ones | All users |
Printable user documentation | General technical documentation for IT-staff | IT staff |
Printable user documentation | Web-Service API documentation for programmers | Programmers |
Printable user documentation | Dual data-model documentation: A) Each content-type and field is documented technically (field type, length, etc.), and B) with a user-friendly description of the field's purpose. | A) IT staff and Programmers, B) Simple roles |
It is possible to extract fully detailed datamodel documentation from Pure at any time in Word and Excel formats. It includes both a technical description (fields types, etc.) and a narrative description.
Pure has a browser-based user-interface. Firefox, Explorer, Safari, and Chrome are supported on Windows and Mac in relatively recent versions. These browser- and platform-combinations are guaranteed to work, and users should not expect errors.
Other browsers and operating systems will also work, but the functionality is not guaranteed, and users should expect occasional errors. All browser-support errors can be reported and will be fixed at no charge.
Pure comes with a built-in Web-Service API, which comprises two separate Web-Services; see below. All content in Pure, that is available for public access, is available from this API. It makes it possible to exhibit content from Pure on own websites, and own web applications can retrieve data from the Web Service API, which makes it possible to use Pure as an integrated component in a local Service Oriented Architecture or even in a national SOA-based infrastructure.
Pure's Web-Service API comprises a Document/Literal service offering XML over SOAP with WSDL-support and a REST service offering just the XML payload without encapsulation.
This Web-Service API is stable and will only rarely change and only after announcements being made. An unstable service is also available for those in need of the latest options.
Rich method libraries are available with Pure's Web-Service API. Methods exists for all purposes and can easily be used with operators for the specific task. If a personal list of publications were to be published online, for example, the relevant method would just be used with criteria for the desired person, the desired publication types, and the desired render style; HARVARD, Vancouver, CBE, MLA, etc.
Publication metadata sets can include pointers to related fulltext files in Pure, which further makes it possible to create online publication-lists with fulltext links. Such publication lists can be automatically rendered in a number of different formats including HARVARD, Vancouver, CBE and MLA.
Please note, that data can't be submitted to Pure through the Web-Service API, it is read-only. This is because of security issues and because there is no need - Pure's other ingestion methods (the systems integration platform PIP, the file-import features, and the Publication Import framework) provide different ingestion methods.
OAI is short for Open Access Initiative. This initiative is behind the OAI-PMH protocol; Open Access Initiative Protocol for Metadata Harvesting. Pure incorporates both an OAI data providing mechanism and an OAI data harvesting mechanism. The latter makes it possible to harvest metadata in bulk from outside OAI sources, the former makes it possible to let metadata be harvested by other systems from Pure.
In Pure's OAI data providing mechanism, XML metadata is provided from Pure in Dublin Core format but can also be provided in CERIF-XML format and in MODS. In Pure's OAI data harvesting mechanism, retrieval of XML metadata from Pure is possible in Dublin Core format, CERIF-XML format and MODS.
OAI harvesting is used by authorities in several countries for national assessment exercises and similar projects, where research organisations have data return obligations.
Dynamic integration with external systems is based on a development framework called PIP, Pure Integration platform, which offers cost-effective setup and maintenance of data synchronisation jobs in Pure, which can be managed by system administrators.
It is possible to let the PIP handle logical operations such as aggregation of data (e.g. summarisation of multiple transactions from a finance system into one value) and selection of criteria based on variables (e.g. retrieve only full-time staff).
PIP will also ensure that proprietary data modelling in the source system is replicated correctly in Pure; e.g. staff moving from one department to another, or organisational units being opened, closed, merged, and taken over by other units. This happens automatically where possible, which is the case with most source systems. Pure includes tools that will allow system admins to do the same manually in Pure, where that is the only option.
More information is available under Systems integration above.
Pure interfaces with repositories via so-called connectors; not via Pure's Integration Platform. Connectors are upgraded with the repositories as necessary. A connector is a standardised, proprietary development framework. It facilitates complex two-way exchange of data with the repository. Both simple submit operations and more advanced update and delete operations are handled by the connector. Open, standardised frameworks such as SWORD will likely take over the function of Pure's connectors but currently lack maturity; the lack of an update method is one example.
The following is a description of the stages that a deposit from Pure to DSpace goes through:
Step 1: Translation from the country specific model to a generic CRIS model o System: Pure
Source: The country/customer specific Pure data model
Destination: A generic CRIS data model
Technology: In memory object-to-object mapping coded in Java
Reason: To minimise the number of model-to-model translations Pure has to support. Almost all contact with external systems is done through the generic CRIS model. This means that Pure must support N+M translations instead of N*M (where N=number of data models and M=number of external systems)
Step 2: Translation from the generic model to MODS XML
System: Pure
Source: A generic CRIS data model
Destination: MODS XML
Technology: In memory object to XML translation coded in Java
Reason: MODS (wrapped in METS) are one of the few import formats natively supported by DSpace
Step 3: The MODS XML is transferred to DSpace in a METS envelope
System: Pure + DSpace
Source: MODS XML wrapped in a METS XML envelope
Destination: MODS XML wrapped in a METS XML envelope
Technology: Web services (WebDAV)
Reason: The METS package is deposited to DSpace via DSpace's own web service interface. Minor augmentations had to be made to DSpace's web service for it to support the necessary operations.
Step 4: The METS envelope is removed and the MODS XML is translated to DIM format o System: DSpace
Source: MODS xml
Destination: DSpace Intermediate Metadata format (DIM)
Technology: XML stylesheet
Reason: The MODS XML is translated to the DSpace DIM format using a XML stylesheet. This process is done by the METS DSpace ingester
A deposit into the ePrints goes through roughly the same steps, the only difference is that the translation from MODS to the the ePrint format is done in Pure, where as the translation from MODS to DIM is done on the DSpace server. Having the translation done in Pure gives some flexibility and less need to coordinate across teams (our implementation team versus the local repository team).
Legacy research outputs can be imported using RIS and BibTex file formats. In addition to that, a datamodel-specific Excel file format is supported, which allows richer records to be imported compared to the standard RIS and BibText formats.
All three formats are available in Pure's Self-import module import interface. It also offers author matching to be carried out automatically during import, organisational units and external persons and organisations are also matched automatically.
Imported outputs can be enriched (e.g. adding a relation to the Project that the output was produced under) and rectified (e.g. correcting misspellings or mis-matches) as appropriate. It can be done through flexible and customisable workflows, which will ensure data quality and distribute the workload.
Comprehensive identification and handling of duplicates is also available including a merge feature, which allows users to choose the best parts from each duplicate.
The tools mentioned above all allow institution's to deal with legacy outputs themselves. In addition to that, Pure comes with a PXA legacy import framework, which allows unlimited import of metadata, fulltext (text and binary objects), and relations (e.g. person-to-organisation). Full field mapping can be handled here too, and custom bulk-operations on data can be carried out. PXA is short for Pure XML Archive, which is the customisable file format used for import on this framework. The PXA framework is usually used by technical staff at Atira for importing legacy outputs and other data types as a service during implementation, if such work is desired by the institution.
Pure's Self-import module comes with interfaces for the following on-line sources and file formats:
SOURCE | DESCRIPTION | STANDARDISED | LICENSE |
PubMed | Online source | Yes | Not required |
Web of Science | Online source | Yes | Required |
Scopus | Online source | Yes | Required |
ArXiv | Online source | Yes | Not required |
CrossRef | Online source | Yes | N/A |
JournalTOCs | Online source | Yes | Not required |
WorldCat | Online source | Yes | Required |
Bibliotek.dk | Online source | Yes | Not required |
BibTex | File source | Yes | Not required |
RefMan | File source | Yes | Not required |
RIS | File source | Yes | Not required |
Arto | File source | Yes | N/A |
CABabstract | Online source | Yes | N/A |
Sources can be customised and new ones can be added by request. This is also true for proprietary internal sources. Please enquire.
Roles, rights, and access strategies is how access to content is delegated in Pure. Generally, it handles the situation that different users must be given different access privileges at different stages over a data item's lifetime in Pure.
Each user will have at least one role with a certain set of rights. The role will determine what functionality is available and what data can be viewed and edited. The user-interface is adaptive: The user will only see the functionality, that he or she actually has rights to access.
Roles can be assigned automatically to users from an HR-system or similar, or they can be set by administrators. Each user can have several roles.
Roles can be customised to meet local requirements exactly. However, a standard Roles/Rights-model comes with Pure, which tends to match the requirements of universities well without customisation.
Roles can be Global or Organisational. Global roles grant access to data and functionality across the system while Organisational roles grant access within any specified organisations.
Users can be a Personal User, which is an organisational role and the most basic role in Pure. Users can also have one or more Editor roles and one or more Administrator roles.
Personal users create content at their own organisation (an institute or a department, for example).
Organisational Editors enrich and/or validate content created by Personal Users; Publications and other research output, Projects, Activities, Impact Cases, etc.
Global Editors create and manage globally relevant content; e.g., Publishers and Journals.
Organisational Administrators have special privileges within one or more specified organisations, which supersedes those of lesser roles at the same organisation.
Global Administrators have ultimate privileges. The primary of these roles is the one referred to simply as "Administrator"; the master administrative role in Pure.
Roles and Rights are closely related to workflows. As mentioned under "Workflows" below, there can be one workflow per content type. If advanced workflows with more than three steps are necessary to set up, more Editors with more fine-grained tasks will also be set up. This enables the highly desirable situation, that creation, editing, enrichment and validation of content can be concentrated on a few persons or distributed to many persons depending on what business processes must be supported in the individual situation.
A Roles/Rights model in Pure will always follow the datamodel. The one shown below is for the UK datamodel, for example:
ROLE | RIGHTS | PURPOSE |
|
|
|
BASIC USERS |
|
|
Personal User | Organisational | Lets users create, retrieve, update, and delete personal Research Outputs, Projects, Esteem, and Activities. |
EDITORS |
|
|
Editor of Publishers | Global | Lets users create, retrieve, update, delete, and merge Publishers across all organisations. |
Editor of Journals | Global | Lets users create, retrieve, update, delete, and merge Journals across all organisations. |
Editor of Events | Global | Lets users create, retrieve, update, delete, and merge Events across all organisations. |
Editor of External organisations | Global | Lets users create, retrieve, update, delete, and merge External Organisations across Pure. |
Editor of External persons | Global | Lets users create, retrieve, update, delete, and merge External Persons across Pure. |
Editor of Research output | Organisational | Lets users create, retrieve, update, delete, and merge Research Outputs within their own organisation. |
Editor of Projects | Organisational | Lets users create, retrieve, update, and delete Projects within their own organisation. |
Editor of Organisations | Organisational | Lets users create, retrieve, update, and delete sub-ordinate organisations within their own main-organisation. |
Editor of Activities | Organisational | Lets users create, retrieve, update, and delete Activities within their own organisation. |
Editor of Application/Award | Organisational | Lets users manage Grant applications and awards organisationally |
|
|
|
REF2014 |
|
|
Assessment Editor | Global | Lets users participate in the administration of the REF return at organisational level |
Editor if UoA | Global | Lets users manage Units of Assessment globally as part of the REF administration effort |
Editor if Impacts | Global | Lets users manage Impacts globally |
Editor of Staff2014 | Global | Lets users manage Staff globally as part of the REF administration effort |
REF2014 administrator | Global | Lets users participate in the administration of the REF return at the highest level at the institution. |
|
|
|
ADMINISTRATORS |
|
|
Administrator | Global | Grants a user global administrator privileges in Pure. |
Technical Administrator | Global | Administrator role limited to technical administration (without access to academic content) |
User administrator | Organisational | Lets a person administrate users within one organisation such as a department. |
Reporter | Organisational | Grants a user the right to run reports on all content within one organisation. |
Administrator of Research Output | Organisational | A top-level role that is allowed to administrate research outputs |
De-duplicator of Research Output | Global | A global role allowed to work with duplicates and de-duplication across Pure. |
Access strategies are the underlying rules by which access is delegated to roles. Pure comes with a pre-defined set of access strategies. It fits the needs and requirements of most research institutions, but it can be customised if necessary.
Access to content objects is normally restricted by Pure's Roles/Rights model and underlying access strategies, as described above. In addition, researcher and other users with sufficient rights can mark content as confidential. Marking content as confidential will make it inaccessible by the normal set of access strategies - users that would ordinarily be able to access it will not be able to for any purpose. Content marked as confidential is only visible to the users that are associated with the content, its relevant editors, and the institution's highest level administrators.
Content marked as confidential will remain confidential. Only the user that originally made the setting or the other users mentioned above will be able to change the setting away from "confidential".
More information about confidentiality is available under Researcher functionality.
Field validation means defining rules for submission of content. A basic example of field validation is making the title-field on publications mandatory, but more advanced field validation can also be employed. Validation is specific for content types - making the title fields mandatory for publications doesn't mean it must be mandatory on book contributions, for example.
Field validation also works on related content types. For example, you can create a person without his or her academic title, but if an academic title is mandatory for authors of publications, the person without the title cannot be added as an author on any publication until a title has been added.
Controlled error-messages will always be shown to the user explaining what is the matter and how to fix it. The field with faulty or missing information will be repositioned in front of the user, it will be given focus, and it will be marked in pink colour. Pure comes with default error-messages, but they are all configurable by Administrators.
TYPE | EXAMPLE |
Simple validation of field value-types | Only numerical values in number-fields |
Simple validation of field value-types | E-mail addresses must contain one @-sign and at least one .-sign. |
Validation of field-utilisation by content type | Title is required on a publication |
Validation of field-utilisation by content type | There must be at least one Author on a publication |
Conditional field-validation | If a publication is classified “External”, at least one author must be internal |
Conditional field-validation | A classification on a classification scheme must be formatted as a content-type URI if the classification scheme is of the type “Content type classification scheme” |
Pure's user-interface is available in English, German, Dutch, Finnish, Swedish and Danish, and it can be made available in any number of additional languages.
Also, content can be in specific languages; at Danish universities, for example, Publications must be both Danish and English, while Activities and Projects are Danish-only. Individual fields can also be language-specific: In the Danish case, only some fields are required to be bilingual, not all. The different languages on a piece of content are referred to as the Primary and Secondary language.
COUNTRY | USER INTERFACE | CONTENT |
Germany | DE, UK | Primary/Secondary: German/English |
United Kingdom | UK | Primary: English Optional: Welsh, Celtic (not implemented) |
Belgium | NL, UK, DE | Primary/Secondary: Dutch/English |
Sweden | SE, UK | Primary/Secondary: Swedish/English |
Finland | FI, UK, SE | Primary/Secondary/Tertiary: Finnish/English/Swedish |
Denmark | DK, EN | Primary/Secondary: Danish/English |
Pure is standardised in several ways in different areas: The technical interfaces, Publication Import interfaces, Legacy import, and the export functionality are areas standardised by data formats, protocols and other measures.
Further, standardised frameworks underlie all the areas of functionality, where much change is expected. These areas are for example Publication Import, where new sources can be added to the Publication Import framework, and the Integration framework, where integration with new sources can be added. Adding new rendering formats to Pure is also framework-based. Extending these areas is fast, cost-effective and reliable because of the use of standardised frameworks.
Finally, each of the available standard datamodels are CERIF based according to the currently governing version.
Atira A/S
Niels Jernes Vej 10
9220 Aalborg Oest
Denmark
Phone: (+45) 96 35 61 00
VAT no. 26835526
General info: info@atira.dk
PURE support: pure-support@atira.dk
Our technical area is server-side application architecture, development, and implementation. Our business domain is Research Information Management. We supply our product Pure, an enterprise-class CERIF-based CRIS system.
Pure, released in 2003, is licensed for 47,900 research staff at our 75 references in 8 countries.
Copyright © 2012 Atira A/S, a Reed Elsevier Company. All rights reserved.
Privacy Policy | Terms and Conditions Cookies are set by this site. To decline them or learn more, visit our Cookies page.