DSpace 9.0 Highlights: Multi-cloud Storage and Simplified SAML

Submitted by Saiful on

The first release candidate of DSpace 9.0 (9.0-rc1) became available last month, and it's packed with quite a few new features and improvements. Three features, in particular, have caught my eye and I'm excited about it.

1. Apache JClouds Integration

The most significant advancement in DSpace 9.0 is the integration of Apache JClouds for assetstore management. This opens up a world of possibilities for institutions looking to leverage the scalability and flexibility of cloud storage.

  • Beyond Filesystem and AWS S3: The DSpace assetstore location was limited to filesystem-based storage or AWS S3 (mostly due to rigid endpoint configuration of AWS SDK). Now, with Apache JClouds, the possibilities are vast.
  • A Universe of Storage Providers: Apache JClouds brings support for a multitude of cloud storage providers, offering unparalleled flexibility. You can find the complete list of supported providers here: https://jclouds.apache.org/reference/providers/
  • Examples of Supported Providers:
    • Google Cloud Storage
    • Azure Blob Storage
    • Amazon S3 and S3-compatible services like MinIO, Digital Ocean Spaces, and many more.
    • OpenStack Swift
    • And many other cloud storage solutions.
  • Scalability and Reliability: Cloud storage offers unparalleled scalability, ensuring your repository can grow with your needs. The inherent redundancy of cloud platforms also enhances data reliability and disaster recovery capabilities.
  • Simplified Deployment and Maintenance: Managing storage in the cloud can significantly reduce the overhead associated with local infrastructure. You can easily scale storage capacity without the need for physical hardware upgrades.

This integration simplifies deploying and maintaining DSpace, especially for organizations with a cloud-first strategy. The ability to use MinIO for local testing, or in house S3 compatible storage, is also a great boon for development and testing.

2. Streamlined SAML Authentication (No More Shibboleth Dependency!)

Another fantastic addition to DSpace 9.0 is the native SAML authentication support, eliminating the previous dependency on Shibboleth. This simplifies the authentication process and makes it easier to integrate DSpace with existing identity management systems.

  • Simplified Configuration: The new SAML implementation significantly reduces the complexity of setting up and configuring SAML authentication.
  • Increased Flexibility: You can now connect DSpace to a wider range of Identity Providers (IdPs) that support SAML 2.0.
  • Reduced Dependencies: By removing the Shibboleth dependency, it reduces complexity.
  • This is a great step forward for ease of deployment, as many institutions already use SAML for authentication.
3. Matomo Analytics Integration (Privacy-First Statistics)

In an increasingly data-driven world, understanding how your repository is used is crucial. However, this often comes with concerns about user privacy. DSpace 9.0 addresses this by introducing integration with Matomo Analytics, a leading open-source, privacy-first analytics platform.

Beyond generic usage stats, this integration allows institutions to gain deeper insights into repository activity while maintaining full control over their data and respecting user privacy.

  • Enhanced, Privacy-Focused Insights: Matomo provides detailed statistics on item views, downloads, visitor behavior, and more, giving you a comprehensive understanding of how users interact with your repository. Crucially, Matomo is designed with privacy in mind, offering features like data anonymization and opt-out options to help institutions comply with data protection regulations like GDPR.
  • Data Ownership and Control: Unlike third-party analytics services where your data resides on external servers, the Matomo integration allows institutions to host their own analytics instance, ensuring complete ownership and control over their usage data.
  • Customizable Reporting: Matomo offers highly customizable dashboards and reports, enabling institutions to tailor their analytics to their specific needs and track key performance indicators relevant to their repository's goals.

This integration is a significant step forward for institutions that require robust usage statistics but prioritize the privacy and control of their data. It aligns perfectly with the principles of open scholarship and responsible data management.

Looking Ahead to DSpace 9.0

These three features alone make DSpace 9.0 a highly anticipated release for me. The Apache JClouds integration and the native SAML support are just a glimpse of the many improvements and enhancements coming our way.

Disclaimer: AI tools assisted in writing parts of this blog entry.