Apache Jena

From lotico
Jump to navigation Jump to search

Apache Jena is an open source Semantic Web framework for Java. It was initially produced by HP Labs (until October 2009) in Bristol UK, and after a brief period as openjena it became an Apache Software Foundation project. Jena provides an API to extract data from and write to Resource Description Framework (RDF) graphs also called a Model in Jena. A model can be sourced with data from files, databases, URLs or a combination of these. A Model can be queried through a query language such as SPARQL 1.1 and specific ARQ extensions. The framework has various internal reasoners can be set up to work with different OWL and rules profiles.

Jena supports the serialisation of RDF graphs to a number of different formats:

  • RDF databases
  • relational databases
  • Turtle
  • N-Triples Notation 3
  • TriG
  • N-Quads
  • TriX
  • RDF Binary

Project History

On 28th August 2000 Brian McBride a researcher at HP Labs Bristol in the UK announced the JENA project with a brief message to the W3C RDF Interest Group mailing-list. His hope was to build “better tools {that} will help to encourage the adoption of RDF”. Though calling for an Apache project for the Semantic Web for some time little did Brian know that 20 years on Jena would be known as a top level Apache Jena Project with a strong and steady community developer base with almost half a millions lines of code in the release. Today JENA consists of a RDF API, a triple store called TDB a RDF storage solutions and the OWL API. And last but not least a fully fledged SPARQL server called FUSEKI. It shows that Jena is now more than just a Java API to process RDF. It is shaping technology standards and did help the wider Semantic Web community by advancing technology tools to bring the Semantic Web vision to its full potential. The evolution of the Jena project is described loosely along four distinct phases of the projects life cycle: An innovation phase that bootstraps the project, an early adoption phase that comes as a result of availability of a usable code base, followed by a late adoption phase that evolved along a more stable release cycle and new product features and design patterns. Finally we take a look at the most recent phase that is associated with attempts to mainstream the technology and can be described by a prolonged phase of adoption in the mainstream.

Early Innovation

In 2000 the first generation of the Jena project was hosted by HP Labs (HP Laboratories) in Bristol, England. The project was driven by Brian McBride a researcher and Semantic Web evangelist working on the implementation of the W3C's RDF specification. In December 2000 Brian McBride described Jena as an experimental Java API for manipulating RDF. The name of the Jena project was inspired by Brian’s daughter Jena McBride. In the source code of version 1.0 the first official release it is mentioned that some of the ideas are “derived from the SiRPAC API, which {was most} recently maintained by Sergey Melnik. It borrows from the subject centric approach of David Megginson's DATAX API and the cascading calls style of JDom.”

In January 2001 the first Jena release 1.0 was published and made available for public download. In April 2001 Brian became co-chair of the W3C RDFCore Working Group which initially was part of the W3C Metadata Activity but which in turn was replaced by the W3C Semantic Web Activity group in February 2001. It had a significant impact on the development of standards and technologies related to RDF.

Jena 1

The initial Jena team in 2001 was comprised of Brian McBride (core API and Berkeley DB support), Andy Seaborne (RDQL query support), Dave Reynolds (web hacker and SQL support), Ian Dickinson (DAML API), Chris Dollin and Jeremy Carroll (ARP). On June 6 Jeremy Carroll announced the release of version 1.1.0 on the RDF-Interest group mailing list. On October 1, 2001 Brian McBride announced the release of version 1.2 with ARP, RDQL, DAML+OIL and a persistent store on the Jena users mailing list. The primary reader processed RDF/XML with XML API SAX to load data and generate the RDF model. The Jena 1.2 release had three different datastores: an in memory store, a storage in a relational database (RDBMS) via Java Database Connectivity (JDBC) and the Sleepycat embedded datastore Berkeley DB. [1]

Early Adoption

Jena 2

The release of Jena 2.0 most likely not coincidentally fell on August 28th, 2003. Only 3 short years after the first announcement of the Jena project. Noticeable changes included that the Jena source code introduces the Factory pattern, a design pattern in Java, to create the RDF and query model instances without exposing creation logic to the user and refer to new objects through a common interface. Furthermore it introduces the Graph package to Jena. Jena version 2.5 introduced Dataset in the ARQ package to manage different types of data sources.

By february 2004 HP Labs reported that the Jena project had more than 17,000 downloads and described it as the “most popular toolkit for Semantic Web developers and that approximately 80% of Semantic Web applications that we have seen at recent major conferences have been based on Jena," was reported by Martin Merry, a Semantic Web research manager at HP Labs Bristol.

Late Adoption Phase and Project Migration

From HP to OpenJena

By October 2009 the Jena project moved its server from HP to openjena.org (a server provided by Talis) to reflect that Jena has become an open source project with developers from multiple different organisations.

Apache Jena Project

In December 2011 the Jena project officially became an Apache project, Andy Seaborne announced Jena 2.7.0-incubating, the first Apache release of Jena under the Apache License, and the first release while Jena was incubating, sponsored by the Apache Incubator PMC. It soon became a top level project (TLP) in April 2012 with the status as a full Apache project with the PMC chair Andy Seaborne. Its most significant contribution was the release of Fuseki a fully fledged SPARQL server in June 2012 in a separate download.

Apache Jena goes Mainstream

Apache Jena 3

By November 2010 the Jena project was accepted into the Apache Incubator. It took the Jena team (now the Apache Jena team) 12 years to make it to release 3.0. While retrospectively it does look more like a maintenance release, just adding a few new features it made significant changes in the package naming and source code structure. In addition Java 8 was now a minimum requirement to fully utilize the Jena project. By April 2012 the Apache Jena project "graduated" and became a TLP (Top Level Project) at the Apache Software Foundation.

Noticeable Developments

Persistent Storage Solutions

From it's early days the storage of RDF models (or persistence layer for Jena) was an area of particular interest for the Jena team. Version 1.0 allowed API users to create RDF models, and read and write to and from RDF/XML files. During execution the model would be loaded into memory. The first mention on the mailing list for a “persistent storage module“ that directly allowed API users to access and manipulate RDF triples during execution time was made available in release 1.2 on October 1, 2001 and was based on Berkeley DB (BDB). It was later accompanied by SDB a extended persistent store using a relational database. And finally covered by the TDB and TDB2 release.

SDB 1.0 was announced on 28 Nov 2007 and had support for Oracle 10g R2 & 11g including Oracle Express, IBM DB2 9 including DB Express, Microsoft SQL Server 2005 including SQL Server Express, PostgreSQL v8, MySQL 5.0 (>=3D5.0.22), HSQL DB 1.8, Apache Derby 10.2. SDB 1.0 required Java5 to work. SDB was first mentioned on 15. February 2006 on the Jena dev mailing list and was advertised as a new database layout for SPARQL. Long before SPARQL became a W3C recommendation and the dominant query language for the Semantic Web.

A Query Language for RDF

Since the RDF API allows developers to interact with the RDF Model directly on the API level soon calls for a more abstract interaction were voiced. In February 2001 Libby Miller made a case for an API-independent query language with the name SquishQL [4] which was based on the word play SQL-ish and also referring to Guha's "Enabling Inferencing" QL98 position paper as an inspiration for the work.[3] This led to the development of a query language similar to SQL in the relational database world. RDQL was first mentioned in the 1.2 release in June 2001 to query RDF but still worked primarily on the back of the XML processing chain. RDQL later was proposed as a W3C member submission on 9 January 2004 by Andy Seaborne.[4] The submission explicitly mentions the QL98 position statement by Guha, Lassila, Miller and Brickly as an influence to the work on RDQL.[5] SPARQL as a dedicated query language for RDF was first mentioned in 2004 and discussed as a W3C working draft on 12 October 2004.

Jena Query Syntax Support
RDQL 1.0 1.1 ARQ Function 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019
x x x x SELECT x
x x x ASK x
x x x DESCRIBE x
x x x UNION x
x x x OPTIONAL x
x x x FILTER x
x x x PREFIX x
x x COUNT x
x x MINUS x
x x PATH x
x x x AS
x x GROUP x
x x x LANG
x x EXISTS x
x x NOT
x LET x

GeoSpatial and GeoSPARQL support

In 2007 the geospatial module was introduced to the Jena project by Marco Neumann and Taylor Cowan. An online spatial query service with the name GeoSPARQL was released in 2008 at the geosparql.org site. In 2019 the Jena project introduced the GeoSPARQL module to increase compatibility with the OGC GeoSPARQL standard.

A RDF Server

The first RDF server was called Joseki and was first mentioned as a beta release on 22 March 2002. The Joseki server initially supported the query language RDQL and provided a service for updating and querying remote RDF models based on the idea for a web protocol with the name RDF NetAPI. The first SPARQL Server called Fuseki was mentioned on 12 October 2010 (and with a reference on 9 October 2010 on jena-devel@lists.sourceforge.net) as a preview release for a SPARQL 1.1 server. ARQ 2.1 beta introduced "Update" in Aug 21, 2007 " SPARQL HTTP Graph Store Protocol" support as part of the SPARQL 1.1 work was introduced in 2011.

Software Releases

find a detailed timeline visualization here

Release Name Date Release Notes
Apache Jena 4.2.0 2021-09-16 ANN
Apache Jena 4.1.0 2021-06-04 ANN
Apache Jena 4.0.0 2021-04-01 ANN requires Java 11
Apache Jena 3.17 2020-12-01 ANN
Apache Jena 3.16 2020-07-13 ANN
Apache Jena 3.15 2020-05-19 ANN
Apache Jena 3.14 2020-01-19 ANN
Apache Jena 3.13.1 2019-10-11 ANN
Apache Jena 3.13.0 2019-09-29 ANN
Apache Jena 3.12.0 2019-06-01 ANN Apache Jena 3.12.0 with GeoSPARQL support
Apache Jena 3.11.0 2019-04-30 ANN
Apache Jena 3.10.0 2018-12-30 ANN
Apache Jena 3.9.0 2018-10-08 ANN
Apache Jena 3.8.0 2018-07-02 ANN
Apache Jena 3.7.0 2018-04-14 ANN
Apache Jena 3.6.0 2017-12-17 ANN
Apache Jena 3.5.0 2017-11-02 ANN
Apache Jena 3.4.0 2017-07-21 ANN
Apache Jena 3.3.0 2017-05-21 ANN
Apache Jena 3.2.0 2017-02-10 ANN
Apache Jena 3.1.0 2016-05-14 ANN: Apache Jena 3.1.0 released with Fuseki 2.4.0
Apache Jena 3.0.0 2015-07-29 ANN
Apache Jena 2.13.0 2015-03-13 ANN: Jena 2.13.0, Elephas and Fuseki2
Apache Jena 2.12.1 2014-10-09 ANN
Apache Jena 2.12.0 2014-08-07 ANN: Jena 2.12.0 released (Java7 is a requirement)
Apache Jena 2.11.1 2014-01-24 ANN
Apache Jena 2.11.0 2013-09-18 ANN Apache Jena 2.11.0 including Fuseki 1.0.0
Apache Jena 2.10.1 2013-05-15 ANN
Apache Jena 2.10.0 2013-02-25 ANN
Apache Jena 2.7.4 2012-10-25 ANN: Apache Jena 2.7.4 / Apache Jena Fuseki 0.2.5
Apache Jena 2.7.3 2012-08-08 ANN: Apache Jena 2.7.3 and Jena Fuseki 0.2.4 Release
Apache Jena 2.7.2 2012-07-02 ANN: Apache Jena 2.7.2 and Jena Fuseki 0.2.3 releases
Apache Jena 2.7.1 2012-06-18 ANN: Apache Jena 2.7.1 and Jena Fuseki 0.2.2 releases
Apache Jena 2.7.0 2011-12-23 ANN ANN YHOO
Jena 2.6.3 2010-06-01 ANN
Jena 2.6.2 2009-10-16 ANN: Jena 2.6.2 and ARQ 2.8.1
Jena 2.6.0 2009-05-18 ANN: Jena 2.6.0 and ARQ 2.7.0
Jena 2.5.7 2008-12-08 ANN: Jena 2.5.7 and ARQ 2.6.0
Jena 2.5.6 2008-06-12 ANN: Jena 2.5.6, ARQ 2.3 and SDB 1.1
Jena 2.5.5 2008-02-18 ANN: Jena 2.5.5 and ARQ 2.2
Jena 2.5.4 2007-09-19 ANN: Jena 2.5.4 & ARQ 2.1
Jena 2.5.3 2007-06-18 ANN
Jena 2.5.2 2007-02-07 ANN
Jena 2.5.1 2007-01-24 ANN
Jena 2.5 2007-01-18 ANN
Jena 2.4 2006-05-04 ANN
Jena 2.3 2005-10-12 ANN
Jena 2.2 2005-04-19 ANN
Jena 2.1 2004-02-10 ANN
Jena 2.0 2003-08-28 ANN
Jena 1.4.0 2002-05-04 Ann: The Jena tutorial and Jena 1.4.0
Jena 1.3.2 2002-03-13 ANN
Jena 1.3.1 2002-03-11 ANN
Jena 1.3.0 2002-01-15 ANN: Jena 1.3.0 released with persistent storage in relational da tabases
Jena 1.2.0 2001-10-01 ANN: Jena 1.2 released with ARP, DAML+OIL, RDQL and persistent store
Jena 1.1.0 2001-07-06 ANN
Jena 1.0 2001-01-09 first release of Jena 1.0
Jena Annoucement 2000-08-28 Brian McBride's first Jena announcement


RDQL was first released in Jena 1.2.0 and is an implementation of the SquishQL RDF query language, which itself derives from rdfDB.

2001-09-17 ANN by Andy Seaborne


ARQ 2.0 2007-05-07 ANN: ARQ 2.0
ARQ 0.9.7 2005-09-30 ANN: ARQ 0.9.7 -- SPARQL for Jena
ARQ 0.9.3 2005-02-21 ANN: ARQ v0.9.3 : SPARQL for Jena
ARQ 0.9.0 2004-10-08 ANN: ARQ v0.9.0 : SPARQL for Jena


RDF server for Jena

Joseki 3.4.4 2011-05-06 ANN

Joseki first beta 2002-03-22 ANN



Fuseki 0.2.0 2011-04-21 ANN

Fuseki 0.1.1 2010-10-12 ANN


2007-02-07 ANN: SDB alpha 1
2007-05-31 ANN: SDB alpha 2


2002-01-15 The first release of the RDB module for persistent storage of Jena models in relational databases was announced with the Jena Jena 1.3.0 release with persistent storage in relational databases.


2008-08-07 [jena-dev] ANN: TDB-0.5


2015-06-08 [jena-dev] TDB2

Former and Current Project Members/Committers/Contributors/Community

Jena 1

Jeremy J. Carroll HP
Ian Dickinson
Chris Dollin
Brian McBride HP
Dave Reynolds HP
Kevin Wilkinson

Jena 2 & 3

Aaron Coburn (acoburn)
Adam Soroka (ajs6f)
Andy Seaborne andy_seaborne (andy) - project chair and Apache Foundation Vice-President
Benson Margulies
Brian McBride bwm
Bruno Kinoshita (kinow)
Chris Dollin chris-dollin
Chris Tomlinson (codeferret)
Claude Warren (claude)
Damian Steer pldms (damian)
Dave Johnson
Dave Reynolds der
Ian Dickinson ian_dickinson (ijd)
Jena Team jenateam
Jeremy Carroll jeremy_carroll
Kevin Wilkinson wkw
Leo Simons
Lorenz Buehmann (lbuehmann)
Marco Neumann neumarcx
Markus Stocker markus_stocker
Osma Suominen (osma)
Paolo Castagna castagna
Peter Coetzee major_error
Reto Bachmann-Gmuer rebach
Richard Cyganiak cyganiak
Rob Vesse (rvesse)
Ross Gardler
Stephen Allen (sallen)
Steve Battle steve_battle
Stuart Taylor staylorabdn
Stuart Williams skw
Taylor Cowan
Ying Jiang (jpz6311whu)

Current Project data maintainer(s) Apache

Andy Seaborne
Ian Dickinson
Dave Reynolds
Stephen Allen
Chris Dollin
Damian Steer
Paolo Castagna
Rob Vesse
Claude Warren
Bruno P. Kinoshita
Osma Suominen
Adam Soroka
Lorenz Buehmann

External links



[1] B. McBride, "Jena: a semantic Web toolkit," in IEEE Internet Computing, vol. 6, no. 6, pp. 55-59, Nov.-Dec. 2002.

[2] Seaborne, Andy: An RDF NetAPI ISWC 2002. Sardinia, Italy, 2002

[3] Brian McBride (2002) Four Steps Towards the Widespread Adoption of a Semantic Web. In: Horrocks I., Hendler J. (eds) The Semantic Web — ISWC 2002. ISWC 2002. Lecture Notes in Computer Science, vol 2342. Springer, Berlin, Heidelberg

[4] Libby Miller, Andy Seaborne, Alberto Reggiori (2002) Three Implementations of SquishQL, a Simple RDF Query Language. https://www.hpl.hp.com/techreports/2002/HPL-2002-110.pdf

[5] Peter Mikhalenko (2006) SquishQL: The simplest RDF query language. https://www.techrepublic.com/article/squishql-the-simplest-rdf-query-language/

[6] Peter Haase, Jeen Broekstra, Andreas Eberhart, Raphael Volz (2004) A comparison of RDF query languages. In Proceedings of the Third International Semantic Web Conference, Hiroshima, Japan, November 2004 https://www.aifb.kit.edu/images/b/b9/2004_522_Haase_A_comparison_of_1.pdf