SHACL for the Practitioner Transcript
Transcript SHACL for the Practitioner Veronika Heimsbakk - 11/24/2025
[[1]]
Larry Swanson:
Thanks, Marco. Now, I'm super delighted to be here. I just saw Veronika in London last week at Connected Data London, and the highlight of the event for me was well, really this the Have you ever if you've ever had these Norwegian KitKat bars? The dirty little secret of my editorial commitment to this book is that I would have done it just for that, but I got a nice, you know, hand in scripted copy of the book. And uh it was a really good time in London. I'm, you know, I've been in the the web world since, you know, way too long. In fact, when Marco and I were talking a little bit before this event and Veronika says, "It's fun to listen to old people talk about these things." So, I've been doing this a long time and really appreciate people's contribution to it. I've been doing it professionally, mostly in content architecture, content modeling, but came into the semantics world about five years ago kind of more properly. And in that time I've met a lot of people and Veronika is just first of all the most delightful person and I was really happy to see that she was writing a book and when she did I said hey what's going on and blah blah and one thing led to another and my background as an editor kind of turned out to be useful and so I helped her and I got to say as a longtime editor like many years this was one of the lightest edits I ever had to do. It was really just sort of like Americanizing and tidying up some English stuff. Veronika's natively just an outstanding and amazing communicator and I would just kind of like little sprinkle on of style on top. But the book itself is awesome. It's it's very accessible but it addresses both like kind of noobs can get something out of get oriented to SHACL right away but it's almost like a good like a reference and a onboarding for experienced practitioners and developers. And it's also a good blend of like that introductory stuff, reference material, and I don't know several case studies at the end of the book that really show the benefits of the text. So, it was a real pleasure to work on it, and I'm just so happy to get to introduce Veronika, which I will do right now.
Veronika Heimsbakk:
Thank you. Thank you so much for the kind words, Larry, and also Marco. Yeah, the edit was easy, you say, Larry. well, I was kind of very surprised that most of the edits were on commas really given my verbose like Norwegian kind of English. But I'm happy that it's finally done that it's a physical thing that's out there in the world. So thank you for having me at a Lotico session. So I will share some slides with you on how the book came to be and also a bit what you can expect. So let me see now you should see my screen. Can anybody confirm please?
Larry Swanson:
Yes, you're sharing your screen. Yes.
Veronika Heimsbakk:
Thank you Larry. Okay. So hi everyone. I think I'll start with a couple of words about myself. So my name is Veronika Heimsbakk. I'm based right outside of Oslo in Norway and I've been within the knowledge graph semantic tech environment for the past well it's been a decade now at least. I finished my degree at the University of Oslo in 2015 where I took a very traditional computer science degree. My core tools of trade used to be like C and Java but I also fell in love with logic during my studies and I took every single course that I could find on logic at the department of informatics. Unfortunately, I didn't discover the courses on logic at the department of philosophy, but I took it took a lot from the mathematical and computer science perspective. And at the very last semester on my degree, I wondered what in the world am I going to use all this logic for? Like I can't write proofs and touring machines for the rest of my life. But luckily I got a phone call from a head hunter asking if I knew any semantic technologies and I was like not that much but I knew it was a course at the uni that was hosted by Martin Giese and so I signed up and took that one and I saw wow I could use all this logic for something practical and I got that job that the head hunter was was calling about. So I started two days after my last exam. I started as a Java developer in a consultancy firm in Oslo working on projects within semantic technologies. And one of the first projects I did in that project I developed together with a former colleague of mine my first SHACL engine.
So that was before it even became standardized. So the vocabulary of SHACL looked a bit different from what it does today. But it was a great experience. And ever since then I've been working with SHACL in a project in Norway and abroad as a consultant for the past 10 years. The seven later years at Capgemini which I left in September 2025 to start with this scaleup company called Data Treehouse and at Data Treehouse we develop an open-source framework for RDF construction in Python, and we also develop a a SHACL engine which has recently been benchmarked an is one of the world's fastest SHACL engines which we are very proud of and suits my interest quite well to be a part of that journey.
So enough about me. Let's talk a little bit about the book. So I will begin this presentation with a little bit of history on how it started and then I will dedicate uh a few slides for the ones that have joined in as contributors and help me along the process and then I will go through the different parts of the book and what you can expect to find in the book and then I know there is a few contributors on the call. So if they would like to say a few words about their contribution when they come to that place in the presentation feel free to do so.
Okay. So everything started in 2022 after I had my SHACL master class at the knowledge graph conference conference in New York. Then I was was asked what is the name of your book? Well, I hadn't written any book, but I was thinking about the the idea or the abstract concept of writing a book and thought that it might not be a bad thing to try this. So I was talking with some friends in the community about this and kept in mind that I've been working with Chuckle for so long and everything I talk about at conferences is SHACL basically. Um, so it wasn't really a bad idea to start writing a book. So and and this book has been written as a sidekick to everything else I do in my life. And I'm a single mom every other week for two small kids and I have a full-time job and I have tons of different things to do in my spare time. So, you know those moments when you're at a pub and you're waiting for somebody and you show up a bit early on purpose to have a pint before they arrive. Those moments, this book has come to life. So for two years it's been those pub scribbling moments and at some point I realized that I needed help in finalizing this. So I needed more stories other than myself. So I've contributed with four stories in the book but I was um completely dependent on showing like the breadth uh of the uh the SHACL capabilities. So I reached out to the community and I got a few responses and have a really nice group of people contributing with stories. So this presentation also will weigh a bit of time on those different stories. And then there was a couple of weeks where I decided to take time off work just to isolate myself far up in the Norwegian mountains without any ways of commuting to anywhere, like far far off grid. Luckily I had internet of course but that was about it to just write and write and write and write. Oh my that was very effective. And then we had a couple of rounds of proof reading of course we had the editing finals layout and print. So it's printed locally in Oslo and the book itself is written in LaTeX using TikZ for illustrations and it's much of it is based upon my master classes and guest lectures on the topic including a more kind of practical approach on it all.
And I also have to mention that the work is dedicated to my late friend and mentor Roger Antonsen who was a associated professor at the University of Oslo. He had the introductional courses on logic and he was perhaps the best ever communicator that I ever known. So for the love of logic and art of communication this book is dedicated to my friend Roger. Let me see and thank you so much to the proof readers. So I have had proof readers among the contributors Ghislain and Thomas and Holger and had a group of friends from Norway both in public sector and in the consultancy business. Filip, Trine, Birgitte and Tia and Jose from the University of Vo contributed with proof reading and guest authors. We will see more of those later. But the team at the European Railway Agency has contributed with their SHACL stories. Ashley Caselli has contributed with a SHACL story that also talks about SHACL rules. So that's very interesting. Benjamin from FörderFunke has contributed with the story and Thomas has contributed with actually two stories and Hoger has contributed with a story on how to use SHACL to generate and drive interfaces in a user experience. We have already heard from my editor Larry. The forword is written by by Holger who is also the co-inventor of SHACL. So I couldn't ask for a better guy to write the forward and also other acknowledgements that are mentioned in the book also listed here.
By the way, if there are any questions while I'm doing this presentation, feel free to to add them in the chat. I have the chat on my second screen here. So, I have them I have them available. Okay. So, the book is divided into three parts and the first part is back to basics. The second part is about the core of SHACL and the last part is about SHACL stories. So for the rest of the presentation I will give you a quick intro to what to expect in the different parts of the book. So for part one back to basics I start with the wonder of logic. In very simple terms, I explain common terminology for knowledge representation and knowledge graphs. The semantics, the meaning of things of course and a definition for knowledge graphs. I talk about ontologies and abstract concepts as ideas of a thing with their terms and definitions because it's important to think about as a center of information how the recipient is interpreting the information and of course I talk about terms as class concept individual and I also have a section on visual thinking through set theory.
The other part of of part one is the building blocks of RDF. After all, SHACL is a another standard in this knowledge graph stack for semantic technologies and it's important to have a little bit of background in RDF. So hopefully this is sufficient enough for those who read the book who aren't too familiar with RDF to be able to understand and read the examples of of SHACL in part two of the book. So when explaining RDF and the building blocks of RDF, I also use this visual notation. So the visual notation that I use is that for every single URI there is a circle and every single predicate are the arrows between things or edges and every single literal value is in a square box. So a concrete example the hobbit with an author Tolken and the hobbit main title "The Hobbit, or There and Back Again". So the main story of my examples in the book is surrounding creative works as books with their affiliated roles for persons and other kind of metadata connected to the creative works.
Right now I have a a cat here 🐈 that is trying to eat my HDMI cable. So hopefully she will leave it be so that the presentation doesn't cut halfway. Sorry about that. Okay. So, part two is about the core of SHACL and I introduced this part with a perspective from the web ontology language. I bet there are several readers that are very well known around the web ontology language has and has been using that for for years and years. But for those who are new to the whole stack of semantic technologies or haven't used OWL to such an extent, this particular chapter is very safe to skip if you're not too familiar with OWL. But I try to put it in perspective because especially for the past year or two there's been a lot of discussion on particular LinkedIn on OWL versus SHACL and how these fit into the world as puzzles of a knowledge graph. So I I try to explain in a very informal manner what the heck is inference and what is validation and the differences and similarities between OWL and SHACL and I also have a very quick draft on how the two work together in a data pipeline. So SHACL is not here to replace OWL. Um there are two different standards with two different design purposes and that's what I'm trying to highlight in this particular chapter. But for those who are not familiar to OWL, as I said, it's completely safe to skip this chapter and I highlight that also in in the text and then it's about the core of SHACL. So I have quite a systematic setup for every single core constraint that you find in the SHACL vocabulary including a visualization and also example of conforming and unconforming data for that particular constraint example that I that I list. And I'm covering targets, all kinds of targets except for SPARQL targets and also property path. I also have a chapter on SHACL SPARQL with examples on both node and property level. I do not cover the SHACL advanced features because on purpose I've only chosen to to include the things that are in the current recommendation from W3C. So, who knows? Maybe a second edition will turn up when uh SHACL 1.2 comes along.
Yeah, just for reference. And then we have the SHACL stories. So the first SHACL story is from the Norwegian Maritime Authority and this is a project that I've been involved with for several years and helped with the thought work and implementation and building up this automated pipeline of taking regulations to RDF with SHACL as the descriptive language for regulatory requirements. So this is the pipeline. I won't go too much into detail on that. I've also written a paper on this particular topic. So that's on the web so you can read if you like. Otherwise there is a chapter on this particular regulatory simulator in the book. But what we do there is that we have every single regulatory requirement as a node shape and regulatory requirements are connected to zero or more so-called scopes. In Norwegian this is called virkeområde the most proper translation that we found were legal scope or scope. So a scope could be like for the maritime regulation at least we have a requirement that says that this is true for every single fishing boat. So that's one scope between 8 and 15 m for example. So then we have two scopes connected to that requirement fishing boats or fishing vessels and five to, no 8 to 15 meters.
Every single requirement is versioned of course because boats built in the '90s have different sets of requirements than boats built today. And I also have a discussion around how I think the closed world assumption is more applicable for the domain of law and legislations rather than the open world assumption of OWL. Yes, the second story in the book, now the cat is on me again. The second story in the book is about certifications and the issuing of personal certifications. Also for the client and Norwich Maritime Authority. But this time we don't only use the the regulatory requirements as SHACL for descriptive purposes but also for validational purposes. So we have a regulation that describe the requirements for personal certificates like for example sea captain or deck officer and and things like that and we have taken that regulation and made a detailed description in in SHACL that reflects the regulation meaning that in the data pipeline when a person is applying for a certificate at the NMA we then harvest all information we can get about that person including their own input to create this 360 profile in RDF for that person and then we compare it with the shapes in the shapes graph that reflects the requirements in the regulation and the validation engine of SHACL are able to give us an an output either you confirm true or if not you get a detailed result set back on what you are missing in order to achieve the certificate So this particular pipeline has been able to reduce case working time at the anime from several weeks to two seconds and that's that's very nice but you can read more about this in in the book of course.
Okay and then we have um Thomas contributing with a story from the European Parliament Open Data Portal where they harmonize data from from different sources and are using SHACL not only for input data but also to to harmonize different application profiles across different domains and also about how you could tailor an author in SHACL Shapes using a spreadsheet tool that Sparnatural has developed. So that's a very very nice story about a real life case that you can can browse today. So I also included links to the different usecases described here so you could investigate on your own later. So I bet I'll get Marco to share the slides after the session also.
And then we have the European Union Agency for Railways ERA and luckily I've been diving a bit into those shapes on another case rather than just writing or them contributing with the story to my book. I was so lucky that they invited me to their rail data forum this year to throw away a SHACL masterclass for them. And in that case I thought I had to investigate a bit in their shapes in order to create good exercises to go with the master class. And what I found is that their shapes are full of SHACL SPARQL and quite hairy SHACL SPARQL from time to time and a bit redundant SHACL SPARQL from time to time. So these shapes are hard to digest for a SHACL engine and they are I can imagine that they are hard to maintain. So this story is especially in interesting due to the amount of shapes and the complexity of shapes that is also providing for a reusable layer. So they they want their shapes to be re reused across different infrastructure managers in in Europe to improve uh interoperability in uh in the ontologies and in in the SHACL shapes.
And then I have a teeny tiny story from a client I couldn't name but they are within the automotive sector where we used SHACL as kind of master shapes where we do not validate the content of a datagramraph but rather the structure of an ontology because in this case we had teams of many many people writing ontologies and committing RDF to our repositories. So between or as a part of the commitment phase of triples, we needed to make sure that the structure of the ontologies was as we described for in the ontology playbook, for example that every single class should have uh exactly one label in RDFS label in English.
I feel like I'm going really fast now but I hope that's okay and there will be an open discussion at the end of the presentation. So the next story from Thomas was about Nakala, a repository of French research data in the social sciences and humanities. And here the tool stack of sparnatural is leveraged to drive a query UI on top of the harmonized research data in the Nakala platform. And this also contains a lot of descriptions on on how that visual or query UI works and it's a very nice useful interface which is open also open sourced. So feel free to have a look at that. It's uh it's a very nice example of of SHACL in use. And then we have Ashley with Urban Subsurface Data. And I see that Ashley is on the call. So if you want to take the mic Ashley, feel free. If not, I can say a few words about this case. So this is a project about that uses SHACL to improve the quality and regulatory compliance of urban subsurface data sets that integrates from various sources and on top of this actually also have used SHACL inference rules to fill in and enrich the data with missing information and there is also been development of a a framework that extends SHACL uh called SHACL-X. I see that I forgot to add the link to to SHACL-X on this slide, but I can fix that before before I send it to Marco. And then there is the last story from my side and that's from the Norwegian Digitalization Agency back in 2016. So it's merely a more of a fun to have story than than other things, but it describes SHACL as a schema for existing standards. So, NOARK 4 and NOARK 5 are the Norwegian archive standards for journal posts. So, you're creating a schema in in SHACL to support both NOARK 4 and NOARK 5 and they are quite different those two. So trying to harmonize those two standards using using SHACL and the SHACL engine that we developed in this particular project actually uh was the starting point um that led to the SHACL engine in RDF4J. Uh who is the author of the SHACL engine in RDF4J was my colleague at this project and he was also the one introducing me to SHACL back in the day.
And then we have the second last story. I saw that Benjamin was on the call earlier, but it doesn't seem he's anymore. But this is about matching user to state benefits in Germany because knowing what kind of state benefits that you might have have a demand for for getting from the state might be quite confusing and kind of messy to find out all the informations that and that you need to to explore the landscape. So Benjamin and his team at FörderFunke has created this service uh that is based upon SHACL to to match your user data with the different kinds of state benefits.
And the last story is from Holger and it's on TopBraid Edge. I bet several of you have heard about TopBraid Edge before. They support SHACL by core in their tooling uh interface for ontology and SHACL shape generation. But also behind the scenes in in top right they use a lot of SHACL to power the user interface. This story also describes advanced features with inferencing and also GraphQL on top of your knowledge graph. So it's not supposed to be like an advertisement for TopBraid Edge. It's rather a story on how you can leverage SHACL to be a driving force in creating user interface and user experience on top of a knowledge graph. So hopefully it will serve as inspiration for other vendors and companies that that develop tooling for RDF.
And then we have come as far as the Appendix. So the appendix isn't very large but it contains a example on SHACL and OWL because I was approached a lot during my my work with this book regarding the differences between SHACL and OWL on a concrete example, like how would would it look in OWL and how would it look in SHACL even though it serves it um serves different purposes and and design patterns. So I chose an example on classifying something. So classifying a star as a G-type star and the definition of a G-type star is a star with a Kelvin temperature between 5300 and 6000. So I've taken this statement and replicated it in natural language which is what you see here and the Description Logic notation Manchester syntax OWL in RDF Turtle and SHACL RDF Turtle. I chose to throw in the Manchester Syntax also because if you're familiar with OWL in RDF Turtle you know it can get pretty messy pretty fast and I also have a discussion around this particular statement on how you can resolve it with inference and how you can resolve it with validation and the differences between those two operations on the knowledge base. Here you see a snippet of the section of SHACL implementations. I was a bit unsure whether or not to to add such a section but I chose to either how because so many people were asking about it. This is a teen tiny snippet. It's more about frameworks than other kinds of tooling in this particular uh section of the book or in this appendix.
So now Marco we are approaching the end. So if you want to get a copy you can do it on here. I recently published information about the ebook because I have written this book in LaTeX which I love. I've written my own book class file for LaTeX for writing this and all my illustrations are in TikZ something that the EUB converters absolutely hate because EPUB is basically HTML of course and the TikZ stuff is vector graphics that is that are compiled in in the LaTeX environment. So if anyone on the call has any experience on converting Latte to EPUB, please let me know. Otherwise, uh the PDF is now available. More information is on my my website. And with that, I want to say thank you for listening in. Um yeah.
Marco Neumann: Well, thank you so much, Veronika. Well done. Thank you. It's a huge task, you know, writing a book. And I would like to open up the floor quickly to some questions. Maybe Larry, if you see some questions in the chat.