To improve your website's search ranking, search engines rely on structured data. JSON-LD, which stands for JSON for Linking Data, is a special format that allows you to embed machine-readable information into your web pages.
How JSON-LD is used for SEO
JSON-LD is used for search engine optimization (SEO) because it provides search engines with explicit information about the content on your web pages. Instead of relying on algorithms to infer meaning from raw text and HTML, you can use JSON-LD to provide search engines with clear information about your content's meaning, creator, offerings, and relationships with other online resources.
Schema.org provides the basis for this structured information by serving as a universal dictionary of types and their associated properties. By using Schema.org types, you ensure that search engines like Google can understand the information you provide.
Why Structured Data is Important for SEO
- Structured data enables you to use rich snippets such as star ratings, event times, and recipe instructions, in search results. These elements capture users' attention and visually invite the user to click on your site, improving your click rate.
- It also informs Google and other search engines about the meaning of your page content. The search engine is better able to parse whether your link is to a product with price and reviews, or a blog post by an author.
- Structured data can unlock special search features and improve your odds of appearing in knowledge panels, carousels, FAQ boxes, and voice search results.
- All these features provide a better user experience: when people can preview key information directly in search results, they are more likely to click.
- Your site stays ahead of the curve as search engines get more semantic and context-aware. Structured data helps future-proof your site by sligning more closely to a format that is easier for search engines to parse.
Adding JSON-LD metadata to your site
JSON-LD metadata is added to the <head>
section of your HTML document through a script
tag like this:
<script type="application/ld+json">
<!-- Your structured data goes here -->
</script>
Having the JSON-LD data in the <head>
section makes your HTML cleaner and easier to read for both humans and crawlers.
JSON-LD directly in the Django template
At Revsys, our first attempt at adding JSON-LD to our sites relied on embedding the data in the Django template. For the most part, this has worked fine and we've had good results from an SEO perspective. But in terms of maintainability, it has not been the most efficient approach. We have now started transitioning to generating the structured data using Django Ninja and Pydantic. As a result, we now have cleaner templates and better maintainability.
The code below illustrates how we used to embed our JSON-LD for our blog post page, with the data directly in the template:
{% block extra_head %}
<script type="application/ld+json">
{
"@context": "https://schema.org",
"@type": "BreadcrumbList",
"itemListElement": [
{
"@type": "ListItem",
"position": 1,
"name": "Home",
"item": "https://www.revsys.com/"
},
{
"@type": "ListItem",
"position": 2,
"name": "Blog",
"item": "https://www.revsys.com/blog/"
},
{
"@type": "ListItem",
"position": 3,
"name": "{{ self.title|escapejs }}",
"item": "https://www.revsys.com{{ request.path }}"
}
]
}
</script>
<script type="application/ld+json">
{
"@context": "https://schema.org",
"@type": "BlogPosting",
"headline": "{{ self.title|escapejs }}",
"description": "{{ self.get_description|escapejs }}",
"datePublished": "{{ self.first_published_at|date:'c' }}",
"dateModified": "{{ self.latest_revision_created_at|date:'c' }}",
"author": {
"@type": "Person",
"name": "{{ self.get_author_name|escapejs }}"
{% if self.author.specific.url %},"url": "{{ self.author.specific.url }}"{% elif self.author.specific.slug %},"url": "https://www.revsys.com{% routablepageurl self.get_parent.specific 'posts_by_author' self.author.specific.slug %}"{% endif %}
},
"publisher": {
"@type": "Organization",
"name": "REVSYS",
"logo": {
"@type": "ImageObject",
"url": "https://www.revsys.com{% static 'images/revsys_logo_white.png' %}"
}
},
"mainEntityOfPage": {
"@type": "WebPage",
"@id": "https://www.revsys.com{{ request.path }}"
}
}
</script>
{% endblock %}
This works, but we don't love this approach because:
- Mixing JSON-LD logic with presentation logic makes templates cluttered, which are harder to read and maintain
- There's no way to validate that your JSON-LD is properly formatted or follows schema.org standards
- It's easy to make syntax errors in the JSON that break the structured data
- Testing the JSON-LD is harder because the output requires rendering the entire template
- The same schema logic gets duplicated across different templates
Generating JSON-LD with Pydantic and Django Ninja
We refactored our schema generation using Django Ninja and Pydantic. Now, instead of embedding logic in templates, we generate structured data server-side and pass it to templates as context variables.
This has given us several benefits:
- Our code is more modular because we keep all schema generation logic in one file per app, making the codebase more organized and easier to navigate.
- The Pydantic models we create are reusable, which is handy since many JSON-LD types use the same subtypes.
- Utilizing Pydantic's type-checking and validation capabilities ensures that our structured data is valid and adheres to Schema.org standards, reducing the chance that we accidentally share invalid data with search engines.
- Our SEO is future-proofed: with our centralized approach, expanding our schema to new content types is simpler and more manageable.
By moving schema generation out of templates and into Django Ninja and Pydantic, we have created a system that is both maintainable and developer-friendly.
We created a file schema.py
that holds our Pydantic models that represent Schema.org types we need to use for the data we are turning into JSON-LD:
# schema.py
from typing import List, Optional
from pydantic import BaseModel, Field
from ninja import ModelSchema
from pydantic.config import ConfigDict
class PersonSchema(BaseModel):
type: str = Field(default="Person", alias="@type")
name: str
url: Optional[str] = None
class OrganizationSchema(BaseModel):
type: str = Field(default="Organization", alias="@type")
name: str
logo: "ImageObjectSchema"
class ImageObjectSchema(BaseModel):
type: str = Field(default="ImageObject", alias="@type")
url: str
class WebPageSchema(BaseModel):
type: str = Field(default="WebPage", alias="@type")
id: str = Field(alias="@id")
model_config = ConfigDict(populate_by_name=True)
class ListItemSchema(BaseModel):
type: str = Field(default="ListItem", alias="@type")
position: int
name: str
item: str
class BreadcrumbListSchema(BaseModel):
context: str = Field(default="https://schema.org", alias="@context")
type: str = Field(default="BreadcrumbList", alias="@type")
itemListElement: List[ListItemSchema]
class BlogPostingSchema(BaseModel):
context: Optional[str] = Field(default="https://schema.org", alias="@context")
type: str = Field(default="BlogPosting", alias="@type")
headline: str
description: Optional[str] = None
datePublished: str
dateModified: Optional[str] = None
author: PersonSchema
publisher: Optional[OrganizationSchema] = None
mainEntityOfPage: Optional[WebPageSchema] = None
url: Optional[str] = None
blogPost: Optional[BaseModel] = None
Using Django Ninja's ModelSchema for automatic schema generation
One of the features of Django Ninja is ModelSchema
, which automatically generates Pydantic schemas from your Django models. This is useful when you want to include model data in your JSON-LD without manually defining every field.
In our blog post implementation, we can use ModelSchema
to automatically include blog page data alongside our structured schema:
# schema.py
from ninja import ModelSchema
import blog.models
class BlogPageSchema(ModelSchema):
class Config:
model = blog.models.BlogPage
model_fields = [
"title",
"subtitle",
"first_published_at",
"category",
"main_url",
"main_url_text",
"featured",
"slug",
]
Then we integrate this ModelSchema
into our blog posting schema generation:
def get_post_schema(post) -> str:
author = PersonSchema(
name=post.get_author_name() or "REVSYS"
)
# ... author URL logic ...
schema = BlogPostingSchema(
headline=post.title,
description=post.get_description(),
datePublished=post.first_published_at.isoformat(),
author=author,
publisher=publisher,
mainEntityOfPage=main_entity,
# Include the model data as additional structured information
blogPost=BlogPageSchema.from_orm(post)
)
return schema.model_dump_json(by_alias=True, indent=2)
This approach gives you flexibility of JSON-LD schemas, and the convenience of automatically generated model schemas.
We then create helper functions to generate JSON-LD from our Pydantic models and update the schema.py
:
# schema.py
def get_breadcrumb_schema(name: str, path: str, post_title: str = None) -> str:
"""Generate JSON-LD breadcrumb schema for navigation structure."""
items = [
ListItemSchema(
position=1,
name="Home",
item="https://www.revsys.com/"
),
ListItemSchema(
position=2,
name=name,
item=f"https://www.revsys.com{path if not post_title else '/blog/'}"
)
]
if post_title:
items.append(ListItemSchema(
position=3,
name=post_title,
item=f"https://www.revsys.com{path}"
))
schema = BreadcrumbListSchema(itemListElement=items)
return schema.model_dump_json(by_alias=True, indent=2)
def get_post_schema(post: Any) -> str:
"""Generate JSON-LD schema for a blog post using Schema.org BlogPosting type."""
author = PersonSchema(
name=post.get_author_name() or "REVSYS"
)
if post.author and hasattr(post.author, 'specific'):
author_specific = post.author.specific
if hasattr(author_specific, 'url') and author_specific.url:
author.url = author_specific.url
elif hasattr(author_specific, 'slug') and author_specific.slug:
parent_page = post.get_parent()
if parent_page:
author.url = f"https://www.revsys.com{parent_page.url}author/{author_specific.slug}/"
publisher = OrganizationSchema(
name="REVSYS",
logo=ImageObjectSchema(
url="https://www.revsys.com/static/images/2017/revsys_logo_white.png"
)
)
main_entity = WebPageSchema(
id=f"https://www.revsys.com{post.url}"
)
schema = BlogPostingSchema(
headline=post.title,
description=post.get_description(),
datePublished=post.first_published_at.isoformat(),
author=author,
publisher=publisher,
mainEntityOfPage=main_entity
)
if hasattr(post, 'latest_revision_created_at') and post.latest_revision_created_at:
schema.dateModified = post.latest_revision_created_at.isoformat()
return schema.model_dump_json(by_alias=True, indent=2)
The model_dump_json()
method is a Pydantic feature that converts your schema objects into JSON strings. The arguments: by_alias=True
ensures that field aliases (like @context
, @type
, @id
) are used instead of the Python field names and indent=2
: formats the JSON with proper indentation, making it readable as well as easier to debug.
Here's what the output looks like for a blog post:
{
"@context": "https://schema.org",
"@type": "BlogPosting",
"headline": "Building Better Django Apps with Pydantic",
"description": "Learn how to integrate Pydantic with Django for better validation and cleaner code.",
"datePublished": "2024-01-15T10:30:00",
"url": "https://www.revsys.com/blog/building-better-django-apps-pydantic/",
"author": {
"@type": "Person",
"name": "Jane Developer",
"url": "https://www.revsys.com/blog/author/jane-developer/"
},
"publisher": {
"@type": "Organization",
"name": "REVSYS",
"logo": {
"@type": "ImageObject",
"url": "https://www.revsys.com/static/images/2017/revsys_logo_white.png"
}
},
"mainEntityOfPage": {
"@type": "WebPage",
"@id": "https://www.revsys.com/blog/building-better-django-apps-pydantic/"
}
}
Without by_alias=True
, you would get Python field names like type
instead of @type
, which would break the JSON-LD standard.
The next step is to update our models to include schema generation in their context. At Revsys, we use Wagtail for our blog, so this example shows overriding the Page model's get_context
method to add the JSON-LD schema elements we need for each blog post. If you are using regular Django models, you might create a get_schemas()
method on your model, which you could then call from your Django view to pass the JSON-LD schemas into your context.
from blog.schema import get_breadcrumb_schema, get_post_schema
from wagtail.models import Page
class BlogPage(Page):
def get_context(self, request):
"""Add JSON-LD schema data to the page context."""
context = super().get_context(request)
context["breadcrumb_schema"] = get_breadcrumb_schema("Blog", request.path, post_title=self.title)
context["post_schema"] = get_post_schema(self)
return context
In our template, we removed the raw JSON-LD code and replaced it with the context variables. Our updated template is now much cleaner.
{% block extra_head %}
{% if breadcrumb_schema %}
<script type="application/ld+json">
{{ breadcrumb_schema|safe }}
</script>
{% endif %}
{% if post_schema %}
<script type="application/ld+json">
{{ post_schema|safe }}
</script>
{% endif %}
{% endblock %}
Reuse one Pydantic model for multiple schemas
We can reuse many of the same components (like PersonSchema
and OrganizationSchema
) in structured data for other pages, pages for our conference talks and presentations. This helps Google show them in event carousels and highlights. These are a great candidate for structured data, using the Event schema type. We can create a new EventSchema
Pydantic model that makes use of our existing schemas, since we are following the types defined by Schema.org.
# schema.py
class PlaceSchema(BaseModel):
type: str = Field(default="Place", alias="@type")
name: str
address: Optional[str] = None
class EventSchema(BaseModel):
context: str = Field(default="https://schema.org", alias="@context")
type: str = Field(default="EducationEvent", alias="@type")
name: str
startDate: str
location: Optional[PlaceSchema] = None
performer: PersonSchema # from our first example
organizer: OrganizationSchema # from our first example
url: Optional[str] = None
Then, we add a new helper function for our Talk model:
# schema.py
def get_talk_schema(talk: Any) -> str:
"""Generate JSON-LD schema for a conference talk using Schema.org Event type."""
speaker = PersonSchema(
name=talk.speaker_name,
url=getattr(talk, "speaker_url", None)
)
organizer = OrganizationSchema(
name="REVSYS",
logo=ImageObjectSchema(
url="https://www.revsys.com/static/images/2017/revsys_logo_white.png"
)
)
location = None
if hasattr(talk, "venue_name"):
location = PlaceSchema(
name=talk.venue_name,
address=getattr(talk, "venue_address", None)
)
schema = EventSchema(
name=talk.title,
startDate=talk.date.isoformat(),
location=location,
performer=speaker,
organizer=organizer,
url=f"https://www.revsys.com{talk.url}"
)
return schema.model_dump_json(by_alias=True, indent=2)
And update our Wagtail TalkPage
model so we can add this new schema to the page context:
# models.py
from app.schema import get_talk_schema
class TalkPage(Page):
def get_context(self, request):
"""Add event schema data to the page context for conference talks."""
context = super().get_context(request)
context["event_schema"] = get_talk_schema(self)
return context
Make sure that the change also reflects on the template page for the talks:
{% block extra_head %}
{% if event_schema %}
<script type="application/ld+json">
{{ event_schema|safe }}
</script>
{% endif %}
{% endblock %}