Contentstack: Content Modeling and Data Migration
Contentstack is a headless CMS that acts primarily as a content repository. It makes content accessible through API s for displaying on any device. Hence allowing end users to maintain the presentation layer on different devices. After looking at the benefits offered by headless CMS, organizations around the globe are pondering over migrating their existing implementations from legacy CMS options. And, for the large number of contents that live in the legacy CMS one would look up an efficient strategy for data migration. In this blog, we ll go over such use cases.
Use Case: Migrating large data sets of content that live in a legacy CMS to Contentstack.
Strategy
Contentstack provides several Content Management APIs to perform CRUD operations for Content Types, Entries, Assets, Taxonomy etc. We will make use of a few of these APIs to migrate the content over from Legacy CMS to CSTK. The API that we will use to create the entries of a particular content type in CSTK accepts the input in JSON. Hence, this implementation will have following parts:
- To export the content from the old CMS in JSON.
- To hash out the dependent assets like images in the exported JSON.
- To import the exported JSON into Contentstack.
Implementation:
To export the content from the old CMS to JSON
One of the most efficient ways to export the content from a CMS in JSON is by using PowerShell scripts. However, the choice of the technique would really depend on the specific CMS from which we are performing this export.
Assumptions:
- Let s assume that for our specific use case we are performing this export from Sitecore XP version X that supports SPE.
- Content Type that we will export and import: News Articles
- Data Structure for News Articles looks like the below:
PowerShell Script that will export the content from Sitecore in the required JSON format for Contentstack:
$items = Get-ChildItem -Path web:/sitecore/content/Home/News Article -Recurse
if ($items.Count -eq 0) {
Show-Alert "There are no content items"
}
else {
$props = @{
Title = "Out Team Json Export"
InfoTitle = "Export Content Items"
InfoDescription = "Export Lists with the help of Json option"
PageSize = 600
}
function Create-GlobalFields ($currentItem) {
$meta_title = $currentItem.Fields["MetaTitle"]
$meta_description = $currentItem.Fields["MetaDescription"]
$global_field = @{
meta_title = "$($meta_title)"
meta_description = "$($meta_description)"
}
return ($global_field | ConvertTo-Json)
}
function Create-SummaryInfo ($currentItem) {
$summary_title = $currentItem.Fields["Summary Title"]
$summary_description = $currentItem.Fields["Summary Description"]
$summary_image = ""
$social_share_image = ""
$summary_info = @{
summary_title = "$($summary_title)"
summary_description = "$($summary_description)"
summary_image = "$($summary_image)"
social_share_image = "$($social_share_image)"
}
return ($summary_info | ConvertTo-Json)
}
function Create-Modules ($currentItem) {
$children = Get-ChildItem -Path $currentItem.ItemPath -Recurse -depth 1
[System.Collections.ArrayList]$modulesList = @()
foreach ( $childItem in $children) {
If ($childItem.TemplateName -eq "RichText") {
If (($childItem.Fields["Use Plain Text Styling"].Value) -eq 1) {
$use_plain_text_styling = $true
}
Else {
$use_plain_text_styling = $false
}
$richtext = @{
richtext = @{
rich_text_editor = "$($childItem.Fields["Content"])"
use_plain_text_styling = "$($use_plain_text_styling)"
}
}
$richtextJson = ($richtext | ConvertTo-Json)
$modulesList.Add($richtextJson) | Out-Null
}
ElseIf ($childItem.TemplateName -eq "Quote") {
$richtext = @{
richtext = @{
rich_text_editor = "$($childItem.Fields["Quote Content"])"
is_quote = $true
}
}
$richtextJson = $richtext | ConvertTo-Json
$modulesList.Add($richtextJson) | Out-Null
}
ElseIf ($childItem.TemplateName -eq "Image") {
If (($childItem.Fields["In-Line"].Value) -eq 1) {
$inLine = $true
}
If (($childItem.Fields["FullBleed"].Value) -eq 1) {
$fullBleed = $true
}
If (($childItem.Fields["No Max Height"].Value) -eq 1) {
$noMaxHeight = $true
}
$image = @{
image_full_width = @{
headline = "$($childItem.Fields["Headline"])"
Image = ""
inline = "$($inLine)"
fullbleed = "$($fullBleed)"
no_max_height = "$($noMaxHeight)"
caption = "$($childItem.Fields["Caption"])"
}
}
$imageJson = $image | ConvertTo-Json
$modulesList.Add($imageJson) | Out-Null
}
}
return ($modulesList | ConvertTo-Json)
}
$items | where-object { $_.TemplateName -eq "NewsArticle" } |
Show-ListView @props -Property @{Label = "title"; Expression = { $_["title"] } },
@{Label = "name"; Expression = { $_["First Name"] } },
@{Label = "date"; Expression = { ([sitecore.dateutil]::IsoDateToDateTime($_["Date"])).
AddDays(-1).ToString("yyyy-MM-dd") } },
@{Label = "summary_info"; Expression = { Create-SummaryInfo -currentItem $_ } },
@{Label = "global_field"; Expression = { Create-GlobalFields -currentItem $_ } },
@{Label = "modules"; Expression = { Create-Modules -currentItem $_ } }
}
To hash out the dependent assets like images in the exported JSON
Below are the steps here that could be executed to meet this requirement:
- Download the assets from Sitecore by right-clicking on the asset folder, choosing script, and then downloading.
- Drag and drop the contents of the downloaded folder in the asset folder created in Contentstack.
- Use Get assets of a specific folder CM API (https://{{base_url}}/v3/assets?folder=enter_your_folder_uid) of the Contentstack to get all the assets in the JSON format (images JSON file).
- Now use a PowerShell script to hash out the entries in the exported JSON file with the images JSON file by finding out a matching pattern. It could be based on an entry title that also exists in the image name or any other.
Once the exported JSON is ready let us talk about the import process. Do the following in your next js/react solution:
Place the exported Json (ex: news_articles_data.json) inside a data folder under your component having the code for data migration.
Create an Entry Service by using Create an entry with JSON RTE API provided by Contentstack like shown below:
import Result from '@/core/result'
import settings from '@/common/settings'
import axios from 'axios'
import posts from '../services/data/news_articles_data.json'
const createAnEntry = async () => {
const origin = 'https://' + settings.contentstack.env.apiHost
const pathname = settings.contentstack.entries.news_article_entries
const url = origin + pathname
const options = {
headers: {
'Content-Type': 'application/json',
authtoken: settings.contentstack.env.authToken,
authorization: settings.contentstack.env.authorization,
api_key: settings.contentstack.env.apikey,
},
}
try {
await Promise.all(
posts.map((post) =>
axios.post(
url,
{
entry: post,
},
options,
),
),
).then(
(response) => {
console.log('response is', response)
return Result.success(response)
},
(error) => {
return Result.fail(error)
},
)
} catch (error) {
return Result.fail('fail')
}
}
const EntryService = {
createAnEntry
}
export default EntryService
The settings file would look like this:
Lastly, make a call to this service to create the entries in your stack. As per my exerience it takes less than 5 seconds to create around one thousand entries.