Shutterstock Expands AI Training Datasets with Long-Form Video and Specialized Multimodal Content

On March 19, 2026, Shutterstock Inc. (NYSE: SSTK) announced a significant expansion of its licensed training datasets, introducing new multimodal content categories designed to power advanced generative artificial intelligence (AI) models. The updated catalog now includes templates, fonts, long-form video content, and premium metadata, alongside specialized podcast and scientific imagery. This move positions the company to meet the increasing demand for diverse, high-fidelity data required for the continuous refinement of large-scale AI systems.

The expansion builds upon Shutterstock’s existing repository, which currently exceeds 600 million images, 50 million videos, 5 million music tracks and sound effects, and 1 million 3D models. By adding specialized assets like long-form video and scientific data, Shutterstock is targeting developers and enterprise partners who require niche datasets for complex multimodal training. The company confirmed that its data licensing business now supports the entire model training lifecycle, from initial research and development to commercial deployment and ongoing retraining.

Daniel Mandell, Senior Vice President of Data Licensing and AI at Shutterstock, stated that high-quality, rights-cleared data has become as critical to AI infrastructure as compute power. He emphasized that generative AI models require a continuous flow of fresh data to remain accurate and competitive. To facilitate this, Shutterstock offers both research and commercial licensing options, allowing startups and academic researchers to begin with lower-cost research licenses before transitioning to commercial agreements for scaled production.

Beyond raw data access, the company’s platform provides integrated services including data structuring, labeling, rights management, and training orchestration. These offerings are supported by ML-assisted evaluation tools and MLOps deployment services, ensuring that content is technically optimized for model ingestion while maintaining clear data provenance for compliance purposes. The company also highlighted its use of human-reviewed metadata to improve model performance and facilitate the transition from experimental phases to production-ready applications. This follows the recent launch of Shutterstock’s AI Services, which provides end-to-end model training and evaluation for global technology partners.

Shutterstock continues to serve as a primary data provider for major industry players, including OpenAI, Meta, NVIDIA, and Google, as well as specialized AI firms such as Runway, Black Forest Labs, and ElevenLabs. The announcement comes amid a broader corporate landscape where the UK’s Competition and Markets Authority (CMA) recently extended its inquiry into the proposed merger between Shutterstock and Getty Images, with a new deadline set for June 14, 2026. The company reported revenue of 990 million dollars over the last twelve months, maintaining a gross profit margin of 59 percent.