Introduction
Graphic design is all about communicating visually and it plays a crucial role in marketing and advertising. Crafting an advertisement, banner, or post involves composing and arranging visual elements such as colors, shapes, images, and type. This process applies design principles like grouping, hierarchy, etc. to effectively convey a message through visual aesthetics.
Traditional design tools:
Graphic design experts have been using traditional design tools like CorelDRAW and Adobe Illustrator to craft these visuals. These tools provided designers with manual control over every aspect of their creations.
Template tools:
Subsequently, template-based design tools like Canva, Adobe Express, Microsoft Designer emerged, simplifying the design process for individuals with limited expertise. However, the challenge of finding the right template led 96% of users to opt for designing from scratch.
Generative AI design tools:
The latest leap in design technology introduces generative design tools, leveraging the capabilities of generative AI. While significant progress has been made in text and image generation, only a few players like Sivi and Microsoft COLE have ventured into developing design generation models. Sivi's models stand out by offering instantly generated layered editable designs that adhere to brand guidelines, setting them apart in the evolving landscape of graphic design.
—————————————————————
You might be curious how a 16 people startup can compete with large businesses in the generative AI for design space.
TL;DR
Abundant data exists online for text and images and image generators like Midjourney, DALLE, produce monolithic images. The lack of structured data coupled with the multifaceted and multi-layered nature of graphic design, presents challenges in solving graphic design through generative AI.
Template-based design tools like Canva, Microsoft Designer, focus on template search rather than tackling the broader challenge of design generation for any given content. Template replaced designs are neither cohesive nor relevant.
At Sivi, we built custom models for layered design generation and prepared unique structured design datasets (beyond image and text datasets) to solve editable design generation.
Sivi follows atomic design principles, composes designs from scratch across infinite dimensions and spanning over 72 languages, following the creative process of a human designer.
We started the research in 2019 and we are an excellent mix of designers, data scientists, and engineers driven by a passion for disruption and pushing boundaries.
—————————————————————
Generative AI Tools
While text and image generation tools like ChatGPT, Dall-E, and Midjourney have gained prominence, the field of design generation introduces tools specifically tailored for this intricate process.
Text Generation Tools:
Notable tools include ChatGPT, Bard, which focus on generating coherent and contextually relevant text.
Read why GPT5 or Gemini cannot generate designs.
Image Generation Tools:
Dall-E, Midjourney, and Stability AI excel in generating diverse and creative images, expanding the possibilities of visual content.
Design Generation Tools:
Some image generators with integrated text capabilities contribute to design generation and COLE by Microsoft trains and combines the different AI products to generate designs.
Cascaded diffusion models like DeepFloyd and Ideogram are very limited with text support and they are not editable.
DGDS by Sivi leads the way in generative AI for graphic design, providing users with the ability to create layered, editable designs on demand.
Decoding the challenges of Generative AI in Graphic Design
The world of graphic design is intricate and multifaceted, encompassing typography, text composition, ornamentation, and imagery to convey complex thoughts, emotions, and attitudes. Creating top-tier designs demands a high degree of creativity, innovation, and lateral thinking. Unlike text and image data, graphic design requires layered, structured data, making it a more challenging domain for generative AI.
Multifaceted Nature of Graphic Design: Graphic design involves various elements, including typography, composition, ornamentation, and imagery. Generating cohesive designs requires understanding and integration of these diverse components.
Multi-layered Design: Unlike simple text or image generation, graphic design involves multiple layers, such as SVG elements, images, and text elements. One unified layer is insufficient to capture the complexity of graphic design.
Element-specific Handling: Each design element must be handled differently due to different properties. Generative AI needs to adapt and understand the unique characteristics of each component in a design.
Lack of Structured Data: Unlike text and image data available abundantly on the internet, graphic design requires layered and structured data, posing challenges in training models effectively.
Brand Identity Integration: Graphic design often involves incorporating brand details and adhering to brand guidelines across various assets, such as banners, posters, and ads.
How Sivi navigates the challenges and generates the designs
Sivi adopts a unique approach based on atomic design principles. Breaking down the design problem into smaller pieces and developing models for each with structured data. You can input text, upload images, and add brand details, and ask Sivi to generate layered, editable designs seamlessly.
User Input:
Type the text or generate it using a prompt
Upload assets such as photos, logos, or generate images via a prompt
Add brand details (colors, typography)
Sivi's Generation Process:
Identifies tone, emotions, and target audience based on content
Forms text styles based on semantics
Analyzes images and forms image styles based on the analysis
Composes designs
Adds colors
Result:
Layered, editable designs that align with user specifications and preferences.
SIVI Architecture
Focus
While existing models like MidJourney, stable diffusion generate images and cascaded diffusion models such as deep-floyd/IF and Ideogram provide some support for image + text generation, they have limitations, usually accommodating only one line of text with minimal editability.
Sivi specializes in design generation. At present, we focus on design generation and we use open source models for image and text generation.
Models
Our approach involves employing custom diffusion models and multimodal architectures. We will share more detailed information soon.
Dataset
We are preparing dataset from 2019 for various models. In our experience, models come and go, but dataset remains the same. Our dataset sets us apart, serving as a distinctive value proposition and creating a robust entry barrier despite the availability of open-source models.
Layered composition dataset for each style
Colorization dataset
A large vector dataset
State of the Art of Generative AI Design Research
Lets say, a clothing brand named “Pink Fashion” wants to create a customer testimonial ad showcasing a pink clothing collection and a review from their customer, Jenny Wilson:
“I recently purchased this pink collection and I love it! The quality is so amazing! I will continue to order more items from this store.”
Here’s a comparison of designs generated by different models:
Some more designs generated by Sivi:
Note: Sivi's results were generated using DGDS Version 1.7, while the images by COLE, DeepFloyd/IF, SDXL, and DALL-E3 were recently published in the COLE research paper. Despite Sivi's capability to edit every single element, all the designs presented here were downloaded directly after generation without any manual edits.
When compared to other models, outcomes by Sivi are exponentially getting closer to human design standards. Designs generated by DGDS Version 1.7 was chosen eight out of ten times.
Limitations of COLE, Image Generators, and Template Tools
Limitations of AI Graphic Design Generators
AI Graphic Design Generators such as COLE understands the user intent. But, they are still in the framework phase and have the following limitations:
The arrangement of typography blocks
Limited number of editable visual elements
The restricted diversity in typography color selection
Only supports square dimensions
No support for free text by the user
Inability to upload user images and brand assets
Limitations of AI Image Generators
Text to image AI tools like DallE3 by Open AI, Midjourney, SDXL, DeepFloyd, among others, excel in image generation but fall short as Generative AI for graphic design, presenting the following limitations:
Generates a monolithic image rather than a design
Does not provide support for free text
Supports only a few aspect ratios
Limitations of Template Tools
Template tools like Canva, Microsoft Designer, etc. begin by picking a template and counting the number of text elements along with the character count.
They then use APIs to generate text that aligns with the specified number of elements and characters in the chosen template. It is a template search rather than design generation.
Finally, they replace the existing content in the template with the generated text.
Templates can be replaced only for a limited number of characters. This restricts the users from adding personalized copy, brand assets, and other elements and the final designs will neither be cohesive nor be relevant.
Advantages of Sivi over COLE and other GenAI tools
Sivi is an AI design generator, not a template tool or an image generator. It understands user intent and composes designs from scratch like a human designer.
Control
Control before generation - Users can add their content-input or edit the generated ones and choose their preferences
Control after generation - fully editable designs and vector layers for seamless customization
Predictability
Component based design following atomic design principles
Allows fine-grained preferences
Style cards
Personalization
Support for brand assets such as colors, typography, logo, and elements
Full text support with semantic-specific designs like Headline, Contact, Button, Lists and more
Practical AI
Business visuals are mostly used by small businesses and this automation will help them use the various layers. Tailored for small businesses, offering automation for various layers in business visuals.
Extendable framework
Generates designs for any dimensions and over 72 languages.
We built our rendering engine and models in a plug-and-play model. We will easily be able to extend to other design areas such as newsletters, UI, and HTML generation in the near future.
Sivi is not just a research or a framework, it is a product available in the market.
Conclusion
Generative AI for graphic design is rapidly evolving, overcoming challenges to provide users with dynamic designs. Tools like Sivi and Microsoft COLE showcase the potential of AI in transforming the design process, offering layered, editable designs that cater to the unique requirements of graphic design. As research continues to push the boundaries of what is possible, the future promises a more seamless integration of generative AI into the creative landscape, opening new avenues for designers and enthusiasts alike.
Share your thoughts on this graphic design revolution.