Image: Stable Diffusion
Updated on August 21, 2022:
Open Diffusion is now available through a web interface. After logging in, you can generate images via text prompts, similar to the DALL-E 2, and have several additional options for adjustments. As with DALL-E 2, there are restrictions on indications, such as sexual or violent imagery.
The Open Diffusion model, which can run on-premises or in the cloud, will no longer have these restrictions. The model is expected to be released on Github in the next few days.
You can try Web Open Diffusion for free. For the equivalent of just under $12, you can buy around 1,000 indications. The actual number of indications available depends on the complexity of the calculations and the resolution of your image.
Click here to go to Dreamstudio, the web interface for Open Diffusion.
The original article is dated August 14, 2022:
Open source rival to DALL-E from OpenAI runs on your graphics card
OpenAI’s DALL-E 2 is receiving free competition. Behind it is an AI open source movement and startup Stability AI.
Artificial intelligence that can generate images from text descriptions has made rapid progress since early 2021. At the time, OpenAI showed impressive results with DALL-E 1 and CLIP. The open source community used CLIP for numerous alternative projects throughout the year. Then, in 2022, OpenAI released the impressive DALL-E 2, Google showed off Image and Parti, Midjourney reached millions, and Craiyon flooded social media with AI images.
Startup Stability AI now announced the release of stable diffusionanother system similar to DALL-E 2 that will initially be gradually made available to new researchers and other groups via a Discord server.
After a testing phase, Stable Diffusion will be released for free: the code and a trained model will be released as open source. There will also be a hosted version with a web interface for users to test the system.
Stability AI finance free DALL-E 2 competitor
Stable Diffusion is the result of a collaboration between researchers from Stability AI, RunwayML, LMU Munich, EleutherAI, and LAION. The EleutherAI research collective is known for its open source language models GPT-J-6B and GPT-NeoX-20B, among others, and is also conducting research on multimodal models.
The nonprofit organization LAION (Large-Scale Artificial Intelligence Open Network) provided the training data with the open source LAION 5B dataset, which the team leaked with human feedback in an early testing phase to create the set. LAION-Final Aesthetics training data.
Patrick Esser of Runway and Robin Rombach of LMU Munich led the project, building on their work in the CompVis group at the University of Heidelberg. There, they created the widely used VQGAN and Latent Diffusion. The latter served as the basis for Stable Diffusion with research from OpenAI and Google Brain.
– Stable diffusion photos (@DiffusionPics) August 14, 2022
Stability AI, founded in 2020, is backed by mathematician and computer scientist Emad Mostaque. He worked as an analyst for various hedge funds for a few years before turning to public works. In 2019, he helped found Symmitree, a project that aims to reduce the cost of smartphones and internet access for disadvantaged populations.
With Stability AI and his private fortune, Mostaque aims to foster the open source AI research community. Its startup previously supported the creation of the “LAION 5B” dataset, for example. To train the stable diffusion model, Stability AI provided servers with 4,000 Nvidia A100 GPUs.
“No one has a vote except our 75 employees, not billionaires, big funds, governments or anyone else in control of the company or the communities we support. We are completely independent,” Mostaque told TechCrunch. “We plan to use our computing to accelerate open source fundamental AI.”
Stable Diffusion is a landmark of open source
Currently, a stable release test is underway, with new additions distributed in waves. The results, which can be seen on Twitter, for example, show that a real competitor to DALL-E-2 is emerging here.
Unlike DALL-E 2, Stable Diffusion can generate featured people images and other topics that OpenAI forbids in DALL-E 2. Other systems like Midjourney or Pixelz.ai can also do this, but they don’t achieve a quality comparable to the great diversity seen in Stable Diffusion, and none of the other systems are of Open Source. .
result #stablediffusion you can do really amazing interpolations between text messages if you correct for initialization noise and slerp between message conditioning vectors: pic.twitter.com/lWOOETYVZ3
– Xander Steenbrugge (@xsteenbrugge) August 7, 2022
Stable Diffusion is already expected to run on a single graphics card with 5.1 gigabytes of VRAM, pushing AI technology to the limits previously only available through cloud services. Stable Diffusion thus offers researchers and stakeholders without access to GPU servers the opportunity to experiment with a modern generative AI model. The model is also supposed to run on MacBooks with Apple’s M1 chip. However, image generation takes several minutes instead of seconds here.
Stability AI also wants to allow companies to train their variant of Stable Diffusion. Multimodal models, therefore, are following the path that the great language models had previously taken: away from a single vendor and towards the wide availability of numerous alternatives through open source.
Runway is already investigating text-to-video editing enabled by Stable Diffusion.
#stablediffusion text-to-image checkpoints are now available for research purposes upon request at https://t.co/7SFUVKoUdl
Working on a more forgiving launch and repainting checkpoints.
– Patrick Esser (@pess_r) August 11, 2022
Stable diffusion: Pandora’s box and net benefits
Of course, with open access and the ability to run the model on a widely available GPU, the opportunity for abuse increases dramatically.
“A percentage of people are just nasty and weird, but that’s humanity,” Mostaque said. “In fact, we believe that this technology will prevail, and the paternalistic and somewhat condescending attitude of many AI fans is wrong in not trusting society.”
Mostaque emphasizes, however, that free availability allows the community to develop countermeasures.
“We are taking significant security measures, including formulating state-of-the-art tools to help mitigate potential damage at launch and to our own services. With hundreds of thousands developing this model, we are confident that the net benefit will be hugely positive, and as billions use this technology, the harms will be nullified.”
More information is available on the Stable Diffusion github. You can find many examples of Stable Diffusion’s imaging capabilities on the Stable Diffusion subreddit. Go here to sign up for the beta version of Stable Diffusion.
#open #source #DALLE #Open #Diffusion #website