Apple’s MGIE, Multimodal AI Glasses, and Many Other AI Advancements
Welcome to the sixth edition of the PixelBin Newsletter. Every Monday, we send you one article that will help you stay informed about the latest AI developments in Business, Product, and Design.
In Today’s Newsletter
🔥 Great News for Designers, Apple’s MGIE, AI Frame, and New Copilot Updates
🌟 The Next Chapter in Google Gemini Era: Gemini Advanced
🚀 Natural Language-Prompted Image Editing with MGIE
🔥 AI in fast lane
Great News for Designers, Apple’s MGIE, AI Frame, and New Copilot Updates
Apple has released MGIE, an open-source model that uses natural language to edit images. MGIE can crop, resize, rotate, flip, add filters, etc. Read More…
Brilliant Labs announced “Frame”. The glasses are the first to have a built-in multimodal AI assistant capable of enhancing daily activities and interacting with digital and physical worlds. Perplexity also announced that they will be integrating their chatbot into the Frames. Read More…
Microsoft just pushed new Copilot updates, including a visual makeover, inline image editing, and more. The company also showed off its first Super Bowl ad in 4 years, putting AI at the center stage. Read More…
The Midjourney alpha is now available to users who have generated 10,000 images or more. Read More…
🌟 Product Innovation through AI
The Next Chapter in Google Gemini Era: Gemini Advanced
The Google Gemini model has been the subject to talk about in the AI world lately, with a focus on its language abilities, performance in various tasks, and comparisons to other models. It highlights several strengths of Gemini, including its ability to handle complex reasoning tasks, generate non-English languages, and outperform other models in certain benchmarks.
Gemini is now a step older with its advanced version and Bard has now been rebranded as Google Gemini. A brand new landing page, new apps, and a lot more has been pushed forward in the rebranding, with subtle warnings over the existing Google assistant.
Here are some details:
Gemini Advanced is a paid upgrade of Gemini, offering more features for $19.99/month with the Google One AI Premium Plan, which also gives you 2TB of storage and other perks.
If you're on a Google One family plan with AI Premium, your family members do not get access to Gemini Advanced, which slows down its growth as a product.
Gemini comes in three versions: Nano, Pro, and Ultra, each designed for different uses.
Gemini works with many Google tools like the Pixel 8 phone, Bard chatbot, and apps like Gmail, Docs, Slides, and Sheets.
Gemini is cheaper and easier to use than before, making it available to more people.
There's a Gemini mobile app that lets you do things like create photo captions and get answers to questions about articles.
While the results achieved by Gemini are impressive, there are also discussions about the potential exaggeration of its capabilities in certain areas. Overall, Google Gemini provides valuable insights into its language abilities and performance across various tasks. Read More…
🚀 Innovation in AI
Natural Language-Prompted Image Editing with MGIE
Apple and UC Santa Barbara researchers have introduced MGIE, an open-source AI system that enables image editing through natural language commands. MGIE can reliably edit an image even if the user describes the changes to be made in natural language.
The system can handle common Photoshop adjustments like cropping, rotating, and filtering, as well as more advanced object manipulations, background replacement, and photo blending. MGIE optimizes images globally by adjusting properties.
How does MGIE use natural language prompts to improve image editing?
MGIE uses natural language prompts to improve image editing by incorporating multimodal large language models (MLLMs) to interpret text prompts and make pixel-level changes to photos.
The MLLMs are capable of cross-modal reasoning and responding appropriately to text, allowing MGIE to translate user commands into concise, unambiguous editing guidance.
For example, "make the sky more blue" becomes "increase the saturation of the sky region by 20%."
MGIE's versatile design empowers all kinds of image editing use cases, from common Photoshop adjustments like cropping, rotating, and filtering to more advanced object manipulations, background replacement, and photo blending. The system optimizes images globally by adjusting properties.
MGIE can understand a wide range of natural language prompts for image editing, including basic prompts like
Crop the image
Rotate the image
Apply a filter
It can also handle more complex tasks like remove the background
, replace the sky
, and add a person to the photo
.
The system can even understand ambiguous commands make it look better
and make appropriate edits based on the context of the image.
Benefits of natural language prompts over traditional image editing methods
Ease of Use: Natural language prompts make image editing more user-friendly. People can simply describe what they want, making the process more intuitive than traditional methods that require technical knowledge of editing software.
Enhanced Flexibility: With text-guided manipulation, users aren't limited to predefined tasks (like colorization or inpainting). They can ask for a wide range of edits, from simple adjustments to complex transformations, reflecting their specific needs more accurately.
More Accessible: By using natural language, these systems lower the barrier to advanced image editing, making powerful tools available to a wider audience without the need for specialized training.
Intuitive Interaction: The conversational approach of using language to guide edits creates a more natural and engaging user experience, making it easier for users to achieve their desired outcomes.
⚙️ Tools to Supercharge Your Productivity
4 Unique AI Tools To Try This Week
TikTok Watermark Remover: Save and enjoy your favorite TikTok videos offline, without any watermarks.
Klap: Save your editing hours and turn your long-form videos into shorts, reels, TikTok videos in a second.
Ubique: Record a single video message and personalize it for all your clients with minimal effort with this platform.
Elsa AI: Practice real-world English conversations with a smart way to learn and improve your communication skills.
Gradient Music: Compose personalized music as per your genre with AI.
So, this is it.
We’ll be back next week with fresh updates on AI, which we think you’ll love.
Meanwhile, if you have something to tell us, we are all ears.
Have suggestions or questions for us? Reach out to us at social@pixelbin.io.
Follow Us for Everyday Highlights on Twitter, LinkedIn, and Instagram. Join our PixelBin discord community and engage in conversation with fellow AI enthusiasts.