Using AI tools as Productivity Enablers: Pieter’s list #whatAIride

“In early 2022, amidst the Covid-19 pandemic, I began using generative AI tools in the playground. To compensate for the lack of physical collaboration with my team, we developed an interactive experience featuring Isabella, an AI powered by GPT-3 (a predecessor for ChatGPT and GPT-4) and Synthesia, an AI video creation platform.” Pieter, our director of All Things Data, starts off.

So he clearly isn’t a novice when it comes to using AI tools to enhance his creativity and/or productivity. That’s why we were quick to ask him to share his list of favorite AI tools of the moment. This is what he hand-picked for you!

Pieter’s Playground Tools

These tools are used to experiment and have fun. When they prove their value, they’ll have the potential to move up a category and become true productivity enablers.

D-ID

Announcement of "TOTW 5 - The SE-AI-RCH Wars"
using D-ID and ElevenLabs.

Rating: ****
Category: Video Creation
Link: https://www.d-id.com/

Description

D-ID utilizes generative AI to produce customized videos, featuring talking avatars with a simple click of a button. This technology is available to businesses and creators through the Creative Reality Studio, which incorporates the most up-to-date AI tools for generating talking avatars from images, audio, or text. Moreover, users can create videos from photos via the Live Portrait product or speaking head videos from text or audio through the Speaking Portrait product.

How I use it

Interesting tool to read out a post on social media, and make a video of it. When you use Elevenlabs (be patient, it’ll show up), you can even use it with your own voice. The reason it’s still in my Playground, is the same reason Synthesia is still in my Playground.

Synthesia

Rating: ****
Category: Video Creation
Link: https://www.synthesia.io

Description

Synthesia is an AI-powered video creation platform that facilitates businesses to quickly generate videos from plain text. It provides web-based application access in 65 languages, a user-friendly interface, 50+ fully customisable video templates and an integrated screen recorder and media library.

How I use it

I utilised this tool to create a virtual interview. Although I am not an experienced video content creator, this tool may be beneficial in providing readers with videos on occasion. It is possible that it could become a productivity enhancer in the near future.

Pieter’s Productivity Enablers

I've come to realize that productivity enablers are not just tools that facilitate productivity. They increasingly serve as productivity enhancers, as they empower me to accomplish tasks I couldn't achieve without them.

Adapting to some of these tools may require a learning curve. For instance, when I first encountered Midjourney, I was initially captivated by the possibilities it presented. However, as my expectations grew, I found the results to be less satisfying. The more you work with these tools and familiarize yourself with their capabilities, though, the more gratifying the outcomes become. I continue to refine my skills with these tools on a daily basis.

It's worth noting a word of caution: once you become proficient in using these AI tools, you might find yourself falling down the rabbit hole, as their allure can be truly captivating!

Whisper

Rating: ****
Category: Automatic Speech Recognition / Speech-To-Text
Link: Introduction to Whisper & Github Repository

Description

Whisper is an open-source automatic speech recognition system that has been trained on 680,000 hours of multilingual and multitask supervised data obtained from the web. It is engineered to be resilient to accents, background noise and technical language, and can transcribe and translate speech in multiple languages into English.

This system utilizes an encoder-decoder Transformer architecture for its end-to-end approach. Additionally, it is capable of performing language identification and phrase-level timestamps. Its design enables easy use with high accuracy, allowing developers to integrate voice interfaces into various applications.

How I use it

Unfortunately, commuting takes up a lot of precious time. So - next to listening to music to relax - I try to spend time as efficiently as possible. Listening to podcasts is one way to do that. I also spend a lot of time watching videos. While both formats are very attractive, they do not come in handy when trying to capture the most important points, or when you’re trying to summarise what you just heard. For those cases, I use whisper. By providing the spoken text, it transcribes the whole podcast and video for me. I can use the transcription as if it were normal text.

The thing that’s very nice (maybe that’s a remainder of being a little bit of a nerd) is that I’m able to run this locally on my Mac or Linux system. Maybe I need to check if it is possible to offload it to the Coral Edge TPU coprocessors I recently bought.

ChatGPT, GPT-3.5, GPT-4

ChatGPT improving the LinkedIn article "TOTW 8 -
AI Consciousness, AI Sentience, AGI”

Rating: *****
Category: Large Language Model
Link: https://chat.openai.com/ (The chat interface about everyone uses these days) & https://platform.openai.com/ (The API & Playground)

Description

Does it still need any introduction? I think that about half of the conscious world has been using ChatGPT for the last few months. You can start a conversation with it, ask it to write poems or complete stories, ask it for menus given the ingredients you have available. When you go deeper, you can of course start customising and embedding it in your own solutions. But that’s beyond the scope of this article.

How I use it

I utilise ChatGPT extensively, with GPT-4 as the backbone in certain instances. The quality of the latter is superior in some cases, although this could be a matter of opinion. I use it for summarisation, editing and enhancing articles and posts I write (as English is not my first language), generating posts and tweets from an article, and occasionally coming up with ideas or visual descriptions for Midjourney or Stable Diffusion. Additionally, I have access to the API (although I am still on the waiting list for access to GPT-4's API) and have several shortcuts on my iPhone using GPT-3.5 (ChatGPT). It is safe to say that Siri cannot compete!

Midjourney

Rating: (overall), * (image quality)
Category: Generative Art
Link: https://www.midjourney.com/

Description

Fantastic AI to create drawings and pictures starting from a prompt. The tool is accessed through a Discord interface. Some people hate it, I started to appreciate it. The thing truly lacking is an easily accessible API.

How I use it

I use Midjourney A Lot! When I have some time left, I’m experimenting with it, leading to some ‘artwork’, although there’s a discussion you can call it art. I don’t care, I have fun using it, but more than that, it gives me a new means to express myself or to present my ideas. For personal use, I have created artwork that adorns the walls of my home and which I often give as gifts on special occasions.

In a professional context, I used Midjourney to create visuals on the Data-theme we use for our Data Capability at AE: “Turning Data into Competitive Advantage”. We use it as background in Teams-calls, backgrounds for our screens, or slides in our presentations. For our internal kickoff, we set a target to generate every image in the keynote slide deck. Moreover, I set a target for myself to accompany every article I write with some AI-Generated art.

Stable Diffusion

Rating: (overall), (image quality)
Category: Generative Art
Link: https://stability.ai

Description

AI to create drawings and pictures starting from a prompt, very much like Midjourney. However, this is an open-sourced model. So, this tool is used in a myriad of other initiatives and even commercial tools. I did a local install on my Mac, and although it takes a lot of computation resources, I enjoy generating art on my computer. It is a little harder to use than Midjourney though.

How I use it

I use Stable Diffusion less than Midjourney. Currently, I use the InvokeAI frontend, but that may change to A1111 or Vlad Diffusion in the near future. Use cases are comparable to Midjourney’s.

Segment Anything

Selecting the Mirror of JWST for LinkedIn article
"What's Your James Webb Space Telescope”

Rating: ****
Category: Image Manipulation
Link: https://segment-anything.com

Description

A new AI model from Meta AI that can "cut out" any object, in any image, with a single click.

How I use it

I only recently started using this tool, but it is a game changer. Until recently, I used Affinity Photo to cut out objects. Everyone that has done it, knows it can be a tedious process. Fast and easy to use and frequently needed ... a true productivity Enabler ;-)

Gigapixel AI

Scaling an image generated in Midjourney for printing

Rating: *****
Category: Image Manipulation
Link: https://www.topazlabs.com/gigapixel-ai

Description

AI-powered application allowing to upscale images with enhanced resolution and details up to 600%.

How I use it

Even upscaled images in Midjourney or Stable Diffusion have a relatively low resolution. Insufficient to have a qualitative print, to have large screen projections, ... So I use Gigapixel AI every time I need to have a high-resolution image, whatever the reason. I also have the DeNoise AI and Sharpen AI applications (their names suggest what they do), and although I sometimes use them, I do so way less frequently than Gigapixel AI.

ElevenLabs

Rating: *****
Category: Text-to-Speech
Link: https://beta.elevenlabs.io

Description

Eleven Labs' platform for generating long format speech uses AI to create natural and compelling voices for creators and publishers.

How I use it

You can use pre-trained models on ElevenLabs. However, I trained my own voice. I can now use it to read out posts. But the thing I’m now trying out, is to use ElevenLabs to create a podcast using the articles I write. This would be impossible to accomplish without this AI. That’s why I categorised it as a productivity enabler, although I haven’t used it that much.

What’s next?

I demonstrated the use of AI tools to create music in a video. If there is sufficient interest, I may write a blogpost on how I produced "/IM-AI-GEN 01", my first piece of music using AI and generative tools.

And what’s next? I am currently exploring AutoGPT, Alpaca, and Controlnet, and anticipate that my toolset will have changed significantly within six months. Perhaps this could be a topic for a future blogpost?

If you would like to know more or discuss how you are, reach out to me via Linkedin.

Page Title Here

Using AI tools as Productivity Enablers: Pieter’s list #whatAIride

Pieter’s Playground Tools

D-ID

Rating: ****Category: Video CreationLink: https://www.d-id.com/

Description

How I use it

Synthesia

Rating: ****Category: Video CreationLink: https://www.synthesia.io

Description

How I use it

Pieter’s Productivity Enablers

Whisper

Rating: ****Category: Automatic Speech Recognition / Speech-To-TextLink: Introduction to Whisper & Github Repository

Description

How I use it

ChatGPT, GPT-3.5, GPT-4

Rating: *****Category: Large Language ModelLink: https://chat.openai.com/ (The chat interface about everyone uses these days) & https://platform.openai.com/ (The API & Playground)

Description

How I use it

Midjourney

Rating: **** (overall), ***** (image quality)Category: Generative ArtLink: https://www.midjourney.com/

Description

How I use it

Stable Diffusion

Rating: **** (overall), **** (image quality)Category: Generative ArtLink: https://stability.ai

Description

How I use it

Segment Anything

Rating: ****Category: Image ManipulationLink: https://segment-anything.com

Description

How I use it

Gigapixel AI

Rating: *****Category: Image ManipulationLink: https://www.topazlabs.com/gigapixel-ai

Description

How I use it

ElevenLabs

Rating: *****Category: Text-to-SpeechLink: https://beta.elevenlabs.io

Description

How I use it

What’s next?

Most Popular

Whitepaper

Select a topic

Most Popular

Whitepaper

Select a topic

Related articles

How and why to use Spark job definitions in Microsoft Fabric

Our key take-aways from the Microsoft Fabric Community Conference in Stockholm

The cost of Microsoft Fabric and how to lower it

When to use Microsoft Fabric?

Explore more topics

Rating: ****
Category: Video Creation
Link: https://www.d-id.com/

Rating: ****
Category: Video Creation
Link: https://www.synthesia.io

Rating: ****
Category: Automatic Speech Recognition / Speech-To-Text
Link: Introduction to Whisper & Github Repository

Rating: *****
Category: Large Language Model
Link: https://chat.openai.com/ (The chat interface about everyone uses these days) & https://platform.openai.com/ (The API & Playground)

Rating: (overall), * (image quality)
Category: Generative Art
Link: https://www.midjourney.com/

Rating: (overall), (image quality)
Category: Generative Art
Link: https://stability.ai

Rating: ****
Category: Image Manipulation
Link: https://segment-anything.com

Rating: *****
Category: Image Manipulation
Link: https://www.topazlabs.com/gigapixel-ai

Rating: *****
Category: Text-to-Speech
Link: https://beta.elevenlabs.io