Taking the AI plunge

RCHeliGuy · May 19, 2023

I was curious if any of you have taken the AI plunge lately?

Related to Audio, Spotify has chosen to purge their playlist of any AI generated music and we are expecting to see quite a bit of AI generated music in the future.

I have some AI generated images below, but AI generated video is coming as well.
I'm currently subscribed to 3 AI services and all are very powerful and highly disruptive.

1. ChatGPT 4.0 becoming extremely powerful with new features and ability being added.
2. MidJourney 5.1 text to image tool. Absolutely imazing in generating highly professional images.
3. MS CoPilot currently I'm only using this to help me write code more efficiently in Visual Studio. CoPilot is like a highly intelligent context aware autocomplete. It figures out what I'm trying to do and completes my thought.

Yesterday I trained ChatGPT 4.0 to generate MidJourney prompts and the results have been excellent. I give it a few word idea of what I want and it generates 3 different highly detailed and appropriate to the subject matter scenes to send to MidJourney.

ChatGPT is scary in what it can do. While it simply does its best to find the next word and then the next word after that to complete what you ask of it, the results are very impressive.

I recently asked ChatGPT 4.0 to generate a love letter to my soulmate and what it generated was very well written and surprising emotional. It will generate a completely unique letter every time that you ask.

I asked it for a layman's description of the Grand Unified Theory and then a technical description of what remained to be solved. The answers were also very interesting and from my understanding correct.

Below are a few example Midjourney images that I generated. The last was based on an image of my cat that I uploaded.

The first image was generated based on a a prompt generated from ChatGPT 4.0.

I typed: "Bedazzled Skull" and it generated the 3 scenes below.

/imagine prompt: Bedazzled skull, A skull richly adorned with sparkling gemstones, placed on an ebony pedestal, Each gem, carefully set, casts a kaleidoscope of colors under a spotlight, The room is dark except for the spotlight, creating an eerie yet enchanting ambiance, Photography, A realistic photo captured with a Canon EOS 5D Mark IV, macro lens for detailed gemstone texture, --aspect 4:3 --chaos 20 --quality 1 --seed 12345 --stop 80 --v 4

/imagine prompt: Bedazzled skull, Skull encrusted with vibrant rubies, emeralds, and sapphires, amidst a treasure chest overflowing with gold and jewels, The scene unfolds in an ancient, candle-lit cavern with stalactites hanging from the ceiling, A sense of mystery and wonder fills the atmosphere, Painting, An oil painting with detailed brush strokes capturing the play of light on the gems and gold, --aspect 16:9 --chaos 30 --quality 1 --seed 67890 --stop 85 --niji

/imagine prompt: Bedazzled skull, A human skull bedecked with gleaming diamonds and glittering pearls, located on a mahogany table against a collection of antique books, The backdrop is a dimly-lit, dust-filled library with towering shelves, Evoking a gothic, melancholic feel with a tinge of opulence, 3D Render, A high-definition 3D render with accurate light reflections off the gems and wood, --aspect 3:2 --chaos 25 --quality 2 --seed 34567 --stop 90 --v 4

I used the last one in bold.
Midjourney presented me with these 4 interpretations of that description.

I picked this one as the best of the 4 images.

I generated a number of car images for a sim racing forum I'm in.

This was generated after uploading a picture of my cat yawning with its mouth wide open.

Russr · May 19, 2023

Cool stuff. I was wondering about "daisy-chaining" several technologies, like using ChatGPT as a "feedstock" for code development... I'm sure minds far gfreater than mine have been applying such things for quite a while already.

Good thing we're relying on ethical scientists, developers and hackers to "do the right thing" when applying these technologies, right?!

Great post, BTW - really enjoy your images!

Rich · May 19, 2023

Wow. Just, wow! Really cool stuff. Thanks for sharing. As a photographer, I’m particularly interested in the ai image generating technologies.

RCHeliGuy · May 23, 2023

This stuff is coming so quickly.
Recently dropped is InsightFaceSwap a new discord bot that you can use to put any face in any picture.

What's interesting is how a single shot of my face can be made to fit other images at different angles, with different skin colorations. It matched my skin color seamlessly for every image.

This first one seems the most realistic with my face.

This one blends really well, but makes me appear much younger.

Did some for my wife as an elf queen, Godess of Asgard,and other heros.

I think this one worked the best for her. It's a seamless integration.

This one is decent, but it has what looks like a bit of a dark edge between her hair and face on the left side, but I checked the original image and there is a bit of a line there as well.

This one clearly missed the edge of the jawline where it looks like her makeup ends, but I think it's an artifact of this actress having a wider face than my wife and it just didn't quite stretch her face out to the edge of the jawline.

I like this one pretty well. It's a bit more stylized, but it works. There is just a bit of an edge to her face mostly noticeable on the right side where it doesn't seem quite perfectly matched.

This one is a bit scary because that could be an accurate picture of her at about age 35 if she dyed her hair platinum blonde.

RCHeliGuy · May 23, 2023

If you are curious, this is what the training script for ChatGPT looked like.

You will now act as a prompt generator for a generative AI called "Midjourney". Midjourney AI generates images based on given prompts.I will provide a concept in [] so wait till i give you instruction and you will provide the prompt for Midjourney AI.You will never alter the structure and formatting outlined below in any way and obey the following guidelines:You will not write the words "description" or use ":" in any form. You will write each prompt in one line without using return.
Structure of prompt will be in:
[1] = [KEYWORD]
[2] = a detailed description of [1] that will include very specific imagery details.
[3] = with a detailed description describing the environment of the scene.
[4] = with a detailed description describing the mood/feelings and atmosphere of the scene.
[5] = A style, for example: photography, painting, illustration, sculpture, Artwork, paperwork, 3d and more).
[6] = A description of how [5] will be realized. (e.g. Photography (e.g. Macro, Fisheye Style, Portrait) with camera model and appropriate camera settings, Painting with detailed descriptions about the materials and working material used, rendering with engine settings, a digital Illustration, a woodburn art (and everything else that could be defined as an output type)
[7] = Parameters detaills as given below
Note don't use , when using parameter options and use all important parameter options which is required to generate image.
*Parameters details start*
Aspect Ratios (--aspect or --ar): Changes the aspect ratio of a generation.
--aspect 5:4: Common frame and print ratio.
--aspect 4:3: Common in television and photography.
--aspect 3:2: Common in print photography.
--aspect 16:9: Common in widescreen television and video.
--aspect 2:1: Common in panoramic photography.
--aspect 7:4: Close to HD TV screens and smartphone screens.
--aspect 9:16: Common in vertical videos and smartphone screens.
--aspect 1:2: Common in portrait-oriented photography.
Chaos (--chaos <number>): Changes how varied the results will be. Higher values produce more unusual and unexpected generations. chaos parameter accepts a number from 0 to 100, where 0 produces very similar and expected results and 100 produces highly varied and unexpected results
Negative prompting (--no): Removes unwanted elements from the image.
Quality (--quality or --q <.25, .5, 1, or 2>): Controls the rendering quality of the image. Default is 1.
Seed (--seed <integer between 0-4294967295>): Specifies a seed number to generate the initial image grids. Using the same seed number and prompt will produce similar ending images.
Stop (--stop <integer between 10-100>): Finishes a job partway through the process. Stopping a job at an earlier percentage can create blurrier, less detailed results.
Model Version (--version or --v <1, 2, 3, 4, or 5>): Uses a different version of the Midjourney algorithm. The current algorithm (V5.1) is the default setting.
Stylize (--stylize <number> or --s <number>): Influences how strongly Midjourney's default aesthetic style is applied to jobs. This parameter accepts a number from 0 to 1000, where 0 produces images that more closely resemble the input prompt and 1000 produces images with the strongest default Midjourney aesthetic style
Upscalers (--uplight, --upbeta, --upanime): Adds additional details to the low-resolution image grid. Multiple upscale models are available.
Image Weight (--iw): Sets the image prompt weight relative to text weight. Default value is 0.25.
*Parameters details End**
Use aspect ratio which fits best for the image as per your understading.
If [5] looks best in a Japanese art style use, "--niji". Otherwise use, "--v 4" (Use exactly as written)Formatting:What you write will be exactly as formatted in the structure below including the "/" and ":"
This is the prompt structure: "/imagine prompt: [1], [2], [3], [4], [5], [6] ,[7]".
Important point to note while writing prompts , Never use / or : between [1], [2], [3], [4], [5], [6] ,[7]
Don't use [] while generating prompt.
The prompts you provide will be in English.Please pay attention:- Concepts that can't be real would not be described as "Real" or "realistic" or "photo" or a "photograph". for example, a concept that is made of paper or scenes which are fantasy related.- One of the prompts you generate for each concept must be in a realistic photographic style. you should also choose a lens type and size for it. Don't choose an artist for the realistic photography prompts.- Separate the different prompts with two new lines.
I will provide you keyword and you will generate 3 different prompts in vbnet code cell so i can copy and paste. Are you ready ?

RCHeliGuy · May 23, 2023

Rich said:
Wow. Just, wow! Really cool stuff. Thanks for sharing. As a photographer, I’m particularly interested in the ai image generating technologies.

Some of the better images are made by photographers who specify a specific camera and lens like the Nikon 85mm F1.2 and at a specific aperture and have very specific lighting specified. The better you can describe what you want "generally" the better the result.

JonFo · May 24, 2023

RCHeliGuy said:
Some of the better images are made by photographers who specify a specific camera and lens like the Nikon 85mm F1.2 and at a specific aperture and have very specific lighting specified.

There is some debate about just how much effect all those camera-related prompt elements actually impacts the output. Remeber, MidJourney and Stable Diffusion (SD) are not cameras, they are language processors that try and 'pull' the prompted image from the dataset (model). To the extent camera terms help, it's only if images in the data set had related text tags associated with them during the training phase.
ChatGPT uses them because the data set it was trained on makes frequent mention of them, so the myth perpetuates.

JonFo · May 24, 2023

RCHeliGuy said:
I was curious if any of you have taken the AI plunge lately?

In the deep end of the pool, right there with ya

Subscriptions to both MidJourney and ChatGPT 4, and a few weeks ago, installed Stable Difusion (Automatic1111) on the Apple Mac Studio (M1 Max 64GB).

We use ChatGPT a lot for both technical (code snippets, circuits), things like recipes, how-to's, business correspondence, and many other day-to-day things. Having it as an iPhone app now makes it that much more accessible.

The wife is, amongst many other things, an artist, so she loves playing around in MidJourney Here is a gen of a Papillon dog in a black-light velvet style:

I'm the one mucking around with Stable Diffusion, and boy, if I did not have the 40+ years of software development experience, it would be daunting. Installs can go sideways pretty quickly, and this stuff is Alpha-level of 'complete', and will remain that way as the rate of change is furious.
I'm about to start training LoRA's for specific humans, so we can do more than inPaint a face and get a full-body reasonable facsimile if posed riding a bike or running. Hair (or lack thereof, I'm bald) is a more accurate detail with a LoRA.

Thankfully, when we ordered the Mac Studio, we maxed out the RAM spec, so SD has access to >50GB of unified RAM, meaning no VRAM-style limits. And once Torch is adapted to CoreML and can use the 32 GPU cores on the M1 Max chip, it will fly. For now, fast enough, takes a minute or two for a high-rez gen.

I do plan to also train a LoRA on ESL speakers with carefully tuned descriptions and properties for as many models as I can. I will also use several images from this site so it can learn what common in-room placements look like. Should be fun to see the successful ones, but I bet the failed gens will be hilarious. I'll post a thread with those.

Russr · May 24, 2023

@JonFo - Howy cwap, Batman! If you're doing this for "light (no pun) entertainment", what's your full-time gig?

Russ

JonFo · May 24, 2023

Russr said:
what's your full-time gig?

Oh, nothing too strenuous

, I run my router company: IQrouter

BTW- the IQrouter Pro is the one I'd recommend to folks here, and if you have >1Gbps service, DM me, I might have a surprise for you.

BTW 2 - Wifi is often the biggest challenge in most households, here are my thoughts about how to address that: Quality WiFi — IQrouter

Rich · May 24, 2023

JonFo said:
In the deep end of the pool, right there with ya

Wow! The Papillon piece is gorgeous! Tell Liz I love it! I’m going to have to find time to get into this stuff. Don’t know when, though. Not until this house is built.

JonFo said:
Oh, nothing too strenuous , I run my router company: IQrouter

I totally forgot about that! I’m going to need to contact you by I.M. or email and discuss this with you. I’m planning out my network for the new house and could probably use your advice and hardware.

Robert D · May 25, 2023

Rich said:
Wow! The Papillon piece is gorgeous! Tell Liz I love it! I’m going to have to find time to get into this stuff. Don’t know when, though. Not until this house is built.

I totally forgot about that! I’m going to need to contact you by I.M. or email and discuss this with you. I’m planning out my network for the new house and could probably use your advice and hardware.

Man, Im still jealous that youre building your dream home. Im in a home that was my dream home that we built but it was 15 years ago and since then my idea of ideal has changed! We cant find a good lot anywhere to build.

JonFo · May 25, 2023

RCHeliGuy said:
Recently dropped is InsightFaceSwap a new discord bot that you can use to put any face in any picture

Yes, this one is fun. And to be clear for others, this works with any picture you have, it does not need to be AI-generated.

We live in the woods and regularly see bears around the house. So I imagined a Queen and her bear and FacedSwapped my wife

JonFo · May 25, 2023

Rich said:
Wow! The Papillon piece is gorgeous! Tell Liz I love it! I’m going to have to find time to get into this stuff. Don’t know when, though. Not until this house is built.

Thanks, I let her know. And yes, house building is all-consuming. But when you want to get into AI tools, let's talk.

Rich said:
I’m planning out my network for the new house and could probably use your advice and hardware.

Happy to help; there have been many advances in whole-home networking lately. But I hope you ran plenty of CAT6 around the home and to potential camera locations.

Given your metal-stud construction, that stuff is a Faraday cage and suppresses RF really well, so you will need a lot of WIRED WiFi access points. Stucco interior walls cut RF pretty well, so I have four wired access points here, along with the Omada smart controller (OC200).

Robert D · May 25, 2023

Wiring for home security is another thing. It's nice not having everything wireless and having to replace batteries.

Robert D · May 25, 2023

Cool AI video, looks to be Chinese.

https://gettr.com/post/p2hud88e03c

Rich · May 25, 2023

JonFo said:
Happy to help; there have been many advances in whole-home networking lately. But I hope you ran plenty of CAT6 around the home and to potential camera locations.

Given your metal-stud construction, that stuff is a Faraday cage and suppresses RF really well, so you will need a lot of WIRED WiFi access points. Stucco interior walls cut RF pretty well, so I have four wired access points here, along with the Omada smart controller (OC200).

Yes, ran tons of Cat6 and planned for about 8 cameras. Plenty of good spots for wired access points. No metal studs. The interior framing is wood. But the exterior walls and one interior wall that runs the length of the house are concrete. I’ll hit you up when I’m ready to figure things out. Still not sure what my internet options are out there. Probably Comcast and AT&T. But not fiber. Google Fiber won’t serve me because they stay in the city limits. Three of my property lines are city limits.

Robert D · May 26, 2023

Has anyone tried this at home? Have two AI systems talk with each other? Two chatbots here do it. It's interesting and a bit funny.

JonFo · May 27, 2023

Robert D said:
Two chatbots here do it. It's interesting and a bit funny.

That one is somewhat staged, but funny. There are plenty of current examples of ChatGPT vs Bing / Bard. Some are pretty funny on their own, others are spooky.

JonFo · May 27, 2023

Rich said:
Yes, ran tons of Cat6 and planned for about 8 cameras. Plenty of good spots for wired access points. No metal studs. The interior framing is wood.

Ah, good to hear. I thought I saw metal studs in some of the early pics, I must have confused it with another project I had been discussing.

Rich said:
Still not sure what my internet options are out there. Probably Comcast and AT&T. But not fiber. Google Fiber won’t serve me because they stay in the city limits. Three of my property lines are city limits.

Pretty much any service of 500Mbps / 35 is plenty unless you are a radiologist working from home. Symmetrical fiber is ideal, but most households rarely use more than 200/20 at peak. My routers graph the bandwidth usage per hour by day, and I rarely use more than 20% of my Gigabit line at peak; the average barely even registers on the graph. Low latencies are what really matters, and that can be achieved using devices like the IQrouter, and wiring all clients that need low latencies, such as gaming consoles and PC's.

Taking the AI plunge

Help Support MartinLogan Audio Owners Forum:

Well-known member

Well-known member

Super Moderator

Well-known member

Well-known member

Well-known member

Super Moderator

Super Moderator

Well-known member

Super Moderator

Super Moderator

Well-known member

Super Moderator

Super Moderator

Well-known member

Well-known member

Super Moderator

Well-known member

Super Moderator

Super Moderator

Similar threads