Specific-Chicken5419 t1_jdrvy8m wrote on March 26, 2023 at 6:31 PM

#2,374,652

lol

Grecu69 t1_jds1x7q wrote on March 26, 2023 at 7:13 PM

#2,375,356

This looks like a slightly better version of siri imo

sumane12 t1_jds5lwr wrote on March 26, 2023 at 7:39 PM

#2,375,788

That delay kills me, far too long. I'm guessing gpt5 will have to be multimodal with sound so can recognise words and doesn't need to process into text

stupidcasey t1_jdsff4l wrote on March 26, 2023 at 8:48 PM

#2,377,035

Replying to sumane12 (#2,375,788)

I expect gpt-5 or 6 to be super multimodal where they train it on anything and everything we have data for, audio shur video of course crossword puzzles hell yeah pong yup car driving why not, I think the only thing stopping us is it takes to long and we’ll have more processing power by then.

NWCoffeenut t1_jdsgb83 wrote on March 26, 2023 at 8:54 PM

#2,377,160

Replying to sumane12 (#2,375,788)

I think a good part of the latency was with the TTS system. The actual text response for the most part came back reasonably quickly.

Dwanyelle t1_jdsjgfb wrote on March 26, 2023 at 9:16 PM

#2,377,550

Yeah, I'd be surprised if we don't have something like that available publicly before the end of the year(if only cause big tech is slowly and unwieldy and things need to work their way through the proper paperwork

pokeuser61 t1_jdskrfs wrote on March 26, 2023 at 9:26 PM

#2,377,722

Replying to sumane12 (#2,375,788)

If you ran this on the hardware that gpt5 will require, it wouldn’t have a delay.

pokeuser61 t1_jdskvem wrote on March 26, 2023 at 9:27 PM

#2,377,731

Replying to Dwanyelle (#2,377,550)

It is both public and open source

Dwanyelle t1_jdsmha6 wrote on March 26, 2023 at 9:38 PM

#2,377,917

Replying to pokeuser61 (#2,377,731)

I should clarify, it will be a packaged product from a big tech person.

I could do this, sure, I can putz around on computers a bit, but once you can just click an "install" button in the Microsoft store, that's it

illathon t1_jdsoud8 wrote on March 26, 2023 at 9:55 PM

#2,378,168

Replying to NWCoffeenut (#2,377,160)

No most implementations of whisper are slow.

micseydel t1_jdsr6vx wrote on March 26, 2023 at 10:13 PM

#2,378,452

Replying to Dwanyelle (#2,377,917)

Big tech will offer it as a service instead of a locally-running system. That will mean latency, increased data use, and other... differences 😅

Dwanyelle t1_jdss5uk wrote on March 26, 2023 at 10:21 PM

#2,378,603

Replying to micseydel (#2,378,452)

Oh, there will definitely be a ton of downsides, but convenience will not be one of them.

HarbingerDe t1_jdst2fh wrote on March 26, 2023 at 10:27 PM

#2,378,737

Replying to Grecu69 (#2,375,356)

It's a significantly better version of Siri.

GPT-4 can borderline pass the Turing Test and Siri can barely do... anything?

axidentalaeronautic t1_jdsuc3k wrote on March 26, 2023 at 10:37 PM

#2,378,910

YESSSS 😫 this has been my dream for years.

kevinzvilt t1_jdsv83o wrote on March 26, 2023 at 10:44 PM

#2,379,028

Replying to HarbingerDe (#2,378,737)

Me: Siri, set my alarm for 7am.

Siri: Here is a list of videos titled Tom Tom Solo by River Banks!

Sigma_Atheist t1_jdswekv wrote on March 26, 2023 at 10:53 PM

#2,379,184

Marvel is cringe. Can we use some other name to compare stuff like this to?

moonpumper t1_jdsxn21 wrote on March 26, 2023 at 11:02 PM

#2,379,362

I just want a screen free phone that's basically just Jarvis. Read my texts to me, look shit up for me, keep track of and make appointments for me, give me stock quotes, tell me the news, just don't suck me into an infinite scroll anymore. If I need to see something cast it to a screen in my house. Done with phone screens.

UnexpectedVader t1_jdsyi5s wrote on March 26, 2023 at 11:09 PM

#2,379,486

Replying to Sigma_Atheist (#2,379,184)

I’m much more in favour of HAL 9000.

Anjz t1_jdsynyi wrote on March 26, 2023 at 11:10 PM

#2,379,506

Replying to Sigma_Atheist (#2,379,184)

You mean you don't like MODOK?

How about we just name it Dan? Dan's a cool guy.

Tobislu t1_jdt046n wrote on March 26, 2023 at 11:21 PM

#2,379,693

Which means Ultron isn't far behind 👀

itsnotlupus t1_jdt280v wrote on March 26, 2023 at 11:37 PM

#2,379,978

Replying to illathon (#2,378,168)

Whisper is the speech recognition component.
I don't think he said what he's using for TTS, might be MacOS' builtin thingy.

itsnotlupus t1_jdt2igm wrote on March 26, 2023 at 11:39 PM

#2,380,033

Replying to sumane12 (#2,375,788)

The model text output is(/can be) a stream, so it ought to be possible to pipe that text stream into a warmed up TTS system and start getting audio before the text is fully generated.

SkyeandJett t1_jdt2zli wrote on March 26, 2023 at 11:43 PM

#2,380,108

Replying to moonpumper (#2,379,362)

That was my thought. No more phone. Just the smart watch.

averyminya t1_jdt33w0 wrote on March 26, 2023 at 11:44 PM

#2,380,126

It's using LLaMA and Alpaca

eggsnomellettes t1_jdt5dxl wrote on March 27, 2023 at 12:02 AM

#2,380,440

Replying to itsnotlupus (#2,379,978)

They're using elevenlabs, which isn't local and hence a slow API call

JDP87 t1_jdt6uhc wrote on March 27, 2023 at 12:13 AM

#2,380,624

Replying to kevinzvilt (#2,379,028)

At least you're getting an answer.

Working on that. Something went wrong. Please try again.

[deleted] t1_jdt7noy wrote on March 27, 2023 at 12:20 AM

#2,380,748

Replying to JDP87 (#2,380,624)

[deleted]

_dekappatated t1_jdt8e99 wrote on March 27, 2023 at 12:26 AM

#2,380,860

TIL there was a B programming language

[deleted] t1_jdta7a5 wrote on March 27, 2023 at 12:40 AM

#2,381,132

Samantha >>>>>>>>>>

InfoOnAI t1_jdtab19 wrote on March 27, 2023 at 12:41 AM

#2,381,147

I've been trying to set something similar up.

fuck_your_diploma t1_jdtdkjw wrote on March 27, 2023 at 1:08 AM

#2,381,596

Replying to Tobislu (#2,379,693)

Don’t tease me like this

Burgundy_and_Pearl t1_jdtf8ms wrote on March 27, 2023 at 1:22 AM

#2,381,861

Replying to Tobislu (#2,379,693)

As long as we don’t prompt it with Pinocchio.

darien_gap t1_jdthhsi wrote on March 27, 2023 at 1:40 AM

#2,382,211

I've been waiting for this since Apple's concept video in 1987: https://www.youtube.com/watch?v=umJsITGzXd0

the_funambule t1_jdtjah6 wrote on March 27, 2023 at 1:56 AM

#2,382,491

Replying to [deleted] (#2,381,132)

ChatGPT states Samantha is the most accurate representation of AI in movies

RedditLovingSun t1_jdtn0z9 wrote on March 27, 2023 at 2:28 AM

#2,382,977

Replying to sumane12 (#2,375,788)

It looks like from the title bar he's using whisper api for transcribing his audio to a text query. That has to send a API request with the audio out and wait for the text to come back over the internet. I'm sure a local audio text transcriber would be considerably faster

Edit nvm whisper can be run locally so he's probably doing that

RedditLovingSun t1_jdtnafr wrote on March 27, 2023 at 2:30 AM

#2,383,012

Replying to moonpumper (#2,379,362)

Can't wait till we get there with a better alpaca model + local transcription and audio generation + chatgpt style plugins for operating apps. All possible today we just have to wait for it to be developed

tortoise888 t1_jdtp8yj wrote on March 27, 2023 at 2:47 AM

#2,383,324

Replying to eggsnomellettes (#2,380,440)

If we eventually get open source Elevenlabs quality models running locally it's gonna be insane.

HesThePianoMan t1_jdtr4iy wrote on March 27, 2023 at 3:03 AM

#2,383,603

This is nothing special, just sounds like Google assistant

DaffyDuck t1_jdu15vr wrote on March 27, 2023 at 4:41 AM

#2,384,931

Replying to HarbingerDe (#2,378,737)

13b parameter llama is not as good as GPT4.

Genesis_Fractiliza t1_jdu2f6x wrote on March 27, 2023 at 4:55 AM

#2,385,087

Replying to tortoise888 (#2,383,324)

!remind me 1 month

RemindMeBot t1_jdu2nlm wrote on March 27, 2023 at 4:57 AM

#2,385,119

Replying to Genesis_Fractiliza (#2,385,087)

I will be messaging you in 1 month on 2023-04-27 04:55:12 UTC to remind you of this link

3 OTHERS CLICKED THIS LINK to send a PM to also be reminded and to reduce spam.

^(Parent commenter can ) ^(delete this message to hide from others.)

^(Info)	^(Custom)	^(Your Reminders)	^(Feedback)

GoSouthYoungMan t1_jdu54fg wrote on March 27, 2023 at 5:26 AM

#2,385,385

Replying to _dekappatated (#2,380,860)

And before there was B, there was APL: A Programming Language. (This is not a joke.)

imlaggingsobad t1_jdu555l wrote on March 27, 2023 at 5:26 AM

#2,385,390

I'm like 100% certain that Apple, Google and Meta are making a JARVIS assistant that connects to AR glasses. It would be a revolutionary product and it's actually feasible imo.

SnipingNinja t1_jdv68wu wrote on March 27, 2023 at 1:07 PM

#2,390,214

Replying to SkyeandJett (#2,380,108)

I actually have a concept in my mind, don't have all the skills needed but will be learning things in the next few months, hopefully I'm not too late when I'm done making my idea into reality.

SnipingNinja t1_jdv6hr0 wrote on March 27, 2023 at 1:09 PM

#2,390,253

Replying to Burgundy_and_Pearl (#2,381,861)

strings?

gif

SkyeandJett t1_jdv6nju wrote on March 27, 2023 at 1:11 PM

#2,390,277

Replying to SnipingNinja (#2,390,214)

This is probably just my anxiety but I feel like anything we think of or try to execute is going to be eclipsed before it can be realized. We're going to go overnight from this moment to indistinguishable from human androids and FDVR. This past couple of weeks has been overwhelming in the extreme.

[deleted] t1_jdv9fmu wrote on March 27, 2023 at 1:33 PM

#2,390,738

Replying to moonpumper (#2,379,362)

[deleted]

moonpumper t1_jdva8ow wrote on March 27, 2023 at 1:40 PM

#2,390,872

Replying to [deleted] (#2,390,738)

With chat gpt type stuff how would it sound much different than a phone conversation? The whole idea is that the os responds to natural language, like talking to a personal assistant or secretary.

[deleted] t1_jdvfk27 wrote on March 27, 2023 at 2:19 PM

#2,391,781

Replying to moonpumper (#2,390,872)

[deleted]

ebolathrowawayy t1_jdvfmrk wrote on March 27, 2023 at 2:19 PM

#2,391,792

Replying to eggsnomellettes (#2,380,440)

There's also Tortoise TTS which can be run locally but idk how fast it is.

SnipingNinja t1_jdvg55n wrote on March 27, 2023 at 2:23 PM

#2,391,880

Replying to SkyeandJett (#2,390,277)

You're right but I think that issue isn't relevant to this, having a locally running AI would be useful regardless of other innovations, and there's something to say about cyberpunkness of such a device

GoldenRain t1_jdvlweg wrote on March 27, 2023 at 3:02 PM

#2,392,800

Replying to Dwanyelle (#2,377,917)

Like https://chat.d-id.com/ which already exists?

LevelWriting t1_jdvn9mt wrote on March 27, 2023 at 3:12 PM

#2,392,979

Replying to imlaggingsobad (#2,385,390)

I would give up phone if could replace with ar.

czmax t1_jdwcbuq wrote on March 27, 2023 at 5:53 PM

#2,396,487

Replying to moonpumper (#2,379,362)

I was hoping that wearables (like a watch) could do this for me. Or at least force development in that direction.

(Seems to not be panning out… but i still have hope. I’d love to only carry a watch for most of my day. Initially I’d go through screen withdrawal but in the long run I think life would be better).

Drown_The_Gods t1_jdww8zc wrote on March 27, 2023 at 7:59 PM

#2,399,191

Replying to sumane12 (#2,375,788)

Use Talon Voice. The developer has their own engine that blows Whisper out of the water. Never worry about speed again. Don’t thank me, but do chuck them a few dollars if you find it useful.

_Alasdair t1_jdy9mfd wrote on March 28, 2023 at 1:48 AM

#2,406,617

I built something exactly like this back when GPT3 API came out. Was pretty cool but eventually got bored with it because it couldn't do anything. I tried hooking it up to external apis to get real world live data but by the end everything was so complicated and slow that I gave up.

Hopefully with the GPT4 plugins we can now make something actually useful. It's gonna be awesome.

Comments