Comments

You must log in or register to comment.

Grecu69 t1_jds1x7q wrote

This looks like a slightly better version of siri imo

5

sumane12 t1_jds5lwr wrote

That delay kills me, far too long. I'm guessing gpt5 will have to be multimodal with sound so can recognise words and doesn't need to process into text

69

stupidcasey t1_jdsff4l wrote

I expect gpt-5 or 6 to be super multimodal where they train it on anything and everything we have data for, audio shur video of course crossword puzzles hell yeah pong yup car driving why not, I think the only thing stopping us is it takes to long and we’ll have more processing power by then.

13

Dwanyelle t1_jdsjgfb wrote

Yeah, I'd be surprised if we don't have something like that available publicly before the end of the year(if only cause big tech is slowly and unwieldy and things need to work their way through the proper paperwork

29

Dwanyelle t1_jdsmha6 wrote

I should clarify, it will be a packaged product from a big tech person.

I could do this, sure, I can putz around on computers a bit, but once you can just click an "install" button in the Microsoft store, that's it

13

Sigma_Atheist t1_jdswekv wrote

Marvel is cringe. Can we use some other name to compare stuff like this to?

−17

moonpumper t1_jdsxn21 wrote

I just want a screen free phone that's basically just Jarvis. Read my texts to me, look shit up for me, keep track of and make appointments for me, give me stock quotes, tell me the news, just don't suck me into an infinite scroll anymore. If I need to see something cast it to a screen in my house. Done with phone screens.

40

Tobislu t1_jdt046n wrote

Which means Ultron isn't far behind 👀

13

itsnotlupus t1_jdt2igm wrote

The model text output is(/can be) a stream, so it ought to be possible to pipe that text stream into a warmed up TTS system and start getting audio before the text is fully generated.

3

InfoOnAI t1_jdtab19 wrote

I've been trying to set something similar up.

2

RedditLovingSun t1_jdtn0z9 wrote

It looks like from the title bar he's using whisper api for transcribing his audio to a text query. That has to send a API request with the audio out and wait for the text to come back over the internet. I'm sure a local audio text transcriber would be considerably faster

Edit nvm whisper can be run locally so he's probably doing that

4

RedditLovingSun t1_jdtnafr wrote

Can't wait till we get there with a better alpaca model + local transcription and audio generation + chatgpt style plugins for operating apps. All possible today we just have to wait for it to be developed

10

HesThePianoMan t1_jdtr4iy wrote

This is nothing special, just sounds like Google assistant

−4

imlaggingsobad t1_jdu555l wrote

I'm like 100% certain that Apple, Google and Meta are making a JARVIS assistant that connects to AR glasses. It would be a revolutionary product and it's actually feasible imo.

10

SkyeandJett t1_jdv6nju wrote

This is probably just my anxiety but I feel like anything we think of or try to execute is going to be eclipsed before it can be realized. We're going to go overnight from this moment to indistinguishable from human androids and FDVR. This past couple of weeks has been overwhelming in the extreme.

1

moonpumper t1_jdva8ow wrote

With chat gpt type stuff how would it sound much different than a phone conversation? The whole idea is that the os responds to natural language, like talking to a personal assistant or secretary.

1

SnipingNinja t1_jdvg55n wrote

You're right but I think that issue isn't relevant to this, having a locally running AI would be useful regardless of other innovations, and there's something to say about cyberpunkness of such a device

1

czmax t1_jdwcbuq wrote

I was hoping that wearables (like a watch) could do this for me. Or at least force development in that direction.

(Seems to not be panning out… but i still have hope. I’d love to only carry a watch for most of my day. Initially I’d go through screen withdrawal but in the long run I think life would be better).

1

Drown_The_Gods t1_jdww8zc wrote

Use Talon Voice. The developer has their own engine that blows Whisper out of the water. Never worry about speed again. Don’t thank me, but do chuck them a few dollars if you find it useful.

2

_Alasdair t1_jdy9mfd wrote

I built something exactly like this back when GPT3 API came out. Was pretty cool but eventually got bored with it because it couldn't do anything. I tried hooking it up to external apis to get real world live data but by the end everything was so complicated and slow that I gave up.

Hopefully with the GPT4 plugins we can now make something actually useful. It's gonna be awesome.

2