Submitted by minimaxir t3_11fbccz in MachineLearning

https://openai.com/blog/introducing-chatgpt-and-whisper-apis

> It is priced at $0.002 per 1k tokens, which is 10x cheaper than our existing GPT-3.5 models.

This is a massive, massive deal. For context, the reason GPT-3 apps took off over the past few months before ChatGPT went viral is because a) text-davinci-003 was released and was a significant performance increase and b) the cost was cut from $0.06/1k tokens to $0.02/1k tokens, which made consumer applications feasible without a large upfront cost.

A much better model and a 1/10th cost warps the economics completely to the point that it may be better than in-house finetuned LLMs.

I have no idea how OpenAI can make money on this. This has to be a loss-leader to lock out competitors before they even get off the ground.

574

Comments

You must log in or register to comment.

LetterRip t1_jaj1kp3 wrote

> I have no idea how OpenAI can make money on this.

Quantizing to mixed int8/int4 - 70% hardware reduction and 3x speed increase compared to float16 with essentially no loss in quality.

A*.3/3 = 10% of the cost.

Switch from quadratic to memory efficient attention. 10x-20x increase in batch size.

So we are talking it taking about 1% of the resources and a 10x price reduction - they should be 90% more profitable compared to when they introduced GPT-3.

edit - see MS DeepSpeed MII - showing a 40x per token cost reduction for Bloom-176B vs default implementation

https://github.com/microsoft/DeepSpeed-MII

Also there are additional ways to reduce cost not covered above - pruning, graph optimization, teacher student distillation. I think teacher student distillation is extremely likely given reports that it has difficulty with more complex prompts.

252

Thunderbird120 t1_jajok9y wrote

I'm curious which memory efficient transformer variant they've figured out how to leverage at scale. They're obviously using one of them since they're offering models with 32k context but it's not clear which one.

59

lucidraisin t1_jakb7h4 wrote

67

Thunderbird120 t1_jakbyew wrote

You're better qualified to know than nearly anyone who posts here, but is flash attention really all that's necessary to make that feasible?

24

lucidraisin t1_jakdtf7 wrote

yes

edit: it was also used to train Llama. there is no reason not to use it at this point, for both training and fine-tuning / inference

46

fmai t1_jalcs0x wrote

AFAIK, flash attention is just a very efficient implementation of attention, so still quadratic in the sequence length. Can this be a sustainable solution for when context windows go to 100s of thousands?

14

lucidraisin t1_jamtx7b wrote

it cannot, the compute still scales quadratically although the memory bottleneck is now gone. however, i see everyone training at 8k or even 16k within two years, which is more than plenty for previously inaccessible problems. for context lengths at the next order of magnitude (say genomics at million basepairs), we will have to see if linear attention (rwkv) pans out, or if recurrent + memory architectures make a comeback.

14

LetterRip t1_janljeo wrote

Ah, I'd not seen the Block Recurrent Transformers paper before, interesting.

3

visarga t1_jalg9iu wrote

I think the main pain point was memory usage.

6

Dekans t1_jamokhr wrote

> We also extend FlashAttention to block-sparse attention, yielding an approximate attention algorithm that is faster than any existing approximate attention method.

...

> FlashAttention and block-sparse FlashAttention enable longer context in Transformers, yielding higher quality models (0.7 better perplexity on GPT-2 and 6.4 points of lift on long-document classification) and entirely new capabilities: the first Transformers to achieve better-than-chance performance on the Path-X challenge (seq. length 16K, 61.4% accuracy) and Path-256 (seq. length 64K, 63.1% accuracy).

In the paper bold is done using the block-sparse version. The Path-X (16K length) is done using regular FlashAttention.

4

Hsemar t1_jalp8as wrote

but does flash attention help with auto-regressive generation? My understanding was that it prevents materializing the large kv dot product during training. At inference (one token at a time) with kv caching this shouldn't be that relevant right?

0

minimaxir OP t1_jajcf4s wrote

It's safe to assume that some of those techniques were already used in previous iterations of GPT-3/ChatGPT.

30

LetterRip t1_jajezib wrote

June 11, 2020 is the date of the GPT-3 API was introduced. No int4 support and the Ampere architecture with int8 support had only been introduced weeks prior. So the pricing was set based on float16 architecture.

Memory efficient attention is from a few months ago.

ChatGPT was just introduced a few months ago.

The question was 'how OpenAI' could be making a profit, if they were making a profit on GPT-3 2020 pricing; then they should be making 90% more profit per token on the new pricing.

52

jinnyjuice t1_jalkbvu wrote

How do we know these technical improvements result in 90% extra revenue? I feel I'm missing some link here.

0

Smallpaul t1_jam673c wrote

I think you are using the word revenue when you mean profit.

3

LetterRip t1_jani50o wrote

We don't know the supply demand curve, so we can't know for sure that the revenue increased.

1

andreichiffa t1_jajuk03 wrote

That, and the fact that OpenAI/MS want to completely dominate LLM market, in the same way Microsoft dominated OS/browser market in the late 90s/early 2000s.

23

Smallpaul t1_jam6et8 wrote

They’ll need a stronger story around lock-in if that’s their strategy. One way would be to add structured and unstructured data storage to the APIs.

5

bjergerk1ng t1_jakszgr wrote

Is it possible that they also switched from non-chinchilla-optimal davinci to chinchilla-optimal chatgpt? That is at least 4x smaller

8

LetterRip t1_jal4y8i wrote

Certainly that is also a possibility. Or they might have done teacher student distillation.

6

[deleted] t1_jamt0wc wrote

[deleted]

8

Pikalima t1_janc14v wrote

I’d say we need an /r/VXJunkies equivalent for statistical learning theory, but the real deal is close enough.

4

cv4u t1_jakzhqj wrote

LLMs can quantize to 8 bit or 4 bit?

4

LetterRip t1_jal4vgs wrote

Yep, or a mix between the two.

GLM-130B quantized to int4, OPT and BLOOM int8,

https://arxiv.org/pdf/2210.02414.pdf

Often you'll want to keep the first and last layer as int8 and can do everything else int4. You can quantize based on the layers sensitivity, etc. I also (vaguely) recall a mix of 8bit for weights, and 4bits for biases (or vice versa?),

Here is a survey on quantization methods, for mixed int8/int4 see the section IV. ADVANCED CONCEPTS: QUANTIZATION BELOW 8 BITS

https://arxiv.org/pdf/2103.13630.pdf

Here is a talk on auto48 (automatic mixed int4/int8 quantization)

https://www.nvidia.com/en-us/on-demand/session/gtcspring22-s41611/

11

londons_explorer t1_jam6oyr wrote

Aren't biases only a tiny tiny fraction of the total memory usage? Is it even worth trying to quantize them more than weights?

7

CellWithoutCulture t1_javhjpc wrote

I mean... why were they not doing this already? They would have to code it but it seems like low hanging fruit

> memory efficient attention. 10x-20x increase in batch size.

That seems large, which paper has that?

1

LetterRip t1_javpxbv wrote

> I mean... why were they not doing this already? They would have to code it but it seems like low hanging fruit

GPT-3 came out in 2020 (they had their initial price then a modest price drop early on).

Flash attention is June of 2022.

Quantization we've only figured out how to do it fairly lossless recently (especially int4). Tim Dettmers LLM int8 is from August 2022.

https://arxiv.org/abs/2208.07339

> That seems large, which paper has that?

See

https://github.com/HazyResearch/flash-attention/raw/main/assets/flashattn_memory.jpg

>We show memory savings in this graph (note that memory footprint is the same no matter if you use dropout or masking). Memory savings are proportional to sequence length -- since standard attention has memory quadratic in sequence length, whereas FlashAttention has memory linear in sequence length. We see 10X memory savings at sequence length 2K, and 20X at 4K. As a result, FlashAttention can scale to much longer sequence lengths.

https://github.com/HazyResearch/flash-attention

1

CellWithoutCulture t1_javqw9s wrote

Fantastic reply, it's great to see all those concrete advances thst made it intro prod. Thanks for sharing.

1

harharveryfunny t1_jairuhd wrote

It says they've cut their costs by 90%, and are passing that saving onto the user. I'd have to guess that they are making money on this, not just treating it as a loss-leader for other more expensive models.

The way the API works is that you have to send the entire conversation each time, and the tokens you will be billed for include both those you send and the API's response (which you are likely to append to the conversation and send back to them, getting billed again and again as the conversation progresses). By the time you've hit the 4K token limit of this API, there will have been a bunch of back and forth - you'll have paid a lot more than 4K * 0.2c/1K for the conversation. It's easy to imagine chat-based API's becoming very widespread and the billable volume becoming huge. OpenAI are using Microsoft Azure compute, who may see a large spike in usage/profits out of this.

It'll be interesting to see how this pricing, and that of competitors evolves. Interesting to see also some of OpenAI's annual price plans outlined elsewhere such as $800K/yr for their 8K token limit "DV" model (DaVinci 4.0?), and $1.5M/yr for the 32K token limit "DV" model.

69

luckyj t1_jajaz53 wrote

But that (sending the whole or part of the conversation history) is exactly what we had to do with text-davinci if we wanted to give it some type of memory. It's the same thing with a different format, and 10% of the price... And having tested it, it's more like chatgpt (I'm sorry, I'm a language model type of replies), which I'm not very fond of. But the price... Hard to resist. I've just ported my bot to this new model and will play with it for a few days

24

currentscurrents t1_jajg818 wrote

> It says they've cut their costs by 90%

Honestly this seems very possible. The original GPT-3 made very inefficient use of its parameters, and since then people have come up with a lot of ways to optimize LLMs.

16

visarga t1_jaj4bqs wrote

> $1.5M/yr

The inference cost is probably 10% of that.

5

xGovernor t1_jaksopw wrote

Oh boy what I got away with. I have been using hundreds of thousands of tokens, augmenting parameters and only ever spent 20 bucks. I feel pretty lucky.

3

Im2bored17 t1_jam6y5y wrote

$20.00 / ($0.002/ 1k tokens) = 10m tokens. If you only used a few hundred k, you got scammed hard lol

8

xGovernor t1_jasx7r9 wrote

You needed the secret api key, included with the plus edition. Prior to Whispers I don't believe you could obtain a secret key. Also gave early access to new features and provides me turbo day one. Also I've used to much more and got turbo to work with my plus subscription.

Had to find a workaround. Don't feel scammed. Plus I've been having too much fun with it.

1

Educational-Net303 t1_jair4wf wrote

Definitely a loss-leader to cut off Claude/bard, electricity alone would cost more than that. Expect a rise in price in 1 or 2 months

68

JackBlemming t1_jaisvp4 wrote

Definitely. This is so they can become entrenched and collect massive amounts of data. It also discourages competition, since they won't be able to compete against these artificially low prices. This is not good for the community. This would be equivalent to opening up a restaurant and giving away food for free, then jacking up prices when the adjacent restaurants go bankrupt. OpenAI are not good guys.

I will rescind my comment and personally apologize if they release ChatGPT code, but we all know that will never happen, unless they have a better product lined up.

68

jturp-sc t1_jaj45ek wrote

The entry costs have always been so high that LLMs as a service was going to be a winner-take-most marketplace.

I think the best hope is to see other major players enter the space either commercially or as FOSS. I think the former is more likely, and I was really hoping that we would see PaLM on GCP or even something crazier like a Meta-Amazon partnership for LLaMa on AWS.

Unfortunately, I don't think any of those orgs will pivot fast enough until some damage is done.

27

badabummbadabing t1_jajdjmr wrote

Honestly, I have become a lot more optimistic regarding the prospect of monopolies in this space.

When we were still in the phase of 'just add even more parameters', the future seemed to be headed that way. With Chinchilla scaling (and looking at results of e.g. LLaMA), things look quite a bit more optimistic. Consider that ChatGPT is reportedly much lighter than GPT3. At some point, the availability of data will be the bottleneck (which is where an early entry into the market can help getting an advantage in terms of collecting said data), whereas compute will become cheaper and cheaper.

The training costs lie in the low millions (10M was the cited number for GPT3), which is a joke compared to the startup costs of many, many industries. So while this won't be something that anyone can train, I think it's more likely that there will be a few big players (rather than a single one) going forward.

I think one big question is whether OpenAI can leverage user interaction for training purposes -- if that is the case, they can gain an advantage that will be much harder to catch up to.

24

farmingvillein t1_jajw0yj wrote

> The training costs lie in the low millions (10M was the cited number for GPT3), which is a joke compared to the startup costs of many, many industries. So while this won't be something that anyone can train, I think it's more likely that there will be a few big players (rather than a single one) going forward.

Yeah, I think there are two big additional unknowns here:

  1. How hard is it to optimize inference costs? If--for sake of argument--for $100M you can drop your inference unit costs by 10x, that could end up being a very large and very hidden barrier to entry.

  2. How much will SOTA LLMs really cost to train in, say, 1-2-3 years? And how much will SOTA matter?

The current generation will, presumably, get cheaper and easier to train.

But if it turns out that, say, multimodal training at scale is critical to leveling up performance across all modes, that could jack up training costs really, really quickly--e.g., think the costs to suck down and train against a large subset of public video. Potentially layer in synthetic data from agents exploring worlds (basically, videogames...), as well.

Now, it could be that the incremental gains to, say, language are not that high--in which case the LLM (at least as these models exist right now) business probably heavily commoditizes over the next few years.

9

Derpy_Snout t1_jajfxrw wrote

> This would be equivalent to opening up a restaurant and giving away food for free, then jacking up prices when the adjacent restaurants go bankrupt.

The good old Walmart strategy

17

VertexMachine t1_jajjq8b wrote

Yea, but one thing is not adding up. It's not like I can go to a competitor and get access to similar level of quality API.

Plus if it's a price war... with Google.. that would be stupid. Even with Microsoft's money, Alphabet Inc is not someone you want to go to war on undercutting prices.

Also they updated their polices on using users data, so the data gathering argument doesn't seem valid as well (if you trust them)


Edit: ah, btw. I don't say that there is no ulterior motive here. I don't really trust "Open"AI since the "GPT2-is-to-dangerous-to-release" bs (and corporate restructuring). Just that I don't think is that simple.

13

farmingvillein t1_jajtmly wrote

> Plus if it's a price war... with Google.. that would be stupid

If it is a price war strategy...my guess is that they're not worried about Google.

Or, put another way, if it is Google versus OpenAI, openai is pretty happy about the resulting duopoly. Crushing everyone else in the womb, though, would be valuable.

11

astrange t1_jajpps3 wrote

"They're just gathering data" is literally never true. That kind of data isn't good for anything.

−5

Purplekeyboard t1_jajcnb5 wrote

> This is not good for the community.

When GPT-3 first came out and prices were posted, everyone complained about how expensive it was, and that it was prohibitively expensive for a lot of uses. Now it's too cheap? What is the acceptable price range?

6

JackBlemming t1_jajg4dz wrote

It's not about the price, it's about the strategy. Google maps API was dirt cheap so nobody competed, then they cranked up prices 1400% once they had years of advantage and market lock in. That's not ok.

If OpenAI keeps prices stable, nobody will complain, but this is likely a market capturing play. They even said they were losing money on every request, but maybe that's not true anymore.

18

Beli_Mawrr t1_jajvgax wrote

I use the API as a dev. I can say that if Bard works anything like OpenAI, it will be super easy to switch.

5

[deleted] t1_jajgqsv wrote

[deleted]

3

bmc2 t1_jajjjvd wrote

Training based on submitted data is going to be curtailed according to their announcement:

“Data submitted through the API is no longer used for service improvements (including model training) unless the organization opts in”

5

lostmsu t1_jaj0dw2 wrote

I would love an electricity estimate for running GPT-3-sized models with optimal configuration.

According to my own estimate, electricity cost for a lifetime (~5y) of a 350W GPU is between $1k-$1.6k. Which means for enterprise-class GPUs electricity is dwarfed by the cost of the GPU itself.

14

currentscurrents t1_jajfjr5 wrote

Problem is we don't actually know how big ChatGPT is.

I strongly doubt they're running the full 175B model, you can prune/distill a lot without affecting performance.

11

MysteryInc152 t1_jal7d3p wrote

Distillation doesn't work for token predicting language models for some reason.

3

currentscurrents t1_jalajj3 wrote

DistillBERT worked though?

2

MysteryInc152 t1_jalau7e wrote

Sorry i meant the really large scale models. Nobody has gotten a gpt-3/chinchilla etc scale model to actually distill properly.

6

harharveryfunny t1_jaj8bk2 wrote

Could you put any numbers to that ?

What are the FLOPS per token inference for a given prompt length (for a given model)?

What do those FLOPS translate to in terms of run time on Azure's GPUs (V100's ?)

What is the GPU power consumption and data center electricity costs ?

Even with these numbers can we really relate this to their $/token pricing scheme ? The pricing page mentions this 90% cost reduction being for the "gpt-3.5-turbo" model vs the earlier davinci-text-3.5 (?) one - do we even know the architectural details to get the FLOPs ?

4

WarProfessional3278 t1_jaj9nnt wrote

Rough estimate: with one 400w gpu and $0.14/hr electricity, you are looking at ~0.00016/sec here. That's the price for running the GPU alone, not accounting server costs etc.

I'm not sure if there are any reliable estimate on FLOPS per token inference, though I will be happy to be proven wrong :)

3

bmc2 t1_jajj03y wrote

They raised $10B. They can afford to eat the costs.

2

Smallpaul t1_jam6mjl wrote

1 of 2 months??? How would that short time achieve the goal against well-funded competitors?

It would need to be multiple years of undercutting and even that might not be enough to lock google out.

2

WarAndGeese t1_jalq339 wrote

Don't let it demotivate competitors. They are making money somehow, and planning to make massive amounts more. Hence the space is ripe for tons of competition, and those other companies would also be on track to make tons of money. Hence, jump in competitors, the market is waiting for you.

−1

Smallpaul t1_jam7abr wrote

> Don't let it demotivate competitors. They are making money somehow,

What makes you so confident?

2

MonstarGaming t1_japbd46 wrote

>They are making money somehow

Extremely doubtful. Microsoft went in for $10B at a $29B valuation. We have seen pre-revenue companies IPO for far more than that. Microsoft's $10B deal is probably the only thing keeping them afloat.

>Hence the space is ripe for tons of competition

I think you should look up which big tech companies already offer chatbots. You'll find the space is already very competitive. Sure, they aren't large, generative language models, but they target the B2C market that ChatGPT is attempting to compete in.

1

[deleted] t1_jak0est wrote

[removed]

31

elsrda t1_jak6drt wrote

Indeed, at least not for now.

EDIT: source

26

[deleted] t1_jak7jf3 wrote

[removed]

42

qqYn7PIE57zkf6kn t1_japrx5u wrote

What does system message mean?

1

earslap t1_jb0qamw wrote

When you feed messages into the API, there are different "roles" to tag each message ("assistant", "user", "system"). So you provide content and tell it from which "role" the content comes from. The model continues from there using the role "assistant". There is a token limit (limited by the model) so if your context exceeds that (combined token size of all roles), you'll need to inject salient context from the conversation using the appropriate role.

2

jturp-sc t1_jaj2w4j wrote

Glad to see them make ChatGPT accessible via API and go back to update their documentation to be more clear on which model is which.

I had an exhausting number of conversations with confused product managers, engineers and marketing managers on "No, we're not using ChatGPT".

19

ShowerVagina t1_jamiqb4 wrote

> I had an exhausting number of conversations with confused product managers, engineers and marketing managers on “No, we’re not using ChatGPT”.

They use your conversations for further training which means if you use it to help you with proprietary code or documentation, you're effectively disclosing that.

1

---AI--- t1_jamo555 wrote

OpenAI updated their page to promise they will stop doing that.

2

ShowerVagina t1_jamts00 wrote

Is that for everyone or just API/Enterprise users?

3

---AI--- t1_jasgezh wrote

I only saw it mentioned in the context of API/Enterprise users.

2

Timdegreat t1_jaj3gpr wrote

Will we be able to generate embeddings using the ChatGPT API?

9

visarga t1_jaj4lxx wrote

Not this time. Still text-embedding-ada-002

9

NoLifeGamer2 t1_jaj9i1b wrote

Gotta love getting those "Model currently busy" errors for only a single request

7

sebzim4500 t1_jan01xr wrote

Would you even want to? Sounds like overkill to me, but maybe I am missing some use case of the embeddings.

2

Timdegreat t1_jan7sel wrote

You can use the embeddings to search through documents. First, create embeddings of your documents. Then create an embedding of your search query. Do a similarity measurement between the document embeddings and the search embedding. Surface the top N documents.

1

sebzim4500 t1_jan85s7 wrote

Yeah, I get that's that embeddings are used for semantic search but would you really want to use a model as big as ChatGPT to compute the embeddings? (Given how cheap and effective Ada is)

2

Timdegreat t1_jangbi7 wrote

You got a point there! I haven't given it too much thought really -- I def need to check out ada.

But wouldn't the ChatGPT embeddings still be better? Given that they're cheap, why not use the better option?

1

farmingvillein t1_japqcq1 wrote

> But wouldn't the ChatGPT embeddings still be better? Given that they're cheap, why not use the better option?

Usually, to get the best embeddings, you need to train them somewhat differently than you do a "normal" LLM. So ChatGPT may not(?) be "best" right now, for that application.

2

londons_explorer t1_jam8409 wrote

It was an interesting business decision to make a blog post announcing two rather different products (ChatGPT API and Whisper) at the same time...

ChatGPT is a best-in-class, or even only-in-class chatbot API... While Whisper is one of many hosted speech to text solutions.

4

harharveryfunny t1_jamab7m wrote

The two pair up very well though - now that there's a natural language API, you could leverage that for speech->text->ChatGPT. From what I've seen of the Whisper demos, it seems to be the best out there by quite a margin. Does anything else perform as well?

4

fasttosmile t1_janaaex wrote

GCP, speechmatics, rev, otter.ai, assemblyai etc. etc. offer similar or better performance, as well as streaming and a much more rich output.

5

MonstarGaming t1_jap8605 wrote

That seems to be the gist of this entire thread. This is the first API most of /r/machinelearning have heard of so it must be best on the market. /s

To your point, there are companies who have been developing speech-to-text for decades. The capability is so unremarkable that most (all?) cloud providers have a speech-to-text offering already and it easily integrates with their other services.

I know this is a hot take, but I don't think OpenAI has a business strategy. They're deploying expensive models that directly compete with entrenched, big tech companies. They can't be thinking they're going to take market share away from GCP, AWS, Azure with technologies that all three offer already, right? Right???

1

fasttosmile t1_japaes4 wrote

To be fair, they are technically very competent and the pricing is very cheap. And their marketing is great.

But yeah dealing with B2B customers (where the money is) and integrating feedback from them is a very different thing than what they've been doing so far. They might be angling to serve as a platform for AI companies that then have to deal with average customers. That way they get to only deal with people who understand the limitations of AI. Could work. Will change the company to be less researchy though.

1

soobardo t1_japo5w5 wrote

Yes, they pair up perfectly. Whisper detects anything I babble to it, english or french and it's surprisingly fast. I've wrapped a loop that:

listens micro -> whisper STT -> chatgpt -> lang detect -> Google TTS -> speaker

With noise/silence detection, it's a complete hands-off experience, like chatting with a real person. Delay is ~ 5s for all calls. "Glueing" the APIs is straightforward and intuitive.

2

xGovernor t1_jaksctz wrote

I've been tinkering with DaVinci but even with turbo/premium using gpt3.5turbo api requires a credit card added to the account. Excited to fool with it, however I typically use 2048-4000 tokens on DaVinci 3.

3

Lychee7 t1_jalbr7l wrote

Criteria for tokens ? Complex, longer the prompt more tokens it'll use ?

1

Trotskyist t1_jalk4j5 wrote

A token is (roughly) 4 characters. Both prompt and result are counted.

5

iTrooz_ t1_jale5ca wrote

I hope the API doesn't have the same restrictions as https://chat.openai.com

1

Stakbrok t1_jam0bpq wrote

You can edit what it replied of course (and then hope it builds off of that and keeps that specific vibe going, which always works in the playground) but damn, they locked it down tight. 😅

Even when you edit the primer/setup into something crazy (you are a grumpy or deranged or whatever assistant) and change some things it said into something crazy, it overrides the custom mood you set for it and goes right back to its ever serious ChatGPT mode. Even sometimes apologizing for saying something out of character (and by that it means the thing you 'made it say' by editing, so it believes it said that)

5

ShowerVagina t1_jamyp12 wrote

I might be in the minority but I strongly believe in unfiltered AI (or a minimal filter, only blocking thing like directions to cool drugs or make weapons). I know they filter it for liability reasons but I wish they didn't.

3

Sea_Alarm_4725 t1_janmlir wrote

I can’t seem to find anywhere what the token limit per request is? With davinci is something like 4k tokens, what about this new chatgpt api?

1

Bluebotlabs t1_jar58e4 wrote

Doesn't the number of tokens increase exponentially with chat history?

1

minimaxir OP t1_jaru4ch wrote

More cumulatively than exponentially but yes.

With the new prices that's not a big deal.

1

Bluebotlabs t1_jarufrq wrote

My mistake, I was confused with the system I was.using for chat history lol

1

bdambrosio94563 t1_jb2ct4n wrote

I've spent the last week exploring gpt-3.5-turbo. Went back to text-davinci. (1) gpt-3.5-turbo is incredibly heavily censored. For example, good luck getting anything medical out of it other than 'consult your local medical professional'. It also is much more reluctant to play a role. (2) As is well documented, it is much more resistant to few-shot training. Since I use it in several roles, including google search information extraction and response-composition, I find it very dissappointing.

Luckily, my use case is as my personal companion / advisor / coach, so my usage is low enough I can afford text-davinci. Sure wish there was a middle-ground, though.

1

Akbartus t1_jbs0hkp wrote

Cannot agree. It is not a deal at all. Such a pricing strategy in the long term is very profitable for its creators. But it does not matter for those who would like to use it, but due to financial situation cannot afford using such APIs for a longer period of time (think about people beyond rich countries). Moreover 1k tokens can be generated in just one small talk in a matter of a few seconds...

1

MonstarGaming t1_jakqs01 wrote

>I have no idea how OpenAI can make money on this.

Personally, I don't think they can. What is the main use case for chat bots? How many people are going to pay $20/month to talk to a chatbot? I mean, chatbots aren't exactly new... anybody who wanted to chat with one before ChatGPT could have and yet there wasn't an industry for it. Couple that with it not being possible to know whether its answers are fact or fiction and I just don't see the major value proposition.

I'm not overly concerned one way or another, I just don't think the business case is very strong.

−14

Smallpaul t1_jam83rb wrote

I guess you haven’t visited any B2C websites in the last 5 years.

But also: there is a world model behind the chatbot which can translate between human languages, between computer languages, can compose marketing copy, summarise text...

4

MonstarGaming t1_jap3jzc wrote

>I guess you haven’t visited any B2C websites in the last 5 years.

I have and that is exactly my point. The main use case is B2C websites, NOT individuals, and there are already very mature products in that space. OpenAI needs to develop a lot of bells, whistles, and integration points with existing technologies (salesforce, service now, etc.) before they can be competitive in that market.

>can translate between human languages

Very valuable, but Google and Microsoft both offer this for free.

>between computer languages

This is niche, but it does seem like an untapped, albeit small, market.

>can compose marketing

Also niche. That being said, would it save time? Marketing materials are highly curated.

>summarise text...

Is this a problem a regular person would pay to have fixed? The maximum input size is 2048 tokens / ~1,500 words / three pages. Assuming an average person pastes in the maximum input, they're summarizing material that would take them 6 minutes to read (Google is saying the average person reads 250 words per minutes). Mind you it isn't saving 6 minutes, they still need to read all of the content ChatGPT produces. Wouldn't the average person just skim the document if they wanted to save time?

To your point, it is clearly a capable technology, but that wasn't my argument. There have been troves of capable technologies that were ultimately unprofitable. While I believe it can be successful in the B2C market, I don't think the value proposition is nearly as strong for individuals.

Anyhow, only time will tell.

−3

[deleted] t1_jap8ttt wrote

[removed]

3

MonstarGaming t1_japjnn4 wrote

Nice, nothing demonstrates the Dunning-Kruger effect quite like a string of insults.

For whatever its worth, that argument is exceedingly weak. I'll let you brainstorm on why that might be. I don't have interest in debating with someone who so obviously lacks tact.

−2

caedin8 t1_jakcasg wrote

It's exciting to see that ChatGPT's cost is 1/10th that of GPT-3 API, which is a huge advantage for developers who are looking for high-quality language models at an affordable price. OpenAI's commitment to providing top-notch AI tools while keeping costs low is commendable and will undoubtedly attract more developers to the platform. It's clear that ChatGPT is a superior option for developers, and OpenAI's dedication to innovation and affordability is sure to make it a top choice for many in the AI community.

−15