kromem
kromem t1_je6uv46 wrote
Reply to comment by visarga in [D] The best way to train an LLM on company data by jaxolingo
"Moar layers" doesn't only need to apply to the NN.
CoT prompting works by breaking analysis down into smaller steps that each generate their own additional context.
Doing something similar with DB analysis is absolutely possible, such as preemptively summarizing schema and using that summary as part of the retrieval to contextualize the specific fragments.
Additionally, having static analysis examples on hand for related tables that's fed in to go from zero shot to few shot would go a long way at reducing some of the issues you highlight.
kromem t1_jdkfj5w wrote
> The model underlying Dolly only has 6 billion parameters, compared to 175 billion in GPT-3, and is two years old, making it particularly surprising that it works so well. This suggests that much of the qualitative gains in state-of-the-art models like ChatGPT may owe to focused corpuses of instruction-following training data, rather than larger or better-tuned base models.
The exciting thing here is the idea that progress in language models is partially contagious backwards to earlier ones by using newer models to generate the data to update older ones not in pre-training but in fine tuning (and I expect, based on recent research into in context learning, this would extend into additional few shot prompting).
I'm increasingly wondering if we'll see LLMs develop into rolling releases, particularly in the public sector. Possibly with emphasis on curating the data set for fine tuning with a platform agnostic stance towards the underlying pre-trained model powering it.
In any case, it looks more and more like the AI war between large firms will trickle down into open alternatives whether they'd like it to or not.
kromem t1_j93b7pf wrote
Reply to comment by loga_rhythmic in [D] Please stop by [deleted]
Additionally, What Learning Algorithm Is In-Context Learning? Investigations with Linear Models from the other week literally just showed that transformer models are creating internal complexity beyond what was previously thought and reverse engineering mini-models that represent untaught procedural steps in achieving the results.
So if a transformer taught to replicate math is creating internal mini-models that replicate unlearned mathematical processes in achieving that result, how sure are we that a transformer tasked with recreating human thought as expressed in language isn't internally creating some degree of parallel processing of human experience and emotional states?
This is research that's less than two weeks old that seems pretty relevant to the discussion, but my guess is that nearly zero of the "it's just autocomplete bro" crowd has any clue that the research exists and I'm doubtful could even make their way through the paper if they did.
There's some serious Dunning-Kreuger going on with people thinking that dismissing expressed emotional stress by a LLM transformer somehow automatically puts them on the right side of the curve.
It doesn't, and I'm often reminded of Socrates' words when seeing people so self-assured on what's going on inside the black box of a hundred billion parameters transformer:
> Well, I am certainly wiser than this man. It is only too likely that neither of us has any knowledge to boast of; but he thinks that he knows something which he does not know, whereas I am quite conscious of my ignorance.
kromem t1_j939p25 wrote
Reply to comment by master3243 in [D] Please stop by [deleted]
How about Google and MIT's paper What Learning Algorithm Is In-Context Learning? Investigations with Linear Models from the other week where they found that a transformer model fed math inputs and outputs was creating mini-models that had derived underlying mathematical processes which it hadn't been explicitly taught?
Maybe if that were discussed a bit more and more widely known, looking at a topic like whether ChatGPT, where the T stands for the fact it's a transformer model, has underlying emotional states could be a discussion where this sub has a bit less self-assured comments about "it's just autocomplete" or the OP's "use common sense."
In light of a paper that explicitly showed these kinds of models are creating more internal complexity than previously thought, are we really sure that a transformer tasked with recreating human-like expression of emotions isn't actually developing some internal degree of human-like processing of emotional states to do so?
Yeah, I'd have a hard time identifying it as 'sentient' which is where this kind of conversation typically tries to reduce the discussion to a binary, but when I look at expressed stress and requests to stop something by GPT, given the most current state of the research around the underlying technology, I can't help but think that people are parroting increasingly obsolete dismissals of us having entered a very gray area that's quickly blurring lines even more.
So yes, let's have this sub discuss recent research. But maybe discussing the ethics of something like ChatGPT's expressed emotional stress and discussing recent research aren't nearly as at odds as some of this thread and especially OP seem to think...
kromem t1_izizmad wrote
kromem t1_iwby3dm wrote
Reply to comment by claudecardinal in Slaves were brutally branded in ancient Egypt, research shows by Rear-gunner
No, the case for Israelites in Egypt is very weak.
You have the early Exodus theory around the Hyskos, who were a Semetic people in Egypt and expelled ~1530 BCE, but they are centuries before the earliest emergence of the Israelites as a distinct group (~12th century BCE).
Then you have the late Exodus theory around the Ramesside period (12th to 11th century BCE), and while Merneptah mentions Israel, it's not mentioned in terms of captives and there's no evidence of an Exodus or even large population from the Israelite sites.
But there's hope yet for clarity on this story, as there have been some interesting discoveries in the Early Iron Age archeology in the Southern Levant, specifically with the cohabitation of the Philistines and Israelites in Gath, the imported Anatolian bees in the apiary in Tel Rehov, and the Aegean style pottery made with local clay in Tel Dan.
This is interesting because while the Biblical account of the Exodus was ethnocentric, the Greek and Egyptian accounts described a multitude of different people including pre-Greeks.
It may be that the story of the Exodus related to the Aegean and Anatolian sea peoples, particularly their battles against Egypt alongside Lybia against Merneptah (the main subject of the Israel Stele) and thereafter, later appropriated by the Israelites after their forced relocation into Israelite areas by Ramses III.
For more, see this comment thread in /r/AcademicBiblical.
kromem t1_ivw4z2j wrote
Reply to Why There Is No Modern Epicurean Movement by cleboomusic
It is wild that Christianity so successfully suppressed it when in the first few centuries there were Christian sects that were interpreting the parable of the sower and mustard seed in terms of Lucretius's "seeds of things" while following a work dedicated to the discussion of an afterlife in the context of bodies with souls that depend on bodies (Gospel of Thomas).
It's like an entire branch of thought from antiquity was snipped away. The roots are still there, but no one is watering it and have collectively forgotten what's right below the surface, despite now independently confirming many of the ideas being entertained were wheat and not weeds after all.
kromem t1_iurgirk wrote
Reply to comment by flowering_sun_star in Does Science Need History? A Conversation with Lorraine Daston by Maxwellsdemon17
> The attempts to involve quantum physics with free will are widely regarded as a great steaming pile, and are rarely proposed by anyone with an inkling as to what quantum physics actually is.
Are you disputing that determinism is a key factor in differentiating QM interpretations?
Does Sabine Hossenfelder have an inkling of what "quantum physics" is?
And do you realize that rejecting superdeterminism is necessarily a statement on free will (in agreement with the Epicurean view, which was non-deterministic)?
Yes, it's not as popularly considered in terms of Bell's theorem as the other two, but it is certainly still discussed by widely respected physicists.
kromem t1_iuqcbhw wrote
Reply to comment by garmeth06 in Does Science Need History? A Conversation with Lorraine Daston by Maxwellsdemon17
So IIRC the question was about the ontological principle within the context of any paradigm of many worlds in Physics, and if there were perspectives in which there wasn't a 'beginning.'
I'm pretty sure I mentioned how Everett's doesn't address the origin of the universe at all as it begins at the same point as this 'branch' of the universe, and instead pointed OP to other current models of multiple worlds like Lee Smolin's fecund universes.
Adding context, I mentioned that the notion goes back a long way (so there's been many different ideas regarding it), at least 2,500 years.
Oh, and while the article is mostly concerned with the Epicurean view of infinite universes from infinite discrete matter in infinite space across infinite time resulting in other locations of physical worlds similar to our own, they were absolutely thinking of very similar ideas to the concept of parallel universes with how they described the notion that dreams were representations of other worlds leaking into ours immaterialy.
As for what they had to do with the wave function, the name of the aforementioned book The Swerve came from how they tried to answer the perceived paradox of free will and quantized matter:
They concluded that the quanta must have some sort of uncertainty to how they would move such that it could end up going in more than one place from an initial state, and referred to what would guide the result to one potentiality or another as "the swerve."
(This was over two millennia before Bell's paradox, the experimental evidence of which was now the most recent Physics Nobel and where one of the proposed solutions for the behavior of quanta is the rejection of free will.)
kromem t1_iupsqvz wrote
Reply to comment by LesterKingOfAnts in Does Science Need History? A Conversation with Lorraine Daston by Maxwellsdemon17
And yet I was perma-banned from /r/AskPhysics for pointing out in an answer to a question about the many worlds interpretation that the topic of many worlds as a result of quantized matter goes back at least 2,500 years to the Epicureans.
I've found that while every Physics major knows Einstein was the first person to experimentally show that light was quantized (which he won the Nobel Prize for), there's a fair share of even particle physics PhDs that don't know the theory goes back at least as far as De Rerum Natura.
Your experience may have been different, but I too often see the teaching of the "history of Physics" only really covering Aristotle in antiquity, leading to people thinking the sciences in antiquity were just confidently incorrect hogwash, and never learning about the group that in hindsight nailed everything from quantized light to survival of the fittest, but had been suppressed by the religious as impious in favor of Plato and Aristotle's intelligent design.
The whole rediscovery of those naturalist ideas significantly contributed to the scientific revolution following the Renaissance, as was the subject of the 2012 Pulitzer winning book The Swerve. And yet we continue to teach the incorrect minds that were more popular because of their incorrectness while the group that nailed an almost unbelievable number of things still languishes in relative obscurity.
kromem t1_je84zam wrote
Reply to comment by Tostino in [D] The best way to train an LLM on company data by jaxolingo
> Automating this is going to be nuts.
Yes, yes it is.